An evaluation that confirmed users loved the rebuilt Telemetry Explorer, then looked past the applause to the harder, more valuable problem underneath: people were not confident they were querying the right data to answer their question.
Telemetry Explorer is the rebuilt tool engineers use to chart how their cloud systems are behaving. Before its public launch, the team wanted to know whether the new design beat the old one. It did: a strong majority of users preferred it, and they loved being able to compare several metrics on one chart. But reading the interviews and the survey side by side, I saw the same struggle in both, no matter the company or skill level. People were happy with the new layout but not confident they were building the right query to answer their actual question, especially once it got complex. I reframed the readout around that gap, turning a simple thumbs-up into the brief for the next phase: lean on the interface itself to guide people toward the right query.
The study was framed as a simple go or no-go: did the rebuilt explorer beat the old one? It did, and stopping there would have produced a clean, satisfying, and slightly hollow readout.
The redesign itself was a real improvement. It moved the chart from a cramped side panel to the center of the screen, brought controls that used to be buried in menus into plain sight, and let people stack and compare several metrics on one chart.
Reading the interviews and the survey together, one thing repeated regardless of company or skill level: people liked the new layout but were not confident they were querying their data the right way. The satisfaction scores were really measuring the layout. The thing that would make or break the product sat one layer down, in building the query itself.
One survey item was written as a deliberately extreme statement, that the tool was already perfect and could not be improved, to confirm people were reading carefully rather than agreeing by default. They disagreed, which both validated the responses and showed they saw room to grow.
That reframe pointed the next phase at the query itself: guide people toward the right one with examples, in-product help, and eventually plain-language querying, so getting it right stops depending on trial and error.
The study was set up to answer one question: do users like the rebuilt explorer? They did, clearly, and the easy version of this readout was a single slide that said success, ship it. That would have been true and nearly useless. What stopped me was reading the interviews and the survey side by side rather than separately. The same struggle surfaced in both, at every skill level: people were not confident they were building the right query to answer their question, especially once it involved several steps or several metrics. The satisfaction scores were measuring the layout. The real risk to the product sat one layer underneath, in the querying itself.
“The hardest task is understanding the different metrics and analyzing the data to spot patterns and trends.”