Not long ago, usability measures and Web analytics were few and far between. The usual standards amounted to little more than task completion, error rates, and click streams. Yet, they served us well.
Some years ago, when relaying one telling measure—how many clicks it took to find a book—to clients at a large metropolitan library group the room fell silent. Finding a book on a library web site should have been, as my father was fond of saying, “as easy as shooting fish in a barrel.” In our test sessions, however, it took eight of 12 participants an average of 6.25 clicks to find John Grisham’s book, A Painted House. The benchmark for the task was one click.
All but a couple of the participants meandered through pages looking for the best-selling book without feeling they were progressing toward their goal. Some participants clicked 18 or 20 times before giving up. Of all the performance data in our 147-page report, this one piece of information, the number of clicks it took to find the Grisham book, moved the client to take action.
“The Three Cs”
This was back in 2002. Now, of course, we have more measures in our toolbox. Or, at least, we should. While the old standards are still useful, the digital spaces we try to improve have become much more complex and so too, have clients’ expectations for functionality and a return on their investment.
Whether you call this a Web 2.0 era or not, there is no disputing that most clients these days care more than they ever did before about the “Three Cs”: Customers, Competitors, and Conversion. Click streams have made room for bounce rates, search analytics, and so much more. If we play our cards right, we can reduce and synthesize the raw data and give our clients more meaningful information that foments action.
Emblematic Measures Have Teeth
Of all the data we report, there are certain measures that are more meaningful than others. I call the more meaningful data emblematic measures. In dictionary terms, emblematic is a “visible symbol of something abstract,” which is “truly representative.”
In our presentation to the library group, the rate for the Grisham task was emblematic. That is, the measure was representative of the library website’s greater inadequacies: its failure to fulfill the basics of its fundamental purpose and meet its customers’ needs. In turn, the measure was understandable to the client on a visceral level because it was firmly planted in their business objectives.
“Emblematic measures ensure that the data are always in the service of the business,” writes Avinash Kaushik, author of Web Analytics: An Hour a Day. “I can’t tell you how many times I run into data simply existing for the sake of data with people collecting and analyzing data with the mindset that the sole reason for the business’s existence is so that it can produce data (for us to analyze!).”
However, not all of the measures we deliver to clients are emblematic, nor should they be. Emblematic measures need to epitomize the entire study’s findings eloquently and elegantly. In layman’s terms, emblematic measures are a lot like the best line from a classic movie: It’s not the only line, but it’s the one that is remarkable, memorable, and eminently quotable.
Emblematic measures are far from prescriptive, static, or context-free, too. With every bit of user experience research we conduct and on each and every site, the measures will surely vary, given the context of testing, the sample, the tasks assigned, the business objectives for the site, the functionality being studied, and so on.
Therein lies one challenge of our daily work.
The Site Abandonment Measure
Fast forward from 2002 to Summer 2006. During a usability test of a philanthropic extranet for a large foundation, we measured the occurrence of something we had seen happening a million times. We used to think it was just too obvious to formalize and report to clients.
But this time, we found our emblematic measure.
We call this measure a Site Abandonment Measure (SAM). We define a SAM as the percentage (or number) of participants who give up on a specific task (or set of tasks), leave a site altogether, and turn to another source—any source—to get a task done. Put simply, it’s the “I quit—I’ve had it with your site” rate.
When we asked our 15 participants to make a recommendation for a grant in support of a local Special Olympics team, 53 percent of the sample abandoned the task all together. Participants told us they would complete the task elsewhere (usually by using a phone to call the Special Olympics or the foundation directly).
We also found the SAM was significant for informational tasks. When we asked participants to get the latest tax return for the Special Olympics group, 40 percent of the sample left the site all together and went directly to the Special Olympics site for the information.
Overall, the SAM for the foundation’s site was 38.6 percent for the ten key tasks on the extranet. This showing was pretty dismal, especially given the context of our research. We were, after all, testing an extranet with the sole purpose of letting users manage their philanthropic funds—not an e-commerce site and click-through rates on ads. (There are no formal usability standards for unacceptable SAMs rates, as far as we know.)
This means that, on average, on any one task, about six out of every 15 participants agreed to take on the task we asked of them, went through the first motions, and then eventually gave up not only on the task, but on the entire site.
When we presented the findings to the client, the show-stopper was the Special Olympics task and the corresponding SAM. How could they have laid down cold, hard cash for a site that failed to let over half of the test participants make a grant recommendation online?
SAMs vs. SARs
SAMs may bring to mind Web analytics and their main use on e-commerce sites. Under the hood, of course, the data is as different as a Porsche from a Prius.
Web analytics, such as conversion rates and the more narrow site abandonment rates (SARs) for measuring user interaction with shopping carts, leverage quantitative data extracted from transactional logs to measure macro-level interactions across a large sample of users. SAMs, on the other hand, use behavioral data from one-on-one test sessions to measure micro-level interactions with a small set of representative users.
As a measure in our toolbox, SAMs can tell us things about users that SARs cannot. When users “think aloud” during usability sessions, SAMs can give us some information about the rest of the story behind the quantitative measure. They can collect qualitative data about users’ frustrations, annoyances, barriers and solutions. (Granted, there is always the issue “self report” in usability test sessions.)
According to Kaushik, there are, of course, emblematic Web analytics, too. And bounce rate, which measures the number of visitors who see only one page and leave, is a frequent one.
“Everyone (from the CEO on down) gets this metric right away and can understand if it is good or bad,” Kaushik says. “It is so hard to acquire traffic and everyone cares about the percentage of traffic that leaves right away. When I report that ‘your bounce rate is 60 percent,’ it simply horrifies the client and drives them to ask questions to take action.”
What’s in a Name?
Relying on a sexy metric or one type of usability measure alone is not always a sure way to reach a client with a call for change though. The underlying data also has to speak to clients. This means practitioners have to work at breathing life into the data they package and deliver.
Kaushik recounts a story about taking existing metrics and segments and simply renaming them to make them more emblematic: “We were measuring five customer segments: (1) those who see less than one page on a site, (2) those who see three pages or less, (3) those who see more than three pages and did not buy, (4) those who place an order, and (5) those who buy more than once. These were valuable segments and something worth analyzing, but the internal clients would simply not connect with the segments until we renamed them to ‘Abandoners,’ ‘Flirters,’ ‘Browsers,’ ‘One-off-wonders,’ and ‘Loyalists.’”
The simple change in how the data was communicated had a huge impact by creating a story around it. Kaushik’s client had a greater understanding and instantly began asking how they could turn Flirters into Loyalists.
Hitting Clients Right between the Eyes
Sophocles wrote, “The truth is always the strongest argument.” Likewise, many practitioners rely on data to provide the best approximations of truth they can. With so much of our research focused on striving for accurate representations of something as amorphous, varied, and hotly debated as user behavior, we are a profession usually awash in data, practicing a less-than-perfect science.
When I was in graduate school, we discussed ””construct validity”:http://books.google.com/books?id=eAdbEn-yZbcC&pg=PA190&lpg=PA190&dq=babbie+and+construct+validity&source=web&ots=k8tB76zIaW&sig=6-ww5WOHJhLKFk5siib3qUYheis#PPA190,M1.” Construct validity refers to the extent to which a test offers approximate evidence that a certain measure (e.g., the task of finding a library book) accurately reflects the quality or construct (the proficiency of users in carrying out a frequently conducted task on a library site) that is to be measured.
It is essential, of course, to weigh the validity of the tasks we develop and the results delivered. But collecting all of the “right” data is not always enough.
“The problem is that we are so immersed in data in our professional or academic worlds that, to a great extent, we become disconnected with reality,” Kaushik says, “especially when we lose touch with the business side of things and we lose touch with customers and base our analysis on how four people in a lab carried out a task.”
Do your rigorous research justice by communicating the data in such a way that it reveals any significant shortcomings. No matter the size of your project, look for the emblematic measures. They will allow you to tell stories that hit clients right between the eyes and move them to action.
Many thanks to Avinash Kaushik for his email interview for this article on October 4 and 5, 2007.
Kaushik is the author of the book, Web Analytics: An Hour A Day writes the blog, Occam’s Razor, and is the founder of Market Motive, a Silicon Valley startup that focuses on online marketing education. He is also the Analytics Evangelist for Google.