Your Guide to Online Research and Testing Tools

Written by: Bartosz Mozyrko

The success of every business depends on how the business will meet their customers’ needs. To do that, it is important to optimize your offer, the website, and your selling methods so your customer is satisfied. The fields of online marketing, conversion rate optimization, and user experience design have a wide range of online tools that can guide you through this process smoothly. Many companies use only one or two tools that they are familiar with, but that might not be enough to gather important data necessary for improvement. To help you better understand when and which tool is valuable to use, I created a framework that can help in your assessment. Once you broaden your horizons, it will be easier to choose the set of tools aligned to your business’s needs. The tools can be roughly divided into three basic categories:

  • User testing: Evaluate a product by testing it with users who take the study simultaneously, in their natural context, and on their own device.
  • Customer feedback: Capture feedback of customer’s expectations, preferences, and aversions directly from a website.
  • Web analytics: Provide detailed statistics about a website’s traffic, traffic sources, and measurement of conversions and sales.

To better understand when to use which tool, it is helpful to use the following criteria:

  • What people say versus what people do… and what they say they do
  • Why versus how much
  • Existing classifications of online tools

The possible services are included at the latter part of the article to help you start.

What people say versus what people do… and what they say they do

What people say, what people do, and what they say they do are three entirely different things. People often lack awareness or necessary knowledge which would enable them to provide correct information. Anyone who has any experience with user research or conversion rate optimization and has spent time trying to understand users has seen firsthand that more often than not user statements do not match the acquired data. People are not always able to fully articulate why they did that thing they just did. That’s the reason it’s sometimes good to compare information about opinions to information on behavior, as this mix can provide better insights. You can learn what people do by studying your website from your users’ perspective and drawing conclusions based on observations of their behavior, such as click tracking or user session recording. However, that is based on the idea that you test certain theories about people’s behavior. There is a degree of uncertainty, and to validate the data you’ve gathered, you will sometimes have to go one step further and simply ask your users, which will allow you to see the whole picture. Therefore, you can learn what people say by reaching out to your target group directly and asking them questions about your business.

Why versus how much

Some tools are better suited for answering questions about why or how to fix a problem, whereas tools like web analytics do a much better job at answering how many and how much types of questions. Google Analytics tells you the percentage of people who clicked what thing to through to what page, but it doesn’t tell you why they did or did not do that. Having knowledge of these differences helps you prioritize certain sets of tools and use them while fixing issues having the biggest impact on your business.

The following chart illustrates how different dimensions affect the types of questions that can be asked:

chart illustrates how different dimensions affect the types of questions that can be asked.
Source: http://www.nngroup.com/articles/which-ux-research-methods/

Choosing the right tool—infographics

There are a lot of tools out there these days that do everything from testing information architecture and remote observation. With more coming out every day, it can be really hard to pick the one that will give you the best results for your specific purpose. To alleviate some of the confusion, many experts tried to classify them according to different criteria. I decided to include some of examples for your convenience below.

Which remote tool should I use? By Stuff On My Wall

A flow chart to evaluate remote tool choices.
Source: http://remoteresear.ch/moderated-research/

Choosing a remote user experience research tool by Nate Bolt

Another chart showing evaluation criteria for remote research tools.
Source: http://remoteresear.ch/categories/

The five categories of remote UX tools by Nate Bolt

Five categories of user research tools.
Source: http://remoteresear.ch/categories/

Four quadrants of the usability testing tools matrix by Craig Tomlin

Usability testing tools arranged in a quadrant chart.
Source: http://www.usefulusability.com/14-usability-testing-tools-matrix-and-comprehensive-reviews/

Tool examples

The examples of tools which I list below are best suited for web services. The world of mobile applications and services is too vast to be skimmed over and has enough material to be a different article completely. The selection is narrowed down in order to not overwhelm you with choice, so worry not.

User testing

User testing and research is vital to creating a successful website, products and services. Nowadays using one of the many existing tools and services for user testing is a lot easier than it used to be. The important thing is to find a tool or service that works for your website and then use it to gather real-world data on what works and what does not.

Survey: The most basic form of what people say. Online surveys are often used by companies to gain a better understanding of their customers’ motives and opinions. You can ask them to respond in any way they choose or ask them to select an answer from a limited number of predetermined responses. Getting feedback straight from your customers may be best used in determining their painpoints or figuring out their possible needs (or future trends). However, what you need to remember about is that people do not always communicate best what is exactly the issue they are facing. Be like Henry Ford: Do not give people faster horses when they want quicker transportation—invent a car.

Examples:
Typeform
Survey Gizmo

Card sorting: It focuses on asking your users to categorize and sort provided items in the most logical way for them or create their own possible categories for items. These two methods are called respectively closed and open card sorting. This will help you to rework the information architecture of your site thanks to the knowledge about the users’ mental models. If you aim to obtain information that balances between “what they do” and “what they say”, sorting is your best bet. Be sure to conduct this study in a larger group – some mental models might make sense, but aren’t the most intuitive for others. Focus on the responses that are aligned with each other, as it is possibly the most representative version of categories.

Examples:
ConceptCodify
usabiliTEST

Click testing/automated static/design surveys: This lets you test screenshots of your design, so you can obtain detailed information about your users’ expectations and reactions to a website in various stages of development. This enters the territory of simply gathering data about the actions of your users, so you obtain information about what they do. The study is conducted usually by asking a direct question: “Click on the button which will lead to sign-ups”. However, remember, click testing alone is not sufficient enough, you need other tools that cover the part of “why” in order to fully understand.

Examples:
Usaura
Verify App

5-Second testing/first impression test: Because your testers have only five seconds to view a presented image, they are put under time pressure and must answer questions relying only on almost subconscious information they obtained. This enables you to improve your landing pages and calls to action, as users mostly focus only on the most eye-catching elements.

Examples:
UsabilityHub
Optimal Workshop Chalkmark

Diary studies: An extensive database of all thoughts, feelings and actions of your user, who belongs to a studied target market. All events are being recorded by the participants at their moment of occurrence. This provides insights into firsthand needs of your customers, asking them directly about their experiences. Yet, it operates in a similar fashion to surveys, therefore remember that your participants do not always clearly convey what they mean.

Examples:
FocusVision Revelation
Blogger

Moderated user studies/remote usability testing: The participants of this test are located in their natural environment, so their experiences are more genuine. Thanks to the tools and software there is no necessity for participants and facilitators to be in the same physical location. Putting the study into context of a natural/neutral environment (of whatever group you are studying) gives you insight into unmodified behaviours. Also, the study is a lot cheaper than other versions.

Examples:
GoToMeeting
Skype

Self-moderated testing: The participants of the test are expected to complete the tasks independently. After that you will obtain videos of their test sessions, along with a report containing information what problems your users were facing and what to do in order to fix them. The services offering this type of testing usually offer the responses quickly, so if you are in a dire need of feedback, this is one of possibilities.

Examples:
Uxeria
UserTesting

Automated live testing/remote scenario testing: Very similar to remote testing, yet the amount of information provided is much more extensive and organized. You get effectiveness ratios (success, error, abandonment and timeout), efficiency ratios (time on task and number of clicks), and survey comments as the results.

Examples:
UX Suite
Loop11

Tree testing/card-based classification: It is a technique which completely removes every distracting element of the website (ads, themes etc.) and focuses only on the simplified text version. Through this you can evaluate the clarity of your scheme and pinpoint the chokepoints that present problems to users. It is a good method to test your prototypes or if you want to detect a problem with your website and suspect the basic framework is at fault.

Examples:
UserZoom
Optimal Workshop Treejack

Remote eye tracking/online eye tracking/visual attention tracking: Shows you where people focus their attention on your landing pages, layouts, and branding materials. This can tell you whether the users are focused on the page, whether they are reading it or just scanning, how intense they are, and what is the pattern of their movement. However, it cannot tell you exactly whether your users actually do see something or do not, or why exactly do they look at a given part. This can be remedied for example with voiceovers, where the participants tell you right away what they feel.

a) Simulated: creates measurement reports that predict what a real person would most likely look at.

Examples:
VAS
Eyequant

b) Behavioral: finds out whether people actually notice conversion-oriented elements of the page and how much attention they pay to them.

Examples:
Attensee Eyetrack Shop

These are the singular features which are prominent elements of the listed services. However, nowadays there is a trend to combine various tools together, so they can be offered by a single website. If you happen to find more than one tool valuable for your business, you can use services such as UsabilityTools or UserZoom.

Customer feedback

Successful business owners know that it’s crucial to take some time to obtain customer feedback. Understanding what your customers think about your products and services will not only help you improve quality, but will also give you insights into what new products and services your customers want. Knowing what your customers think you’re doing right or wrong also lets you make smart decisions about where to focus your energies.

Live chats: an easy to understand way of communicating through the website interface in real time. Live chat enables you to provide all the answers your customers could want. By analyzing their questions and often inquired issues you can decide what needs improvement. Live chats usually focus on solving an immediate problem, so it is usually used for smaller issues. The plus is the fact that your client will feel acknowledged right away.

Examples:
LiveChat
Live Person

Insight surveys: They help you understand your customers thanks to targeted website surveys. You can create targeted surveys and prompts by focusing them on the variables such as the time on page, the number of visits, the referring search term or your own internal data. You can even target custom variables such as the number of items in a shopping cart. However, they are very specific and operate on the same principle as general surveys, so you must remember about the risk that the survey participants won’t always be able to provide you with satisfactory answers.

Examples:
Survicate
Qualaroo

Feedback forms: They are a simple website application to receive feedback from your website visitors. You can create a customized form, copy and paste code into your site’s HTML, and start getting feedback. This is a basic tool for getting feedback forms from your customer, and receiving and organizing results. If you want to know the general opinion about your website and the experiences of your visitors (and you want it to be completely voluntary), then forms are a great option.

Examples:
Feedbackify
Kampyle

Feedback forums: Users enter a forum where they can propose and vote on items which need change or need to be discussed. That information allows you to prioritize issues and decide what needs to be fixed as fast as possible. The forums can be also used for communicating with users, for example you can inform them that you introduced some improvements to the product. Remember, however, that even the most popular issues might be actually least important for imrpoving your serive and/or website, it is up to you to judge.

Examples:
UserVoice
Get Satisfaction

Online customer communities: You refer to your customer directly, peer-to-peer, and offer problem solving and feedback. Those web-based gathering places for customers, experts, and partners enable you to discuss problems, post reviews, brainstorm new product ideas, and engage with one another.

Examples:
Socious
Lithium

There are also platforms that merge some of the functionalities such as UserEcho or Freshdesk which are an extremely popular solution to the growing demands of clients who prefer to focus on single service with many features.

Website analytics

Just because analytics provide you with some additional data about your site doesn’t mean it’s actually valuable to your business. You want to find the errors and holes within your website and fill them with additional functionality for your users and customers. Using the information gathered you can influence your future decisions in order to improve your service.

Web analytics: all movement of the users is recorded and stored. However, their privacy is safe, as the data gathered is used only for optimization, and the data is impossible to be personally identified. The data can be later used for evaluating and improving your service and website in order to achieve your goals such as increasing the amount of visitors or sales.

Examples:
Mint
Mixpanel
KISSmetrics
Woopra
Google Analytics

In-page web analytics: They differ from traditional web analytics as they focus on the users’ movement within the page and not between them. These tools are generally used to understand behavior for the purposes of optimizing a website’s usability and conversion.

a) Click tracking: This technique used to determine and record what the users are clicking with their mouse while browsing the website–it draws you a map of their movements, which allows you to see step by step the journey of your user. If there is a problem with the website, this is one of the methods to check out where that problem could’ve occured.

Examples:
Gemius Heatmap
CrazyEgg

b) Visitor recording/user session replays: Every action and event is recorded as a video.

Examples:
Inspectlet
Fullstory

c) Form testing: This allows you to evaluate the web form and identify areas that need improvement, for example which fields make your visitors leave the website before completing the form.

Examples:
Formisimo
UsabilityTools Conversion Suite

In a similar fashion to the previous groups, there is also a considerable amount of analytic Swiss army knives offering various tools in one place. The examples of such are ClickTale, UsabilityTools, or MouseStats.

Conclusion

This is it—the finishing line of this guide to online research tools. It is an extremely valuable asset which can provide important and surprising data. The amount of tools available at hand is indeed overwhelming, that is why you need to consider the listed factors of what, why and such. This way you will reach a conclusion about what exactly you need to test in order to improve your service or obtain required information. Knowing what you want to do will help you narrow your choices and in result choose the right tool. Hopefully, what you’ve read will help you choose the best usability tools for testing, and you will end up an expert in your research sessions.

Three Ways to Improve Your Design Research with Wordle

Written by: Jeff Tang

“Above all else show the data.”
–Edward Tufte

Survey responses. Product reviews. Keyword searches. Forums. As UX practitioners, we commonly scour troves of qualitative data for customer insight. But can we go faster than line-by-line analysis? Moreover, how can we provide semantic analysis to project stakeholders?

Enter Wordle. If you haven’t played with it yet, Wordle is a free Java application that generates visual word clouds. It can provide a compelling snapshot of user feedback for analysis or presentation.

Using Wordle for content strategy

Wordle excels at comparing company and customer language. Here’s an example featuring one of Apple’s crown jewels, the iPad. This text comes from the official iPad Air web page. After common words are removed and stemmed:

iPad Air Wordle

Apple paints a portrait of exceptional “design” with great “performance” for running “apps.” Emotive adjectives like “incredible,” “new,” and “Smart [Cover]” are thrown in for good measure. Now compare this to customer reviews on Amazon.com:

image02

To paraphrase Jakob Nielsen, systems should speak the user’s language. And in this case, customers speak more about the iPad’s “screen” and “fast[er]” processor than anything else. Apps don’t even enter the conversation.

A split test on the Apple website might be warranted. Apple could consider talking less about apps, because users may consider them a commodity by now. Also, customer lingo should replace engineering terms. People don’t view a “display,” they look at a “screen.” They also can’t appreciate “performance” in a vacuum. What they do appreciate is that the iPad Air is “faster” than other tablets.

What does your company or clients say in its “About Us,” “Products,” or “Services” web pages? How does it compare to any user discussions?

Using Wordle in comparative analysis

Wordle can also characterize competing products. For example, take Axure and Balsamiq, two popular wireframing applications. Here are visualizations of recent forum posts from each website. (Again, popular words removed or stemmed.)

Axure Wordle

Balsamiq Wordle

Each customer base employs a distinct dialect. In the first word cloud, Axure users speak programmatically about panels (Axure’s building blocks), widgets, and adaptive design. In the Balsamiq cloud, conversation revolves more simply around assets, text, and projects.

These word clouds also illustrate product features. Axure supports adaptive wireframes; Balsamiq does not. Balsamiq supports Google Drive; Axure does not. Consider using Wordle when you want a stronger and more immediate visual presentation than, say, a standard content inventory.

Beyond comparative analysis, Wordle also surfaces feature requests. The Balsamiq cloud contains the term “iPad” from users clamoring for a tablet version. When reviewing your own Wordle creations, scan for keywords outside your product’s existing features. You may find opportunities for new use cases this way.

Using Wordle in iterative design

Finally, Wordle can compare word clouds over time. This is helpful when you’re interested in trends between time intervals or product releases.

Here’s a word cloud generated from recent Google Play reviews. The application of interest is Temple Run, a game with over 100 million downloads:

Temple Run Wordle

As you can see, players gush about the game. It’s hard to imagine better feedback.

Now let’s look at Temple Run 2, the sequel:

Temple Run sequel Wordle

Still good, but the phrase “please fix” clearly suggests technical problems. A user researcher might examine the reviews to identify specific bugs. When comparing word clouds over time, it’s important to note new keywords (or phrases) like this. These changes represent new vectors of user sentiment.

Wordle can also be tested at fixed time intervals, not just software versions. Sometimes user tastes and preferences evolve without any prompting.

Summary

Wordle is a heuristic tool that visualizes plaintext and RSS feeds. This can be quite convenient for UX practitioners to evaluate customer feedback. When seen by clients and stakeholders, the immediacy of a word cloud is more compelling than a typical PowerPoint list. However, keep the following in mind when you use Wordle:

  • Case sensitivity. You must normalize your words to lower (or upper) case.
  • Stemming. You must stem any significant words in your text blocks.
  • Accuracy. You can’t get statistical confidence from Wordle. However, it essentially offers unlimited text input. Try copying as much text into Wordle as possible for best results.
  • Negative phrases. Wordle won’t distinguish positive and negative phrasing. “Good” and “not good” will count as two instances of the word “good.”

That’s it. I hope this has been helpful for imagining text visualizations in your work. Good luck and happy Wordling.

Tree Testing

Written by: Dave OBrien

A big part of information architecture is organisation – creating the structure of a site. For most sites – particularly large ones – this means creating a hierarchical “tree” of topics.

But to date, the IA community hasn’t found an effective, simple technique (or tool) to test site structures. The most common method used — closed card sorting — is neither widespread nor particularly suited to this task.

Some years ago, Donna Spencer pioneered a simple paper-based technique to test trees of topics. Recent refinements to that method, some made possible by online experimentation, have now made “tree testing” more effective and agile.

How it all began

Some time ago, we were working on an information-architecture project for a large government client here in New Zealand. It was a classic IA situation – their current site’s structure (the hierarchical “tree” of topics) was a mess, they knew they had outgrown it, and they wanted to start fresh.

We jumped in and did some research, including card-sorting exercises with various user groups. We’ve always found card sorts (in person or online) to be a great way to generate ideas for a new IA.

Brainstorming sessions followed, and we worked with the client to come up with several possible new site trees. But were they better than the old one? And which new one was best? After a certain amount of debate, it became clear that debate wasn’t the way to decide. We needed some real data – data from users. And, like all projects, we needed it quickly.

What kind of data? At this early stage, we weren’t concerned with visual design or navigation methods; we just wanted to test organisation – specifically, findability and labeling. We wanted to know:
* Could users successfully find particular items in the tree?
* Could they find those items directly, without having to backtrack?
* Could they choose between topics quickly, without having to think too much (the Krug Test)1?
* Overall, which parts of the tree worked well, and which fell down?

Not only did we want to test each proposed tree, we wanted to test them against each other, so we could pick the best ideas from each.

And finally, we needed to test the proposed trees against the existing tree. After all, we hadn’t just contracted to deliver a different IA – we had promised a better IA, and we needed a quantifiable way to prove it.

The problem

This, then, was our IA challenge:
* getting objective data on the relative effectiveness of several tree structures
* getting it done quickly, without having to build the actual site first.

As mentioned earlier, we had already used open card sorting to generate ideas for the new site structure. We had done in-person sorts (to get some of the “why” behind our users’ mental models) as well as online sorts (to get a larger sample from a wider range of users).

But while open card sorting is a good “detective” technique, it doesn’t yield the final site structure – it just provides clues and ideas. And it certainly doesn’t help in evaluating structures.

For that, information architects have traditionally turned to closed card sorting, where the user is provided with predefined category “buckets” and ask to sort a pile of content cards into those buckets. The thinking goes that if there is general agreement about which cards go in which buckets, then the buckets (the categories) should perform well in the delivered IA.

The problem here is that, while closed card sorting mimics how users may file a particular item of content (e.g. where they might store a new document in a document-management system), it doesn’t necessarily model how users find information in a site. They don’t start with a document — they start with a task, just as they do in a usability test.

What we wanted was a technique that more closely simulates how users browse sites when looking for something specific. Yes, closed card sorting was better than nothing, but it just didn’t feel like the right approach.

Other information architects have grappled with this same problem. We know some who wait until they are far enough along in the wireframing process that they can include some IA testing in the first rounds of usability testing. That piggybacking saves effort, but it also means that we don’t get to evaluate the IA until later in the design process, which means more risk.

We know others who have thrown together quick-and-dirty HTML with a proposed site structure and placeholder content. This lets them run early usability tests that focus on how easily participants can find various sublevels of the site. While that gets results sooner, it also means creating a throw-away set of pages and running an extra round of user testing.

With these needs in mind, we looked for a new technique – one that could:
* Test topic trees for effective organisation
* Provide a way to compare alternative trees
* Be set up and run with minimal time and effort
* Give clear results that could be acted on quickly

The technique — tree testing

Luckily, the technique we were looking for already existed. Even luckier was that we got to hear about it firsthand from its inventor, Donna Spencer, the well-regarded information architect out of Australia, and author of the recently released book “Card Sorting”:http://rosenfeldmedia.com/books/cardsorting/.

During an IA course that Donna was teaching, she was asked how she tested the site structures she created for clients. She mentioned closed card sorting, but like us, she wasn’t satisfied with it.

She then went on to describe a technique she called “card-based classification”:http://www.boxesandarrows.com/view/card_based_classification_evaluation, which she had used on some of her IA projects. Basically, it involved modeling the site structure on index cards, then giving participants a “find-it” task and asking them to navigate through the index cards until they found what they were looking for.

To test a shopping site, for example, she might give them a task like “Your 9-year-old son asks for a new belt with a cowboy buckle”. She would then show them an index card with the top-level categories of the site:

She would then show them an index card with the top-level categories of the site.

The participant would choose a topic from that card, leading to another index card with the subtopics under that topic.

 The participant would choose a topic from that card, leading to another index card with the subtopics under that topic.

The participant would continue choosing topics, moving down the tree, until they found their answer. If they didn’t find a topic that satisfied them, they could backtrack (go back up one or more levels). If they still couldn’t find what they were looking for, they could give up and move on to the next task.

During the task, the moderator would record:
* the path taken through the tree (using the reference numbers on the cards)
* whether the participant found the correct topic
* where the participant hesitated or backtracked

By choosing a small number of representative tasks to try on participants, Donna found that she could quickly determine which parts of the tree performed well and which were letting the side down. And she could do this without building the site itself – all that was needed was a textual structure, some tasks, and a bunch of index cards.

Donna was careful to point out that this technique only tests the top-down organisation of a site and the labeling of its topics. It does not try to include other factors that affect findability, such as:
* the visual design and layout of the site
* other navigation routes (e.g. cross links)
* search

While it’s true that this technique does not measure everything that determines a site’s ease of browsing, that can also be a strength. By isolating the site structure – by removing other variables at this early stage of design – we can more clearly see how the tree itself performs, and revise until we have a solid structure. We can then move on in the design process with confidence. It’s like unit-testing a site’s organisation and labeling. Or as my colleague Sam Ng says, “Think of it as analytics for a website you haven’t built yet.”

So we built Treejack

As we started experimenting with “card-based classification” on paper, it became clear that, while the technique was simple, it was tedious to create the cards on paper, recruit participants, record the results manually, and enter the data into a spreadsheet for analysis. The steps were easy enough, but they were time eaters.

It didn’t take too much to imagine all this turned into a web app – both for the information architect running the study and the participant browsing the tree. Card sorting had gone online with good results, so why not card-based classification?

Ah yes, that was the other thing that needed work – the name. During the paper exercises, it got called “tree testing”, and because that seemed to stick with participants and clients, it stuck with us. And it sure is a lot easier to type.

To create a good web app, we knew we had to be absolutely clear about what it was supposed to do. For online tree testing, we aimed for something that was:
* Quick for an information architect to learn and get going on
* Simple for participants to do the test
* Able to handle a large sample of users
* Able to present clear results

We created a rudimentary application as a proof of concept, running a few client pilots to see how well tree testing worked online. After working with the results in Excel, it became very clear which parts of the trees were failing users, and how they were failing. The technique worked.

However, it also became obvious that a wall of spreadsheet data did not qualify as “clear results”. So when we sat down to design the next version of the tool – the version that information architects could use to run their own tree tests – reworking the results was our number-one priority.

Participating in an online tree test

So, what does online tree testing look like? Let’s look at what a participant sees.

Suppose we’ve emailed an invitation to a list of possible participants. (We recommend at least 30 to get reasonable results – more is good, especially if you have different types of users.) Clicking a link in that email takes them to the Treejack site, where they’re welcomed and instructed in what to do.

Once they start the test, they’ll see a task to perform. The tree is presented as a simple list of top-level topics:
In Treejack, the tree is presented as a simple list of top-level topics.

They click down the tree one topic at a time. Each click shows them the next level of the tree:
In Treejack, each click shows them the next level of the tree.

Once they click to the end of a branch, they have 3 choices:
* Choose the current topic as their answer (“I’d find it here”).
* Go back up the tree and try a different path (by clicking a higher-level topic).
* Give up on this task and move to the next one (“Skip this task”).

In Treejack, the participant selects an answer.

Once they’ve finished all the tasks, they’re done – that’s it. For a typical test of 10 tasks on a medium-sized tree, most participants take 5-10 minutes. As a bonus, we’ve found that participants usually find tree tests less taxing than card sorts, so we get lower drop-out rates.

Creating a tree test

The heart of a tree test is…um…the tree, modeled as a list of text topics.

One lesson that we learned early was to build the tree based on the content of the site, not simply its page structure. Any implicit in-page content should be turned into explicit topics in the tree, so that participants can “see” and select those topics.

Also, because we want to measure the effectiveness of the site’s topic structure, we typically omit “helper” topics such as Search, Site Map, Help, and Contact Us. If we leave them in, it makes it too easy for users to choose them as alternatives to browsing the tree.

Devising tasks

We test the tree by getting participants to look for specific things – to perform “find it” tasks. Just as in a usability test, a good task is clear, specific, and representative of the tasks that actual users will do on the real site.

How many tasks? You might think that more is better, but we’ve found a sizable learning effect in tree tests. After a participant has browsed through the tree several times looking for various items, they start to remember where things are, and that can skew later tasks. For that reason, we recommend about 10 tasks per test, presented in a random sequence.

Finally, for each task, we select the correct answers – 1 or more tree topics that satisfy that task.

The results

So we’ve run a tree test. How did the tree fare?

At a high level, we look at:
* Success – % of participants who found the correct answer. This is the single most important metric, and is weighted highest in the overall score.
* Speed – how fast participants clicked through the tree. In general, confident choices are made quickly (i.e. a high Speed score), while hesitation suggests that the topics are either not clear enough or not distinguishable enough.
* Directness – how directly participants made it to the answer. Ideally, they reach their destination without wandering or backtracking.

For each task, we see a percentage score on each of these measures, along with an aggregate score (out of 10):
Showing Treejack results with a percentage score of each measure and an aggregate score.

If we see an overall score of 8/10 for the entire test, we’ve earned ourselves a beer. Often, though, we’ll find ourselves looking at a 5 or 6, and realise that there’s more work to be done.

The good news is that our miserable overall score of 5/10 is often some 8’s and 9’s brought down by a few 2’s and 3’s. This is where tree testing really shines — separating the good parts of the tree from the bad, so we can spend our time and effort fixing the latter.

To do more detailed analysis on the low scores, we can download the data as a spreadsheet, showing destinations for each task, first clicks, full click paths, and so on.

In general, we’ve found that tree-testing results are much easier to analyse than card-sorting results. The high-level results pinpoint where the problems are, and the detailed results usually make the reason plain. In cases where a result has us scratching our heads, we do a few in-person tree tests, prompting the participant to think aloud and asking them about the reasons behind their choices.

Lessons learned

We’ve run several tree tests now for large clients, and we’re very pleased with the technique. Along the way, we’ve learned a few things too:
* Test a few different alternatives. Because tree tests are quick to do, we can take several proposed structures and test them against each other. This is a quick way of resolving opinion-based debates over which is better. For the government web project we discussed earlier, one proposed structure had much lower success rates than the others, so we were able to discard it without regrets or doubts.

* Test new against old. Remember how we promised that government agency that we would deliver a better IA, not just a different one? Tree testing proved to be a great way to demonstrate this. In our baseline test, the original structure notched a 31% success rate. Using the same tasks, the new structure scored 67% – a solid quantitative improvement.

* Do iterations. Everyone talks about developing designs iteratively, but schedules and budgets often quash that ideal. Tree testing, on the other hand, has proved quick enough that we’ve been able to do two or three revision cycles for a given tree, using each set of results to progressively tweak and improve it.

* Identify critical areas to test, and tailor your tasks to exercise them. Normally we try to cover all parts of the tree with our tasks. If, however, there are certain sections that are especially critical, it’s a good idea to run more tasks that involve those sections. That can reveal subtleties that you may have missed with a “vanilla” test. For example, in another study we did, the client was considering renaming an important top-level section, but was worried that the new term (while more accurate) was less clear. Tree testing showed both terms to be equally effective, so the client was free to choose based on other criteria.

* Crack the toughest nuts with “live” testing. Online tree tests suffer from the same basic limitation as most other online studies – they give us loads of useful data, but not always the “why” behind it. Moderated testing (either in person or by remote session) can fill in this gap when it occurs.

Conclusion

Tree testing has given us the IA method we were after – a quick, clear, quantitative way to test site structures. Like user testing, it shows us (and our clients) where we need to focus our efforts, and injects some user-based data into our IA design process. The simplicity of the technique lets us do variations and iterations until we get a really good result.

Tree testing also makes our clients happy. They quickly “get” the concept, the high-level results are easy for them to understand, and they love having data to show their management and to measure their progress against.

You can sign up for a free Treejack account at “Optimal Workshop”:http://www.optimalworkshop.com/treejack.htm.2

References

1. “Don’t Make Me Think”:http://www.amazon.com/Dont-Make-Me-Think-Usability/dp/0321344758, Steve Krug
2. Full disclosure: As noted in his “bio”:http://boxesandarrows.wpengine.com/person/35384-daveobrien, O’Brien works with Optimal Workshop.

MindCanvas Review

Written by: Sarah A. Rice

MindCanvas describes itself as a remote research tool that uses Game-like Elicitation Methods (GEMs) to gather insights about customer’s thoughts and feelings. It was developed by Uzanto Consulting, a web product strategy firm. When I first learned about MindCanvas, I understood it to be an online card sorting tool. Happily, it’s much more than that.

As a veteran IA consultant, I have used MindCanvas a handful of times during the course of different projects. I have also conducted card sorting exercises without the tool. I am thrilled to have a useful—and user-friendly—tool at my disposal. One of my main reasons for selecting MindCanvas was the reputation of one of its creators, Rashmi Sinha. She is well known and respected, and I felt assured that any tool designed by a fellow IA for IAs couldn’t be all that bad. I was right.

MindCanvas provides open and closed card sorting capabilities, as well as a host of other UT tools: Divide-the-Dollar, Clicky, Sticky, Concept Test, and FreeList. Clicky and Sticky allow users to react to a wireframe or prototype by answering questions about images and content, or applying stickies (Post-it–like notes) with attributes to a visual image. FreeList and Divide-the-Dollar allow you to elicit product ideas and prioritize them by having participants list and rank the features they find most useful. All of these methods offer easy-to-use interfaces to help your research participants along.

Deciding which MindCanvas method to use is one of the more complicated parts of the tool. It’s card sorting methods are good for validating a site’s navigation or information hierarchy. You can also explore user needs and values and gather feedback on brand and positioning by using some of its more specialized UT methods. MindCanvas’ website and supporting help wiki provide information on selecting the appropriate testing method for your website or product.

Using MindCanvas

The basic process for using MindCanvas is as follows:

  1. After payment, sign an agreement to obtain a login and password.
  2. Decide which method (i.e. Sticky, FreeList, etc.) addresses your research needs.
  3. Create potential research questions and tasks based on the MindCanvas method you have selected.
    (I’ve used OpenSortand TreeSort).
  4. Upload questions to MindCanvas’ Workbench.
  5. Test the research study and make changes until you are satisfied with it.
  6. Send out the test site URL to your participants.
  7. Monitor the study (i.e. see how many people have completed all the tasks).
  8. When the study is concluded, send a report request to the MindCanvas team.
  9. Receive the reports in visual form and download raw data from the MindCanvas site.
  10. Embed reports into PowerPoint or Word document and review results with client.

I usually take several days to review the reports before showing them to my consulting clients. Doing so allows me to more easily explain the results. (Here’s a pointer to anyone using MindCanvas: To view the results properly make sure PowerPoint is in “Slideshow” mode).

Strengths

MindCanvas has a couple shining strengths I’d like to illuminate:

  1. An engaging, easy-to-use interface for your customers or end users. It’s fairly self-explanatory and makes routine UT tasks fun.
  2. Stellar data visualization tools once your study is completed.

User Interface

MindCanvas’ interface is what sets it apart from other UT software I’ve seen. Its creators took their inspiration from the world of digital gaming to develop an interface that’s engaging for the person using it, while gathering important data for researchers. Its card sorting methods employ a floating hand to deal cards, which are then sorted by users. Another method gives users virtual gold coins to vote for their favorite product features. These exercises are enhanced by accompanying sound effects. I’ve received numerous comments from users describing MindCanvas’ exercises as “fun”. They have also commented that while they don’t understand how these exercises will help me build a better website or software interface, they still enjoyed the tasks and were pleased at the conclusion of the test.

The other online research tools I’ve reviewed offer more awkward interfaces. Sorting exercises take multiple steps or the online tasks are not intuitive and confuse research participants. I’m not interested in making my users become experts at online card sorting or other UT methods. I simply want to extract what they know or understand about a particular website or service.

According to Jess McMullin of nForm User Experience Consulting, “MindCanvas is unmatched as a remote research tool in its ability to provide creative methods for gathering data [and] engaging participants…..”

Data Visualization

Another MindCanvas strength is its data output. Although you can obtain the raw data and analyze it yourself (assuming you have statistical software and know how to use it), the real benefit of MindCanvas is its easy-to-understand data visualizations, which showcase the results of your study. All my clients have received clear, easy-to-interpret answers to their research questions. The visualizations can be embedded into a PowerPoint slide or Word document, making them easily accessible. Your clients don’t have to rely on your interpretation of the data; they can interpret the data themselves if they choose. Every client who has viewed MindCanvas’ data visualizations has been impressed and wondered why it wasn’t used all along.

Weaknesses

I’ve used MindCanvas a handful of times and encountered some weaknesses:

  • Study Size: If you have a large client with complex, statistically rigorous research needs, MindCanvas is not for you. It has a limit of 200 users per study. Two hundred is plenty for most of my research needs, but some of my clients want to go beyond that.

  • Data Sorting: If you have complex user segmentation needs, MindCanvas has its limitations. It allows you to perform a single data sort to identify user sub-groups. For example, it’s easy to segment all male vs. female participants or all participants who are 21- to 50-years-old. If you need to segment 16- to 20-year-old females or men who only shop online (or any two parameters of your choice), you’ll need a different tool. There are ways around these limitations: You can create two separate research studies to deal with different users, or you can build more complex research questions to solicit the answers you need in order to sort the data required. However, these solutions have limitations of their own, so there is a trade-off.

  • Pricing Structure: The current pricing structure is $499 per study, with each accompanying report $99. This is adequate for quick-and-dirty research to resolve obvious user issues, but the pricing structure doesn’t scale well. For example, if you run a single study and want multiple reports for different audience segments, each $99 report adds up quickly. It can be difficult to budget up front before the research study is even developed, leaving the door open for cost increases. If a simple card sorting tool is all that you need, check out WebSort, which costs $499 for three months of unlimited use and automatically generates a dendogram. (Please note that MindCanvas offers much more than card sorting).

  • Data Analysis Bottleneck: Some of the back-end data analysis is done by a human, who works on a schedule. All data reports are generated once a week. If you get your report order request to Uzanto by the Tuesday deadline, results will be available by Thursday. This might not work with your tight project schedule, in which case, you’re out of luck.

MindCanvas’s Workbench

MindCanvas is currently offered in self-service mode. This means that you (or your researcher) need to become familiar with the finer points of MindCanvas’ Workbench for constructing studies. The upside is that some parts are made easy, like being able to “copy” another study in order to create your own (a handy feature), or creating as many preliminary studies as you like before distributing the real thing.

Mindcanvas Workbench
Figure 1: Manage Activity

The downside is that some interface elements in the study creation console are a bit mysterious. For example, under Manage Study, it’s unclear if the data has been downloaded 164 times or if there are 164 participants who have completed the study. The difference between Manage Study and Track Activity is also hazy. Manage Study allows you to specify where to send users after they have completed the study and limit the number of participants or the length of the study, while Track Activity informs you how many people have completed the study. The Download Tracking CSV gives you access to a text file with a list of all participant’s URL information and their start and stop times.

Mindcanvas Workbench
Figure 2: Track Activity

The Workbench allows access to MindCanvas’ powerful study creation module, but you can tell most of the design effort went into the end user’s interface, not the study designer’s. Luckily, there is a wiki available which answers a lot of questions and Uzanto consultants are very friendly and helpful with the occasional question.

Conclusion

The IA community can finally say that we have a tool designed for us. For so long, we’ve had to take existing tools and try to use them in ways not intended by their designers, sometime with frustrating results and having to develop clever and complicated workarounds. These issues are no longer a problem. It’s a tool for us, made by one of us. It’s about time!

Keeping Pace with Change

Written by: Samantha Bailey
“Documentation is a little bit like broccoli; we all know it’s good for us, but many of us avoid it.”

Documentation is a little bit like broccoli; we all know it’s good for us, but many of us avoid it. It would be one thing if it was just a matter of getting things documented once, but the web is a never-finished medium and producing “living” documentation that evolves alongside projects and sites can require a heroic level of time and energy. Unfortunately this documentation void is a major contributor to bad information architecture design, especially as a site evolves over time. Blueprints and sitemaps let us look at complex information spaces from a bird’s-eye view, which helps us think more holistically and systematically about the design. But in order to do that, we have to have tools that support our work within the framework of our realities. And for the foreseeable future, this reality is likely to include projects that will never have time for documentation built into the resource allocation plan.

Intuitect is the most recent entrant into the emerging market for software that supports design and documentation of websites and information products from conception to prototyping. As such, it supports the creation of sitemaps to show hierarchical relationships, wireframes to show page-level components, and flow diagrams to document state and interactive components. Its features compete most closely with iRise and Axure; iRise being the first-to-market product focused on rapid prototyping and geared at the enterprise level, and Axure being the lighter, faster, cheaper entrant with a virtually identical feature set and more user friendly user interface (UI).

I have been leading user experience (UX) design teams for several years, which involves staying abreast of software tools that my team may find useful, evaluating software requests from my team, and making a business case for purchasing software. Because none of these tasks require me to have the level of proficiency possessed by the active practitioner, this review will focus more on my assessment of the strengths and weaknesses from a conceptual perspective, leaving the detailed evaluation of specific controls to a competent practitioner. While no longer in a practitioner role, I’ve had a significant amount of exposure to many of the products in this marketplace; in previous workplaces we used both Visio and iRise and in my current situation Visio, Axure, Illustrator, and Dreamweaver are all used in information design documentation.

From my perspective, there are three rather unique differentiators to Intuitect:

  • It is a licensed Visio add-on rather than an independent software program.
  • It is specifically catering to user experience professionals and works from a baseline assumption that users will be conversant in principles of information architecture and interaction design.
  • It has a fairly sophisticated approach to indexing structural components that acknowledges the dynamic nature web design projects, which, by their very nature, are continuously undergoing revision.

The good, the bad, and the compatible

Specifically catering to UX professionals is an aspect of Intuitect that I find particularly compelling because the software intrinsically acknowledges the relationship between page components, pages, and organizational groupings—something that is not fully realized in either iRise or Axure. As an information architect by training, having a product with specific affordances for documenting navigation is long overdue. bailey toggle The ability to move (in this case by tabs at the bottom of the screen) among sitemaps, wireframes, and flow diagrams supports thinking about information spaces more organically. It is the ability to see our design work through several lenses (sitemap/blueprint, page-level wireframe, interaction flow) toggle that I find most compelling. This multi-view approach has potential to be a powerful training and teaching tool while simultaneously supporting experienced practitioners in doing more complex design work. While experimenting with the tool I had the sensation that I was thinking less linearly and was able to visualize the interaction of components and pages in a more multi-dimensional way than was previously feasible with Visio or the other products I’ve used.

Pros and cons assert themselves the most baldly with Intuitect’s status as a Visio add-on. As someone who has used Visio extensively, I appreciate this integration, which allows Intuitect to rely on Visio’s existing strengths and extend and improve upon a mature software product’s formidable feature set. Anyone who has been using Visio regularly will likely be enthusiastic about the extended capabilities, as well as appreciative of the UI commonalities and the comparatively low barrier to entry in terms of pricing. Additionally, people who work in large organizations will likely experience few compatibility problems and may find it easier to have Intuitect accepted as a new software addition due to its association with Microsoft and the Office Suite’s level of penetration in large companies.

Of course, therein lies the rub—Intuitect is not available to Mac users (who typically use Omnigraffle) and may be overlooked by shops where Visio isn’t used regularly. In addition, it may have a higher learning curve for those who are unfamiliar with Visio since they’ll have to familiarize themselves with the peccadilloes of both components. Since Visio doesn’t have the same level of penetration as the other components of the Office suite, colleagues and clients who review UX deliverables may not have the software and will have to view deliverables as PDFs or an equivalent solution. Of course, anyone already using Visio is almost certainly already adept in responding to this limitation. (Because Intuitect currently only works with Visio 2003 Professional, folks who have not upgraded their Office suite recently will not be able to access the software without an upgrade.)

It may be my own professional bias at work, but I see few disadvantages to Intuitect’s UX-professional centric positioning; this is a niche that hasn’t been fully explored as other programs have tended to focus more on enabling the business segment to participate in the information design process. While it’s possible that some prospective users could be put off by the UX-specific positioning, I doubt that this would be likely to pose any greater barrier than the issues around Visio adoption described above. Most of the business analysts that I’ve worked with are Visio conversant and will readily grasp Intuitect’s design themes.

Indexing to the rescue

“I have long been frustrated by the clash that occurs between the realities of working in new media, where the only constant is change and the inherently static nature of documentation.”

It is Intuitect’s approach to indexing that I find most exciting about the product. Unlike traditional approaches to web design documentation where the “boxes and arrows” have static relationships, Intuitect captures the logical relationships between pages and data structures and is able to cascade changes and maintain relationships as the design expands and contracts and as hierarchical relationships change.

I have long been frustrated by the clash that occurs between the realities of working in new media, where the only constant is change and the inherently static nature of documentation. To date, I haven’t encountered a satisfactory solution to the problem of updating documentation in real-time—updating in a way that cascades throughout the design without requiring a manual update of numbering schemes. Because creating and maintaining documentation is so mind-numbingly labor intensive, most organizations end up with one of several (unsatisfactory) approaches:

  • Generate sitemaps lifted from the website via an automated process—from a content management system, for example—which to be disconnected from the organic site design and development in progress;
  • Settle for woefully out of date and incomplete documentation, or, most commonly;
  • Don’t maintain this kind of documentation outside of very specific redesign efforts.

Lack of or limits to documentation are a common lament in many industries. I suspect a primary cause of poor documentation in the web design world is the omnipresent change factor in that the next project is always underway before the current project is finished. If documentation were done for documentation’s sake, this would not be a concern. But it’s not: most information architectures are haphazard at best and impenetrable as the norm.

This product advances us another step down that road by introducing sophisticated indexing that can keep up with real-time design realities. With their approach to indexing, Intuitect offers the promise of painless evolution, and this strikes me as a feature that could become the product’s “secret” weapon in terms of developing customer loyalty.

Intuitect is currently still in beta and, as such, there are still bugs, inconsistencies, and gaps in help content and feature set limitations to content with, but this is definitely a product that information architects and interaction designers will want to familiarize themselves with and one that may become indispensable to many in the UX design community.

For more information: