User Research With Small Business Owners: Best Practices and Considerations

Written by: Chelsey Glasson

The majority of our work at Google has involved conducting user research with small business owners: the small guys that are typically defined by governmental organizations as having 100 or fewer employees, and that make up the majority of businesses worldwide.

Given the many hurdles small businesses face, designing tools and services to help them succeed has been an immensely rewarding experience. That said, the experience has brought a long list of challenges, including those that come with small business owners being constantly on-call and strapped for time; when it comes to user research, the common response from small business owners and employees is, “Ain’t nobody got time for that!”

To help you overcome common challenges we’ve faced, here are a few tips for conducting successful qualitative user research studies with small businesses.


Recruiting tip #1: Give yourself an extra week, and then some

It generally takes more time to recruit for research projects with small businesses than what’s typical for consumer studies. There are several reasons why this is the case.

Existing user research participant pools tend to be light on small business representation, meaning recruiting for your project may have to start completely from scratch. Also, it can take time to track down the appropriate person at a small business to talk to—are you trying to reach the owner, the accountant, customer service staff, or…?  

Finally, small businesses are accustomed to companies trying to sell them new offerings or get them to sign up for product pilots, and many have been scammed in signing up for “free” pilots or services that end up turning into a perpetual sales pitch. Because of this, the chances of a small business owner or employee saying “No!” to participating in your user research is especially high.

Recruiting tip #2: Make sure you’re crystal clear on what type of business you want to recruit

There’s quite a bit of variation in terms of business environment, priorities, strategies and other factors across different types of businesses. Accidentally overlooking important criteria could be detrimental to a study.

For example, do you want to talk to a certain type of business, such as professional service, service area, or brick-and-mortar? Does it matter if your study participants are from B2B vs. B2C companies? What about online vs. offline businesses? Additional points of consideration include number of employees, business goals (e.g., does the business want to grow?), and revenue.

If you’re not sure if you’ve overlooked important criteria, ask for feedback from product managers, marketing professionals, and other user researchers who may have relevant information. It can also be helpful to see how entities such as the Small Business Administration categorize business types.

Recruiting tip #3: Make sure you’re crystal clear on whom you want to interview

When conducting research with small business owners, it’s common to assume that the business owner is involved in most decisions, but that’s often not the case.

Is it actually the business owner you’re interested in speaking to? Or, do you need to talk to someone who’s responsible for a specific task, such as someone who managing online marketing decisions or handles the company’s financials? The larger the company, the higher the chances are of the company owner delegating responsibilities.

We typically ensure we’re speaking to the right type person by asking screening questions specific to roles and responsibilities (see examples at the end of this article).

Recruiting tip #4: Avoid hobbyists disguised as business owners

It’s common for hobbyists—for example, people who casually sell certain services or offerings for personal enjoyment—to sign up for user research involving small business owners. On the surface they pass many of the screening criteria, but in reality their motivations and behaviors are quite different from a full-time business owner or employee of a business. We typically screen out hobbyists via recruiting screening surveys by asking if potential study participants spend at least 30 hours per week in their role as business owner or employee of the business.

Recruiting tip #5: Recruit extra participants

When conducting research with consumers, we always recruit one extra participant in the event there is a no-show. When conducting research with small businesses, we’ll increase that number to two or more. Given how unpredictable the small business environment  can be, we’ve found that the chances of last-minute cancellations or no-shows is much higher with small business owners and employees than it is with consumers.


Incentive tip #1: Provide incentives other than cash

While incentives are a nice gesture, cash or gift cards are not a huge motivator, as they aren’t viewed as a worthwhile tradeoff for inconveniences that come with stepping away from running a business in order to participate in an interview.

What is motivating is providing small business owners and their employees with information and tips on how to run the business successfully: things like offering free accounting software, coaching on social media best practices, and personal access to a member of the support team for assistance. Another approach is to offer 15 minutes after the interview for free coaching and/or advice on a topic that makes sense given the study focus.

The small business community is tightly knit, and small businesses are often invested in each other’s success. Because of this, another option is to frame the study as an opportunity to improve offerings for all small businesses.

Even better, small businesses owners and employees love the opportunity to share feedback on tools and services they routinely use to run their business. If the product you’re testing or exploring touches upon tools and services already in use, it can be motivating to frame user interviews as an opportunity to shape the future of the offering being reviewed.

Finally, consider offering small business owners and employees the opportunity to participate in an exclusive Trusted Testers community, which provides the option to share feedback, receive “insiders” information and tips, and interact with and learn from other small businesses. We’ve found this option can be especially motivating for engaging in user research.


Interview tip #1: Consider in-person interviews

It can be hard for small business owners and employees to take time away from the business to participate in research that might be conducted at your lab or office. Likewise, for remote interviews, small business owners and employees don’t always have convenient access to needed technology at their place of business.

For these reasons, we’ve found that small businesses are much more likely to participate in user research if interviews take place at their place of business. This way they can tend to the business during interviews if needed and don’t have to waste valuable time setting up technology to participate in the interviews.

Also, conducting in-person interviews provides context often needed to understand complicated processes and workflows that business owners and their employees face.

Interview tip #2: Be flexible with scheduling

We’re also always especially flexible with scheduling when conducting research with small business owners and employees. In addition to leaving extra time between interviews, we usually also leave an interview slot open in the event we have to move the schedule around suddenly. We’re also mindful of offering early morning or late evening interview times, especially if the verticals we’re focused are service oriented (restaurants, spas, etc); trying to conduct a field visit during peak hours can be really intrusive for these types of businesses.

Interview tip #3: Be prepared for last-minute changes

The world of small business owners and their employees can be unpredictable, which is why we always schedule extra, backup participants for research. We’ve run into countless situations where a research participant cancels an interview at the last second on account of unexpected business or emergencies.

It’s also common for small business owners or employees to request location changes at the last second. For example, one time I (Chelsey) was scheduled to interview a business owner at his home (which is where he ran the business). He called five minutes before the interview explaining he wanted to be respectful of his roommates and asking if we could meet elsewhere. Good thing I had scoped out the area before this happened and had a nearby coffee shop in mind where we could talk!

Interview tip #4: Emphasize participant expertise early

When interviewing small business owners and employees, it’s common for them to want to seize the opportunity to get insider information or training on whatever topic is being explored. When this happens at the start of an interview, the interviewer becomes the expert for the remainder of the conversation which can prevent an open, honest dialogue.

To establish the participant as the expert early on in the conversation, there are a few things we’ll typically do. For starters, we always state that the goal of the study is to learn from the participant.

Next, we’ll ask the participant to give a tour of the business (if a site visit) and to explain what the day-to-day looks like in running it. During the tour and/or day-to-day explanation, we’ll call out pieces of information that are new to us and ask a few follow-up questions. This strategy usually does the trick in placing the research participant in expert mode and researchers as the student.

Interview tip #5: Bring extra NDAs  

I (Chelsey) will never forget, in kicking off an interview with a business owner in India, when I unexpectedly discovered several family members waiting to enthusiastically participate in the conversation!

The reality is that running a small business—whether in India, the US, or elsewhere—is rarely a solo operation. Consequently, we’ve found it’s common for family, friends and employees to be asked by interviewees to join interviews. This isn’t a bad thing. In fact, in many situations it’s a wonderful surprise that can lead a an engaging, insightful conversation.

Because of how frequent this scenario can be, we now always make sure to bring extra copies of NDAs.


Reporting tip #1: Provide context

When socializing your findings to a product team, building empathy for the participants and their challenges is key. Reporting on consumer insights is relatively easy because most of us face similar challenges in our daily lives and we can easily identify with the participants. However, small business owners and employees face challenges that are less relatable.

Therefore, it’s important to create a narrative that includes the context of the merchant’s business and business practices. For example, what vertical do they work in? What does their day to day look like? How has their business evolved? What do they feel their customers’ needs are, and how does that in turn translate to their use of your product?

Additionally, keep in mind that your product won’t exist in a vacuum. Small business merchants are experimental, and are willing to try out numerous tools and services until they find one that meets most (usually not all) of their needs. Small business owners also value integration and may find creative ways to DIY integration that doesn’t already exist. It’s therefore not unusual that small business research may occasionally graze the edges of competitive analysis.

When crafting your report, create a story around the participants — what are their challenges and successes? How do they feel about their customers? How does (or could) your product fit into their business processes? Finally, video recordings and direct quotes are highly impactful and help emphasize the person behind the findings.

Reporting tip #2: Limit, but embrace, variation

Because small businesses are so varied in terms of vertical, structure, and practices, it takes a careful eye to draw unified or cohesive themes across what can sometimes seem like disparate participants.

Often in user research there is an impulse to sweep outliers under the rug. However, in small business research it can actually be helpful to call out and explain the outliers. They may represent an edge case that your team has an opportunity to address, or they might reveal something new about a vertical, business, or merchant type.  

Of course, as we mentioned earlier, it’s important to clearly define your intended participant group. Even with a clearly definition of who you want to talk to, you can expect to see a healthy amount of variation among your study participants.

Concluding thoughts

Small businesses have different pressures and motivations than consumers that are important to consider in setting up a successful user study with business owners and those who help run small businesses. To get the most out of your time and theirs, study up on what might relieve these pressures and speak to motivations, and adjust your recruiting, incentives, and interview techniques accordingly.

Sample screening questions

Which of the following best describes the business where you work? Please select one.

Food and dining (e.g., restaurant, bar, food truck, grocery store) 1
Retail and shopping (e.g., clothing boutique, online merchandise store) 2
Beauty and fitness (e.g., nail salon, gym, hair salon, spa) 3
Medical and health (e.g., doctor, dentist, massage therapist, counselor) 4
Travel and lodging (e.g., hotel, travel agency, taxi, gas station) 5
Consulting services (e.g., management consulting, business strategy) 6
Legal services (e.g., lawyer, paralegal, bail bondsman) 7
Home services and construction (e.g, contractor, HVAC, plumber, cleaning services) 8
Finance and banking (e.g., accounting, insurance, financial planner, investor, banker) 9
Education (e.g., tutoring, music lessons, public or private school, daycare, university) 10
Entertainment (e.g., movie theatre, sports venue, comedy club, bowling alley) 11
Art / design (e.g., art dealers, antique restoration, photographer) 13
Automotive services (e.g., auto repairs, car sales) 14
Marketing services (e.g., advertising, marketing, journalism, PR) 15
Other 16

Thinking about the next 12 months, which of the following are overall goals for the business you own or work for? Select all that apply.

Acquire new customers 1
Conduct more business with existing customers 2
Target specific customer segments 3
Improve operational efficiency/capabilities 4
Expand to more locations 5
Develop new products/services 6
Offer training or development for my employees 7
Invest in improvements to physical locations (e.g., new paint, interior remodeling, etc.) 8
Maintain current business performance 9
Acquire competitors 10
Other 11
None of the above 99


How does your business operate? Please select all that apply.

You have a physical business location that customers visit (e.g., store, salon, restaurant, hotel, doctor’s office etc.) 1
Your business serves customers at their locations (e.g., taxi driver, realtor, locksmith, wedding photographer, plumber) 2
Your customers can purchase products and services from any location, online or by phone 3
Other 4

Which of the following best describes your role in your business?

Owner 1
Employee 2
Other 3

Which of the following are you responsible for at the business you own or work for? Please

select all that apply.

Hiring employees 1
Managing employees 2
Business planning/Strategy 3
Marketing/Promotions 4
Finance/Accounting 5
Sales/Customer Service 6
Legal 7
IT 8
Other 9
None of the above   99

Which of the following best describes your current employment status? Please one.

Work full-time (30 or more hours per week) 1
Work part-time (fewer than 30 hours per week) 2
Not employed 4
Student 5
Retired 6

How to Determine When Customer Feedback Is Actionable

Written by: Naira Musallam

One of the riskiest assumptions for any new product or feature is that customers actually want it.

Although product leaders can propose numerous ‘lean’ methodologies to experiment inexpensively with new concepts before fully engineering them, anything short of launching a product or feature and monitoring its performance over time in the market is, by definition, not 100% accurate. That leaves us with a dangerously wide spectrum of user research strategies, and an even wider range of opinions for determining when customer feedback is actionable.

To the dismay of product teams desiring to ‘move fast and break things,’ their counterparts in data science and research advocate a slower, more traditional approach. These proponents of caution often emphasize an evaluation of statistical signals before considering customer insights valid enough to act upon.

This dynamic has meaningful ramifications. For those who care about making data-driven business decisions, the challenge that presents itself is: How do we adhere to rigorous scientific standards in a world that demands adaptability and agility to survive? Having frequently witnessed the back-and-forth between product teams and research groups, it is clear that there is no shortage of misconceptions and miscommunication between the two. Only a thorough analysis of some critical nuances in statistics and product management can help us bridge the gap.

Quantify risk tolerance

You’ve probably been on one end of an argument that cited a “statistically significant” finding to support a course of action. The problem is that statistical significance is often equated to having relevant and substantive results, but neither is necessarily the case.

Simply put, statistical significance exclusively refers to the level of confidence (measured from 0 to 1, or 0% to 100%) you have that the results you obtained from a given experiment are not due to chance. Statistical significance alone tells you nothing about the appropriateness of the confidence level selected nor the importance of the results.

To begin, confidence levels should be context-dependent, and determining the appropriate confidence threshold is an oft-overlooked proposition that can have profound consequences. In statistics, confidence levels are closely linked to two concepts: type I and type II errors.

A type I error, or false-positive, refers to believing that a variable has an effect that it actually doesn’t.

Some industries, like pharmaceuticals and aeronautics, must be exceedingly cautious against false-positives. Medical researchers for example cannot afford to mistakenly think a drug has an intended benefit when in reality it does not. Side effects can be lethal so the FDA’s threshold for proof that a drug’s health benefits outweigh their known risks is intentionally onerous.

A type II error, or false-negative, has to do with the flip side of the coin: concluding that a variable doesn’t have an effect when it actually does.

Historically though, statistical significance has been primarily focused on avoiding false-positives (even if it means missing out on some likely opportunities) with the default confidence level at 95% for any finding to be considered actionable. The reality that this value was arbitrarily determined by scientists speaks more to their comfort level of being wrong than it does to its appropriateness in any given context. Unfortunately, this particular confidence level is used today by the vast majority of research teams at large organizations and remains generally unchallenged in contexts far different than the ones for which it was formulated.

Matrix visualising Type I and Type II errors as described in text.


But confidence levels should be representative of the amount of risk that an organization is willing to take to realize a potential opportunity. There are many reasons for product teams in particular to be more concerned with avoiding false-negatives than false-positives. Mistakenly missing an opportunity due to caution can have a more negative impact than building something no one really wants. Digital product teams don’t share many of the concerns of an aerospace engineering team and therefore need to calculate and quantify their own tolerance for risk.

To illustrate the ramifications that confidence levels can have on business decisions, consider this thought exercise. Imagine two companies, one with outrageously profitable 90% margins, and one with painfully narrow 5% margins. Suppose each of these businesses are considering a new line of business.

In the case of the high margin business, the amount of capital they have to risk to pursue the opportunity is dwarfed by the potential reward. If executives get even the weakest indication that the business might work they should pursue the new business line aggressively. In fact, waiting for perfect information before acting might be the difference between capturing a market and allowing a competitor to get there first.

In the case of the narrow margin business, however, the buffer before going into the red is so small that going after the new business line wouldn’t make sense with anything except the most definitive signal.

Although these two examples are obviously allegorical, they demonstrate the principle at hand. To work together effectively, research analysts and their commercially-driven counterparts should have a conversation around their organization’s particular level of comfort and to make statistical decisions accordingly.

Focus on impact

Confidence levels only tell half the story. They don’t address the magnitude to which the results of an experiment are meaningful to your business. Product teams need to combine the detection of an effect (i.e., the likelihood that there is an effect) with the size of that effect (i.e., the potential impact to the business), but this is often forgotten on the quest for the proverbial holy grail of statistical significance.

Many teams mistakenly focus energy and resources acting on statistically significant but inconsequential findings. A meta-analysis of hundreds of consumer behavior experiments sought to qualify how seriously effect sizes are considered when evaluating research results. They found that an astonishing three-quarters of the findings didn’t even bother reporting effect sizes “because of their small values” or because of “a general lack of interest in discovering the extent to which an effect is significant…”

This is troubling, because without considering effect size, there’s virtually no way to determine what opportunities are worth pursuing and in what order. Limited development resources prevent product teams from realistically tackling every single opportunity. Consider for example how the answer to this question, posed by a MECLABS data scientist, changes based on your perspective:

In terms of size, what does a 0.2% difference mean? For, that lift might mean an extra 2,000 sales and be worth a $100,000 investment…For a mom-and-pop Yahoo! store, that increase might just equate to an extra two sales and not be worth a $100 investment.

Unless you’re operating at a Google-esque scale for which an incremental lift in a conversion rate could result in literally millions of dollars in additional revenue, product teams should rely on statistics and research teams to help them prioritize the largest opportunities in front of them.

Sample size constraints

One of the most critical constraints on product teams that want to generate user insights is the ability to source users for experiments. With enough traffic, it’s certainly possible to generate a sample size large enough to pass traditional statistical requirements for a production split test. But it can be difficult to drive enough traffic to new product concepts, and it can also put a brand unnecessarily at risk, especially in heavily regulated industries. For product teams that can’t easily access or run tests in production environments, simulated environments offer a compelling alternative.

That leaves product teams stuck between a rock and a hard place. Simulated environments require standing user panels that can get expensive quickly, especially if research teams seek  sample sizes in the hundreds or thousands. Unfortunately, strategies like these again overlook important nuances in statistics and place undue hardship on the user insight generation process.

A larger sample does not necessarily mean a better or more insightful sample. The objective of any sample is for it to be representative of the population of interest, so that conclusions about the sample can be extrapolated to the population. It’s assumed that the larger the sample, the more likely it is going to be representative of the population. But that’s not inherently true, especially if the sampling methodology is biased.

Years ago, a client fired an entire research team in the human resources department for making this assumption. The client sought to gather feedback about employee engagement and tasked this research team with distributing a survey to the entire company of more than 20,000 global employees. From a statistical significance standpoint, only 1,000 employees needed to take the survey for the research team to derive defensible insights.

Within hours after sending out the survey on a Tuesday morning, they had collected enough data and closed the survey. The problem was that only employees within a few timezones had completed the questionnaire with a solid third of the company being asleep, and therefore ignored, during collection.

Clearly, a large sample isn’t inherently representative of the population. To obtain a representative sample, product teams first need to clearly identify a target persona. This may seem obvious, but it’s often not explicitly done, creating quite a bit of miscommunication for researchers and other stakeholders. What one person may mean by a ‘frequent customer’ could mean something different entirely to another person.

After a persona is clearly identified, there are a few sampling techniques that one can follow, including probability sampling and nonprobability sampling techniques. A carefully-selected sample size of 100 may be considerably more representative of a target population than a thrown-together sample of 2,000.

Research teams may counter with the need to meet statistical assumptions that are necessary for conducting popular tests such as a t-test or Analysis of Variance (ANOVA). These types of tests assume a normal distribution, which generally occurs as a sample size increases. But statistics has a solution for when this assumption is violated and provides other options, such as non-parametric testing, which work well for small sample sizes.

In fact, the strongest argument left in favor of large sample sizes has already been discounted. Statisticians know that the larger the sample size, the easier it is to detect small effect sizes at a statistically significant level (digital product managers and marketers have become soberly aware that even a test comparing two identical versions can find a statistically significant difference between the two). But a focused product development process should be immune to this distraction because small effect sizes are of little concern. Not only that, but large effect sizes are almost as easily discovered in small samples as in large samples.

For example, suppose you want to test ideas to improve a form on your website that currently gets filled out by 10% of visitors. For simplicity’s sake, let’s use a confidence level of 95% to accept any changes. To identify just a 1% absolute increase to 11%, you’d need more than 12,000 users, according to Optimizely’s stats engine formula! If you were looking for a 5% absolute increase, you’d only need 223 users.

But depending on what you’re looking for, even that many users may not be needed, especially if conducting qualitative research. When identifying usability problems across your site, leading UX researchers have concluded that “elaborate usability tests are a waste of resources” because the overwhelming majority of usability issues are discovered with just five testers.

An emphasis on large sample sizes can be a red herring for product stakeholders. Organizations should not be misled away from the real objective of any sample, which is an accurate representation of the identified, target population. Research teams can help product teams identify necessary sample sizes and appropriate statistical tests to ensure that findings are indeed meaningful and cost-effectively attained.

Expand capacity for learning

It might sound like semantics, but data should not drive decision-making. Insights should. And there can be quite a gap between the two, especially when it comes to user insights.

In a recent talk on the topic of big data, Malcolm Gladwell argued that “data can tell us about the immediate environment of consumer attitudes, but it can’t tell us much about the context in which those attitudes were formed.” Essentially, statistics can be a powerful tool for obtaining and processing data, but it doesn’t have a monopoly on research.

Product teams can become obsessed with their Omniture and Optimizely dashboards, but there’s a lot of rich information that can’t be captured with these tools alone. There is simply no replacement for sitting down and talking with a user or customer. Open-ended feedback in particular can lead to insights that simply cannot be discovered by other means. The focus shouldn’t be on interviewing every single user though, but rather on finding a pattern or theme from the interviews you do conduct.

One of the core principles of the scientific method is the concept of replicability—that the results of any single experiment can be reproduced by another experiment. In product management, the importance of this principle cannot be overstated. You’ll presumably need any data from your research to hold true once you engineer the product or feature and release it to a user base, so reproducibility is an inherent requirement when it comes to collecting and acting on user insights.

We’ve far too often seen a product team wielding a single data point to defend a dubious intuition or pet project. But there are a number of factors that could and almost always do bias the results of a test without any intentional wrongdoing. Mistakenly asking a leading question or sourcing a user panel that doesn’t exactly represent your target customer can skew individual test results.

Similarly, and in digital product management especially, customer perceptions and trends evolve rapidly, further complicating data. Look no further than the handful of mobile operating systems which undergo yearly redesigns and updates, leading to constantly elevated user expectations. It’s perilously easy to imitate Homer Simpson’s lapse in thinking, “This year, I invested in pumpkins. They’ve been going up the whole month of October and I got a feeling they’re going to peak right around January. Then, bang! That’s when I’ll cash in.”

So how can product and research teams safely transition from data to insights? Fortunately, we believe statistics offers insight into the answer.

The central limit theorem is one of the foundational concepts taught in every introductory statistics class. It states that the distribution of averages tends to be Normal even when the distribution of the population from which the samples were taken is decidedly not Normal.

Put as simply as possible, the theorem acknowledges that individual samples will almost invariably be skewed, but offers statisticians a way to combine them to collectively generate valid data. Regardless of how confusing or complex the underlying data may be, by performing relatively simple individual experiments, the culminating result can cut through the noise.

This theorem provides a useful analogy for product management. To derive value from individual experiments and customer data points, product teams need to practice substantiation through iteration. Even if the results of any given experiment are skewed or outdated, they can be offset by a robust user research process that incorporates both quantitative and qualitative techniques across a variety of environments. The safeguard against pursuing insignificant findings, if you will, is to be mindful not to consider data to be an insight until a pattern has been rigorously established.

Divide no more

The moral of the story is that the nuances in statistics actually do matter. Dogmatically adopting textbook statistics can stifle an organization’s ability to innovate and operate competitively, but ignoring the value and perspective provided by statistics altogether can be similarly catastrophic. By understanding and appropriately applying the core tenets of statistics, product and research teams can begin with a framework for productive dialog about the risks they’re willing to take, the research methodologies they can efficiently but rigorously conduct, and the customer insights they’ll act upon.

Your Guide to Online Research and Testing Tools

Written by: Bartosz Mozyrko

The success of every business depends on how the business will meet their customers’ needs. To do that, it is important to optimize your offer, the website, and your selling methods so your customer is satisfied. The fields of online marketing, conversion rate optimization, and user experience design have a wide range of online tools that can guide you through this process smoothly. Many companies use only one or two tools that they are familiar with, but that might not be enough to gather important data necessary for improvement. To help you better understand when and which tool is valuable to use, I created a framework that can help in your assessment. Once you broaden your horizons, it will be easier to choose the set of tools aligned to your business’s needs. The tools can be roughly divided into three basic categories:

  • User testing: Evaluate a product by testing it with users who take the study simultaneously, in their natural context, and on their own device.
  • Customer feedback: Capture feedback of customer’s expectations, preferences, and aversions directly from a website.
  • Web analytics: Provide detailed statistics about a website’s traffic, traffic sources, and measurement of conversions and sales.

To better understand when to use which tool, it is helpful to use the following criteria:

  • What people say versus what people do… and what they say they do
  • Why versus how much
  • Existing classifications of online tools

The possible services are included at the latter part of the article to help you start.

What people say versus what people do… and what they say they do

What people say, what people do, and what they say they do are three entirely different things. People often lack awareness or necessary knowledge which would enable them to provide correct information. Anyone who has any experience with user research or conversion rate optimization and has spent time trying to understand users has seen firsthand that more often than not user statements do not match the acquired data. People are not always able to fully articulate why they did that thing they just did. That’s the reason it’s sometimes good to compare information about opinions to information on behavior, as this mix can provide better insights. You can learn what people do by studying your website from your users’ perspective and drawing conclusions based on observations of their behavior, such as click tracking or user session recording. However, that is based on the idea that you test certain theories about people’s behavior. There is a degree of uncertainty, and to validate the data you’ve gathered, you will sometimes have to go one step further and simply ask your users, which will allow you to see the whole picture. Therefore, you can learn what people say by reaching out to your target group directly and asking them questions about your business.

Why versus how much

Some tools are better suited for answering questions about why or how to fix a problem, whereas tools like web analytics do a much better job at answering how many and how much types of questions. Google Analytics tells you the percentage of people who clicked what thing to through to what page, but it doesn’t tell you why they did or did not do that. Having knowledge of these differences helps you prioritize certain sets of tools and use them while fixing issues having the biggest impact on your business.

The following chart illustrates how different dimensions affect the types of questions that can be asked:

chart illustrates how different dimensions affect the types of questions that can be asked.

Choosing the right tool—infographics

There are a lot of tools out there these days that do everything from testing information architecture and remote observation. With more coming out every day, it can be really hard to pick the one that will give you the best results for your specific purpose. To alleviate some of the confusion, many experts tried to classify them according to different criteria. I decided to include some of examples for your convenience below.

Which remote tool should I use? By Stuff On My Wall

A flow chart to evaluate remote tool choices.

Choosing a remote user experience research tool by Nate Bolt

Another chart showing evaluation criteria for remote research tools.

The five categories of remote UX tools by Nate Bolt

Five categories of user research tools.

Four quadrants of the usability testing tools matrix by Craig Tomlin

Usability testing tools arranged in a quadrant chart.

Tool examples

The examples of tools which I list below are best suited for web services. The world of mobile applications and services is too vast to be skimmed over and has enough material to be a different article completely. The selection is narrowed down in order to not overwhelm you with choice, so worry not.

User testing

User testing and research is vital to creating a successful website, products and services. Nowadays using one of the many existing tools and services for user testing is a lot easier than it used to be. The important thing is to find a tool or service that works for your website and then use it to gather real-world data on what works and what does not.

Survey: The most basic form of what people say. Online surveys are often used by companies to gain a better understanding of their customers’ motives and opinions. You can ask them to respond in any way they choose or ask them to select an answer from a limited number of predetermined responses. Getting feedback straight from your customers may be best used in determining their painpoints or figuring out their possible needs (or future trends). However, what you need to remember about is that people do not always communicate best what is exactly the issue they are facing. Be like Henry Ford: Do not give people faster horses when they want quicker transportation—invent a car.

Survey Gizmo

Card sorting: It focuses on asking your users to categorize and sort provided items in the most logical way for them or create their own possible categories for items. These two methods are called respectively closed and open card sorting. This will help you to rework the information architecture of your site thanks to the knowledge about the users’ mental models. If you aim to obtain information that balances between “what they do” and “what they say”, sorting is your best bet. Be sure to conduct this study in a larger group – some mental models might make sense, but aren’t the most intuitive for others. Focus on the responses that are aligned with each other, as it is possibly the most representative version of categories.


Click testing/automated static/design surveys: This lets you test screenshots of your design, so you can obtain detailed information about your users’ expectations and reactions to a website in various stages of development. This enters the territory of simply gathering data about the actions of your users, so you obtain information about what they do. The study is conducted usually by asking a direct question: “Click on the button which will lead to sign-ups”. However, remember, click testing alone is not sufficient enough, you need other tools that cover the part of “why” in order to fully understand.

Verify App

5-Second testing/first impression test: Because your testers have only five seconds to view a presented image, they are put under time pressure and must answer questions relying only on almost subconscious information they obtained. This enables you to improve your landing pages and calls to action, as users mostly focus only on the most eye-catching elements.

Optimal Workshop Chalkmark

Diary studies: An extensive database of all thoughts, feelings and actions of your user, who belongs to a studied target market. All events are being recorded by the participants at their moment of occurrence. This provides insights into firsthand needs of your customers, asking them directly about their experiences. Yet, it operates in a similar fashion to surveys, therefore remember that your participants do not always clearly convey what they mean.

FocusVision Revelation

Moderated user studies/remote usability testing: The participants of this test are located in their natural environment, so their experiences are more genuine. Thanks to the tools and software there is no necessity for participants and facilitators to be in the same physical location. Putting the study into context of a natural/neutral environment (of whatever group you are studying) gives you insight into unmodified behaviours. Also, the study is a lot cheaper than other versions.


Self-moderated testing: The participants of the test are expected to complete the tasks independently. After that you will obtain videos of their test sessions, along with a report containing information what problems your users were facing and what to do in order to fix them. The services offering this type of testing usually offer the responses quickly, so if you are in a dire need of feedback, this is one of possibilities.


Automated live testing/remote scenario testing: Very similar to remote testing, yet the amount of information provided is much more extensive and organized. You get effectiveness ratios (success, error, abandonment and timeout), efficiency ratios (time on task and number of clicks), and survey comments as the results.

UX Suite

Tree testing/card-based classification: It is a technique which completely removes every distracting element of the website (ads, themes etc.) and focuses only on the simplified text version. Through this you can evaluate the clarity of your scheme and pinpoint the chokepoints that present problems to users. It is a good method to test your prototypes or if you want to detect a problem with your website and suspect the basic framework is at fault.

Optimal Workshop Treejack

Remote eye tracking/online eye tracking/visual attention tracking: Shows you where people focus their attention on your landing pages, layouts, and branding materials. This can tell you whether the users are focused on the page, whether they are reading it or just scanning, how intense they are, and what is the pattern of their movement. However, it cannot tell you exactly whether your users actually do see something or do not, or why exactly do they look at a given part. This can be remedied for example with voiceovers, where the participants tell you right away what they feel.

a) Simulated: creates measurement reports that predict what a real person would most likely look at.


b) Behavioral: finds out whether people actually notice conversion-oriented elements of the page and how much attention they pay to them.

Attensee Eyetrack Shop

These are the singular features which are prominent elements of the listed services. However, nowadays there is a trend to combine various tools together, so they can be offered by a single website. If you happen to find more than one tool valuable for your business, you can use services such as UsabilityTools or UserZoom.

Customer feedback

Successful business owners know that it’s crucial to take some time to obtain customer feedback. Understanding what your customers think about your products and services will not only help you improve quality, but will also give you insights into what new products and services your customers want. Knowing what your customers think you’re doing right or wrong also lets you make smart decisions about where to focus your energies.

Live chats: an easy to understand way of communicating through the website interface in real time. Live chat enables you to provide all the answers your customers could want. By analyzing their questions and often inquired issues you can decide what needs improvement. Live chats usually focus on solving an immediate problem, so it is usually used for smaller issues. The plus is the fact that your client will feel acknowledged right away.

Live Person

Insight surveys: They help you understand your customers thanks to targeted website surveys. You can create targeted surveys and prompts by focusing them on the variables such as the time on page, the number of visits, the referring search term or your own internal data. You can even target custom variables such as the number of items in a shopping cart. However, they are very specific and operate on the same principle as general surveys, so you must remember about the risk that the survey participants won’t always be able to provide you with satisfactory answers.


Feedback forms: They are a simple website application to receive feedback from your website visitors. You can create a customized form, copy and paste code into your site’s HTML, and start getting feedback. This is a basic tool for getting feedback forms from your customer, and receiving and organizing results. If you want to know the general opinion about your website and the experiences of your visitors (and you want it to be completely voluntary), then forms are a great option.


Feedback forums: Users enter a forum where they can propose and vote on items which need change or need to be discussed. That information allows you to prioritize issues and decide what needs to be fixed as fast as possible. The forums can be also used for communicating with users, for example you can inform them that you introduced some improvements to the product. Remember, however, that even the most popular issues might be actually least important for imrpoving your serive and/or website, it is up to you to judge.

Get Satisfaction

Online customer communities: You refer to your customer directly, peer-to-peer, and offer problem solving and feedback. Those web-based gathering places for customers, experts, and partners enable you to discuss problems, post reviews, brainstorm new product ideas, and engage with one another.


There are also platforms that merge some of the functionalities such as UserEcho or Freshdesk which are an extremely popular solution to the growing demands of clients who prefer to focus on single service with many features.

Website analytics

Just because analytics provide you with some additional data about your site doesn’t mean it’s actually valuable to your business. You want to find the errors and holes within your website and fill them with additional functionality for your users and customers. Using the information gathered you can influence your future decisions in order to improve your service.

Web analytics: all movement of the users is recorded and stored. However, their privacy is safe, as the data gathered is used only for optimization, and the data is impossible to be personally identified. The data can be later used for evaluating and improving your service and website in order to achieve your goals such as increasing the amount of visitors or sales.

Google Analytics

In-page web analytics: They differ from traditional web analytics as they focus on the users’ movement within the page and not between them. These tools are generally used to understand behavior for the purposes of optimizing a website’s usability and conversion.

a) Click tracking: This technique used to determine and record what the users are clicking with their mouse while browsing the website–it draws you a map of their movements, which allows you to see step by step the journey of your user. If there is a problem with the website, this is one of the methods to check out where that problem could’ve occured.

Gemius Heatmap

b) Visitor recording/user session replays: Every action and event is recorded as a video.


c) Form testing: This allows you to evaluate the web form and identify areas that need improvement, for example which fields make your visitors leave the website before completing the form.

UsabilityTools Conversion Suite

In a similar fashion to the previous groups, there is also a considerable amount of analytic Swiss army knives offering various tools in one place. The examples of such are ClickTale, UsabilityTools, or MouseStats.


This is it—the finishing line of this guide to online research tools. It is an extremely valuable asset which can provide important and surprising data. The amount of tools available at hand is indeed overwhelming, that is why you need to consider the listed factors of what, why and such. This way you will reach a conclusion about what exactly you need to test in order to improve your service or obtain required information. Knowing what you want to do will help you narrow your choices and in result choose the right tool. Hopefully, what you’ve read will help you choose the best usability tools for testing, and you will end up an expert in your research sessions.

Online Surveys On a Shoestring: Tips and Tricks

Written by: Gabriel Biller

Design research has always been about qualitative techniques. Increasingly, our clients ask us to add a “quant part” to projects, often without much or any additional budget. Luckily for us, there are plenty of tools available to conduct online surveys, from simple ones like Google Forms and SurveyMonkey to more elaborate ones like Qualtrics and Key Survey.

Whichever tool you choose, there are certain pitfalls in conducting quantitative research on a shoestring budget. Based on our own experience, we’ve compiled a set of tips and tricks to help avoid some common ones, as well as make your online survey more effective.

We’ve organized our thoughts around three survey phases: writing questions, finding respondents, and cleaning up data.

Writing questions

Writing a good questionnaire is both art and science, and we strongly encourage you to learn how to do it. Most of our tips here are relevant to all surveys, but particularly important for the low-budget ones. Having respondents who are compensated only a little, if at all, makes the need for good survey writing practices even more important.

Ask (dis)qualifying questions first

A sacred rule of surveys is to not waste people’s time. If there are terminating criteria, gather those up front and disqualify respondents as quickly as you can if they do not meet the profile. It is also more sensitive to terminate them with a message “Thank you for your time, but we already have enough respondents like you” rather than “Sorry, but you do not qualify for this survey.”

Keep it short

Little compensation means that respondents will drop out at higher rates. Only focus on what is truly important to your research questions. Ask yourself how exactly the information you collect will contribute to your research. If the answer is “not sure,” don’t ask.

For example, it’s common to ask about a level of education or income, but if comparing data across different levels of education or income is not essential to your analysis, don’t waste everyone’s time asking the questions. If your client insists on having “nice to know” answers, insist on allocating more budget to pay the respondents for extra work.

Keep it simple

Keep your target audience in mind and be a normal human being in framing your questions. Your client may insist on slipping in industry jargon and argue that “everyone knows what it is.” It is your job to make the survey speak the language of the respondents, not the client.

For example, in a survey about cameras, we changed the industry term “lifelogging” to a longer, but simpler phrase “capturing daily routines, such as commute, meals, household activities, and social interactions.”

Keep it engaging

People in real life don’t casually say, “I am somewhat satisfied” or “the idea is appealing to me.” To make your survey not only simple but also engaging, consider using more natural language for response choices.

For example, instead of using standard Likert-scale “strongly disagree” to “strongly agree” responses to the statement “This idea appeals to me” in a concept testing survey, we offered a scale “No, thanks” – “Meh” – “It’s okay” – “It’s pretty cool” – “It’s amazing.” We don’t know for sure if our respondents found this approach more engaging (we certainly hope so), but our client showed a deeper emotional response to the results.

Finding respondents

Online survey tools differ in how much help they provide with recruiting respondents, but most common tools will assist in finding the sample you need, if the profile is relatively generic or simple. For true “next to nothing” surveys, we’ve used Amazon Mechanical Turk (mTurk), SurveyMonkey Audience, and our own social networks for recruiting.

Be aware of quality

Cheap recruiting may easily result in low quality data. While low-budget surveys will always be vulnerable to quality concerns, there are mechanisms to ensure that you keep your quality bar high.

First of all, know what motivates your respondents. Amazon mTurk commonly pays $1 for the so-called “Human Intelligence Task” that may include taking an entire survey. In other words, someone is earning as little as $4 an hour if they complete four 15-minute surveys. As such, some mTurk Workers may try to cheat the system and complete multiple surveys for which they may not be qualified.

SurveyMonkey, on the other hand, claims that their Audience service delivers better quality, since the respondents are not motivated by money. Instead of compensating respondents, SurveyMonkey makes a small donation to the charity of their choice, thus lowering the risk of people being motivated to cheat for money.

Use social media

If you don’t need thousands of respondents and your sample is pretty generic, the best resource can be your social network. For surveys with fewer than 300 respondents, we’ve had great success with tapping into our collective social network of Artefact’s members, friends, and family. Write a request and ask your colleagues to post it on their networks. Of course, volunteers still need to match the profile. When we send an announcement, we include a very brief description of who we look for and send volunteers to a qualifying survey. This approach costs little but yields high-quality results.

We don’t pay our social connections for surveys, but many will be motivated to help a friend and will be very excited to hear about the outcomes. Share with them what you can as a “thank you” token.

For example, we used social network recruiting in early stages of Purple development. When we revealed the product months later, we posted a “thank you” link to the article to our social networks. Surprisingly even for us, many remembered the survey they took and were grateful to see the outcomes of their contribution.


If you are trying to hit a certain sample size for “good” data, you need to over-recruit to remove the “bad” data. No survey is perfect and all can benefit from over-recruiting, but it’s almost a must for low-budget surveys. There are no rules, but we suggest over-recruiting by at least 20% to hit the sample size you need at the end. Since the whole survey costs you little, over-recruiting will equally cost little.

Cleaning up data

Cleaning up your data is another essential step of any survey that is particularly important for the one on a tight budget. A few simple tricks can increase the quality of responses, particularly if you use public recruiting resources. When choosing a survey tool, check what mechanisms are available for you to clean up your data.

Throw out duplicates

As mentioned earlier, some people may be motivated to complete the same survey multiple times and even under multiple profiles. We’ve spotted this when working with mTurk respondents by checking their Worker IDs. We had multiple cases when the same IDs were used to complete a survey multiple times. We ended up throwing away all responses associated with the “faulty IDs” and gained more confidence in our data at the end.

Check response time

With SurveyMonkey, you can calculate the time spent on the survey using the StartTime and EndTime data. We benchmarked the average time of the survey by piloting the survey in the office. This can be used as a pretty robust fool-proof mechanism.

If the benchmark time is eight minutes and you have surveys completed in three, you may question how carefully respondents were reading the questions. We flag such outliers as suspect and don’t include them in our analysis.

Add a dummy question

Dummy questions help filter out the respondent quickly answering survey questions at random. Dummy questions require the respondent to read carefully and then respond. People who click and type at random might answer it correctly, but it is unlikely. If the answer is incorrect, this is another flag we use to mark a respondent’s data as suspect.

Low-budget surveys are challenging, but not necessarily bad, and with a few tricks you can make them much more robust. If they are used as an indicative, rather than definitive, mechanism to supplement other design research activities, they can bring “good enough” insights to a project.

Educate your clients about the pros and cons of low-budget surveys and help them make a decision whether or not they want to invest more to get greater confidence in the quantitative results. Setting these expectations up front is critical for the client, but you never know, it could also be a good tool for negotiating a higher survey budget to begin with!

Creativity Must Guide the Data-Driven Design Process

Written by: Rameet Chawla

Collecting data about design is easy in the digital world. We no longer have to conduct in-person experiments to track pedestrians’ behavior in an airport terminal or the movement of eyeballs across a page. New digital technologies allow us to easily measure almost anything, and apps, social media platforms, websites, and email programs come with built-in tools to track data.

And, as of late, data-driven design has become increasingly popular. As a designer, you no longer need to convince your clients of your design’s “elegance,” “simplicity,” or “beauty.” Instead of those subjective measures, you can give them data: click-through and abandonment rates, statistics on the number of installs, retention and referral counts, user paths, cohort analyses, A/B comparisons, and countless other analytical riches.

After you’ve mesmerized your clients with numbers, you can draw a few graphs on a whiteboard and begin claiming causalities. Those bad numbers? They’re showing up because of what you told the client was wrong with the old design. And the good numbers? They’re showing up because of the new and improved design.

But what if it’s not because of the design? What if it’s just a coincidence?

There are two problems with the present trend toward data-driven design: using the wrong data, and using data at the wrong time.

The problem with untested hypotheses

Let’s say you go through a major digital redesign. Shortly after you launch the new look, the number of users hitting the “share” button increases significantly. That’s great news, and you’re ready to celebrate the fact that your new design was such a success.

But what if the new design had nothing to do with it? You’re seeing a clear correlation—two seemingly related events that happened around the same time—but that does not prove that one caused the other.

Steven D. Levitt and Stephen J. Dubner, the authors of “Freakonomics,” have built a media empire on exposing the difference between correlation and causation. My favorite example is their analysis of the “broken windows” campaign carried out by New York City Mayor Rudy Giuliani and Police Commissioner William Bratton. The campaign coincided with a drop in the city’s crime rate. The officials naturally took credit for making the city safer, but Levitt and Dubner make a very strong case that the crime rate declined for reasons other than their campaign.

Raw data doesn’t offer up easy conclusions. Instead, look at your data as a generator of promising hypotheses that must be tested. Is your newly implemented user flow the cause of a spike in conversion rates? It might be, but the only way you’ll know is by conducting an A/B test that isolates that single variable. Otherwise, you’re really just guessing, and all that data you have showing the spike doesn’t change that.

Data can’t direct innovation

Unfortunately, many designers are relying on data instead of creativity. The problem with using numbers to guide innovation is that users typically don’t know what they want, and no amount of data will tell you what they want. Instead of relying on data from the outset, you have to create something and give it to users before they can discover that they want it.

Steve Jobs was a big advocate of this method. He didn’t design devices and operating systems by polling users or hosting focus groups. He innovated and created, and once users saw what he and his team had produced, they fell in love with a product they hadn’t even known they wanted.

Data won’t tell you what to do during the design process. Innovation and creativity have to happen before data collection, not after. Data is best used for testing and validation.

Product development and design is a cyclical process. During the innovation phase, creativity is often based on user experience and artistry — characteristics that aren’t meant to be quantified on a spreadsheet. Once a product is released, it’s time to start collecting data.

Perhaps the data will reveal a broken step in the user flow. That’s good information because it directs your attention to the problem. But the data won’t tell you how to fix the problem. You have to innovate again, then test to see if you’ve finally fixed what was broken.

Ultimately, data and analysis should be part of the design process. We can’t afford to rely on our instincts alone. And with the wealth of data available in the digital domain, we don’t have to. The unquantifiable riches of the creative process still have to lead design, but applying the right data at the right time is just as important to the future of design.