The success of every business depends on how the business will meet their customers’ needs. To do that, it is important to optimize your offer, the website, and your selling methods so your customer is satisfied. The fields of online marketing, conversion rate optimization, and user experience design have a wide range of online tools that can guide you through this process smoothly. Many companies use only one or two tools that they are familiar with, but that might not be enough to gather important data necessary for improvement. To help you better understand when and which tool is valuable to use, I created a framework that can help in your assessment. Once you broaden your horizons, it will be easier to choose the set of tools aligned to your business’s needs. Continue reading Your Guide to Online Research and Testing Tools
Design research has always been about qualitative techniques. Increasingly, our clients ask us to add a “quant part” to projects, often without much or any additional budget. Luckily for us, there are plenty of tools available to conduct online surveys, from simple ones like Google Forms and SurveyMonkey to more elaborate ones like Qualtrics and Key Survey.
Whichever tool you choose, there are certain pitfalls in conducting quantitative research on a shoestring budget. Based on our own experience, we’ve compiled a set of tips and tricks to help avoid some common ones, as well as make your online survey more effective.
We’ve organized our thoughts around three survey phases: writing questions, finding respondents, and cleaning up data.
Writing a good questionnaire is both art and science, and we strongly encourage you to learn how to do it. Most of our tips here are relevant to all surveys, but particularly important for the low-budget ones. Having respondents who are compensated only a little, if at all, makes the need for good survey writing practices even more important.
Ask (dis)qualifying questions first
A sacred rule of surveys is to not waste people’s time. If there are terminating criteria, gather those up front and disqualify respondents as quickly as you can if they do not meet the profile. It is also more sensitive to terminate them with a message “Thank you for your time, but we already have enough respondents like you” rather than “Sorry, but you do not qualify for this survey.”
Keep it short
Little compensation means that respondents will drop out at higher rates. Only focus on what is truly important to your research questions. Ask yourself how exactly the information you collect will contribute to your research. If the answer is “not sure,” don’t ask.
For example, it’s common to ask about a level of education or income, but if comparing data across different levels of education or income is not essential to your analysis, don’t waste everyone’s time asking the questions. If your client insists on having “nice to know” answers, insist on allocating more budget to pay the respondents for extra work.
Keep it simple
Keep your target audience in mind and be a normal human being in framing your questions. Your client may insist on slipping in industry jargon and argue that “everyone knows what it is.” It is your job to make the survey speak the language of the respondents, not the client.
For example, in a survey about cameras, we changed the industry term “lifelogging” to a longer, but simpler phrase “capturing daily routines, such as commute, meals, household activities, and social interactions.”
Keep it engaging
People in real life don’t casually say, “I am somewhat satisfied” or “the idea is appealing to me.” To make your survey not only simple but also engaging, consider using more natural language for response choices.
For example, instead of using standard Likert-scale “strongly disagree” to “strongly agree” responses to the statement “This idea appeals to me” in a concept testing survey, we offered a scale “No, thanks” – “Meh” – “It’s okay” – “It’s pretty cool” – “It’s amazing.” We don’t know for sure if our respondents found this approach more engaging (we certainly hope so), but our client showed a deeper emotional response to the results.
Online survey tools differ in how much help they provide with recruiting respondents, but most common tools will assist in finding the sample you need, if the profile is relatively generic or simple. For true “next to nothing” surveys, we’ve used Amazon Mechanical Turk (mTurk), SurveyMonkey Audience, and our own social networks for recruiting.
Be aware of quality
Cheap recruiting may easily result in low quality data. While low-budget surveys will always be vulnerable to quality concerns, there are mechanisms to ensure that you keep your quality bar high.
First of all, know what motivates your respondents. Amazon mTurk commonly pays $1 for the so-called “Human Intelligence Task” that may include taking an entire survey. In other words, someone is earning as little as $4 an hour if they complete four 15-minute surveys. As such, some mTurk Workers may try to cheat the system and complete multiple surveys for which they may not be qualified.
SurveyMonkey, on the other hand, claims that their Audience service delivers better quality, since the respondents are not motivated by money. Instead of compensating respondents, SurveyMonkey makes a small donation to the charity of their choice, thus lowering the risk of people being motivated to cheat for money.
Use social media
If you don’t need thousands of respondents and your sample is pretty generic, the best resource can be your social network. For surveys with fewer than 300 respondents, we’ve had great success with tapping into our collective social network of Artefact’s members, friends, and family. Write a request and ask your colleagues to post it on their networks. Of course, volunteers still need to match the profile. When we send an announcement, we include a very brief description of who we look for and send volunteers to a qualifying survey. This approach costs little but yields high-quality results.
We don’t pay our social connections for surveys, but many will be motivated to help a friend and will be very excited to hear about the outcomes. Share with them what you can as a “thank you” token.
For example, we used social network recruiting in early stages of Purple development. When we revealed the product months later, we posted a “thank you” link to the article to our social networks. Surprisingly even for us, many remembered the survey they took and were grateful to see the outcomes of their contribution.
If you are trying to hit a certain sample size for “good” data, you need to over-recruit to remove the “bad” data. No survey is perfect and all can benefit from over-recruiting, but it’s almost a must for low-budget surveys. There are no rules, but we suggest over-recruiting by at least 20% to hit the sample size you need at the end. Since the whole survey costs you little, over-recruiting will equally cost little.
Cleaning up data
Cleaning up your data is another essential step of any survey that is particularly important for the one on a tight budget. A few simple tricks can increase the quality of responses, particularly if you use public recruiting resources. When choosing a survey tool, check what mechanisms are available for you to clean up your data.
Throw out duplicates
As mentioned earlier, some people may be motivated to complete the same survey multiple times and even under multiple profiles. We’ve spotted this when working with mTurk respondents by checking their Worker IDs. We had multiple cases when the same IDs were used to complete a survey multiple times. We ended up throwing away all responses associated with the “faulty IDs” and gained more confidence in our data at the end.
Check response time
With SurveyMonkey, you can calculate the time spent on the survey using the StartTime and EndTime data. We benchmarked the average time of the survey by piloting the survey in the office. This can be used as a pretty robust fool-proof mechanism.
If the benchmark time is eight minutes and you have surveys completed in three, you may question how carefully respondents were reading the questions. We flag such outliers as suspect and don’t include them in our analysis.
Add a dummy question
Dummy questions help filter out the respondent quickly answering survey questions at random. Dummy questions require the respondent to read carefully and then respond. People who click and type at random might answer it correctly, but it is unlikely. If the answer is incorrect, this is another flag we use to mark a respondent’s data as suspect.
Low-budget surveys are challenging, but not necessarily bad, and with a few tricks you can make them much more robust. If they are used as an indicative, rather than definitive, mechanism to supplement other design research activities, they can bring “good enough” insights to a project.
Educate your clients about the pros and cons of low-budget surveys and help them make a decision whether or not they want to invest more to get greater confidence in the quantitative results. Setting these expectations up front is critical for the client, but you never know, it could also be a good tool for negotiating a higher survey budget to begin with!
Web site optimization has become an essential capability in today’s conversion-driven web teams. In Part 1 of this series, we introduced the topic as well as discussed key goals and philosophies. In Part 2, I presented a detailed and customizable process. In this final article, we’ll cover communication planning and how to select the appropriate team and tools to do the job.
For many organizations, communicating the status of your optimization tests is an essential practice. Imagine if your team has just launched an A/B test on your company’s homepage, only to learn that another team had just released new code the previous day that had changed the homepage design entirely. Or imagine if a customer support agent was trying to help users through the website’s forgot password flow, unaware that the customer was seeing a different version due to an A/B test that your team was running.
To avoid these types of problems, I recommend a three-step communication program:
- Pre-test notification
This is an email announcing that your team has selected a certain page/section of the site to target for its next optimization test and that if anyone has any concerns, they had better voice them immediately, before your team starts working on it. Give folks a day or two to respond. The email should include:
- Name/brief description of the test
- Affected pages
- Expected launch date
- Link to the task or project plan where others can track the status of the test.
Here’s a sample pre-test notification.
- Pre-launch notification
This email is sent out a day or two before a new experiment launches. It includes all of the information from the Pre-Test Notification email, plus:
- Expected test duration
- Some optimization tools create a unique dashboard page in which interested parties can monitor the results of the test in real-time. If your tool does this, you can include the link here.
- Any other details that you care to mention, such as variations, traffic allocation, etc…
Here’s a sample pre-launch email.
- Test results
After the test has run its course and you’ve compiled the results into the Optimization Test Results document, send out a final email to communicate this. If you have a new winner, be sure to brag about it a little in the email. Other details may include:
- Brief discussion
- A few specifics, such as conversion rates, traffic and confidence intervals
- Next steps
Here’s a sample test results email.
Team size and selection
As is true with many things, good people are the most important aspect of a successful optimization program. Find competent people with curious minds who take pride in their work – this will be far more valuable than investment in any optimization tool or adherence to specific processes.
The following are recommendations for organizations of varying team sizes.
It is difficult for one person to perform optimization well unless they are dedicated full-time to the job. If your organization can only cough-up one resource, I would select either a web analytics resource with an eye for design, or a data-centric UX designer. For the latter profile, I don’t mean the type of designer who studied fine art and is only comfortable using Photoshop, but rather the type who likes wireframes, has poked around an analytics tool on their own, and is good with numbers. This person will also have to be resourceful and persuasive, since they will almost certainly have to borrow time and collaborate with others to complete the necessary work.
Two to three people
With a team size of three people, you are starting to get into the comfort zone. To the UX designer and web/data analytics roles, I would add either a visual designer or a front-end developer. Ideally, some of the team members would have multiple or overlapping competencies. The team will probably still have to borrow time from other resources, such as back-end developers and QA.
A team that is lucky enough to have five dedicated optimization resources has the potential to be completely autonomous. If your organization places such a high value on optimization, they may have also invested accordingly in sophisticated products or strategies for the job, such as complex testing software, data warehouses, etc… If so, then you’ll need folks who are specifically adept at these tools, broadening your potential team to roles such as data engineers, back-end developers, content managers, project managers, or dedicated QA resources. A team of five would ideally have some overlap with some of the skill-sets.
The optimization market is hot and tool selection may seem complicated at first. The good news is that broader interest and increased competition is fueling an all-out arms race towards simpler, more user-friendly interfaces designed for non-technical folks. Data analysis and segmentation features also seem to be evolving rapidly.
My main advice if you’re new to optimization is to start small. Spend a year honing your optimization program and after you’ve proven your value, you can easily graduate to the more sophisticated (and expensive) tools. Possibly by the time you’re ready, your existing tool will have advanced to keep up with your needs. Also realize that many of the cheaper tools can do the job perfectly well for most organizations, and that some organizations with the high-powered tools are not using them to their fullest capabilities.
A somewhat dated Forrester Research report from February 2013 assesses some of the big hitters, but notably absent are Visual Website Optimizer (VWO) and, for very low end, Google’s free Content Experiments tool. Conversion Rate Experts keeps an up-to-date comparison table listing virtually all of today’s popular testing tools, but it only rates them along a few specific attributes.
I performed my own assessment earlier this year and here is a short list of my favorites:
Visual Website Optimizer (VWO)
Google Content Experiments
Adobe Test & Target
Here are a few factors to consider when deciding on products:
Intuitive user interface
Luckily, most tools now have simple, WYSIWYG type of interfaces that allow you to directly manipulate your site content when creating test variations. You can edit text, change styles, move elements around, and save these changes into a new test variation. Some products have better implementations than others, so be sure to try out a few to find the best match for your team.
Targeting allows you to specify which site visitors are allowed to see your tests. Almost all tools allow you to target site visitors based on basic attributes that can be inferred from their browser, IP address, or session. These attributes may include operating system, browser type/version, geographical location, day of week, time of day, traffic source (direct vs. organic vs. referral), and first time vs. returning visitor. More advanced tools also allow you to target individuals based on attributes (variables) that you define and programmatically place in your users’ browser sessions, cookies, or URLs. This allows you to start targeting traffic based on your organization’s own customer data. The most advanced tools allow you to import custom data directly into the tool’s database, giving you direct access to these attributes through their user interface, not only for targeting, but also for segmented analysis.
Analysis and reporting
Tools vary widely in their analysis and reporting capabilities, with the more powerful tools generally increasing in segmentation functionality. The simplest tools only allow you to view test results compared against a single dimension, for example, you can see how your test performed on visitors with mobile vs. desktop systems. The majority of tools now allow you to perform more complicated analyses along multiple dimensions and customized user segments. For example, you might be interested in seeing how your test performed with visitors on mobile platforms, segmented by organic vs. paid vs. direct traffic.
Keep in mind that as your user segments become more specific, your optimization tool must rely on fewer and fewer data points to generate the results for each segment, thereby decreasing your confidence levels.
Server response time
Optimization tools work by adding a small snippet of code to your pages. When a user visits that page, the code snippet calls a server somewhere that returns instructions on which test variation to display to the user. Long server response times can delay page loading and the display of your variations, thereby affecting your conversions and reporting.
When shopping around, be sure to inquire about how the tool will affect your site’s performance. The more advanced tools are deployed on multiple, load-balanced CDNs and may include contractual service level agreements that guarantee specific server response times.
Most optimization vendors provide a combination of online and telephone support, with some of the expensive solutions offering in-person set-up, onboarding and training. Be sure inquire about customer support when determining costs. A trick I’ve used in the past to test a vendor’s level of service is to call the customer support lines at different times of the day and see how fast they pick up the phone.
Price and cost structure
Your budget may largely determine your optimization tool options as prices vary tremendously, from free (for some entry tools with limited features) to six-figure annual contracts that are negotiated based on website traffic and customer support levels (Maxymiser, Monetate and Test & Target fall into this latter category).
Tools also vary in their pricing model, with some basing costs on the amount of website traffic and others charging more for increased features. My preference is towards the latter model, since the former is sometimes difficult to predict and provides a disincentive to perform more testing.
Integration with CMS/analytics/marketing platforms
Automated segmentation and targeting
Some of the advanced tools offer automated functionality that tries to analyze your site’s conversions and notify you of high-performing segments. These segments may be defined by any combination of recognizable attributes and thus, far more complicated than your team may be able to define on their own. For example, the tool might define one segment as female users on Windows platform, living in California, and who visited your site within the past 30 days. It might define a dozen or more of these complex micro-segments and even more impressively, allow you to automatically redirect all future traffic to the winning variations specific to each of these segments. If implemented well, this intelligent segmentation has tremendous potential for your overall site conversions. The largest downside is that it usually requires a lot of traffic to make accurate predictions.
Automated segmentation is often an added cost to the base price of the optimization tool. If so, consider asking for a free trial period to evaluate the utility/practicality of this functionality before making the additional investment.
Synchronous vs. asynchronous page loading
Most tools recommend that you implement their services in an asynchronous fashion. In other words, that you allow the rest of your page’s HTML to load first before pinging their services and potentially loading one of the test variations that you created. The benefit of this approach is that your users won’t have to wait additional time before your control page starts to render in the browser. The drawback is that once the call to the optimization’s services is returned, then your users may see a page flicker as the control page is replaced by one of your test variations. This flickering effect, along with the additional time it takes to display the test variations, could potentially skew test results or cause surprise/confusion with your users.
In contrast, synchronous page loading, which is recommended by some of the more advanced tools, makes the call to the optimization tool before the rest of the page loads. This ensures that your control group and variations are all displayed in the same relative amount of time, which should allow for more accurate test results. It also eliminates the page flicker effect inherent in asynchronous deployments.
By far, the most difficult step in any web site optimization program is the first one – the simple act of starting. With this in mind, I’ve tried to present a complete and practical guide on how to get you from this first step through to a mature program. Please feel free to send me your comments as well as your own experiences. Happy optimizing.
The Jewish Torah teaches that the Creator created our world through ten utterances–for example, “let there be light.”
The Jewish mystical tradition explains that these utterances correspond with ten stages in the process of creation. Every creative process in the world ultimately follows this progression, because it is really a part of the continual unfolding of the world itself, in which we are co-creators.
This article aims to present an overview of the mystical process of creation and principal of co-creation and to illustrate how it can guide bringing digital product ideas into reality–although it’s easy enough to see how this could translate to other products and services–in a way that ensures a great user experience, and makes our creative process more natural and outcomes more fruitful.
And a note as you read: In Jewish mysticism, the pronoun “He” is used when referring to the transcendent aspect of the Creator that is the source of creation, and “She” is used when referring to the imminent aspect that pervades creation, because they are characterized by giving and receiving, respectively. Because this article discusses the relationship of the transcendent aspect, the masculine pronoun has been used.
The process of creation
Ten stages, four realms
The ten stages in the process of creation progressively create four realms.
Three triads create three spiritual realms, and the tenth stage creates our tangible reality, which is the culmination of creation. It is understood that creation becomes increasingly defined and tangible as the creative power flows from one realm to the next. When we participate in creation, our efforts naturally follow the same progression.
The four realms are traditionally referred to by Hebrew terms, so to make things easier I’ll refer to them using a designer’s day-to-day terms–ideation, design, implementation, and operation.
Before we dive in though, one more thing to note is that within each realm there is a three-stage pattern whereby the creation first becomes revealed, then delineated, and finally consolidated in a state of equilibrium. Hang in there, you’ll shortly see what this means.
The realm of ideation
In the beginning there was only the Creator, alone.
In the first three stages of creation, He simply created the possibility for a creation. This corresponds with the generation of business ideas.
Just as before there was anything else it had to arise in the Creator’s mind to create the world, so too, the starting point of all products and services is the emergence of an idea–a simple and common example of which is “a digital channel will help our customers connect with us.”
Next, the seed sprouts a series of details to define it. In creation, the details included the fact that creation will be limited and that there is an order to its unfolding. In business, the idea undergoes an extrapolation to define its reach and scope. For example, “the digital channel will need product information, a shopping cart, a customer database, and a social function for customers’ reviews.”
The third stage in the process of creation is the preparation for bridging the gap between the abstract realm of potential where the Creator is still effectively alone, with a new reality of seemingly separate creations. Correspondingly, in business the third step requires bringing the idea from a place of theory to a point that it can be shared with others, such as presenting to decision makers and stakeholders, or briefing agents and consultants.
The realm of design
Now that it’s possible to distinguish between the Creator and His creation, the next three stages serve to coalesce the homogenous creation into spiritual templates. This corresponds with the conceptual design of how the business idea may be realized.
The first stage in this realm is an expression of the Creator’s kindness, as He indiscriminately bestows life to all of creation. Correspondingly, the design process begins with telling the end-to-end story of the idea, from the user discovering the new product or service through to their consummate pleasure in using it, without our being too concerned with practical considerations. This could be captured in business process diagrams, but human-centred user journey maps or storyboards have proven more natural.
Next, the Creator expressed His attribute of judgement to establish the boundaries of His evolving creations. In business, we begin addressing practical considerations, such as time, budget, and technical constraints to define the boundaries of the concept. This generally involves analyzing the desired story to establish the finite set of practical requirements for realizing it. For digital products, the requirements are often closely followed by a business case, an information architecture, and a system architecture.
As mentioned, the third stage is where a consolidated state of equilibrium is reached to form the output of the realm. In creation, mystics describe the culmination of this realm as being sublime angels who are only identified by their function–for example to heal or to enact justice–and consider them to be the templates for these attributes, as they become manifest in the lower beings.
Similarly, we consolidate the business idea by sketching or prototyping how we envision it will become manifest. Typically we deliver low-fidelity interaction, product or service designs, which are often accompanied by a business plan and functional and technical specifications.
The realm of implementation
Using the spiritual templates, the next three stages serve to create individualized spiritual beings. This corresponds with implementing our conceptual designs into an actual digital product.
In creation, the life-force is now apportioned according to the ability of the created being to receive, similar to pouring hot liquid material into a statue mould. Correspondingly, we apply branding, colors, and shapes to bring the blueprint to life–the result being high-fidelity visual designs of what the digital product will actually look and feel like.
Next, the life-force solidifies to form the individual spiritual being, similar to when the hot liquid cools and the mould can be removed. This corresponds with slicing the visual designs to develop the front-end, developing the database, and integrating the back-end functionality.
The culmination of this realm is often depicted in artwork and poetry as being angels that have human form, wings, and individual names. They are, however, still spiritual beings, not physical beings like us. Correspondingly, at the final stage of implementation, there exists a fully functional digital product…in a staging environment.
The realm of operation
The culmination of the process of creation is our tangible reality, which is comprised of physical matter and its infused life-force (part of which is our physical bodies infused with our souls). Bridging the infinitely large gap between the spiritual and physical realms is often considered the most profound step in the process of creation, yet paradoxically it’s simultaneously the smallest conceptual distance from a spiritual being that looks and functions like a physical being, and an actual physical being.
Correspondingly, launching a digital product into the live public domain can be the most daunting and exciting moment, yet it can be as easy as pressing a button to redirect the domain to point to the new web-server or to release the app on the app store.
At this point the Creator is said to have rested, observing His creation with pleasure. Similarly, it can be very satisfying to step back at this point and soak in how our initial seed of an idea has finally evolved into an actual operational reality–which will hopefully fulfill our business goals!
The principle of co-creation
By now we can appreciate why there seems to be a natural and logical sequence for the activities typically involved in creating a new product or service. Jewish mysticism, however, unequivocally adds that we are co-creators with the Creator. That is: We, created beings, are able to influence what the end product of creation will be, just like users can influence our products and services when we engage with them during the creation process.
Jewish mysticism relates that the Creator consults with His retinue of angels to make decisions regarding His creation. This corresponds with our soliciting user input to validate the direction of our creative efforts, such as:
- during ideation, conducting research to ensure the ideas indeed meet users’ needs and desires;
- during design, conducting user validation to ensure the sensibility and completeness of the story, correlation of the framework with users’ mental models, and usability of the blueprints; and
- during implementation, conducting user testing to help smooth out any remaining difficulties or doubts in the user experience.
We are also taught that the Creator is monitoring human activity and makes adjustments accordingly. Similarly, at the stage of operation, it’s good practice to steer the finished product to better achieve business goals by monitoring the usage analytics.
Finally, we’re taught that the Creator desires our prayers beseeching Him to change our reality, similar to how we’ve come to understand the most potent consideration is user feedback on the fully operational product.
On the surface it still seems as though the process of creation is a cascading “waterfall,” but we see that our world is constantly evolving–for example, more efficient transport, more sophisticated communication, more effective health maintenance–seemingly through our learning from experience to improve our efforts. In a simple sense, this can be likened to the “agile” feedback loop where learnings from one round of production are used to influence and improve our approach to the next round.
Jewish mysticism teaches, however, that under the surface our genuine efforts below arouse a magnanimous bestowal of ever-increasingly refined life-force into the creation. This can be understood as similar to a pleased business owner allocating increasingly more budget to continue work on an evidently improving product or service.
These days, it is becoming more common for businesses to implement a continuous improvement program, whereby an ongoing budget is allocated for this purpose. The paradigm of continually looking for ways to more effectively meet user needs and achieve business goals–such that they can be fed back into the process for fleshing out the idea, designing, and then implementing–perfectly parallels the reality that we are co-creating an ever more refined world using ever-deepening resources.
But how can a compounding improvement continue indefinitely? Jewish mysticism explains that as the unlimited creative power becomes exponentially more revealed within our limited reality, there will eventually come a grand crescendo with the revelation of the Creator’s essential being, which is neither unlimited or limited, but both simultaneously. This will be experienced as the messianic era–“In that era, there will be neither famine or war, envy or competition, for good will flow in abundance and all the delights will be freely available as dust. The occupation of the entire world will be solely to know their Creator.”1
Users front of mind at every stage
Before we get there, however, it can be seen from the above how every stage of the creative process has a unique effect on the user experience of the end product or service, such that it would bode well if we strive to ensure:
- The initial business idea meets an actual need or fulfils an actual desire of our users
- The concept is designed to function according to the user’s understanding and expectations
- The product or service is implemented in a way that is appealing and easy to use
- The operating product or service is continually improved to meet users’ evolving needs
By knowing each stage and each skill set’s proper place in the sequence and how to incorporate our learnings and user sentiment, we can achieve a more natural creative process for ourselves, our peers, and our clients and ensure the end product or service offers the best possible user experience, indefinitely.
|Creative activity||Co-creation activity||Output|
|User research||User pain points
|User focus groups
|User testing||Staging product|
Ideas for improvement
References and further reading
- Sefer Likutei Amarim, “The Tanya”, by the Alter Rebbe, Rabbi Schneur Zalman of Liadi
- Sefer HaMaamorim Melukatim, by the Lubavitcher Rebbe, Rabbi Menachem Mendel Schneerson
- Basi L’Gani, by the Rebbe Rayatz, Rabbi Yosef Yitzchak Shneerson.
- Beshaah Shehikdimu 5672, “Ayin Beis”, by the Rebbe Rashab, Rabbi Shalom DovBer Schneersohn
- Mishneh Torah, Sefer Shoftim, Melachim uMilchamot, Chapter 12, Halacha 5, by the Rambam, Rabbi Moses ben Maimon
In the previous article we talked about why site optimization is important and presented a few important goals and philosophies to impart on your team. I’d like to switch gears now and talk about more tactical stuff, namely, process.
Establishing a well-formed, formal optimization process is beneficial for the following reasons.
- It organizes the workflow and sets clear expectations for completion.
- Establishes quality control standards to reduce bugs/errors.
- Adds legitimacy to the whole operation so that if questioned by stakeholders, you can explain the logic behind the process.
At a high level, I suggest a weekly or bi-weekly optimization planning session to perform the following activities:
- Review ongoing tests to determine if they can be stopped or considered “complete” (see the boxed section below). For tests that have reached completion, the possibilities are:
- There is a decisive new winner. In this case, plan how to communicate and launch the change permanently to production.
- There is no decisive winner or the current version (control group) wins. In this case, determine if more study is required or if you should simply move on and drop the experiment.
- Review data sources and brainstorm new test ideas.
- Discuss and prioritize any externally submitted ideas.
|How do I know when a test has reached completion?
Completion criteria are a somewhat tricky topic and seemingly guarded industry secrets. These define the minimum requirements that must be true in order for a test to be declared “completed.” My personal sense from reading/conferences is that there are no widely-accepted standards and that completion criteria really depend on how comfortable your team feels with the uncertainty that is inherent in experimentation. We created the following minimum completion criteria for my past team at DIRECTV Latin America. Keep in mind that these were bare-bones minimums, and that most of our tests actually ran much longer.
For further discussion of the rationale behind these completion criteria, please see Best Practices When Designing and Running Experiments later in this article.
The creation of a new optimization test may follow a process that is similar to your overall product development lifecycle. I suggest the following basic structure:
The following diagram shows a detailed process that I’ve used in the past.
Step 1: Data analysis and deciding what to test
Step one in the optimization process is figuring out where to first focus your efforts. We used the following list as a loose prioritization guideline:
- Recent product releases, or pages that have not yet undergone optimization.
- High “value” pages
- 1. High revenue (ie. shopping cart checkout pages, detail pages of your most expensive products, etc…).
- 2. High traffic (ie. homepage, login/logout).
- 3. Highly “strategic” (this might include pages that are highly visible internally or that management considers important).
- Poorly performing pages
- 1. Low conversion rate
- 2. High bounce rate (for an excellent discussion of bounce rate, see Avinash Kaushik’s article).
Step 2: Brainstorm ideas for improvement
Ideas for how to improve page performance is a topic that is as large as the field of user experience itself, and definitely greater than the scope of this article. One might consider improvements in copywriting, form design, media display, page rendering, visual design, accessibility, browser targeting… the list goes on.
My only suggestion for this process is to make it collaborative – harness the power of your team to come up with new ideas for improvement, not only including designers in the brainstorming sessions, but also developers, copywriters, business analysts, marketers, QA, etc… Good ideas can (and often do) come from anywhere.
Adaptive Path has a great technique of collaborative ideation that they call sketchboarding, which uses iterative rounds of group sketching.
Step 3: Write the testing plan
An Optimization Testing Plan acts as the backbone of every test. At a high level, it is used to plan, communicate, and document the history of the experiment, but more importantly, it fosters learning by forcing the team to clearly formulate goals and analyze results.
A good testing plan should include:
- Test name
- Opportunities (what gains will come about if the test goes well)
- 1. Expected dates that the test will be running in production.
- 2. Resources (who will be working on the test).
- 3. Key metrics to be tracked through the duration of the experiment.
- 4. Completion criteria.
- 5. Variations (screenshots of the different designs that you will be showing your site visitors).
Here’s a sample optimization testing plan to get you started.
Step 4: Design and develop the test
Design and development will generally follow an abbreviated version of your organization’s product development lifecycle. Since test variations are generally simpler than full-blown product development projects, I try to use a lighter, more agile process.
Be sure that if you do cut corners, only skimp on things like process artifacts or documentation, and not on design quality. For example, be sure to perform some basic usability testing and user research on your variations. This small investment will create better candidates that will be more likely to boost conversions.
Step 5: Quality assurance
When performing QA on your variations, be as thorough as you would with any other code release to production. I recommend at least functional, visual, and analytics QA. Even though many tools allow you to manipulate your website’s UI on the fly using interfaces that immediately display the results of your changes, the tools are not perfect and any changes that you make might not render perfectly across all browsers.
Keep in mind that optimization tools provide you one additional luxury that is not usually possible with general website releases – that of targeting. You can decide to show your variations to only the target browsers, platforms, audiences, etc… for which you have performed QA. For example, let’s imagine that your team has only been able to QA a certain A/B test on desktop (but not mobile) browsers. When you actually configure this test in your optimization tool, you can decide to only display the test to visitors with those specific desktop browsers. If one of your variations has a visual bug when viewed on mobile phones, for example, that problem should not affect the accuracy of your test results.
Step 6: Run the Test
After QA has completed and you’ve decided how to allocate traffic to the different designs, it’s time to actually run your test. The following are a few best practices to keep in mind before pressing the “Go” button.
1. Variations must be run concurrently
This first principle is almost so obvious that it goes without saying, but I’ve often heard the following story from teams that do not perform optimization: “After we launched our new design, we saw our [sales, conversions, etc…] increase by X%. So the new design must be better.”
The problem with this logic is that you don’t know what other factors might have been at play before and after the new change launched. Perhaps traffic to that page increased in either quantity or quality after the new design released. Perhaps the conversion rate was on the increase anyway, due to better brand recognition, seasonal variation, or just random chance. Due to these and many other reasons, variations must be run concurrently and not sequentially. This is the only way to hold all other factors consistent and level the playing field between your different designs.
2. Always track multiple conversion metrics
One A/B test that we ran on the movie detail pages of the DIRECTV Latin American sites was the following: we increased the size and prominence of the “Ver adelanto” (View trailer) call to action, guessing that if people watched the movie trailer, it might excite them to buy more pay-per-view movies from the web site.
Our initial hunch was right, and after a few weeks we saw that pay-per-views purchases were 4.8% higher with this variation over the control. This increase would have resulted in a revenue boost of about $18,000/year in pay-per-view purchases. Not bad for one simple test. Fortunately though, since we were also tracking other site goals, we noticed that this variation also decreased purchases of our premium channel packages (ie. HBO and Showtime packages) by a whopping 25%! This would have decreased total revenue by a much greater amount than the uptick in pay-per-views, and because of this, we did not launch this variation to production.
It’s important to keep in mind that changes may affect your site in ways that you never would have expected. Always track multiple conversion metrics with every test.
3. Tests should reach a comfortable level of statistical significance
I recently saw a presentation in which a consultant suggested that preliminary tests on email segmentation had yielded some very promising results.
In the chart above, the last segment of users (those who had logged in more than four times in the past year) had a conversion rate of .00139% (.139 upgrades per 1000 emails sent). Even though a conversion rate of .00139% is dismally low by any standards, according to the consultant it represented an increase of 142% compared to the base segment of users, and thus, a very promising result.
Aside from the obvious lack of actionable utility (does this study suggest that emails only be sent to users who have logged in more than four times?) the test contained another glaring problem. If you look at the “Upgrades” column at the top of the spreadsheet, you will see that the results were based on only five individuals purchasing an upgrade. Five total individuals out of almost eighty four thousand emails sent! So if, by pure chance, only one other person had purchased an upgrade in any of the segments, it could have completely changed the study’s implications.
While this example is not actually an optimization test but rather just an email segmentation study, it does convey an important lesson: don’t declare a winner for your tests until it has reached a “comfortable” level of significance.
So what does “comfortable” mean? The field of science requires strict definitions to use the terms “significant” (95% confidence level) and “highly significant” (99% confidence level) when publishing results. Even with these definitions, it still means that there is a 5% and 1% chance, respectively, of your conclusions being wrong. Also keep in mind that higher confidence intervals require more data (ie. more website traffic) which translates into longer test durations. Because of these factors, I would recommend less stringent standards for most optimization tests – somewhere around 90-95% confidence depending on the gravity of the situation (higher confidence intervals for tests with more serious consequences or implications).
Ultimately, your team must decide on confidence intervals that reflect a compromise between test duration and results certainty, but I would propose that if you perform a lot of testing, the larger number of true winners will make up for the fewer (but inevitable) false positives.
4. The duration of your tests should account for any natural variations (such as between weekdays/weekends) and be stable over time
In a 2012 article on AnalyticsInspector.com, Jan Petrovic brings to light an important pitfall of ending your tests too early. He discusses an A/B test that he ran for a high-traffic site in which, after only a day, the testing tool reported that a winning variation had increased the primary conversion rate by an impressive 87%, with a 100% confidence interval.
Jan writes, “If we stopped the test then and pat each other on the shoulder about how great we were, then we would probably make a very big mistake. The reason for that is simple: we didn’t test our variation on Friday or Monday traffic, or on weekend traffic. But, because we didn’t stop the test (because we knew it was too early), our actual result looked very different.”
After continuing the test for four weeks, Jan saw that the new design, although still better than the control, had leveled out to a more reasonable 10.49% improvement since it had now taken into account natural daily variation. He writes, “Let’s say you were running this test in checkout, and on the following day you say to your boss something like ‘hey boss, we just increased our site revenue by 87.25%’. If I was your boss, you would make me extremely happy and probably would increase your salary too. So we start celebrating…”
Jan’s fable continues with the boss checking the bank account at the end of the month, and upon seeing that sales had actually not increased by the 87% that you had initially reported, reconsiders your salary increase.
The moral of the story: Consider temporal variations in the behavior of your site visitors, including differences between weekday and weekend or even seasonal traffic.
Step 7: Analyze and Report on the Results
After your test has run its course and your team has decided to press the “stop” button, it’s time to compile the results into an Optimization Test Report. The Optimization Test Report can be a continuation of the Test Plan from Step 2, but with the following additional sections:
- Next steps
It is helpful to include graphs and details in the Results section so that readers can visually see trends and analyze data themselves. This will add credibility to your studies and hopefully get people invested in the optimization program.
The discussion section is useful for explaining details and postulating on the reasons for the observed results. This will force the team to think more deeply about user behavior and is an invaluable step towards designing future improvements.
In this article, I’ve presented a detailed and practical process that your team can customize to its own use. In the next and final article of this series, I’ll wrap things up with suggestions for communication planning, team composition, and tool selection.