This article investigates content recommender systems. Because Netflix is probably the best known recommendation system and numerous articles have been published about their system, I will concentrate on their content recommendation mechanism as representative of the type.
I will show that the Netflix mechanism contains characteristics of updated theories of emotion—mainly constructed emotions theory—but it still lacks several essential components.
The lack of these components can explain some inaccuracies in Netflix recommendations and can suggest broader implications.
Emotions: A background
The traditional view of emotions (Paul Ekman , as an example) is that people are born with a set of emotions—fear, anger, sadness, and the like.
Because we are all born with emotions, the traditional view is that these basic emotions are similar across all human beings.
Lisa Barrett’s recent research  has uncovered difficulties with this traditional theory.
One of the leading new theories is the constructed emotions theory. According to this view, emotions are learned, not born. Different people, therefore, have different emotions; cultural environment influences these emotions.
No emotion is universal, meaning some cultures have anger, sadness, fear, disgust, happiness and so on, and some cultures don’t.
The process of emotion begins in the brain. The brain tries to identify the physical environment, to understand what this environment has signified in the past, and what the cultural norms are related to this scenario. Following this analysis, the brain suggests an emotion most suitable to this context.
The context itself is also a factor; a different environment with identical traits would produce a different interpretation and thus a different feeling.
A simplified history of recommendation systems
A content recommendation area—such as Netflix’—shares general characteristics with the emotion analysis field.
With recommendations, we try to understand what the next thing a person would want to do or feel—such as when a person wants to feel frightened.
The prevailing opinion has been that as information accumulated by the recommendation system increases, accuracy will increase: More data, more accuracy. However, in the video recommendation field, the recommendations remain inaccurate despite the enormous amount of data available.
In a recommendation system, the system must analyze both the user and the content. My claim is that the main effort is focused on understanding the user rather than the content.
If we use the wrong method to understand what people want, no amount of data will make it more accurate.
An important part of any user experience department should be a consistent outreach effort to users both familiar and unfamiliar. Yet, it is hard to both establish and sustain a continued voice amongst the business of our schedules.
Recruiting, screening, and scheduling daily or weekly one-on-one walkthroughs can be daunting for someone in a small department having more than just user research responsibilities, and the investment of time eventually outweighs the returns as both the number of participants and size of the company grow.
This article is targeted at user experience practitioners at small- to mid-size companies who want to incorporate a component of user research into their workflow.
It first outlines a point of advocacy around why it is important to build user research into a company’s ethos from the very start and states why relying upon standard analytics packages are not enough. The article then addresses some of the challenges around being able to automate, scale, document, and share these efforts as your user base (hopefully) increases.
Finally, the article goes on to propose a methodology that allows for an adjustable balance between a department’s user research and product design and highlights the evolution of trends, best practices, and common avoidances found within the user research industry, especially as they relate to SaaS-based products.
Why conduct usability sessions?
User research is imperative to the success and prioritization of any software application–or any product, for that matter. Research should be established as an ongoing cycle, one that is woven into the fabric of the company, and should never drop-off nor be simply ‘tacked on’ as acceptance testing after launch. By establishing a constant stream of non-biased opinions and open lines of communication which are immune to politics and ever-shifting strategies, research keeps design and development efforts grounded in what should already be the application’s first priority–the user.
A primary benefit in working with SasS products is that you’re able to gain feedback in real-time when any feature is changed. You don’t have to worry about obsolete versions, or download packages–web-based software enables you to change directions quickly. Combining an ongoing research effort with popular software development methods such as agile or waterfall allows for immediate response when issues with an application’s usability are found.
Different from analytics
SaaS are unique in that there is not the same type of tracking needed in-product. Metrics such as page views or bounce-rates are largely irrelevant, because the user could be spending their entire session on configuring functions of a single feature on a single page.
For example, for our application here at Loggly, the user views an average of ~2 pages (predominantly login and then search) and spends on average 8x as long on search then any other page. Progression is made within the page-level functions, not among multiple pages within the application’s structure.
Say your analytics package gives an indication that something is wrong with the setup flow or configuration screen, but you don’t yet have a good concept of at what point in the process the users are getting stuck.
Perhaps a button might be getting click after click because it is confusing and unresponsive, not because its useful. Trying to solve this exclusively with an analytics package will pale in comparison to the feedback you’ll get from a single, candid user who hits the wall. As discussed later in this article, with screensharing, you’re able see the context in which the user is trying to achieve a specific task, defining the ‘why’ in their confusing becomes more apparent than just the ‘what’ are they clicking on.
Determining a testing audience
The first component of defining any research effort should be to define who you want to talk to. Ideally, you’ll be able to have a mix of both new users and veterans that are able to provide a well-rounded feedback loop on both initial impressions of your application as well as historical perspective on evolution and found shortcomings after repeated use, but not all companies have this luxury.
Once in the door
Focus first on the initial steps the user has to take when interacting with your application. It seems obvious, but if these are not fulfilled with maximum efficiency, the user will never progress into more advanced features.
Increasing the effectiveness of the flow through set-up, configuration, and properly defining a measure of activation will pay dividends to all areas of the application. This should be a metric that is tested, measured, and monitored closely, as it functions as a type of internal bounce rate. Ensuring that the top of the stream for the majority of application users is sound will guarantee improved usage further down the road to the deeper, buried interactions.
These advanced features should be also be tracked and measured with the correlation that starts to paint a profile of conversion. Some companies define conversion as free-to-paid; others do so in a more viral sense–conversion being defined as someone who has shared on social media or similar.
As you start itemize these important features, you’ll get a better sense of the usage profile for where you’re trying to point the user to. For example, adding a listing record, or perhaps customizing a page–these might match a profile for someone who is primed for repeat visitation, someone who has created utility and a lasting connection, and ultimately ready to convert.
If there is a focus on recruiting participants who are newly signed-up users, then you’ll likely overlap with outbound sales efforts. Because your company’s sales and marketing funnel tries as hard as possible to convert trial users to paid, or paid to upgrade, the company’s priority will likely be on conversion, not on research.
Further, if a researcher tries to outreach for usability surveys at this point, from the user’s perspective (especially those deemed potential high-value customers) it would mean different prompts for different conversations with different people from various groups within your company, all competing for spots on their calendar. This gives a very hectic and frenetic impression of your company and should be avoided.
In the case of a SaaS product, sometimes the sales team has already made contact with potential customers, and many of these sales discussions involve demonstrations around populated, best-case scenarios (which showcase the full features) of your product.
As a result, you may find the participant has been able to ‘peek behind the curtain’ through watching the sales team provide these demonstrations, giving them an unfair advantage as to how much he / she knows before trying to finally use the product themselves. For the inexperienced user, your goal is to capture the genuine instinct of the uninitiated, not those who have seen the ‘happy path’ and are trying to trace back the steps to get to that fully-populated view.
To make sure you’re not bumping heads with the sales and conversion team, ask if you can take their castoffs–the customers they don’t think will convert. You can pull these from their CRM application and automate personalized emails asking for their time. I’ll outline this method in further detail in the section following, because it pertains to the veteran users as well.
As described in a previous post, guerrilla testing at conferences is a great way of fulfilling what gets seen and what parts of the interface or concept get ignored. These participants are great providers of honest, unbiased feedback and haven’t been exposed to the product other than some initial impressions of the concept.
Desiring the messy room
But what about the users that have been using your product for months now, those who have skin in the game, have already put their sweat and dollars behind customization of their experience? Surveying these participants allows us to see where they’ve found both utility and what areas need be expanded upon. Surveying only the uninitiated won’t provide feedback on any nagging functional roadblocks, those which are found only after repeated use. These are the participants that will provide the most useful feedback, sessions where you can observe the environment that they’ve created for themselves, the ‘messy room.’
Making an observational research analogy, a messy room is more telling of the occupants’ personality than an empty one. Given your limitations, how has the participant been forced to find workarounds? Despite these workarounds, they’ve continued to use the product, in despite of how we’ve expected them to use it–and these two can be contrastingly very different.
Find your friendly marketing representative/sales engineer at your company (or just roll your own) and discuss with them the best way to integrate a user experience outreach email into the company’s post-funnel strategy. For example, post-funnel would be after their trial periods have long since expired and the user is either comfortable in their freemium state or fully paid up.
As mentioned earlier, you can also harvest leads from the top of the funnel in the discarded CRM leads. However, you’ll likely have a greater percentage of sessions with users that are misfires–those indifferent or only just poking around the app, with not yet a full understanding of what it might do. Thankfully, the opt-in approach for participation filters this out for the most part.
Focusing again on the recruitment of the veteran, experienced users, another, more complex scenario would be to trigger this UX outreach email once a specific set of features have been initiated–giving off the desired signature of an advanced, informed user.
Going from purely legacy-based perspective, six months of paid, active use should be enough time to establish a relationship with a piece of software, whether they love or hate it. If there exists enough insight into the analytics side of the sales process, it would behoove you to also make sure that the user has had a minimum number of logins across these six months (or however long you’ll allow the users to mature).
Outreach emails triggered through the CRM should empower the recipient to make the experience of the product better, both for themselves and their fellow customers. Netflix does a great job of this by continually asking about the streaming quality or any delays around arrival times of their product.
I also recommend asking the users a couple of quantitative and qualitative questions, as this metric something you should be doing for your greater UX efforts already. These questions follow the guidelines of general SUS (System Usability Survey) practices that have been around for decades. Make the questions general enough so that they can be re-used and compared going forward, without fear of needing the change the goalposts when features or company priorities change.
When engineering this survey, be sure to track which tier of customer is filling out these surveys, because both their experience and expectations could be wildly different. Remember also to capture the user’s email address as a hidden field so you can cross reference against any CRM or analytics packages that are already identifying existing customers.
It depends on the complexities of your product, but typically 20-30 minutes is enough time to cover at least the main areas of function. Any longer, and you might encounter people not wanting to fit in an entire hour block into their schedule. If these recorded sessions are kept to just a half-hour, I find that a $25 is sufficient compensation for this duration of time, but your results may certainly vary.
In any type session, do iterate that this is neither a sales, nor a support call. You’re researching on how to make the product better. However, you should be comfortable on how to avoid (or sometimes suggest) workarounds to optimize the participant’s experience, giving them greater value of use.
Tools of the trade
For implementation of the questionnaire, I hacked the HTML / CSS from a Google Form to exist as self-hosted page but still pushing results through the matching form and input IDs to the extensible Google Spreadsheet.
There are a few tutorials that explain how to retain your branding while using Google’s services. I went through the trouble so I can share the URL of either the form or the raw results with anyone, without the need to create an account or login. As we discuss the sharing component of these user research efforts, this will become more important. Although closed systems like SurveyMonkey or Wufoo are easy to get up and running, the extensibility or a raw, hosted result set does not compare.
Insert a prompt at the end of the questionnaire for the user to participate in a compensated user research survey, linking to a scheduling applications such as Calend.ly. This application has been indispensable for opt-in mass scheduling like this. The features of gCal syncing, timezone conversion, daily session capping, email reminders, and custom messaging all are imperative to a public-facing scheduling board. Anyone can grab a 30-minute time slot from your calendar with just your custom URL, embeddable at the end of your questionnaire.
To really scale this user research effort to the point where it can be automated, you cannot spend the time trying to negotiating mutually-available times, converting time zones and following up with confirmations. Calend.ly allows for you to cap the number of participants who can grab blocks of your time, so you can set a maximum number of sessions per day, preventing a complete overload of bookings in your schedule.
As a part of the scheduling flow within Calend.ly, a customizable input field asks the participant for their Skype handle in order to screen share together, and I’d advise for the practitioner to create a separate Skype account for this usability effort. With every session participant, you’ll begin to add and add more seemingly random contacts, any semblance of organization and purity for your personal contact list will be gone.
Calend.ly booking utility – a publicly-accessible reservation system.
Once the user is on the Skype call, ask for permission to record the call and make sure that you give a disclaimer that their information will be kept private and shared with no one outside the company. You might also add ahead of time that any support questions that come up, you’ll be happy to direct to the proper technicians.
Permissions granted, be sure to re-iterate to the participant the purpose and goal of the call, and provide them with a license to say whatever they want, good or bad–you want to hear it. Your feelings won’t be hurt if they have frustrations or complaints about certain approaches or features of your product.
For recording the call, there are plenty of options out there, but I find that SnagIt is a good tool to capture video, especially given the resolution and dimension of the screen share tends to change based upon the participant’s monitor size. When compressing the output, a slow frame rate of 5/10 fps should suffice, saving you considerable file size when having to manage these large recordings.
When you’re walking the participant through the paces of the survey, be sure to annotate the time started and any high/lowlights you see along the way. While in front of your desktop, a basic note-taking utility application (or even pad and paper) should suffice. This will allow you to go back after the survey is finished and pull quotes for use elsewhere, such as powerpoint presentations or similar.
I always try to write a running diary of the transcript and a summary at the end just to cover what areas of the application we explored, as well as a quick summary of what feedback we gathered. Summarizing the typed transcript and posting the relative recorded video files should take no more than 10 minutes, which will still keep your total per-participant (including processing) time to under an hour each, certainly manageable as a part of your greater schedule.
Share the love (or hate)
I want to make sure that these sessions are able to be referred to by the executive and product management team for use in their prioritization strategy. Setting up an instance of MAMP / WordPress on a local box (we’re using one of the Mac Minis that power a dashboard display) which allows me to pass around the link internally and not have to deal with some of the issues around large video file sizes being uploaded, as well as alleviate any permissions concerns with these sessions being out in the wild.
Also important is to tag these posts attached to these files when you upload them. This allows faster indexing when trying to find evidence around a certain feature or function. Insert your written summary into the post content, and you’ll be able to better search on memorable quotes that might have been written down.
These resources can be very good for motivation internally, especially among the engineers who don’t often get to see people using the product they continually pour themselves into. They’ll also resonate with the product team, who will see first-hand what’s needed to re-prioritize for the next sprint.
After awhile, you’ll start to get a great library of clips that you can draw knowledge from. There’s also a certain satisfaction to seeing the evolution of the product in the interface through these screengrabs. That which was shown as confusing at one time may now be fixed!
Fulfillment of a participant compensation can be done through Amazon or other online retailers; you can wire a gift card through an email address, which you’ll be able to scrape as a hidden field from the spreadsheet of user inputs. Keep a running list of those that you’ve reached out to and contacted for responses.
You might also incorporate contacts met during sessions described in the Guerrilla Usability Testing at Conferences article, so you’ll be able to follow up when attending the next year’s conference to recruit again. After enough participants and feedback, think about establishing a customer experience council that you can follow up on with specific requests and outreach, even for quick vetting of opinions.
This article first outlined the strategies and motivation behind the research, advocating creating an automated workflow of continually-scheduled screenshares with customers, rather than trying to recruit participants individually. This methodology was then broken down to distinct steps of recruitment via email, gathering quantitative and qualitative feedback, and automating an opt-in booking of the sessions themselves. Finally, this article went on to discuss how to best leverage and organize this content internally, so that all might benefit from your process.
User research is imperative to the success and prioritization of any software application (or any product, for that matter). Yet, too often we forget to consume or own product. Whether it be server log management as I’ve chosen, or apartment listing or ecommerce purchases, shake off complacency and try to spend 30-mins a week trying to accomplish typical user tasks from start-to-finish.
Also make it a point to conduct some of these sessions among those you work alongside; you’ll be surprised what you can find just by the simple repetition with a fresh set of eyes and ears. The research process and its dependencies does not have to be as intricate as the one listed above.
When your company starts to incorporate user opinion into a design and development workflow, it will begin to pay out dividends, both in the perceived usability of your application as well as the gathered metrics of user satisfaction.
The following is a composite of experiences I’ve had in the last year when talking with startups. Some dialog is paraphrased, some is verbatim, but I’ve tried to keep it as true as possible and not skew it towards anyone’s advantage or disadvantage.
As professionals in the user-centered design world, we are trained and inclined to think of product design as relying on a solid knowledge, frequently tested, of our potential users, their real-life needs and habits.
We’ve seen the return on investment in taking the time to observe users in their daily lives, in taking our ideas as hypotheses to be tested. But the founders and business people we often interview with have been trained in a different worldview, one in which their ideas are sprung fully formed like Athena from the brow of Zeus. This produces a tension when we come to demonstrate our value to their companies, their products, and their vision. We want to test; they want to build. Is there a way we can better talk and work together?
Most of my interactions with these startups were job interviews or consulting with an eye toward a more permanent position; the companies I spoke with ranged from “I’m a serial entrepreneur who wants to do something” to recent B-school grads in accelerator programs such as SkyDeck, to people I’ve met through networking events such as Hackers & Founders.
In these conversations, I tried to bring the good news of the value of user experience and user research but ran into a build-first mentality that not only depreciates the field but also sets the startup on a road to failure. Our questions of “What are the user needs?” are answered with “I know what I want.” We’re told to forget our processes and expertise and just build.
Can we? Should we? Or how can we make room for good UXD practices in this culture?
“I did the hard work of the idea; you just need to build it”
Over the past two years, I’ve been lucky to find enough academic research and contract work that I can afford to be picky about full-time employment (hinging on the mission and public-good component of potential employers). But self-education, the freelance “UX Team of One,” and Twitter conversations can’t really match the learning and practice potential of working with others, so I keep looking for full-time UX opportunities.
This has lately, by happenstance, meant startups in the San Francisco Bay area. So I’ve been talking to a lot of founders/want-to-be-founders/entrepreneurs (as they describe themselves).
But I keep running into the build-first mentality. And this is often a brick wall. I’m not saying I totally know best, but the disconnect in worldviews is a huge impediment to doing what I can, all of which I know can help a startup be better at its goals, so that it can have a fighting chance to be in that 10-20% that doesn’t end up on the dust heap of history.
“Build first” plays out with brutal regularity. The founders have an idea, which they see as the hard part; I’ve actually had people say, “You just need to implement my idea.” They have heard about something called “UX” but see user experience design as but a simple implementation of their idea.
As a result, the meaning of both the U and the X get glossed over.
The started-up startup
We’ll start with the amalgam of a startup that had already made it into an accelerator program. A round of funding, a web site, an iOS app, an origin story on (as you’d expect) TechCrunch.
It began with a proof of concept: A giant wall, Photoshopped onto a baseball stadium, of comments posted by the app’s users. The idea was basically to turn commercial spaces into the comments thread below any HuffPo story (granted, a way to place more advertising in front of people). The company was composed of the founder, fresh from B-school; a technical lead also just out of school; a few engineers; and sales/marketing, which was already pitching to companies.
The company was juggling both the mobile and web apps and shooting for feature-complete from the word Go. Though there were obvious issues, such as neither actually working and the lack of any existing comment walls or even any users; they were trying to build a house of cards with cards yet to be drawn.
In talking with the tech lead, I saw that they were aware of some issues (crashes, “it’s not elegant enough”) but didn’t see others (the web and mobile app having no consistent visual metaphors and interaction flows, typos, dead ends, and the like). To their credit, they wanted something better than what they had. Hence, hiring someone to do this “UX thing.” But what did they think UX was?
I had questions about the users. How did they differ from the customers–the locations that would host walls, which would generate revenue by serving ads to the users who posted comments?
I had questions about the company. What was their business process? What had they done so far?
This was, I thought, part of what being interviewed for a UX position would entail–showing how I’d go about thinking about the process.
I was more than ready to listen and learn; if I were to be a good fit, I’d be invested in making the product successful as well as developing a good experience for users. I was also prepared with some basic content strategy advice; suggestions about building a content strategy process seemed nicer than pointing out all the poor grammar and typos.
Soon, I was meeting with the founder. He talked about how a B-school professor had liked his idea and helped him get funding. I asked about the users. He responded by talking about selling to customers.
When he asked if I had questions, I asked, “What problem does this solve, for whom, and how do you know this?” It’s my standard question of any new project, and, I was learning, also a good gauge of where companies were in their process. He said he didn’t understand. He said that he had financial backing, so that was proof that there was a market for the app. What they wanted in a UX hire, he said, was someone to make what they had prettier, squash bugs, and help sell.
I got a bad feeling at that point; the founder dismissed the very idea of user research as distracting and taking time away from building his vision. Then I started talking about getting what they had in front of users, testing the hypotheses of the product, iterating the design based on this: all basic UX and Lean (and Lean UX!) to boot, at least to someone versed in the language and processes of both.
This, too, the founder saw as worse than worthless. He said it took resources away from selling and coding, and he thought that testing with users could leak the idea to competitors. So, no user research, no usability testing, no iteration of the design and product.
(Note on one of startups that’s part of this amalgam: As of this writing, there has been neither news nor updates to the company site since mid-2012, and though the app is still on the iTunes Store, it has too few reviews to have a public rating. This after receiving $1.2 million in seed funding in early 2012.)
The pre-start startup
I’ve also spoken with founders at earlier stages of starting up. One had been in marketing at large tech companies and wanted to combine publishing with social media. Another wrote me that they wanted to build an API for buying things online. I chatted with a B-school student who thought he’d invented the concept of jitneys (long story) and an economist who wanted to do something, though he wasn’t sure what, in the edu tech space. What they all had in common was a build-first mission. When I unpacked this, it became obvious that what they all meant was, “we don’t do research here.”
Like the company amalgam mentioned above, they all pushed back against suggestions to get out of the building (tm Steve Blank) to test their ideas against real users. Anything other than coding or even starting on the visual design of their products was seen as taking time away from delivering their ideas, which they were sure of (I heard a lot of “I took the class” and “we know the market” here).
And their ideas might end up being good ones–I can’t say. They seem largely well-intentioned, nice people. But when talking with them about how to make their product or service vital for users and therefore more likely to be a success, it soon becomes clear that what UX professionals see as vital tools and processes in helping create great experiences are seen quite differently by potential employers, to the point that even mentioning user research gets you shown the door. Politely, but still.
I’d like to bring up here the idea that perhaps we, as UX people, perhaps have contributed to the problem. The field is young and Protean, so the message of “what is UX?” can be garbled even if there were a good, concise answer. Also, in the past, user research has indeed been long and expensive and resulted in huge documents of requirements and so on, which the Lean UX movement is reacting to. So nobody’s totally innocent, to be sure. But that’s another article in the making (send positive votes to the editors).
One (anonymized) quote:
“Yep, blind building is a real disaster and time waste… I’ve seen huge brands go down that path… I have identified a great proof-of-concept market and have buy-in from some key players. My most immediate need, however, is a set of great product comps to help investors understand how the experience would work and what it might look like. I’ve actually done a really rough set of comps on my own, but while I’m a serious design snob, I am also terrible designer…”
So: Blind building is a real disaster, but she’s sketched out comps and just wants someone to make it look designed better. Perhaps she saw “buy in from some key players” as user research?
We had an extended exchange where I proposed lightweight, minimum-viable-product prototypes to test her hypotheses with potential users. She objected, afraid her idea would get out, that testing small parts of the idea was meaningless, that she didn’t have time, that it only mattered what the “players” thought, that she never saw this at the companies she worked at (in marketing).
Besides, her funding process was to show comps of how her idea would work to these key players, and testing would only appear to reduce confidence in her idea. (Later that week, I heard someone say how “demonstrating confidence” was the key ingredient in a successful Y Combinator application.)
“We’re looking for somebody who’s passionate about UI/UX to work with us on delivering this interface.
“Our industry specifics make us a game of throwing ideas around with stakeholders, seeing what sticks and building it as fast as possible. Speed unfortunately trumps excellence but all products consolidate while moving in the right direction.
“We certainly have the right direction business-wise and currently need to upgrade our interface. We require UX consulting on eliminating user difficulty in the process of buying, as well as an actual design for this.”
So: To him, it’s all about implementing an interface. Which, to him, is just smoothing user flows and, you know, coming up with a design. Frankly, I’m not sure how one could do this well, or with a user-centered ethic, without researching and interacting with potential users. I’m also not sure how to read his “upgrade our interface”; is that just picking better colors and shapes, in the absence of actual research and testing on whether it works well for users? That doesn’t strike me as useful, user-centric design. (During the interview process at Mozilla, I was asked the excellent question of how I’d distinguish art and design; I’m not sure I nailed the answer, but I suspect there’s more to design than picking colors and shapes.)
And I wasn’t sure even if he was receptive to the idea of users qua users in the first place. Before this exchange, when he described his business model, I pointed out that his users and his customers were two different sets of people and this can mean certain things from a design perspective. Given that his response was that they have been “throwing ideas around with stakeholders,” I gathered that his concept of testing with users was seeing what his funders liked. That did not bode well for actual user-centered design processes.
When I asked how they’d arrived at the current user flows and how they knew they were or weren’t good, he said that they internally step through them and show them to the investors (neither population is, again, the actual user). He was adamant both that talking to users would slow them down from building, and that because they were smart business people, they know they’re going in the right direction. It was at this point I thought that he and I were not speaking the same language.
I referred him to a visual designer I know who could do an excellent job.
I do not have the answers on how to bridge this fundamental gap between worldviews and processes. A good UX professional knows the value of user research and wants to bring that value to any company he or she joins. But though we can quote Blank, though we can show case studies, though we can show how a Gothelfian Lean UX process could be integrated into a hectic build schedule–when all this experience runs into a “build first” mentality, the experience and knowledge loses. At least in my experience. What is to be done?
What makes a marketing e-mail or newsletter efficient? One can judge, for instance, by the number of users that opened the message or clicked on a specified element representing primary action, such as a product link or button.
Those indicators measure user engagement precisely; however, they are limited to the last phase of interaction with e-mail or newsletter. The act of clicking certain element in a marketing e-mail is a result of a longer process of identifying, assimilating, and analyzing its content. It is in those three steps that the decision is made to take action or not, and it is those three steps that are not analyzed or included in standard efficiency measurement, such as CTR or open-rate.
Therefore, click-through-rate or open-rate measures only completed processes, not taking into account those interrupted. Moreover, those parameters do not inform us about “why” a certain user decided to click or abandon the message.
One way to understand what is happening in users’ minds is to observe what they really see, which cannot be done using the traditional methods of e-mail research. Instead, we used eye tracking on a desktop computer to record the person’s gaze while looking at the e-mail message, checking which objects they looked at, for how long, and which elements, among the whole field of the vision, attracted their attention the most.
To check what kind of impact some of the characteristics of e-mails have on users, some of the stimuli were transformed by our team. For instance, we modified location of logo and the calls-to-action, changed size of prices, or flopped photos change the direction the person in the photo is facing.
Each of the stimuli used in the study had two versions–an original and a modified one. Each version was seen by 27 participants. All of the heat maps in the report are derived from the averaging of 10 second long scan paths of 27 subjects.
Observations: Testing known principles and their variations
Our different observations confirm some of the generally known design principles, such as users’ deep-rooted dislike of homogenous blocks of text.
At the same time, some of our hypotheses were disproved. For instance, reducing the length of introductory text did not result in an increased number of users reading it. In fact, introductory text was so rarely read that a general recommendation from our research is to remove it all together in favor of items that really matter.
Text and reading
Learning how to read and gaining experience in this activity shapes our perception since early childhood. In our (Western) culture, we read from left to right and from top to bottom. This becomes a strong habit and this strategy of scanning a visual stimulus is executed automatically, even if the viewed stimulus does not contain text.1
What is more, readers on the web are very selective.2 They constantly search for valuable content, but when the required amount of effort increases, their motivation plummets. Below, we describe further and illustrate those phenomena with the examples from our study.
Blocks of text
It may sound like a truism, but it is always good to have in mind that a homogenous block of text is not a good way to communicate with the Internet users.2 One can often observe in eyetracking studies that users tend to skip this kind of content, without making even the slightest attempt to read it.
Fortunately, there are some tips and tricks which can make the text more attractive to the user’s eye. First, formatting which includes clearly distinguishable headlines and leads often results in a phenomenon called F-pattern.
Readers have a strong tendency to scan headlines briefly, and they usually start to read from the top of the page. Their motivation to focus their attention on a written content decreases gradually, so you may expect that the first few headlines (counting from the top) will be read, and that the lower the headline is located, the less attention it will get.
Introduction text in an e-mail message
Reading requires time and effort, and the recipients of a newsletter want to quickly get exactly the information they are interested in (which usually means the special offers). It did not surprise us that introductory text in a newsletter would be ignored most of the time.3
But what to include in the marketing message instead of introductory blah-blah text? The answer seems obvious–more valuable content, such as the products we want to present.
Our study confirmed that hypothesis: After cutting most of the introductory text out, the amount of attention focused on it did not change much. On the other hand, the products presented in the message benefited greatly in terms of attracting users’ gaze.
Properties of numbers
The next thing we wanted to focus on was if numbers caught a human’s eye. Nielsen4 suggested that numbers written as numerals are eye-catching, whereas numbers written with letters are not, because they are indistinguishable from an ordinary piece of text.
We studied how long the participants focused their gaze on numbers, depending on their size. The difference between small and large digits turned out to be statistically significant. The average difference between small and large number approximated 200 and 400 ms for both prices depicted in the stimulus. From the psychophysiological perspective, this is a long time. The longer we fixate on an object, the deeper the processing and understanding of the visual information.5
Communication through images
Pictures: What’s worth it, and what’s not
One of the widely known phenomena which can be observed in eyetracking and usability studies is so-called banner blindness. In short, web users tend to act as if they were blind to advertisements or other types of redundant information, which can only distract them from completing the task. This adaptive mechanism applies as well to stock photos and to pictures which do not present the real products or people. Pictures without informational value may even pull the viewers’ attention away from the valuable content because they may be easily classified as an advertisement, which is usually neither informative nor relevant.
Directing users’ attention by faces
Some types of pictorial stimuli are almost always classified as important. One of them is certainly a human face. We are social animals, so we are perfectly wired to automatically read the subtle social cues, for example those connected with decoding where the attention of another human being is directed at the moment.
And example of how this reflexive mechanism works can bee seen on the picture above. The participant automatically followed the gaze of the model right after noticing her face.
In the original version of this newsletter the model looked straight forward. We have created the modified version in which the model is looking at the logo. We tested both versions with our participants, and then we examined whether there is a significant difference in the amount of time the participants fixated on the logo. In the modified version, the average time of focused gaze on the logo was significantly longer.
Our observations and recommendations are rooted in a number of studies focused on what recipients do really see while looking at advertisements in email campaigns. Some of the effects repeated in our 2011 and 2013 studies; some of them were also confirmed in studies on the perception of the e-mails and newsletters carried out by other teams.
But we should not forget that those are general laws, which, however, in particular creation may be not fulfilled due to various mitigating factors, such as the content of the e-mail, its size, and the level of the audience engagement.
1Ziming Liu, (2005) “Reading behavior in the digital environment: Changes in reading behavior over the past ten years”, Journal of Documentation, Vol. 61 Iss: 6, pp.700 – 712
It is an honest question: how smart are your users? The answer may surprise you: it doesn’t matter. They can be geniuses or morons, but if you don’t engage their intelligence, you can’t depend on their brain power.
Far more important than their IQ (which is a questionable measure in any case) is their Effective Intelligence: the fraction of their intelligence they can (or are motivated to) apply to a task.
Take, for example, a good driver. They are a worse driver when texting or when drunk. (We don’t want to think about the drunk driver who is texting.) An extreme example you say? Perhaps, but only by degree. A person who wins a game of Scrabble one evening may be late for work because they forgot to set their alarm clock. How could the same person make such a dumb mistake? Call it concentration, or focus, we use more of our brain when engaged and need support when we are distracted.
So, what does a S.T.U.P.I.D. user look like?
“Fear is the mind killer”, Frank Herbert wrote. Our minds are malleable and easily affected by their context. The effect of stress on the brain is well known, if not well understood. People under stress take less time to consider a decision thoroughly, and they choose from the options presented to them rather than consider alternatives. Stress is often due to social pressures. Car salespeople know to not let a customer consider an offer overnight, but pressure them to buy right away.
Tiredness is one of the largest causes of industrial and motor vehicle accidents. Interfaces used by tired people should take into account their lowered sense of self-awareness and number of details that the user is likely to miss. A classic example of an interface used by sleepy people, the iPhone alarm clock is typically set right before bed. Unfortunately, it doesn’t ring if the phone is set to vibrate, the default state for many people. When a user sets the alarm, it would be useful to override the vibrate feature, or at least remind them that it won’t ring.
Training for enterprise applications is more often discussed then enacted. Users are thrown at an application with a manual and a Quick Reference Card. Applications that are not designed around the user’s workflow have to explain their conceptual model while they are being used: “where” things are stored, how to make changes, who to send things to.
Complex systems that are used infrequently are a particular problem. In the design of the automated external defibrillator, it is assumed the user may have no knowledge of the science or training on the device, and will be using it in a chaotic, stressful environment. The frequency of use should drive design. Yearly processes, like doing your taxes, should assume that the users have never done it before. In rarely used interfaces, customization is likely to be less useful, but a comparison to previous year’s entries is very useful as they remind the user what they did before.
Nothing reduces effective intelligence faster than doing a boring task against one’s will.
More important than the user’s mental model of an application is their mental attitude toward the task. Someone sitting in the front passenger seat of a car may have the same field of view as the driver, but unless they are focused on it, they will not remember the path driven. Nothing reduces effective intelligence faster than doing a boring task against one’s will. When a user is passive, complexity becomes insurmountable. Games aimed at casual gamers know to keep the interaction model simple, using a flat navigation and avoiding “modes” (e.g. edit vs view).
User centered design is a powerful approach because it recognizes that there are many reasons people use a system. Airline booking sites are used to buy tickets, but also to see if the family can afford to go on vacation. The designer should recognize that they cannot solve every problem, but should give users the tools to help themselves, to work independently of the application’s intended method. In internal enterprise systems, the top user request is often “export to Excel”. This often reflects that the system does not meet the user’s needs. Excel empowers the user to do ‘out of the box’ actions. It is the API to the real world.
…The top user request is often ‘export to excel’…. Excel empowers the user to do ‘out of the box’ actions. It is the API to the real world.
People are multi-tasking more than ever, whether it is simply listening to music while driving or playing Farmville while watching TV. Effective multi-tasking has been shown to be a myth, but it is a popular one. Paying “partial attention” to multiple activities has significant impact to your perception of an interface. Users are often said to be on “autopilot”, clicking on things by shape, rather than reading the text. An interface cannot rely on the user having a clear and consistent working memory across multiple screens. The task and details must be re-stated at each step to remind the user the step they are on and what they need to do. Frequent, automatic saving of user entered data is essential, especially as connections can time out.
Help S.T.U.P.I.D. users by designing S.M.A.R.T.
Start-ups often experience a shock when they emerge from the hothouse of heads-down development. Their intended customers barely have time to listen to their idea, let alone devote time to explore its features. The contrast between a small group of friends working intensely together on a single project with the varied needs and limited free time of their customers can be a disheartening experience.
Projects often fail not because the idea is bad, but because the value their service will provide is not easily understood. The question I ask my team is “What problem, from the user’s point of view, are you solving?” It has to be a problem the user knows they have. If the problem is not obvious to the user, in terms they understand, the solution doesn’t matter. Focusing on the problem keeps a project from drifting into fantasy requirements: solutions looking for a problem.
Design teams often use themselves as model users, but…. The user knows nothing about the product, doesn’t understand the concept, and doesn’t care.
Design teams often use themselves as model users, but they are almost the perfect storm of differences between themselves and the users.
They know the product exists and what it is supposed to do.
They understand the internal concept, including its past and future ideas.
They care, personally, about the product. Their success depends on it.
The user has none of these things. The user knows nothing about the product, doesn’t understand the concept, and doesn’t care.
What can be done to make S.T.U.P.I.D. users S.M.A.R.T?
Why are simple apps popular these days? It is not that people don’t like features, it’s because instant comprehensibility trumps powerful features. In the old search engine wars, Google may have had a better search algorithm, but they became known for having a simpler design. Yahoo and others tried to become portals, losing sight of the users primary goal. I advise people to “Design the mobile version first” to help them focus on the key user benefits.
The down side is that any successful project expands and adds features to address additional user needs. What starts out as “Writer for iPad” can end up as Microsoft Word. Simple is not always better, but keeping the new user in mind helps find the right balance.
An app is only as good as the user understands it. That starts with the name – is it cute or does it explain what it does? Is it “pidg.in” or “Automatic Mailbox”? The iPhone / iPad apps’s television ads were effective sales tools, but also trained a generation by simply showing them in use. Each step of a workflow is subject to delays and distractions. Ecommerce sites know to reduce links during the final checkout process. With complex transactions, the risk is greater that the user will have lost their focus. Remind the user what they are doing in big title text. Focus on delivering Clear and Consistent messaging and instructions, for example, adding side notes like Ally.com’s password guidance.
Standard design patterns are good, but they also throw the user into autopilot. It makes sense to break them for critical decisions. The hard part is determining what a critical decision point is. Observing user behavior, customer service records, and identifying risks to the user’s data are good clues. If something is simple enough that the users are mostly on autopilot, for example installing software, make the default action a single click.
The dark side of users on “autopilot” is that they will regularly make mistakes by not paying attention. Mistakes are generally not obvious to a system, but it is good practice to highlight destructive actions and enable recovery. Capture data in little steps. Saving form fields instead of form pages, prevents large data loss. It’s a good idea to highlight and ask for confirmation on big, destructive changes, like deleting a database. “Undo”, common on computers, but slow to come to the web, enables the user to recover from errors.
Gmail lets users undo moving a message to the trash.
Gmail also let you restore your contacts if you accidentally make a large, destructive change.
Test in realistic situations
There is an essential flaw in the two-way mirror usability test method. In the interest of copying the form of the lab-coated scientist, these rooms create an artificial aura of “science”. But as ethnographic research can tell you, real world usage is so different as to make the test questionable. It selects for a test population that is free in the middle of the day, motivated by $50, and M&Ms, puts them in an unfamiliar environment with a personal guide to focus on a specific task with no distractions. This is about as unrealistic as it gets.
There is an essential flaw in the two-way mirror usability test method…. It selects for a test population that is free in the middle of the day, motivated by $50, and M&Ms.
In reality, the same person may have a child on their lap and only 10 minutes to look up a flight. The fact that an ecommerce session may expire after a few hours is trivial for some, but significant for people who only have a few hours a day to use the computer. “Universal Design” is a great approach, because methods to help specific disabilities tend to be useful to the general public.
Testing should go beyond the user interface and cover the basic business model. The Apple iTunes video download “rental” is for 24 hours. Unfortunately, people tend to watch movies at the same time each day, for example, after the kids go to bed. If your kids wake up, you have to finish it earlier the next day. Would it have killed them to make the rental 27 hours, so parents could actually use it?
Design for the right level of Effective Intelligence
Effective intelligence obviously varies across situations. People are ingenious at figuring out things they really want, but the simplest task is insurmountable to the unmotivated. Both scenarios are solvable, but an application that makes the wrong assumptions about its users will fail. (Interestingly, this study suggests that easier-to-use design can affect the user’s perception of difficulty, and encourage them to complete the task.)
One should adapt their strategy to the user’s desire and the problem’s complexity. Here’s an unscientific matrix for effective intelligence with software interfaces.
This matrix compares the amount a user desires to complete the task versus the complexity of the task to that user type. Different user types will have different measures of complexity, so one might create several matrices.
Low Desire, Low Complexity – The goal here is to finish these tasks as fast as possible. Follow standard design conventions, seek to eliminate steps.
Low Desire, High Complexity Complex – Tasks that the user doesn’t want to do are a danger zone. Can the problem be reconsidered or eliminated?
High Desire, Low Complexity – The easiest quadrant.
High Desire, High Complexity – This is the most interesting quadrant. A self-training interface, (integrated help, training modules) can get the user started; they will often take it the rest of the way. Video games often have a “training” level to train the user on basic skills like moving around.
Effective Intelligence is a helpful concept in the design toolbox. User research and testing are the best ways to know your users, but knowing what may limit a user in reality helps design ways to make them smarter.
Like this article? Want to keep Stephen’s wisdom close at hand? Download the handy, cubicle-friendly, 61kb PDF to hang on a nearby wall and you’ll always remember to design SMART.