<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Boxes and Arrows &#187; John Ferrara</title>
	<atom:link href="http://boxesandarrows.com/author/ferrarajc/feed/" rel="self" type="application/rss+xml" />
	<link>http://boxesandarrows.com</link>
	<description>Boxes and Arrows is devoted to the practice, innovation, and discussion of design; including graphic design, interaction design, information architecture and the design of business.</description>
	<lastBuildDate>Tue, 21 May 2013 13:57:51 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>Applying Turing&#8217;s Ideas to Search</title>
		<link>http://boxesandarrows.com/applying-turings-ideas-to-search/</link>
		<comments>http://boxesandarrows.com/applying-turings-ideas-to-search/#comments</comments>
		<pubDate>Thu, 28 Aug 2008 21:03:13 +0000</pubDate>
		<dc:creator>John Ferrara</dc:creator>
				<category><![CDATA[Interactivity]]></category>
		<category><![CDATA[Interfaces]]></category>
		<category><![CDATA[Learning From Others]]></category>
		<category><![CDATA[Special topic: Search and Metadata]]></category>

		<guid isPermaLink="false">http://boxesandarrows.com/applying-turings-ideas-to-search/</guid>
		<description><![CDATA[Alan Turing's ideas about artificial intelligence have not panned out exactly as he expected back in the 1950s. These ideas, however, can be used in interface design. John Ferrara shows us how they apply to designing search.]]></description>
				<content:encoded><![CDATA[<p>Here&rsquo;s how the game works: You&rsquo;re on your computer, instant messaging away. One IM session is with a real person and the other is with an artificial intelligence (AI) program that&rsquo;s designed to pose as a human being&nbsp;by&nbsp;using a&nbsp;casual conversational tone. The AI is able to respond in complete sentences with realistic syntax to mask its identity, even throwing in slang, canned humor, or typos.</p>
<p style="margin-left: 80px;"><i>Q: Who&rsquo;s the most famous person in the world?</i></p>
<p style="margin-left: 80px;"><i>A: Used to be Tom Cruise, but hes gone a little crazy LOL <img src='http://www-boxesandarrows-com.zippykid.netdna-cdn.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </i></p>
<p>Would you be able to sort out which is the person and is which is the machine just by asking them questions?</p>
<p>This game is at the heart of a famous article written by Alan Turing, a critical figure at the inception of the computer age. The Turing test is intended to serve as litmus for evaluating whether a machine possesses humanlike&nbsp;intelligence.</p>
<p>Although Turing&rsquo;s article was written in 1950, you could still be confident today that if you ask enough questions you&rsquo;ll eventually win the game. It may take a while if the program is particularly well written, but the rough edges of the computer&rsquo;s abilities will inevitably begin to show. You&rsquo;ll catch it claiming to be uninformed about a mainstay of everyday life, failing to grasp an implication, or stringing together phrases with a mechanistic tone that gives it away.</p>
<p style="margin-left: 80px;"><i>Q: How would you describe a sunset to a sightless person?</i></p>
<p style="margin-left: 80px;"><i>A: The sun sets at the end of every day.</i></p>
<p>Gotcha.&nbsp;</p>
<p>&nbsp;</p>
<h2>The Turing Test and User Interfaces</h2>
<p>In December of 2006, while I was conducting usability testing of a search engine, it struck me that the Turing test has something important to teach us about interface design. It describes an ideal form of human-computer interaction in which people express their information needs in their own words, and the system understands and responds to their requests <i>as another human being would</i>. During my usability test, it became clear that this was the very standard to which my test participants held search engines.</p>
<p>Most of our interactions with a website are driven by dumb processes, where either the server or the client machine follows an unambiguous set of instructions: When I click on this link, retrieve that HTML page. When I click the &quot;Date&quot; column, rearrange the records in descending chronological order. When I select a term from a tag cloud, retrieve all documents tagged with that term and order them by their popularity scores.</p>
<p>Computers are intrinsically good at these types of things.</p>
<p>But search technology is different. It shortcuts around the a site&#8217;s formal information architecture.&nbsp;When searching, the user doesn&rsquo;t need to figure out the mental model underlying the navigation and&nbsp;site structure; she just needs to say what she wants. Like the computer in Turing&rsquo;s thought experiment, the search engine needs to be able to parse the user&rsquo;s input and determine how to respond. That&rsquo;s easy for a person, but far more difficult for a computer.</p>
<p>Search engines can give the false impression that they speak English, which seems reasonable enough:&nbsp; I ask Google for something about &quot;mars exploration&quot;, and I get back a page full of links about just that (Figure 1). But of course even Google possesses nothing approaching a human understanding of language or ideas; its results are based on matching patterns and crunching quantifiable values.</p>
<p><img width="660" height="518" src="http://www-boxesandarrows-com.zippykid.netdna-cdn.com/files/banda/applying-turings/Figure_1.gif" alt="Google results for mars exploration" title="Turing article figure 1" /></p>
<p><it></it><i><font size="1"><font size="2">Figure 1: Google does well with this search; it only needs to match words.</font></font></i></p>
<p>For many purposes this works extremely well.&nbsp; But there&rsquo;s an enormous gap between any computer&rsquo;s capacity for understanding and that of a human being.&nbsp; Let&rsquo;s say that you want information about the space program that came just before the Apollo missions, but you can&rsquo;t remember what it was called.&nbsp; You search Google for: &ldquo;space mission before Apollo&rdquo;.</p>
<p>Like the program giving itself away in the Turing test, the edges of Google&rsquo;s abilities begin to show (figure 2).&nbsp; The results focus on the keyword &ldquo;Apollo,&rdquo; which frequently shows up with the words &ldquo;space&rdquo; and &ldquo;mission,&rdquo; completely missing the intended meaning that&rsquo;s obvious to a human being.&nbsp; For this reason, the search fails.&nbsp; In our testing we found that in instances when users had difficulty searching successfully, this type of problem was often the underlying cause.<br />
&nbsp;</p>
<p>&nbsp;<img width="657" height="560" src="http://www-boxesandarrows-com.zippykid.netdna-cdn.com/files/banda/applying-turings/Figure_2.gif" alt="Google results for space missions before apollo.  All discuss Apollo missions, not previous missions." title="Turing article figure 2" /></p>
<address>&nbsp;<font size="2">Figure 2: For this search, the engine&nbsp;would have to&nbsp;match against ideas.</font></address>
<p>&nbsp;</p>
<h2>Implications for Design</h2>
<p>Users hold search to a human standard of understanding that computers cannot as yet achieve. This is more than just a curiosity: &nbsp;The Turing test has something to tell us about how we can better design our website search interfaces today. We can find opportunities by posing the question:<br />
&nbsp;</p>
<p style="margin-left: 40px; text-align: left;"><i>Assuming that current technology remains the same,&nbsp;what could we do that would make a computer more convincing in a Turing test?</i></p>
<p>&nbsp;</p>
<h3>The user&rsquo;s role</h3>
<p>If the user has not phrased her search clearly enough for another person to understand what she&rsquo;s trying to find, then it&rsquo;s not reasonable to expect that a comparatively &quot;dumb&quot; machine could do better. In a Turing test, the response to a question incomprehensible even to humans would prove nothing, because it wouldn&rsquo;t provide any distinction between person and machine.</p>
<p>In fact, server logs reveal that this is one of the most common reasons searches fail: users often provide only a vague description of what they want. Worse still, in testing we found that users had difficulty recognizing when their searches weren&rsquo;t well-phrased, and they tended to blame the&nbsp;poor results on the system, not themselves.</p>
<p>At first glance this problem may not seem to tell us very much about the design of search at all, since the user&rsquo;s skill is at issue. But in fact, the designer has the opportunity to help determine the user&rsquo;s input, making it easier for the search system to provide a better response. The Turing test is much easier to pass if you have some influence over the questions the user asks.</p>
<p><i>Suggest functions</i> show a list of popular search phrasings matching the characters the user has entered so far (Figure 3). The user can submit one just by clicking it.</p>
<p>&nbsp;</p>
<p><img src="http://www-boxesandarrows-com.zippykid.netdna-cdn.com/files/banda/applying-turings/Figure_3.gif" alt="options to complete query on solar; includes solar system, solar power, and others" /></p>
<address>
<p><font size="2" face="Arial">Figure 3: Search suggest functions show the most popular phrasings matching the text. The user&nbsp;can select any one of these to submit the search.</font></p>
</address>
<p>&nbsp;</p>
<p>Suggest functions verge on the revolutionary because they have two important effects on the usability of Search:</p>
<p style="margin-left: 40px;">1.&nbsp;Suggest functions&nbsp;encourage people to select the most specifically worded applicable search from the list. It takes no more work to click on a wordy, descriptive search than it does to click on a short, vague one. This provides more focused results.&nbsp;After implementing a suggest function on Vanguard&rsquo;s intranet, we found that the average length of the 100 most commonly submitted searches had increased by 29%.</p>
<p style="margin-left: 40px;">2.&nbsp;Suggest functions&nbsp;make optimization efforts more effective.&nbsp;In the case of Vanguard&rsquo;s suggest function, we found that the suggested phrasings were much more likely to be submitted than those not on the list.&nbsp;This means that optimizing pages for those suggested phrasings will benefit users more often.</p>
<p>This&nbsp;is a solution that solves a problem so concisly it&rsquo;s bound to become ubiquitous. I would expect that by mid-2010, your website will look behind the times if its search function doesn&rsquo;t include suggestions.</p>
<p>&nbsp;</p>
<h3>The search engine&rsquo;s role</h3>
<p>Let&rsquo;s assume that the user has done a good enough job of phrasing her search so that another person would have a clear understanding of what she&rsquo;s trying to find. With the user upholding her end of the bargain, the onus is then on the search engine to return the best available matches at the very top of the results list. If it doesn&rsquo;t, the search will have failed.</p>
<p>But just as the program in a Turing test will suffer from unavoidable deficiencies, so will search engines. Figure 4 shows typical rankings of the best match for the most commonly submitted, well-phrased queries returned by a fairly good website search engine. While the best result is often returned at the top of the list, there are many instances where it&rsquo;s positioned much further down. This unreliability is common to all search engines.&nbsp;</p>
<p><img src="http://www-boxesandarrows-com.zippykid.netdna-cdn.com/files/banda/applying-turings/Figure_4.gif" alt="options to complete query on solar; includes solar system, solar power, and others" />&nbsp;</p>
<address>Figure 4</address>
<p>&nbsp;</p>
<p>The Turing test again points toward a solution. The AI program would be more convincing if a human being provided it with canned responses to commonly asked questions. Take the &quot;most famous person&quot; example that opened this article:</p>
<p style="margin-left: 80px;"><i>A: Used to be Tom Cruise, but hes gone a little crazy LOL <img src='http://www-boxesandarrows-com.zippykid.netdna-cdn.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </i></p>
<p>While a modern AI program could capably generate convincing responses, one with this kind of personality and cultural insight would almost certainly need to be prewritten. Imagine that the same Turing test is run tens of thousands of times with different participants. Over this many trials, you would be able to see trends in the kinds of questions people ask that give the computer away &ndash; and confidently predict that they would come up again in the future. You could then write custom responses for them, making it seem like the machine actually understands the questions.</p>
<p>Such trend data are readily available in your site&rsquo;s search logs. You can use a list of the most commonly submitted searches to write canned results, usually called &quot;best bets&quot;, to correct the underperforming searches. Best bets serve to fill gaps, patching irregularities in the quality of results. You can&rsquo;t write best bets for every query that will ever be submitted (what would be the point of a search engine?), but working from the search logs lets you have great impact with minimal work.</p>
<p>It may already have occurred to you that there&rsquo;s a special synergy between suggest functions and best bets. The former lets you influence the user&rsquo;s input; the latter lets you ensure that the system provides the best possible responses to common queries. They&rsquo;re especially effective in combination, allowing the designer to approach a search system design &ndash; or, for that matter, an AI program for a Turing test &ndash; such that it&nbsp;can be overwhelmingly successful.</p>
<p>&nbsp;</p>
<h2>The Future of Search</h2>
<p>The previous section was specifically limited to current technology. But the Turing test also points to opportunities for future improvements to search. I predict that two developments will contribute most to the advancement of search in the years to come: public ontologies and language parsers.</p>
<p>&nbsp;</p>
<h3>Public ontologies</h3>
<p>Computers fail the Turing test because words have no&nbsp;meaning to them. An ontology is a description of the relationships among things, and thus it imbues words with substance and meaning. Ontologies&nbsp;specify that a steering wheel is a part of a car, a car is a type of automobile, and automobiles are a means of transportation. In the future, we may expect that more search engines will include semantic functions that will make use of these resources to gain greater clarity about what a user&rsquo;s trying to find.</p>
<p>Several such general-level, public ontologies are currently in development, such as Princeton University&#8217;s&nbsp;<a href="http://wordnet.princeton.edu/">WordNet</a>. But they&rsquo;re dwarfed by the total scope of human understanding across all cultural contexts and outpaced by the continuous development of new information.</p>
<p>I would expect an ontology-building tool to emerge using social factors to allow anyone in the world to contribute, much like a wiki. In time, such a resource might grow large enough to provide computers with an information base so broad and deep that it would become difficult to stump them in a Turing test.</p>
<p>&nbsp;</p>
<h3>Natural language parsers</h3>
<p>Most website search engines are currently based primarily on pattern-matching algorithms. By contrast, any computer in a Turing test must have a robust capability to parse human language. Such capabilities have long existed and even been implemented in search engines like Ask.com, but these functions have fallen into disfavor because few users phrase their searches in complete sentences.</p>
<p>People do, however, use phrases with syntactic structure in their searches. Words take on meanings when they&#8217;re used in combination with one another that are different from their meanings when they&rsquo;re used alone. Computers that are sensitive to how an adjective modifies a noun or how a preposition introduces a phrase will come much closer to the user&rsquo;s expectation of a search engine that understands them as well as a human being would.</p>
<p>&nbsp;</p>
<h2>Conclusion</h2>
<p>Alan Turing predicted that 50 years from the time of his article, computers would be sophisticated enough to pass his test. It&rsquo;s now eight years past that date, and I&rsquo;m skeptical that his prediction will ever come true. But today, the thought experiment provides us with a pragmatic way of thinking about search, because the two domains are linked by a common element: the expectations of the user.</p>
<p>&nbsp;</p>
<h4>References</h4>
<p>Turing, A.M. (1950).&nbsp; <a href="http://www.cs.umbc.edu/471/papers/turing.pdf">Computing Machinery and Intelligence</a>. <i>Mind, </i>LIX&nbsp;(236), 433-460. <br />
Rosenfeld, L. (2008).&nbsp; &nbsp;<a href="http://www.slideshare.net/lrosenfeld/site-search-analytics-workshop-presentation">Site Search Analytics for a Better User Experience</a>.&nbsp; Presentation.&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://boxesandarrows.com/applying-turings-ideas-to-search/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Search Behavior Patterns</title>
		<link>http://boxesandarrows.com/search-behavior-patterns/</link>
		<comments>http://boxesandarrows.com/search-behavior-patterns/#comments</comments>
		<pubDate>Wed, 30 Jan 2008 17:10:22 +0000</pubDate>
		<dc:creator>John Ferrara</dc:creator>
				<category><![CDATA[Big Ideas]]></category>
		<category><![CDATA[Design Principles]]></category>
		<category><![CDATA[Methods]]></category>
		<category><![CDATA[Special topic: Search and Metadata]]></category>

		<guid isPermaLink="false">http://boxesandarrows.com/search-behavior-patterns/</guid>
		<description><![CDATA[People search for information online is often in idiosyncratic ways. It's rarely as straightforward as designers of search systems assume. John Ferrara gives us hope as he helps us think about a broader search ecology and identifies patterns in behavior that serve as the basis for good search design.]]></description>
				<content:encoded><![CDATA[<p>A search engine on an organization&rsquo;s website or intranet is often built to support an overly narrow model of user behavior, which goes something like this:</p>
<ul>
<li>User types in a search</li>
<li>Search engine gives back matching results</li>
<li>User reads the results and picks the best one</li>
</ul>
<p>Simple. Better still, it asks very little of the user interface&mdash;only that it provide some way to submit a search, and some list in response.</p>
<p>&nbsp;</p>
<p>However, such simple models overlook the fact that humans are complex, convoluted, capricious, mutable, moody, multifaceted beings with broadly differing backgrounds, competencies, and frames of reference. (1) In practice, this can make the requirements for search interfaces quite a bit more complicated.</p>
<p>The good news is that while users vary widely in the ways they search, their behaviors follow a limited number of identifiable patterns. By examining the factors that cause variability in user behavior and considering personas that illustrate those variations, we can identify common search behavior patterns and the interface affordances that support them.&nbsp;</p>
<h2>Factors that affect user behavior</h2>
<p>Search behavior is the result of interplay among several independent factors the user brings to the search operation, six of which are described below. Designers have no more control over these than they have over the color of the user&rsquo;s hair.&nbsp;</p>
<p><strong>1. Domain expertise</strong><br />
User behavior has a lot do to with a user&rsquo;s familiarity with the subject on which he or she is searching. When searching outside a domain of expertise, people will be less certain where to start, use less precise language, and have more difficulty evaluating search results. By contrast, experts in a field generally know what verbiage will work best, and so generally get better results, from which they&rsquo;re better able to discern the most useful documents. (2)</p>
<p><strong>2. Search experience</strong><br />
Users who have a better understanding of the breadth of a search engine&rsquo;s capabilities have more ways to go about finding information. If you know how to use Boolean operators, exact strings, filtering controls, and have proven strategies for exploiting search, then you have a much richer toolset at your disposal. But search experience also isn&rsquo;t an absolute requirement for success. We have seen that users who are short on technical know-how but rich in domain knowledge can often get by. On the other hand, technophiles can have great difficulty finding information in an unfamiliar body of knowledge.&nbsp;</p>
<p><strong>3. Cognitive style</strong><br />
User behavior is also influenced by the way users assimilate new information. Researchers like <a href="http://informationr.net/tdw/publ/unis/app7.4.html">Nigel Ford and his colleagues</a> have proposed a number of schemas to describe cognitive style, but for the purposes of search it makes sense to think of it as a spectrum ranging from global to analytical thinking.</p>
<ul>
<li>Global thinkers first try to build a broad level of understanding across related topics.</li>
<li>Analytical thinkers dive right into a single topic and research it thoroughly to resolve a specific problem.</li>
</ul>
<p>Most people lie somewhere between these extremes, sporadically using either cognitive style but tending more often toward one. (3)&nbsp;</p>
<p><strong>4. Goal type</strong><br />
Search goals will vary from one query to the next, and may be broadly classified into three categories as outlined by Andrei Broder in his article &ldquo;<a href="http://www.sigir.org/forum/F2002/broder.pdf">A Taxonomy of Web Search</a>:&rdquo;</p>
<ul>
<li>Navigational searches are efforts to reach a particular location, such as an intranet&rsquo;s timesheet application.</li>
<li>Informational searches seek out any documents relating to a topic, like a description of employee benefits.</li>
<li>Transactional searches occur when the user primarily wants to accomplish something online, like changing her benefits elections.&nbsp;</li>
</ul>
<p><strong>5. Mode of seeking</strong><br />
The extent to which users understand what they are trying to find determines their mode of seeking. The level of understanding can range from known items, where people know exactly what they need and how to describe it, to much more exploratory searches, where they have only a loose concept what they want to find. (4) Furthermore, as Marcia Bates pointed out in her oft-cited 1989 paper &ldquo;<a href="http://www.gseis.ucla.edu/faculty/bates/berrypicking.html">The Design of Browsing and Berrypicking Techniques for the Online Search Interface</a>,&rdquo;  information needs are often unstable and may evolve as a user learns more about a subject area.&nbsp;</p>
<p><strong>6. Situational idiosyncrasies</strong><br />
To add a final layer of unpredictability, search behavior can vary for the same user with the same task, due to idiosyncrasies in external pressures, working context, temperament, or mood. For example, a user who is nearing a tight deadline is likely to behave very differently than a user who is just leisurely exploring the same topic out of general interest. People can also approach search tasks differently simply if they&rsquo;ve had a bad day, feel tired, stand to make money, or feel especially engaged in a topic.&nbsp;</p>
<h2>Personas</h2>
<p>Grounding abstract ideas in concrete personas can help bring all of these factors to life. Personas are descriptions of typical users that illustrate key attributes that are relevant to the design of a website or online system. An understanding of the motives underlying user actions, like those detailed above, provides a great starting point for authoring personas.</p>
<p>For instance, the hypothetical people described below each illustrate different areas of domain knowledge, and represent a spectrum of search experiences and cognitive styles. They will be used to relate the factors above to the common search behavior patterns that follow.</p>
<ul>
<li>Andrea is a technical wiz who is completely comfortable with search engines. She is a project manager for a mainframe manufacturing division of her company. Her cognitive style tends to be analytical.</li>
<li>Dmitry has moderate technical know-how. He works in the benefits administration division of his company&rsquo;s HR department. He learns new information globally about as often as he does analytically.</li>
<li>Kazue is generally uncomfortable with technology, but is a recognized expert in her field of instructional design. She tends to be a global thinker who prizes an understanding of the big picture.</li>
</ul>
<h2>Patterns of Behavior</h2>
<p>Despite the large number of variables tugging user actions this way and that, they translate into a relatively small number of common patterns of behavior. In my work, I&rsquo;ve observed six broad patterns, described below with recommendations for accommodating each.</p>
<p><b>1. Alternating between search and browse<br />
</b>When searching, users will often select a result that is closest to the topic they have in mind even if it isn&rsquo;t a precise match. They&rsquo;ll then follow the links on that page to find their target information. A global thinker like Kazue might do this if she were exploring an information goal outside of her domain expertise. Unable to formulate the search phrase precisely right, she would need to trust the results returned by the engine. Finding that they&rsquo;re promising but not quite right, she may switch to browsing before returning to the results page.</p>
<p>In effect, searching and browsing can function as a single behavior, with many people moving fluidly between both. These users see no distinction between the two, since both work in support of a single information seeking task. This means that improving the quality of a site&rsquo;s navigation will necessarily also make searches more successful.</p>
<p><span><em>Design recommendations:</em></span></p>
<ul>
<li>Support robust cross-linking on each page, so that when users reach pages that are near matches they can easily get to the best matches.</li>
<li>Include conventional hierarchical cues like breadcrumb trails and contextual navigation, as well as nonhierarchical, associative links among topically related pages. (4)</li>
<li>Don&rsquo;t let pages come to a dead end, without any links to other resources on the site.</li>
</ul>
<p>If Kazue is able to easily cross-link among related pages, this hybridized searching/browsing behavior will be more effective.&nbsp;</p>
<p><b>2. Minimizing the results set<br />
</b>Users sometimes measure the success of a query primarily by the number of results it returns. If they feel the number is too large, they add more terms in an effort to bring back a more manageable set. Given her understanding of how search engines determine relevance, you&rsquo;d expect Andrea to do this if she needed to quickly locate a known item within her domain expertise, like &ldquo;mainframe manufacturing.&rdquo;</p>
<p><em>Design recommendations:</em></p>
<ul>
<li>Allow users to filter the search results by categories, so they can reduce the number of results while making them more topical.</li>
<li>Include a numeric count of the total number of results returned for the query and the total number for each category.</li>
<li>Use &ldquo;and&rdquo; as the default operator rather than &ldquo;or,&rdquo; so the number of results narrows instead of growing as the user adds more terms.</li>
<li>Don&rsquo;t confound this behavior by truncating the total results set at a round number like 100 or 500; this makes it difficult for users like Andrea to gauge the quality of her query.</li>
</ul>
<p><img width="599" height="443" alt="" src="http://www.boxesandarrows.com/files/banda/search-behavior/Figure_1.gif" /></p>
<p><em>Fig. 1: Filtering mechanisms help users narrow down searches that brought back too many results.</em></p>
<p><b>3. Surveying quickly<br />
</b>Some users scan through the results quickly, and if none of the titles strike them as an ideal match, they may proceed several pages deep into the results set. I&rsquo;ve seen these users go to the fifth or sixth page of results without hesitation, then go back to the initial results to look more carefully or submit another query.</p>
<p>For instance, Dmitry could do this to hedge his strategy if his task isn&rsquo;t fully defined. Hopeful that something will just pop out at him, he may do a quick scan of the first few pages, then fall back to another strategy if that doesn&rsquo;t work out.</p>
<p><em>Design recommendations:</em></p>
<ul>
<li>Ensure that result titles are comprehensible at a glance, including application files like PDFs and Word documents, which often return cryptic file names by default.</li>
<li>Highlight the terms that match the words originally submitted to help people scan the titles and descriptions more easily.</li>
<li>Allow users to change the number of results shown per page to avoid navigating through too many paginated results.</li>
</ul>
<p>These changes will allow Dmitry to evaluate pages more efficiently and successfully.</p>
<p><img width="529" height="481" alt="" src="http://www.boxesandarrows.com/files/banda/search-behavior/Figure_2.gif" /></p>
<p><em>Fig. 2: Search engines often return cryptic file names for application files like PDFs and PowerPoint slideshows.</em><br />
&nbsp;</p>
<p><b>4. Making immediate judgments<br />
</b>Other users look only at the first few results before deciding whether the query was successful or not. Finding nothing, these users may then resubmit the query or give up on search altogether.</p>
<p>Andrea, the analytical thinker, would be discriminating about a result&rsquo;s relevance to a narrowly defined informational goal. Confident in her expertise, she would also be quick to conclude that search is flawed if it cannot return a good match in the first few listings. This behavior requires that the best match be returned as close to the top of the list as possible.</p>
<p><em>Design recommendations:</em></p>
<ul>
<li>Optimize results for the most commonly submitted queries. Working from the search logs, try out each of the top queries and evaluate the quality of the top results returned, then optimize the content of those pages to improve their ranking.</li>
<li>When pages cannot be further optimized, include a manually generated &ldquo;Best Bets&rdquo; sidebar to force those matches to appear at the top. This gives the page a second chance to hit the specific target in Andrea&rsquo;s mind.&nbsp;</li>
</ul>
<p><b>5. Agonizing over the query<br />
</b>Sometimes users have difficulty translating the concept they want to find into a specific search phrase. They will often rewrite the query several times before submitting it, and then focus on revising it further if the results are not as they had expected them to be.</p>
<p>Less experienced users like Kazue are more likely to show this behavior, especially if the task isn&rsquo;t well defined and lies conceptually outside of her domain. Kazue may also be inclined to phrase the query generally enough to satisfy her global cognitive style, but fret over how general is too general.</p>
<p><em>Design recommendations:</em></p>
<ul>
<li>Consider providing tools that assist in formulating the query, such as suggestion functions that present searches similar to the one the user is typing.</li>
<li>Consider including lists of popular searches or automated storage of the user&rsquo;s previous queries, saved to a profile or cookie.</li>
</ul>
<p>Anytime that Kazue can select a query from a list rather than originating it from scratch, she will be able to search much more efficiently.</p>
<p><img width="409" height="265" alt="" src="http://www.boxesandarrows.com/files/banda/search-behavior/Figure_3.gif" /></p>
<p><em>Fig. 3: Suggest functions assist users with formulating queries when they don&rsquo;t quite know how to phrase their request.</em><br />
&nbsp;</p>
<p><b>6. Pogosticking<br />
</b>Some users click several results in rapid succession, quickly sampling each before settling on a best candidate to meet their needs. Jared Spool has described this as &ldquo;pogosticking&rdquo;&mdash;bouncing up and down between choices of uncertain relative value. This is the kind of behavior that Dmitry might resort to if the quick surveying behavior described for him above didn&rsquo;t yield anything. Assuming that his temperament is fairly tolerant and he isn&rsquo;t pressed for time, Dmitry may decide that he cannot determine the usefulness of pages without looking at them. These users need support for three primary tasks: assessing result listings, comparing result pages, and tracking work.</p>
<p><em>Design recommendations:</em></p>
<ul>
<li>Again, provide comprehensible titles and descriptions on the results page, as well as highlighted search terms.</li>
<li>Pages can be even more effectively compared if highlighting can be extended to the display of the results page itself (as is possible with Yahoo! and Google toolbars).</li>
<li>Allow users the option to open results in a new browser window to assist comparison. Sites like <a href="http://www.ask.com">Ask</a> and <a href="http://www.easysearchlive.com">Easy Search Live</a> are experimenting with page previews.</li>
<li>Be sure to include a visited link color on the results page. This is absolutely essential for Dmitry to keep track of the pages he has already tried and rejected as he jumps to each of the matches from the hub listing page.</li>
</ul>
<p><img width="505" height="439" alt="" src="http://www.boxesandarrows.com/files/banda/search-behavior/Figure_4.gif" /></p>
<p><em>Fig. 4: Visited link colors help the user avoid revisiting results that have already been tried and rejected.</p>
<p></em></p>
<h2>Conclusion</h2>
<p>Search behavior varies with domain expertise and technical knowledge, cognitive style, goal, and mode of seeking. All of these factors will interact in complex ways to influence a user&rsquo;s actions. Even then, behaviors will vary depending upon whether at that moment the user is under pressure, in a good mood, or any number of other idiosyncrasies.</p>
<p>The point is that the designer cannot select the behavior that a user will follow when conducting a search. This may invite the impression that the design must be overly broad, providing any conceivable function regardless of the likelihood it will be used, because we cannot predict whether it will be needed. Fortunately, users&rsquo; actual behaviors do fall into generally describable patterns, each of which has dependencies upon specific affordances of the interface. This is how designers can better cater to what appears to be chaos: make available those capabilities that best support the range of known behavior patterns for your target personas.&nbsp;</p>
<h4>&nbsp;</h4>
<h4>References</h4>
<p>(1) James Kalbach provides an overview of literature around this topic in his article &ldquo;<a href="http://www.internettg.org/newsletter/dec00/article_information_foragers.html">Designing for Information Foragers: A Behavioral Model for Information Seeking on the World Wide Web</a>.&rdquo;</p>
<p>(2) For more on expert search behavior, see these two articles: Christoph H&scaron;lscher &amp; Gerhard Strube (2000): &ldquo;<a href="http://www9.org/w9cdrom/81/81.html">Web Search Behavior of Internet Experts and Newbies</a>&rdquo;; and, Suresh K. Bhavanani (2002): &ldquo;Domain-Specific Search Strategies for the Effective Retrieval of Healthcare and Shopping Information,&rdquo; <span class="caps"><span class="caps"><span class="caps"><span class="caps"><span class="caps"><span class="caps"><span class="caps"><span class="caps"><span class="caps"><span class="caps"><span class="caps"><span class="caps">CHI 2002</span></span></span></span></span></span></span></span></span></span></span></span>, pp. 610-611.</p>
<p>(3) See Ryen W. White &amp; Steven M. Drucker (2007): &ldquo;Investigating Behavioral Variability in Web Search,&rdquo; International World Wide Web Conference 2007, pp. 21-30.</p>
<p>(4) See Donna Maurer (2006): &ldquo;<a href="http://www.boxesandarrows.com/view/four_modes_of_seeking_information_and_how_to_design_for_them">Four Modes of Seeking Information and How to Design for Them</a>.&rdquo;</p>
<p>(5) David Fiorito and Richard Dalton further described different types of navigation in their presentation at the 2004 IA Summit, &ldquo;<a href="http://www.iasummit.org/2004/finalpapers/FioritoDalton_Handout_or__final__paper.ppt">Creating a Consistent Enterprise Web Navigation Solution</a>&rdquo;.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://boxesandarrows.com/search-behavior-patterns/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Strategies for Improving Enterprise Search</title>
		<link>http://boxesandarrows.com/strategies-for-improving-enterprise-search/</link>
		<comments>http://boxesandarrows.com/strategies-for-improving-enterprise-search/#comments</comments>
		<pubDate>Tue, 11 Sep 2007 19:27:08 +0000</pubDate>
		<dc:creator>John Ferrara</dc:creator>
				<category><![CDATA[Findability]]></category>
		<category><![CDATA[Special topic: Search and Metadata]]></category>

		<guid isPermaLink="false">http://boxesandarrows.com/strategies-for-improving-enterprise-search/</guid>
		<description><![CDATA[Installing a search engine is just the beginning of creating an effective enterprise search system. John Ferrara addresses critical aspects of the user experience often overlooked or ignored.  ]]></description>
				<content:encoded><![CDATA[<p>It&#8217;s common for enterprise website developers to implement search engines with out-of-the-box functionality, point it at their content repositories, and then just leave it at that. Search is becoming something of a neglected orphan, in part because packaged search products are relatively easy to implement, and then even more easily forgotten.</p>
<p>Unfortunately, the results are too often plagued by problems. You know something&#8217;s gone wrong when a perfectly clear query returns results that are not only irrelevant, but seemingly deranged. Pages with a logical relationship to the initial request compete for placement among what Jared Spool fittingly calls &#8220;wacko results.&#8221;<sup><a href="#fn1">1</a></sup>  The majority of participants walking into my usability tests report they don&#8217;t trust embedded site search to help them find what they&#8217;re looking for. </p>
<p>Quality search results only come about through applied effort, requiring in particular the skills of an information architect.<sup><a href="#fn2">2</a></sup>  And IAs must be ready to go well beyond their traditional front-end role, digging into the functional backend and source data of the search engine. This article outlines how we can bolster findability and win back users&#8217; confidence.</p>
<h1>Conceptualizing the Task</h1>
<p>The results of any given search are impossible to predict with precision (short of having tried it before). That&#8217;s because five distinct variables combine to determine its outcome (Figure 1):</p>
<ol>
<li><strong>Search engine.</strong> The algorithmic gears that parse the query and assign pages relevance.</li>
<li><strong>Content.</strong> The documents searched.</li>
<li><strong>Index.</strong>  A catalog of the locations of every word in every document. This is what allows Google to miraculously find 5 billion instances of the word &#8220;the&#8221; in 0.2 seconds.</li>
<li><strong>User input.</strong>  The keywords and other parameters the user submits.</li>
<li><strong>Results display.</strong>  The way the data returned by the search engine is presented.</li>
</ol>
<p><img src="http://www-boxesandarrows-com.zippykid.netdna-cdn.com/files/banda/strategies-for/ferrara_strategies_fig1.gif" width="500" height="334" alt="ferrara_strategies_fig1.gif" /></p>
<p><strong>Figure 1. Five variables that determine the success of a site search. </strong></p>
<p>Critically, the search engine isn&#8217;t the only factor that determines the outcome, so search can&#8217;t be seen purely as a technology problem. It&#8217;s important for organizations to realize that their investment in search doesn&#8217;t end with the product&#8217;s implementation; the most successful approaches will go further to include strategies addressing all of the outside variables.</p>
<h1>Strategies</h1>
<p>Several engine products allow you to tweak the search engine&#8217;s algorithm itself, but I don&#8217;t recommend it. That would be like doing brain surgery to fix a speech impediment&mdash;whether or not you solve that problem, you&#8217;ll inevitably cause a great many more. Changing the algorithm affects all searches, including the ones that already work just fine. So it&#8217;s easiest to keep it stable and modify the factors surrounding it.</p>
<p>Taking the search engine as a constant, then, there are four variables that affect the quality of search. Strategies for improving each of these are proposed below.</p>
<h2>Strategy 1: Make the Content Machine-Readable</h2>
<p>Search engines can provide better results when they&#8217;re given better content. The trick is to provide a basis for inferring the content&#8217;s meaning.</p>
<h3>Structural Markup</h3>
<p>The XHTML structure of pages is relevant to the IA, because content that is more machine-readable will be easier to find using search. Pages should extensively use the correct semantic elements: &lt;h1&gt; through &lt;h6&gt;, &lt;p&gt;aragraph, &lt;q&gt;uotation, &lt;caption&gt;, and so on, as well as semantically named &#8220;class&#8221; attributes.  This will help the search engine compare the usage of terms among pages, to distinguish the central topic of a page from peripheral concepts (Figure 2). While IAs typically don’t mark up individual pages, they can influence the process by specifying template-level semantic elements in their wireframes and participating in periodic content reviews. </p>
<p><img src="http://www-boxesandarrows-com.zippykid.netdna-cdn.com/files/banda/strategies-for/ferrara_strategies_fig2.gif" width="550" height="413" alt="ferrara_strategies_fig2.gif" /></p>
<p><strong>Figure 2. Structural markup explains that Jupiter is the central topic of page A, while in page B it&#8217;s just one of several subpoints on observing planets.</strong></p>
<h3>Standard Meta Tags</h3>
<p>Most websites use keywords and descriptions in meta tags, but not often as part of a larger strategy. The first step is to create a controlled vocabulary, a standardized set of keywords.<sup><a href="#fn3">3</a></sup>  If you tag them as &#8220;teachers&#8221; over here, but &#8220;professors&#8221; over there, the search engine will have a hard time understanding why they&#8217;re the same thing. The keywords should also reflect actual terminology from the page itself (especially headings) and be reinforced in the description tag.</p>
<h3>More Metadata</h3>
<p>Go beyond keywords. Tags that describe the target audience groups, the sector of a financial service, or the cuisine of a recipe page provide more ways to compare and contrast the content; search engines will read as much meta information as you give them. There is a practical limit to how much you can do, which makes user-defined tags well worth considering.</p>
<h3>Ontology</h3>
<p>Humans know that pugs are dogs, and dogs chase cats, and cats play with yarn, but these relationships are lost on computers. An ontology is a list of concepts linked by the ways they relate to one another (Figure 3), helping the search engine grasp the content&#8217;s meaning. If your search product supports ontologies (several do), this can significantly improve the quality of the results.<sup><a href="#fn4">4</a></sup></p>
<p><img src="http://www-boxesandarrows-com.zippykid.netdna-cdn.com/files/banda/strategies-for/ferrara_strategies_fig3.gif" width="500" height="210" alt="ferrara_strategies_fig3.gif" /></p>
<p><strong>Figure 3. An ontology explains the relationships between concepts.</strong></p>
<h2>Strategy 2: Index All of the Right Data</h2>
<p>Indexes have made searching remarkably expedient, but the way they&#8217;re built has a lot to do with the quality and reliability of results. Proper indexing requires taking a hands-on approach, and the IA has an interest in working with the development team to influence it.</p>
<h3>Ignoring Unnecessary Content</h3>
<p>Search engines will automatically index the entire content of a page, regarding everything as equally important. This is a problem because the navigation, for example, will contain terms that are specifically relevant to the siblings, parents, and children of a page, and not to the page itself (Figure 4). There are several methods of excluding this content; the important thing is to make sure that it&#8217;s done, because this is one of the most common reasons why searches return bizarre results.</p>
<p><img src="http://www-boxesandarrows-com.zippykid.netdna-cdn.com/files/banda/strategies-for/ferrara_strategies_fig4.gif" width="500" height="366" alt="ferrara_strategies_fig4.gif" /></p>
<p><strong>Figure 4.  A search for &#8220;Neptune&#8221; may return results that include this page about Jupiter because the term &#8220;Neptune&#8221; appears here in the navigation.</strong></p>
<h3>Getting All Resources</h3>
<p>Users reasonably expect a search to return all of the website&#8217;s relevant publicly available documents.  Unfortunately, many search products can&#8217;t index .pdf, .doc, .xls, .ppt, and similar files, and you can forget about content locked away in audio or video files. The best fix is to convert application files to XHTML and provide transcripts or summaries of multimedia files. This can be a big job, so you may want to initially convert just the most commonly accessed documents.</p>
<h2>Strategy 3: Make the Most of User Input</h2>
<p>It can be difficult to figure out how to phrase a query. Users have to express what are often complicated concepts in that particular set of words that a given search engine will like. It&#8217;s important to make the most of what users submit on their first attempt, because they&#8217;re much less likely to make a second.<sup><a href="#fn5">5</a></sup></p>
<h3>Query Expansion</h3>
<p>All contemporary search vendors offer some type of query expansion, where the search engine automatically looks for words related to the ones the user actually entered (Figure 5). Word stemming, which searches for different forms of the same word, is usually enabled by default.  However, the thesaurus, which searches for equivalent and related terms, requires manual work.<sup><a href="#fn6">6</a></sup>
</p>
<p><img src="http://www-boxesandarrows-com.zippykid.netdna-cdn.com/files/banda/strategies-for/ferrara_strategies_fig5.gif" width="500" height="284" alt="ferrara_strategies_fig5.gif" /></p>
<p><strong>Figure 5.  Searches shouldn&#8217;t only look for the terms as the user entered them, but for related and alternate forms of those terms.</strong></p>
<p>You can go overboard defining synonyms, but the problem is usually too little (by which I mean &#8220;none at all&#8221;) rather than too much.<sup><a href="#fn8">8</a></sup> Search logs are the best resource for discovering synonyms, related terms, and common misspellings. Set up ongoing reviews to add terms that users actually submit to the thesaurus, drawn from the wealth of data that&#8217;s freely available in the logs. The number of successful first attempts will rise dramatically over time.</p>
<h3>Syntax Conventions</h3>
<p>Users should be able to submit searches in whatever way they learned to write them. Unfortunately, search engines have different syntaxes for the standard operators (And, Or, Not, exact string). You can&#8217;t rely on a help file&mdash;it&#8217;s one of people&#8217;s least favorite things to read. The parser should instead be scripted to accept all common syntax conventions, so the user doesn&#8217;t have to guess. It should also use &#8220;And&#8221; as the default operator, which will appropriately limit the results downward as more terms are added to the search.</p>
<h3>Assisting Query Formulation</h3>
<p>Suggestion functions provide users with a list of similar queries that other people have tried as they type. This makes a lot of sense, since it can be difficult to put a complex idea into words or to recall the precise name of an item. Stellar examples of suggest functions include <a href="http://labs.google.com/suggest">Google Suggest</a>, <a href="http://livesearch.alltheweb.com/">AllTheWeb</a>, and <a href="http://www.apple.com">Apple&#8217;s website</a>.</p>
<h2>Strategy 4: Build the Results Page Around the User&#8217;s Needs</h2>
<p>The results page should be designed to help users find matches for their interests as quickly as possible. This is closer to the IA&#8217;s typical interface design role, yet it&#8217;s still uncommon to see much more than the vendor&#8217;s out-of-the-box functionality on search results pages.</p>
<h3>Showing Relevance</h3>
<p>Sometimes a search engine will return the right results, but the user will fail to recognize it. Users need to see why results are relevant to their searches. There are two simple ways to do this.</p>
<p>The first is to show a text excerpt from the page that contains the terms from the user&#8217;s query, instead of the &lt;meta&gt;description field. The description may vary greatly from the user&#8217;s entered query&mdash;especially on long pages&mdash;and it may not be at all clear why a particular page was retrieved. Instead, an excerpt of the actual content that matches the search will directly explain why a user might want to click through to that page.</p>
<p>The second way to show relevance is to bold the terms in the excerpt that match terms in the user&#8217;s original query. That will help the user to quickly scan the page for the results that have the right words in the right context (Figure 6).</p>
<p><img src="http://www-boxesandarrows-com.zippykid.netdna-cdn.com/files/banda/strategies-for/ferrara_strategies_fig6.gif" width="500" height="375" alt="ferrara_strategies_fig6.gif" /></p>
<p><strong>Figure 6.  Excerpting and term highlighting allow the user to understand how each result relates to the query, and quickly identify the ones that are most relevant.</strong></p>
<h3>Best Bets</h3>
<p>Despite all optimization efforts, search engines sometimes still miss strong associations that are obvious to people. In cases where particular keywords should be returning specific pages, it can be helpful to include a list of manually specified &#8220;Best Bets,&#8221; triggered by business rules (Figure 7).<sup><a href="#fn8">8</a></sup>  This reintroduces the designer&#8217;s influence into search, smoothing out irregularities in the reliability of automated results.</p>
<p><img src="http://www-boxesandarrows-com.zippykid.netdna-cdn.com/files/banda/strategies-for/ferrara_strategies_fig7.gif" width="500" height="292" alt="ferrara_strategies_fig7.gif" /></p>
<p><strong>Figure 7.  Best bets allow the designer to force particular pages to be returned when the user&#8217;s query contains a specific string.</strong></p>
<h3>Conditional Content</h3>
<p>Taking Best Bets one step further, consider including contextually appropriate content in the search results page when a string in the user&#8217;s query indicates the user probably has a particular interest in mind.  For example, a user searching for &#8220;extrasolar planets&#8221; on an astronomy website might appreciate a results page that includes a list comparing the properties of all planets discovered beyond our solar system.</p>
<h1>Conclusion</h1>
<p>This article introduces just some of the steps that you can take to improve the overall search experience on your site. The reliability of enterprise search needs significant improvement to reestablish user confidence, and IAs should take the lead. To get there, a product&#8217;s out-of-the-box functionality must not be seen as the end, but as just the beginning. </p>
<p>
<strong>REFERENCES</strong></p>
<ul>
<li>
<p id="fn1"><sup>1</sup> Jared Spool: <a href="http://www.uie.com/brainsparks/2006/04/14/bbc-reports-users-lose-patience-with-poor-search-2/">&#8220;BBC Reports Users Lose Patience with Poor Search&#8221; </a> </li>
<li>
<p id="fn2"><sup>2</sup> Lou Rosenfeld &#038; Peter Morville, <em>Information Architecture for the World Wide Web</em>, pp 136-137. </li>
<li>
<p id="fn3"><sup>3</sup> Fred Leise, Karl Fast, and Mike Steckel: <a href="http://www.boxesandarrows.com/view/creating_a_controlled_vocabulary">&#8220;Creating a Controlled Vocabulary&#8221;</a> </li>
<li>
<p id="fn4"><sup>4</sup> Tim Berners-Lee: <a href="http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21&#038;catID=2">&#8220;The Semantic Web&#8221;</a> </li>
<li>
<p id="fn5"><sup>5</sup> Jared Spool: <a href="http://www.uie.com/articles/users_search_once/">&#8220;People Search Once, Maybe Twice&#8221;</a> </li>
<li>
<p id="fn6"><sup>6</sup> Christina Wodtke, <em>Information Architecture: Blueprints for the Web</em>, pp. 137-140.</li>
<li>
<p id="fn7"><sup>7</sup>Lou Rosenfeld &#038; Peter Morville, <em>Information Architecture for the World Wide Web</em>, pp. 188-189.</li>
<li>
<p id="fn8"><sup>8</sup> Chris Farnum: <a href="http://www.slideshare.net/ChrisFarnum/tuning-up-site-search-ia-summit-2007">&#8220;Tuning up Site Search&#8221;</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://boxesandarrows.com/strategies-for-improving-enterprise-search/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>
