Amanda Spink is one of the smartest people working on user behavior while using web search, yet when I mentioned her name to a friend who’s spent the last year working on the search user experience, he had never heard of her. The design community is woefully undereducated about search, and is often prone to redesigning Google and postulating what Yahoo! is doing wrong, rather than working to understand why search engines have chosen to do what they do. I suppose this shouldn’t be surprising, though, considering Spink’s work is more often seen in scholarly journals, such as New Directions in Human Information Behavior and Journal of the American Society for Information Science and Technology (brought to you by the same folks who bring you the IA Summit, yet rarely cracked by the working folks).
In order to help correct this problem, I shyly contacted my hero by email, and overcame the time difference between sunny California and even sunnier Brisbane, Australia, with a series of email questions.
Christina Wodtke: When I joined Yahoo!, I had never worked on search before. Your article, “From E-Sex to E-Commerce: How Search Changes” was one of the most valuable in beginning to get my mind around the problem of search. Since reading that article, however, I haven’t seen much change from your findings. In your opinion, are users changing their search behavior, or are they still following the same patterns you found when you analyzed Excite’s data?
Amanda Spink: You make a good observation. Since 1997, our findings have come from analyzing large-scale web user data gathered from commercial web companies, including Excite, Ask Jeeves, Alltheweb.com, Alta Vista, Vivisimo, and Dogpile. Our research since 1997 shows some trends and changing patterns in general searching. However, looking at more recent data from Vivisimo and Dogpile, most web queries are still short—2 to 3 terms, and sessions include little query modification and are generally 2-3 queries in length.
Few people use advanced search features, and many queries include spelling and other mistakes that adversely affect the search results. People look at only a few result pages—not beyond the first or second results pages.
A small number of terms are used with high frequency, and many terms are used once. Web queries are very diverse in topic and some, such as people’s names, are unique.
CW: You’re referring to the “long tail” of a Zipf curve? How does this affect search engine’s strategies in providing relevant results?
AS: The long tail makes “relevant” retrieval very difficult, especially if users are only providing the search engine with two to three words on which to base relevance judgments. Despite the inherent “interactive” nature of search, much of search is not very interactive. The talk about personalization is an attempt to obtain more information from the user to help determine relevance.
CW: So are users changing their overall behavior at all?
AS: We are seeing a growth in more complex search behaviors. More people are searching for information using more than one search. This might mean repeat searches of the same query over time or modifying the queries in successive searches over time. Many people are multitasking or searching for information on more than one topic during a search session. People’s information needs are often quite complex in their home and work environments.
CW: How are they handling that?
AS: During multitasking search they may include two or more topics in one window, or open new windows for each topic and run searches concurrently.
CW: Could you talk a bit more about “complex search behaviors”?
AS: Peoples’ information seeking behavior can often be long and complex. Imagine a person is looking for information on cars. He conducts one search on one search engine, looks at the results, and tries another search engine, or goes back to the first search engine, and repeats the same search with the same search terms, or he may add or remove some terms (query reformulation). This is called successive searching.
In addition, research shows that people often search for more than one topic during their interaction with a search engine. They may batch their topics due to time constraints or new topics may evolve during their search session. This is called multitasking search.
Both phenomena are examples of more complex behaviors beyond the one-topic, one-search paradigm that most search engines assume.
CW: If queries are still so short, what are some of the more successful disambiguation tools used by the search engine? Vivisimo and Clusty offer algorithmically generated groupings and present them as narrowing tools. But last time I was testing these tools with users, the narrowing options were essentially invisible to them. And despite Jakob Nielsen’s assurance (admittedly in 1999) that longer search boxes produce longer queries, I’ve never seen it happen. Are there better and worse ways to encourage richer queries from users?
AS: So far no commercial tool seems to be effective at helping users on a large scale. Search engines have not used longer boxes, so no one really knows what would happen on a large scale if text boxes were changed. The best way to encourage richer queries is to train users and expect them to put more effort into their search behavior. Search engines need to put more demands on the users. People don’t understand their own information behaviors, and they don’t really understand much about search or the web, so they will have to learn. It could take generations.
CW: Really? Many folks who revamp search, either by adding Google, Yahoo!, or another vendor, seem to be leaning toward long entry boxes. I’m thinking about CNN, NY Times.com, CNET.
AS: The Google and CNN text boxes may be a little bigger or longer than average, but not substantially longer. How about a structured textbox, like an electronic library catalog interface? How about a textbox that is 3 inches by 3 inches with lots of space for people to express themselves? If you give people a small text box, you’re probably constraining their expression of their information problem.
People need to feel they should play around with search and experiment. All they can really do at present is squeeze in a few words, press search, and look at a list of websites-the list giving little indication of what the websites mean or how they are ranked. One major problem is that [designers of] search engines tend to think that one technique will do it! What they need to do is test combinations of many techniques, such as clustering, relevance feedback, etc. There is no silver bullet here.
CW: When you speak of training users, I’ve found that very challenging, much more with search than any other Internet paradigm. I’ve been in a lab with a fellow who has used Google for five years, and he never realized “cached” was there until I pointed it out. How can train people who have “banner blindness” for most of the page?
AS: I think this is a major problem for the web industry. How to train billions of people? Whoever comes up with the best solution for that question may capture huge market share. The paradigm needs to change. Search is challenging and interactive, and maybe a “game” paradigm would help.
CW: The short-query phenomena is fascinating. In a lab, I have asked people why they typed, say, “sailboats,” and they’ll say, “Oh, I’m interested in taking classes next summer when we’re up at the lake in Michigan,” yet none of the words made it into their query. Any insights?
AS: Our research shows that the most effective search terms are those submitted by the user, from a user’s interaction with another person about their topic, and terms they identify on the screen from the retrieved output. Stimulating users to talk with someone or thing (agent) about their information problem helps generate terms and look at the results for additional terms.
CW: Hey, those sound like classic reference librarian techniques!
AS: One area that some web developers are exploring is classic reference librarian techniques. It’s an obvious area to explore to understand information behavior and how librarians have helped people with their information problems.
CW: “E-sex and E-commerce,” was referring to a topical shift in searches. Are you continuing to see changes in what people are searching for?
AS: I think it’s important not to assume what “people are searching for” means just U.S.-based search engines. There are major differences emerging in search in different global regions. For the more U.S.-based search engines, the topics seem to have stabilized somewhat with business and e-commerce related searches being the largest category, followed by people, places and things, computers, and medical/health. Sex/porn and entertainment is now a smaller proportion of searches. From what I’ve seen about the Chinese search engines (e.g., Baidu), users are looking for entertainment and gaming. One could say that the Chinese search engine users are where the U.S. users were 5-10 years ago. As more Chinese business information is accessible via Baidu, the search topics may change. Also, currently less than 10% of the Chinese population search the web, so as that number increases, topic may change.
CW: There are endless articles these days about search privacy, and Google giving information to the feds. Is the ordinary person on the street worried about that?
AS: This is an important area for everyone. If search engines and the web are becoming the primary tool by which people are expected to access information, then privacy and the practices of the government in regulation or companies is crucial. Much like the way we see telephones and TV in the past, as involving privacy, commercial, and government interests. Also, because search is now ubiquitous, politicians will seek to gain political advantage or grounds for industry regulation. Ordinary people should be concerned about how political and commercial information policies will affect their access the web.
CW: And you are studying the evolution of human information behavior. How far back are you going? Medieval libraries? Cavemen looking for the right painting?
AS: Obviously humans evolved information behaviors before preliterate societies through cave art, etc. Information behaviors evolved to help humans complete and cooperate, as the technologies evolved from cave art to the web. In fact, people may not change their information behaviors, but may have evolved over time to utilize a greater capacity for more complex information behaviors.
The Spink and Currier paper talks about the information behaviors of Darwin, Napoleon, and Casanova-all very effective people at finding and using information. And what we write about Casanova many people have found fascinating!
CW: Now you are being cruel! I’m going to have to renew my library card. Can you predict trends in behavior from your research? What’s next on the horizon?
AS: What’s next on the horizon is developing an understanding of how human information behaviors have evolved over human history, how they evolve over a person’s lifetime, how their search interactions develop over time, and how search in the aggregate is evolving over time. In other words, we need more longitudinal studies.
CW: Any bits of advice to practitioners about to attack the search tools on their sites? Lessons from web search?
AS: A key problem for practitioners is the lack of computer people trained in information and web retrieval, web design, and web usability. There is a lack of good trained people and not many industry consultants who really understand search. Search is much harder than most people think, and the design of effective search tools is even harder. Practitioners need to really test any search engines they consider buying. Many companies claim that their search engines are effective and the best, but provide little real evidence for their claims.
Be careful of the search engine that promises effectiveness and superiority based on a “single” feature, e.g., linking or clustering. There is no silver bullet feature. We don’t yet have Search engines that have adopted a more holistic attitude based on a real understanding of search, people’s information behavior and what is really effective. Whoever takes that path effectively will gain competitive advantage in the marketplace.
For More Information
Spink, A., & Currier, J. (2006). Emerging evolutionary framework for human information behavior. In: A. Spink & C. B. Cole (Eds.), New Directions in Human Information Behavior. Berlin: Springer (pp. 13-31).
Spink, A., & Cole, C. B. (2006). Human information behavior: Integrating diverse approaches and information use. Journal of the American Society for Information Science and Technology, 57(1) 25-35.
Spink, A., & Currier, J. (2006). Toward an evolutionary perspective on human information behavior: An exploratory study. Journal of Documentation, 62(2), 171-193.