MSWeb: An Enterprise Intranet #2

Written by: Peter Morville

Beyond taxonomies: selling services
The MSWeb team started out four years ago with a vision of the very broad but tricky area of taxonomies, and went to work figuring out how they could be built for use on the MSWeb portal. They tested and developed tools and vocabularies that improve content management as well as the searching and browsing of the MSWeb site.

MSWeb is the ultimate low-hanging fruit: highly visible, frequently used by many in the company, rich in valuable content and important to management.This project is beginning to have an impact that goes far beyond the MSWeb site. Other major Microsoft intranet sites—those for human resources, finance, the library, and the information technology group—have begun to use some or all of the tools and taxonomies that were developed by the MSWeb team. And more than two dozen major sub-portals have implemented aspects of MSWeb’s search system. How has the MSWeb team succeeded at spreading its gospel through a huge organization like Microsoft, when similar efforts at smaller companies often fail?

The roots of MSWeb’s success are many. Let’s examine them.

Location, location, location
Because MSWeb is the company’s major intranet portal, just about everyone in the company uses it—94% of all Microsoft employees. The site is large and complex, providing the team with ample challenges and a test bed for trying out new solutions. Additionally, MSWeb’s enterprise-wide prominence has made for an excellent marketing opportunity for the team’s efforts and for information architecture in general.

Indeed, as a candidate site for an information architecture redesign, MSWeb is the ultimate low-hanging fruit: highly visible, frequently used by many in the company, rich in valuable content, important to management, and, finally, managed by an enlightened team that was aware of information architecture. You couldn’t ask for a better showcase for the value of good information architecture.

Helping where it hurts
Every information architecture project ultimately has two audiences: users and site managers/owners. It’s important to make both audiences happy, and the best way to do so is to fix what hurts.

The MSWeb team intentionally selected a major area—search—that would greatly benefit both users and managers, and designed its taxonomies to specifically improve search performance. Users’ experiences with searching were greatly improved through the integration of Best Bets into search results (more on Best Bets below). And the MSWeb team began to help site managers address search, sometimes by simply providing informal consulting, but also in more concrete ways such as providing a centrally managed crawling and indexing service. By encouraging units to develop resource records, the MSWeb team spawned the creation of a collection of content surrogates that references some of the most valuable content in the Microsoft intranet environment. And once these records were created, they made for great starting points for site crawling—robots simply followed the links embedded in the UCS’s records.

Just as the prominence of MSWeb gained exposure for the team’s efforts, the success of Best Bets validated the MSWeb approach. Both paved the way for improved collaboration between the MSWeb team and many other business units that were players in the Microsoft intranet environment.

Modular services
From the very start, the MSWeb team has looked for opportunities to develop its taxonomies and tools in a modular and therefore reusable fashion, and package them as services for the rest of the company. In fact, they’ve even branded their offerings as “Search and Taxonomies as a Service” (originally “Search as a Service”—and still referred to as SAS). The SAS console, displayed here, provides an excellent visualization of what SAS offers to its users.

The MSWeb team recognized that other business units would have a wide variety of needs, as well as existing tools on hand to address their own information architecture and content management challenges. They knew that no one could compel those business units to adopt 100% of the MSWeb approach. So the team designed SAS to be extremely modular, so Microsoft business units could take advantage of some services while passing on others.
For example, SAS offers access to MSWeb’s taxonomies through the MDR. Other units can manage and store their own taxonomies through the MDR as well, as long as they are willing to share their work. And to ensure quality in their taxonomies, those other business units can take advantage of taxonomy-related consulting services provided by SAS.

Different business units can access taxonomies from the MDR through the SAS console. Or, because the taxonomies are exportable in XML, units can develop their own interfaces, as did Microsoft’s library. This flexibility means that existing tools, homegrown or not, don’t need to be thrown out in favor of MSWeb’s version. Similarly, XML is used to export search results; this enables another unit’s site to leverage the records stored in UCS (assuming that their engine can accommodate XML). Even the MSWeb search interface is exportable, as it’s written using XSL.

As discussed earlier, metadata schema are extensible, in effect allowing different business units to create customized versions of any schema. Records created using those schema are reusable through a highly flexible subscription process. And last but not least, optional crawling and indexing services are also made available by SAS to its client business units.

All of this flexibility leads to a huge number of possible SAS service configurations. A Microsoft business unit could handle most of its information architecture and content management needs using everything SAS has to offer, or it could operate its own publishing system that only imports taxonomies from the MDR. Or it might choose to go it completely alone. The decision is up to that business unit, and is impacted by the factors of users, content, and context that guide all information architecture work.

In the case of HRWeb, Microsoft’s human resources portal, the decision was made to use most SAS services. SAS was used to:

  • Identify content for crawling and indexing for use in searching
  • Create a category label taxonomy for browsing
  • Create Best Bets specifically for use in the HRWeb portal
  • Classify those Best Bets using HRWeb’s category label taxonomy
  • Provide access to the SAS high-quality search engine
  • Export Best Bets search results to HRWeb’s site

Perhaps most importantly, HRWeb drew on the MSWeb team’s expertise through a consulting relationship. MSWeb staff taught HRWeb’s team how to develop category labels through user-centered design (UCD) techniques such as contextual inquiry. The HRWeb team was also instructed in the art and science of cataloging resource records using descriptive vocabularies and the shared metadata schema. The resulting HRWeb site is shown here.

Currently, most units have small web development–related teams and limited resources, and are just beginning to delve into the sticky topics of taxonomies, searching, and browsing. As they learn about SAS, they are generally quite glad to take advantage of the tools and expertise already developed by the MSWeb team. But as each unit’s expertise and budget for information architecture grows, it will likely want to take on more and more control. The flexibility of its service modules will ensure that SAS can be configured to keep up with those changes.

Different kinds of flexibility
Aside from a focus on taxonomies, the major components of MSWeb’s approach—the tools and a flexible, modular, and somewhat entrepreneurial service model—draw little from library science. And as noted earlier, the taxonomies themselves, not to mention MSWeb’s operating definition of the word “taxonomy,” do not adhere to an orthodox library science approach.

Team members left their disciplinary baggage at the door in order to achieve buy-in and support from colleagues from different backgrounds and with different perspectives.This is a different flexibility than the kind that drives the SAS approach. The MSWeb team has been driven by a philosophy built on a flexibility of mind. Although many team members have library science backgrounds, they have left their disciplinary baggage at the door in order to achieve buy-in and support from colleagues from different backgrounds and with different perspectives.

For example, few, if any, graphic designers get excited by the thought of developing taxonomies. But anyone will listen to an open-minded colleague describing a good approach to solving a big problem. Because the MSWeb team was willing to be flexible in its terminology and outlook, they could communicate their taxonomy-based solutions more effectively to colleagues and clients who might be turned off by “library talk.” One senior designer on the MSWeb team described his realization of the value of the taxonomy approach and its basis in UCD techniques as the moment he “drank the Kool-Aid.” From that point on, he bought into the approach 100 percent.

The team was also successful because it was flexibly designed—not just LIS people, but technologists, technical communicators, designers, and strategists. In addition to lending the team more credibility with outsiders, the team’s interdisciplinary nature meant that many ideas were explained, translated, and fought over before they were ever exposed to outsiders. Interdisciplinary perspectives lead, as always, to a better and more marketable set of services.

Company savings
The MSWeb team understands the need for baby steps in any significant information architecture project. They’ve spent years developing taxonomies and supporting tools to use on MSWeb. And they’ve taken a gradual approach to rolling them out as SAS services to other business units.

But it’s also important to note that within three months of launching SAS, nine sub-portals had already implemented SAS-based search on their sites. Two of those had created site-specific category label taxonomies to support browsing, and another was in the process of doing so. All leveraged the MSWeb Best Bets results as part of their own search systems.

Quick adoption of SAS represents success for the MSWeb team, but has much greater significance to Microsoft as a whole. Besides the benefits to users, which we’ll describe below, an incredible amount of labor has been saved. It’s estimated that SAS has resulted in a cost savings of 45 person years in avoided work (based on calculating the development efforts—estimated at 5 person years—and multiplying by 9— the number of business units that didn’t have to reinvent the SAS wheel). These savings were achieved with no increase in the MSWeb team’s staffing levels, and what was developed for MSWeb has been completely reusable by other business units.

Benefits to Users
As Microsoft’s intranet environment matured in the mid-’90s, it began to suffer from the same afflictions as most enterprise intranets: too many clicks to get to desired information, difficult site-wide navigation, and the best documents buried deep within search results. And, as mentioned earlier, users and their champions began to ask for taxonomies to make these problems go away.

The MSWeb team’s response is a work in progress. Four years is a brief moment in the lifespan of a large company and its information systems. The team is taking an evolutionary approach, avoiding unrealistic goals of fixing all problems for everyone in a few years. In this way, there are no false expectations. But even in four years, many concrete benefits have been realized, and taxonomies are at the forefront of these improvements. With category label taxonomy, for example, the labels are more representative and consistent, improving navigation within MSWeb and between Microsoft intranet sites.

Searching is also greatly improved. By encouraging resource record creation with UCS, MSWeb is able to identify valuable content in the intranet environment, and therefore can do a better job of crawling remote intranet content. Better crawling leads to more comprehensive indexing. Users are now querying indexes that represent both a much larger body of content and a higher-quality collection of content. More importantly, users’ queries are more powerful than before—they are able to take advantage of MSWeb’s descriptive vocabularies to reduce the ambiguity of individual search terms. Consider a search on “asp,” a very ambiguous term. During a search, the descriptive vocabularies stored in the MDR are automatically invoked to expand the search by including the different meanings (“Active Server Pages” and “application service providers”). These terms are also displayed as executable searches on the search results page to narrow or refine the search.

The MSWeb team has also helped pioneer a positive and increasingly common trend: “Best Bets.” These are search results that are the product of manual efforts. Often displayed before other, automatically generated results, Best Bets link a user to documents that a cataloger has determined to be highly relevant to the user’s initial search query. Best Bets are designed to address the “sweet spot” in searching, which consists of the few unique search queries that constitute the majority of all searches executed. Why not add value to the small number of frequently executed searches by adding Best Bets to their results?

The screenshot shows the results for the search query “asp” from the MSWeb intranet, and you’ll note that the first five are all Best Bets. The components of the search results—resource title, URL, description, and categories—are drawn from the meta-data schema, as the query searched an index of the controlled vocabulary terms assigned to these Best Bet records when they were indexed with UCS.

The MSWeb team uses a function provided as part of the SAS console to determine which searches merit Best Bet coverage. By invoking the console’s “View Query Logs” command and specifying a date range and collection, it’s possible to determine how many documents each query retrieved. If the “Where Query Returned” option is set to “0 Best Bets,” we can learn which of those high-retrieving queries do not have Best Bets associated with them, and create new Best Bets accordingly.

Another SAS Console function is “View Metrics.” Its “Ranked Hit Clickthrough” option provides a graphic representation of the rank of documents in a particular query’s search results are being clicked through. Typically, the Best Bets, ranked at the top, have a far higher clickthrough than other documents.

So, does this hybrid approach—the combination of manually and automatically generated results—actually help users? It may be too early to tell, but the initial data is promising. Users are performing 18% fewer searches since Best Bets were implemented; this might suggest that the results of their initial searches are more successful, reducing the need to submit follow-up searches. And, as shown in here, users are clicking through the top results’ links close to twice as much as they had before Best Bets were implemented. This may suggest that users are finding Best Bets results to be more relevant than automatically generated results.

Overall, the MSWeb team has attempted to measure the cumulative impact that better browsing, searching, and content have had on users. Performing a task analysis exercise both before and after a major redesign, the team was rewarded with some hopeful results in terms of success rate, time on task, and number of clicks. The following table displays the results of the task analysis. The version 3.0 results were recorded in February 1999, prior to the implementation of the taxonomy-driven approach, and the version 4.01 results were recorded in July 1999, after the implementation of the taxonomy-driven approach.

Measure v.3.0 Average v.4.01 Average Change
Task Success Rate 68.30% 79% +10.7%
Time on Task 3 minutes 26 seconds 3 minutes 10 seconds -16 seconds
Number of Clicks 13 5 -8 clicks


Certainly, other factors may have had an impact on these numbers. But even if we discount them, there is still ample anecdotal evidence to demonstrate the value of the MSWeb team’s efforts.

What’s Next
The initial success of MSWeb’s approach is exciting, but it’s just the first step over the course of many years and phases to come. To some degree, the team expects continued growth in what’s currently in place: more resource records, more robust taxonomies, and more sites coming on board and utilizing an increasing array of SAS services and MSWeb consulting. But the MSWeb team also hopes to try out some interesting new plans in the not-too-distant future.

The rational, the obvious, and the good often never make it off the drawing board, thanks to corporate strategies that change with the wind, extreme fluctuations in budgets, and, worst of all, the dreaded reorganization.

One exciting possibility is an increased role for other business units in the creation of an even more mature infrastructure to support enterprise-wide information architecture and content management. MSWeb isn’t looking to own this endeavor, but move into a leadership role, with other units playing the role of partners. In this scenario, Microsoft will save money because its business units will engage in increased sharing of taxonomies and related tools and efforts. Additionally, a greater degree of awareness among content managers might result in more willingness to go along with future centralizing initiatives, such as requiring the registration of resources in order for them to be indexed for searching. This trade-off might make for a little more work on the part of content owners, but will result in improved searching for users, as well as much more efficient content management practices by establishing who’s responsible for what content, when it should be updated, and so on.

Even more exciting is the possibility of creating something of a Microsoft “semantic web” along the lines of what Tim Berners-Lee, creator of the Web, and others have recently proposed. A semantic web environment allows connections to be made automatically between related content objects. Some of the tools described in this chapter could be extended to support such automatic associations; for example, the taxonomies developed by different Microsoft business units could be “cross-walked,” meaning that relationships between similar terms or “nodes” in the taxonomies could be established. These relationships could go a long way toward improving search across Microsoft’s intranets, as content with different tags and similar content would be retrieved together. VocabMan and the SAS console already have built-in support for related tags, which will enable future cross-walking of taxonomies.

The concept of a semantic web offers much more potential. Alex Wade, Manager of Knowledge Access Services, sees a future where semantic objects—not physical documents—are the atoms that make up the MSWeb universe. He states: “We don’t draw many lines between objects today, and when we do, these are rarely delineated; now we’re moving to semantically derived relationships.” He’d like to see a semantic MSWeb provide access to people, places, and things that are connected by “strong rules” or relationships; once an initial set of rules is seeded, new rules can be inferred. This web of relationships could have a hugely beneficial impact in an intranet environment like Microsoft’s, where it’s often as important to find the right person as it is to find the right information. This transition requires a paradigm shift for information architects: as Alex suggests, we’ll need to “stop tagging documents and start drawing relationships between objects. Eventually they’ll have different types of hierarchical, associative, and equivalent relationships.”

MSWeb’s Achievement
Nothing that the MSWeb team did—whether considering the initial problem, coming up with an approach, and developing the tools and expertise to make it happen—can be described as revolutionary. Rather, these were rational steps taken to address complicated problems. So why discuss their work here?

Well, if you have ever worked in a large organization—or even many smaller ones— you know that what’s rational isn’t often what happens. The rational, the obvious, and the good often never make it off the drawing board, thanks to corporate strategies that change with the wind, extreme fluctuations in budgets, and, worst of all, the dreaded reorganization. And Microsoft isn’t immune to such problems; one MSWeb team member went through seven different managers and had three title changes in just five months.

The MSWeb team has developed some neat taxonomies and tools. But we’re recognizing the team for its most impressive achievement: successfully implementing a rational plan in a large, corporate environment. The team understood that only a holistic approach—one that accommodated content, users, and context—could make a difference. They also knew that enterprise-wide solutions require sufficient time—years, not months—to take hold.

If you’re taking on a similar challenge, we suggest you follow Vivian Bliss’s advice:

“…Improving information systems affects people, process and technologies. To not recognize that will spell doom. In other words, technology alone is not the answer. Just as merely tweaking the UI is not the answer, nor is building a taxonomy that is not flexible or able to be leveraged in publishing and finding. Another key is to have a multi-disciplinary team. Just one discipline does not have the answer.”


Louis Rosenfeld is an independent information architecture consultant.

Peter Morville is President and Founder of Semantic Studios, a leading information architecture and knowledge management consulting firm.

MSWeb: An Enterprise Intranet #1

Written by: Peter Morville

What is the Holy Grail of information architects? It’s the secret that will help them develop and maintain a centralized, user-centered information architecture for a large, distributed organization—the kind made up of all sorts of autonomous, bickering business units that have their own goals, their own sites, their own infrastructures, their own users, and their own ideas of how to go about things.

We understand that you probably don’t have the same resources at your disposal as does Microsoft’s team. But we think everyone can learn from their efforts; what they’re doing today is what most intranets will be doing in three to five years.It’s nearly impossible to develop a successful information architecture against a backdrop of explosive content growth, content ROT, and the political twists and turns common in any organization. And, we’re sorry to say, we don’t have the Holy Grail. But we’ve had the privilege of getting up close to a large number of corporate intranets. And the best approach we’ve seen so far is that taken by Microsoft’s intranet portal (MSWeb) team.

Before you protest, we admit that yes, we understand that you probably don’t have the same resources at your disposal as does Microsoft’s team. But we think everyone can learn from their efforts; what they’re doing today is what most intranets will be doing in three to five years, for two reasons. First, MSWeb’s approach is flexible enough to be customized for many large organizations. And second, knowing Microsoft, it’s a reasonable bet that the good ideas described here will soon enough find their way into Microsoft’s product offerings (if they haven’t already) and into your IT department. So perhaps you’ll own a piece of this approach in the not-too-distant future. Let’s preview it here so you’ll be ready.

Challenges for the user
Like Microsoft itself, MSWeb is insanely huge and distributed. Let’s use some numbers to paint a picture of the situation. MSWeb contains:

  • 3,100,000+ pages
  • Content created by and for over 50,000 employees who work in 74 countries
  • 8,000+ separate intranet sites

With apologies to Herbert Hoover, Microsoft has put a web server in practically every employee’s pot. Employees, in turn, have responded by embracing the technology, as you’d expect from one of the world’s largest technology companies, and by churning out an impossibly huge volume of content.

But if you’re a typical Microsoft employee, these numbers also represent a bit of a problem. Microsoft estimates that a typical employee spends 2.31 hours per day engaging with information, and 50 percent of that time is used looking for that information. Although you already know how ambivalent we are about using such calculations to estimate actual costs to the organization, we think these numbers show that at least some valuable employee time is being wasted flailing about in this huge environment in search of information.

Here are just a few examples of how this chaotic environment hurts Microsoft employees.

Where to begin?—This is your typical case of “silo hell.” With as many as 8,000 possibilities available, employees have a hard time determining where they should begin looking for the information they need. While some starting points are obvious—check the human resources site for information on your medical insurance or 401(k) plan—other areas, such as technical information, are scattered throughout Microsoft’s intranet environment.

Inconsistent navigation systems—Navigation systems are quite inconsistent because they employ many different labeling schemes. Therefore, users are confused each time they encounter a new one. Not only does this inhibit navigation, it also muddles the user’s sense of place.

Same concept, different labels—Because different labels are used for the same concepts, users miss out on important information when they don’t search or browse for all the possible labels for those concepts. For example, users may search for “Windows 2000” without realizing that they also need to hunt for “Microsoft Windows 2000,” “Windows 2000,” “Win 2000,” “Win2000,” “Win2k,” “Win 2k,” and “w2k”.

Different concepts, same label—Conversely, a term doesn’t always mean what you think it does. For example, ASP can mean “active server pages,” “application service providers,” or “actual selling price.” And the term “Merlin” has been used as the code name for three very different products.

Ignorance is not bliss—Often, users are happy when they get any relevant information. But in a knowledge-intensive environment like Microsoft’s, users are much more demanding—their jobs depend on finding the best information possible. In this case, employees often get frustrated because they don’t know when to stop searching. Is the content simply not there? Or is a server down somewhere? Or maybe they didn’t enter a good search query?

It’s not hard to see how a typical employee’s 1.155 hours per day might get burned up. In short, Microsoft employees face an expansive and confusing information environment that’s about as intimidating as the Web itself.

Challenges for the information architect
The flip side of this problem is how these numbers affect the people who are responsible for making Microsoft’s content, or aggregating that content into portals. Let’s make another comparison to the broader Web. Building and maintaining the Yahoo! portal has been a huge undertaking, spanning years and a gigantic collection of content—the Web as a whole. MSWeb is a portal too, and though 8,000 sites is a much more manageable number than what Yahoo! faces, consider the varying motives and concerns of those who own and maintain those independent sites. While Yahoo! can now get away with charging sites for inclusion in its directory, Microsoft can’t charge or compel site owners within the company to register. Instead, the MSWeb team has to create incentives for participation in its model. But the owners of the intranet’s various sites are too distracted by other concerns (such as serving their own constituencies) to consider how their site fits into the bigger picture of Microsoft’s intranet.

When a site is brought into the MSWeb fold, it comes with its own information architecture. Its organization and labeling systems and other tricky information architecture components must be integrated into the broader MSWeb architecture or be replaced altogether. For example, as many as 50 different variants of product vocabularies had been created in the Microsoft intranet environment. Fixing such problems is a messy and complicated challenge for any information architect.

And it gets even worse: all of those Microsoft intranet sites are backed up by a technical architecture of some sort. Some are designed, built, and maintained by in-house technical staff and are quite advanced and elaborate. At the other extreme are sites maintained by hand or by a simple tool like MS FrontPage. The technology architectures that support the Microsoft intranet environment vary widely in complexity, and the MSWeb team must determine ways to normalize and simplify the environment to make content management easier and more efficient. Additionally, many of these technology architectures are not designed to support a portal or any other sort of enterprise-wide information architecture, so that’s another crucial factor the MSWeb team must account for.

Does your head hurt yet?

We like taxonomies, whatever they are
Four years ago, many heads were already throbbing at Microsoft. And an odd and often misunderstood term—“taxonomies”—began to be heard in corridors at Redmond. Although they share a common “X,” “taxonomies” and “sexy” are two words that aren’t often seen together in public. So when “taxonomies” become a common part of everyday conversation, it’s a sure sign that an organization is ready for a deeper look into information architecture.

So Microsoft’s MSWeb team heard the word and knew that the time had come for a more ambitious approach to improving MSWeb. The team—populated by an impressive mix of information scientists, designers, technologists, and politically savvy managers—began to consider what users meant when they called for better (or any) taxonomies. Instead of the traditional biology-inspired definition, Microsoft’s employees thought of taxonomies as constructs that would help them search, browse, and manage intranet content more effectively.

In response, the MSWeb team developed a more generalized operating definition of taxonomies that would be more in line with how other employees were using the term. This flexibility—the willingness to speak the language of clients, rather than rigidly clinging to a “correct” but ultimately unpopular meaning—was key. It set the tone for successful communications between the MSWeb team and its clients throughout the organization.

Three flavors of taxonomies
The team defined taxonomies as any set of terms that shared some organizing principle. For example, descriptive vocabularies were seen as controlled vocabularies that described a specific domain (e.g., geography, or products and technologies) and included variant terms for the same concept. Metadata schema were collections of labeled attributes for a document, not unlike a catalog record. Category labels were sets of terms to be used for the options of navigation systems. These three areas comprised the foundation of the MSWeb approach. Better searching, browsing, and managing of information would be achieved by designing taxonomies that could be shared throughout the enterprise.

Descriptive vocabularies for indexing
Developing terms to manually index important pieces of content seemed a smart proposition for the MSWeb team. It would complement automated indexing by the search engine, which was currently the primary means of making the site’s content available. But creating and applying descriptive vocabularies is an expensive proposition, especially within an information environment as large as Microsoft’s. And there are so many different ways to index content. So half the battle was in selecting which vocabularies would deliver the most value to the organization as a whole.

The MSWeb team considered a number of issues when deciding which vocabularies to develop. Not surprisingly, characteristics of the content drove many of the decisions.

Search Log Analysis—Queries from MSWeb’s search query logs are storied in an SQL database, and could therefore be searched and more easily analyzed. Search log analysis helped the MSWeb team gauge user content needs in their own words and determine appropriate vocabulary terms. Studying the search log’s most common queries also helped the team get a good overview of which content areas were generally most valuable to users.

Availability—The team looked for decent controlled vocabularies that had already been developed in-house or that were available commercially. Vivian Bliss, MSWeb Knowledge Management Analyst, puts it simply: “Don’t reinvent the wheel!” If there’s a useful vocabulary out there, it’s much cheaper to license and adapt it than to create a new one. Unfortunately, most of the required vocabularies were very specific to Microsoft’s content, and had to be custom-built in-house.

Other decisions were driven by business context. The MSWeb team considered such issues as:

Politics—The team was careful to talk with content stakeholders about what they felt they needed to make their content more accessible. In some cases, stakeholders were interested both in information architecture concepts and in committing to working with the MSWeb team. Others were interested in neither. Through such discussions, it became apparent which stakeholders were ready to participate and which weren’t.

Applicability—Some vocabularies were too specific to have broad value for users across the company. The MSWeb team instead focused on vocabularies with broader appeal and value.

After taking all of these considerations into account, Microsoft narrowed its vocabulary development to the following vocabularies:

  • Geography
  • Languages
  • Proper names
  • Organization and business unit names
  • Subjects
  • Product, standards, and technology names

Developing some of these vocabularies was trickier than you might think. Geography, for example, had to be split into two separate vocabularies: general place names, and locations of Microsoft installations. On the other hand, the subject vocabulary development was simpler than it might have been: its development was constrained to addressing primarily equivalence relationships. The MSWeb team hasn’t added extensive hierarchical and associative relationships; that would require a huge effort and take resources away from developing other vocabularies that could provide broad benefits right away. (In the future, the team does plan to selectively address these other relationships as time and resources permit.)

Metadata schema
Developed hand-in-hand with controlled vocabularies, metadata schema describe which metadata to use to describe or catalog a content resource. While Microsoft’s descriptive vocabularies were driven by content and context, metadata schema were informed more by issues of users and content.

The MSWeb team developed a single schema that has value for both MSWeb and other intranet sites. Borrowing from the Dublin Core Metadata Element Set (see http://dublincore.org), MSWeb’s schema was intended to be sufficiently “stripped down” so that content owners would use it to describe resources, resulting in more records and therefore a more useful collection of content. The schema’s simplicity was balanced with the goal of providing enough descriptive information to augment searching and browsing by users.

The team also had to ensure that records produced using the schema would include fields useful for resource description, display, and integration with other parts of the information architecture (namely by integrating with search results and browsing schemes). The process used to develop this metadata schema was, in the words of one team member, “down and dirty.” Although more polished methodologies exist, sufficient resources were not available at the time for this initial schema development project. For this reason, it was important to structure the schema to include both a required “core” set of fields and the flexibility to support future extensions of the schema by other business units. To date, seven other major portals are using the metadata schema, and many have extended and customized it for their own context.
The schema’s core fields are:

  • URL Title—The name of the resource
  • URL Description—A brief description of the resource; suitable for display in a search result
  • URL—The address of the resource
  • ToolTip—Text displayed for a mouseover
  • Comment—Administrative information that helps manage a record (not seen by the end user)
  • Contact Alias—The name of the person responsible for this resource
  • Review Date—The date that the resource should be next reviewed (default setting is six months from when the record was created or last updated)
  • Status—The record’s status; e.g., “active” (the default), “deleted,” “inactive,” and “suggestion”; used for content management purposes

The schema has been commonly extended with these optional fields:

  • Strongly Recommended—Flags resources that are especially appropriate
  • Products—Terms from the product, standards, and technology names vocabulary that describe the subject matter of the resource
  • Category Label—Terms from the vocabulary of category labels; used to ensure that the resource is listed under the appropriate label in the site’s navigation system
  • Keywords—Terms from descriptive vocabularies used to describe the resource

MSWeb began to use the metadata schema to create resource records in 1999; since then, over one thousand records have been created. These fuel the immensely useful “Best Bets” search results and hold huge potential for improving areas such as content management. We’ll describe the role of both metadata schema and “Best Bets” at Microsoft in greater detail later in this chapter.

Category labels
The third type of taxonomy—labels for the categories in site-wide navigation systems—was geared toward providing users of Microsoft intranet sites with navigational context. Category labels help users know where they are and where they can go. The MSWeb team employed a user-centered process for designing navigation systems, relying upon useful standbys as card sorting and contextual inquiry. In [this screenshot], the category labels are shown on the left-hand side of the screen. Descriptions of nodes, displayed on the right-hand side, help catalogers choose the appropriate category label.

The initial set of category labels was developed solely for the MSWeb portal’s navigation system. But because the portal is so widely used and because the revised navigation represented a major upgrade for many users, the owners of other intranet sites began to approach the MSWeb team for assistance in developing their own navigation systems.

The MSWeb team responded by making its user-centered design process and expertise into a service that other site owners could utilize. As collaboration with other sites increases, a “standard” intranet navigation system will eventually be created, likely a combination of predetermined intranet-wide options (e.g., another “core”) and a locally determined selection of choices (“extensions”) that would be informed by a shared set of guidelines. For now, the transitional stage of raising awareness and providing support to other site owners is considered a great leap forward, and a prerequisite to further navigation standardization.

How it comes together
The impact of all three taxonomies is clear from the MSWeb search results shown [here]. Category labels provide contextual navigation at the end of each “Best Bet” result (the first two displayed) and populate the “categories” site-wide navigation system on the left-hand side. Below that, the “terms” area displays two variants of the search term that come directly from the descriptive vocabularies. The “Best Bet” search results themselves are drawn from resource records based on a metadata schema.

MSWeb’s “three taxonomies” approach is steeped in traditional library science, which isn’t surprising considering the backgrounds of many of those on the MSWeb team. But it’s important to note how willing the team was to abandon the traditional library science concepts that didn’t make sense in the intranet environment. For example, the team did not try to create “traditional” thesauri for its metadata schema and category label taxonomies. Other standards familiar to the LIS community, such as Dublin Core, weren’t initially adopted for MSWeb’s metadata schema because they were not appropriate at the time (although the Dublin Core schema may be partially or completely adopted by MSWeb at some point).

In the next issue (Sept. 9)—Beyond taxonomies: selling services, benefits to user and what’s next for MSWeb

Louis Rosenfeld is an independent information architecture consultant.

Peter Morville is President and Founder of Semantic Studios, a leading information architecture and knowledge management consulting firm.

The Age of Findability

Written by: Peter Morville

The Third Annual Information Architecture Summit in Baltimore compelled my first visit to the new, state-of-the-art terminal at Detroit Metropolitan Airport.

As I approached the airport on a cold March morning, perhaps I should have been excited. After all, the $1.2 billion Northwest World Gateway was billed as the terminal of the future. According to Northwest Airlines, I was about to have “one of the world’s greatest travel experiences.”

But in reality, I felt dread. I was late for my flight and desperately needed a restroom and a cup of coffee in exactly that order. What I didn’t need was the challenge of finding my way in a new airport.

After circumnavigating “the largest single parking structure in the world ever built at one time” three times, in search of long-term parking, I finally broke down, asked a security guard, and was told the signs for international parking actually lead to long-term parking. Of course!

Several circles of hell later, freshly sprung from the airport security checkpoint and a full-body pat down, I emerged into the spectacular center of Concourse A. High-arched ceilings soared above. Luxury retail stores lined the hall. Straight ahead, a black granite elliptical water fountain fired choreographed, illuminated streams of water, “representing the connections made via global travel.”

Unfortunately, what I couldn’t find was a sign pointing to one of the 475 public restroom stalls inside this 2-million square-foot complex. To cut a long and painful story short, I was 30,000 feet in the air before I finally got my cup of coffee.

Name that pain
Jakob Nielsen might say this airport has usability problems. Conduct a heuristic evaluation, run a few user tests, fix the worst blunders, and you’re on your way. That’s the great thing about usability. It applies to everything. Websites, software, cameras, fishing rods and airports. It’s one hell of a powerful word.

Lou Rosenfeld might say this airport has information architecture problems. But he probably wouldn’t. While maps and signs fit comfortably into the domain of information architecture, it’s a stretch to include the structural design of an airport terminal or the solicitation of feedback from frustrated travelers. Like it or not, information architecture has boundaries. Unfortunately, our clumsy two-word label isn’t quite as flexible as Jakob’s.

That’s why I say this airport has findability problems. The difficultly I had finding my way dominated all other aspects of the experience. Like usability, findability applies broadly across all sorts of physical and virtual environments. And, perhaps most important, it’s only one word!

Post-Hum(or)ous self-definition
At Argus Associates, we built a consulting firm that specialized in “information architecture” and we wrote a book to explain and explore the topic.

In the past year, our company has been post-hum(or)ously accused of practicing “Content IA,” a pejorative label that bothers me.

It’s absolutely true that we Argonauts brought the strengths and biases of library science to the IA table. And, we certainly focused more on organizing sites with massive amounts of content than on designing task and process flows for online applications.

However, this focus was indicative, not of a love for content, but of a passion for designing systems that help people find what they need.

Unfortunately, we couldn’t declare this passion too openly, because in the 1990s most customers weren’t buying “findability.”

At first, they focused on image and technology. Remember the early days of glossy brochure web sites and hyperactive Java applications? Later, they learned to ask for usability, scalability and manageability. They had felt some pain, but not enough.

In order to create a big tent, we sold “information architecture,” striking a delicate balance between our clients’ needs and wants. But all along, we maintained a deep conviction that, in the long run, the most important and challenging aspect of our work would involve enabling people to find stuff.

So, if you want to label the Argus brand of information architecture, rather than calling it Argus IA or Content IA or Polar Bear IA, I humbly suggest that you call it Findability IA. Or else!

Arrows over boxes
True to form, I’ve always resisted attempts to canonically define information architecture. In an emerging field, the last thing you want to do is prematurely place its identity inside a box, or should I say coffin?

However, information architecture is entering a new stage of maturity. IA roles and responsibilities are firming up. The IA community is taking shape. While we insiders argue over the minutia, a de facto definition of information architecture has emerged and reached critical mass. There’s no going back.

On one level, this is wonderfully exciting. For many of us who labored in obscurity in the early 1990s, this is validation that our vision of the future wasn’t completely crazy.

But this is also frightening. With maturity comes rigidity. We’re finding ourselves trapped inside boxes of our own making. And those arrows that connect us to related disciplines and new challenges are looking mighty appealing.

After all, it’s a tough sell to argue that content management and knowledge management and social computing and participation economics are all components of the big umbrella of information architecture. The IA tent is simply not that big.

And yet, we information architects are fascinated by these topics. We yearn to escape our boxes and follow the arrows.

For me, findability delivers this freedom. It doesn’t replace information architecture. And it’s really not a school or brand of information architecture. Findability is about recognizing that we live in a multi-dimensional world, and deciding to explore new facets that cut across traditional boundaries.

Findability isn’t limited to content. Nor is it limited to the Web. Findability is about designing systems that help people find what they need.

The age of findability
Even inside the small world of user experience design, findability doesn’t get enough attention. Interaction design is sexier. Usability is more obvious.

And yet, findability will eventually be recognized as a central and defining challenge in the development of web sites, intranets, knowledge management systems and online communities.

Why? Because the growing size and importance of our systems place a huge burden on findability. As Lou posits “despite this growth, the set of usability and interaction design problems doesn’t really change…(but) information architecture does get more and more challenging.”

Ample evidence exists to support this bold claim. Companies are failing to deliver findability. For example, a recent study by Vividence Research found poorly organized search results and poor information architecture design to be the two most common and serious usability problems.

This resonates with my experience interviewing users of Fortune 500 web sites and intranets. Some of these poor souls are ready to burst into tears as they recount their frustrations trying to find what they need inside these massive information spaces.

At the IA Summit, usability expert Steve Krug also agreed with this bold claim, noting that his company’s motto doesn’t apply to the challenges faced by information architects. Designing for findability is rocket surgery!

In the coming years, our work will only become more difficult. But that’s a good thing. Consider the following passage from a fascinating article written by business strategy guru Michael Porter:

“Companies need to stop their rush to adopt generic ‘out of the box’ packaged applications and instead tailor their deployment of Internet technology to their particular strengths…The very difficulty of the task contributes to the sustainability of the resulting competitive advantage.”1

That last sentence applies directly to the work we do. We all have a great deal of difficult and important work ahead. There’s an awful lot of findability in our future.

Where do we go from here?
I wrote this article to explore findabilty as both a word and a concept. I’d be very interested in your reactions. Does findability strike a chord? Are you intrigued by the design of findable objects? Are you ready to become a findability engineer? Or does this pseudo-word annoy you? Is findability overrated? Do you prefer a future filled with expensive, beautiful airports that just happen to be unnavigable? Comments please!

For more information:

  1. “Strategy and the Internet,” by Michael E. Porter in Harvard Business Review, March 2001.
Peter Morville is President of Semantic Studios, an information architecture and knowledge management consulting firm and co-author of the best-selling book, “Information Architecture for the World Wide Web.”