Icon Analysis

“She gave up the search for the mouse settings icon in seconds and opted to just use the ridiculously over-sensitive mouse.”

An icon search task that lasts longer than anticipated can result in user annoyance or even premature abandonment. I once changed the mouse settings on my laptop to be overly sensitive, and had a colleague use it to show me a data analysis technique she had been working on. She immediately noticed and asked permission to change the settings. At my resolution of 1400×1050, the icons in the Windows control panel folder render at 16×16 pixels. In addition, I had the list pre-sorted by comment rather than application name. Not used to these settings or dealing with mouse preferences, she gave up the search for the mouse settings icon in seconds and opted to just use the ridiculously over-sensitive mouse while demonstrating her analysis technique.

You may think she was justified if only using my system for a short time. If so, you’d be surprised to know this was no small demo! It went on for almost a half an hour. She surfed the web to retrieve various files, used several applications, accessed her FTP space to download some of her own work, and showed the technique twice with different sets of user data. Scientist and user throughout, she sprinkled obscenities about the mouse amongst her thoughtful discussion of data analysis. I was astonished, and now far too afraid to tell her I had fooled with the mouse on purpose.

Two weeks later, I was discussing the analysis technique with another coworker and he said, “By the way, I heard your mouse is all messed up. I can fix that if you want.” Bad human computer interaction (HCI) experiences travel fast! The issue could have been avoided if only the mouse settings icon had been more identifiable.

Inability to discriminate one icon from another and/or find an icon in a set can be far more disastrous than my anecdote above. Systems used by first responders in hazardous materials incidents (see MARPLOT, for example) rely on icon design to signify entity classification (e.g. small icon of a schoolhouse) and level of critical danger to an entity (e.g. a school icon is painted red on a map). Immediately recognizing danger to a school amongst lumber yards, garbage dumps, and plant nurseries is imperative; any time-slip in the search and discrimination task could delay notification and evacuation of hundreds of children. How then can we diagnose problems with icons that fail in this regard?

Search and discrimination of icons

The human visual system is a complex mechanism that encodes information using many channels in two major pathways. The magnocellular pathway (M pathway, or “big neurons”) contains channels sensitive to gross shape, luminance, and motion. The parvocellular pathway (P pathway—“small neurons”) contains channels sensitive to color and detailed shape (Nicholls et al, 1992). In order to discriminate between two different visual signals—icons, in our case—the signals encoded in available channels must differ beyond some threshold. A common distinguishing technique is color. For example, try to find the red network settings icon on the right in figure 1.


Figure 1: Original icon list shown in the Windows control panel (left) and the icon list with the network icon highlighted red or feature-based search (right). Click to enlarge.

Searching by some distinguishing feature like color is called (not surprisingly) a feature-based search. Feature-based searches are limited in a few ways: their effectiveness drops if we apply a unique color to all icons in the image set and distinguishing by color only employs purposeful differences in only one of the two visual pathways (the P pathway). Additionally, icons tend to be small in a UI, thereby restricting differences in shape to “detailed shape” information—also encoded in the P pathway. Ideally, we would like to design icons that purposefully differ along channels in both M and P pathways.

Fig 2

Figure 2: Original Network Connections Icon with constituent M and P pathway representations.

An elegant technique to do this involves leveraging the core difference between the pathways. Large neurons are less densely packed in the retina of the eye than small ones. The spatial density leads to fundamentally different encodings of the visual image. Figure 2 shows an image of an icon that has been filtered to simulate the way it would be encoded in the M and P pathways.

Images filtered in this manner can be judged for distinctiveness along 2 pathway dimensions, assisting in economy of discrimination and search tasks. Distinctiveness in P pathway representations is easy enough to judge without the use of filtering techniques; designers weigh color and detailed shape decisions directly during the design process. The only tool a designer has to judge M pathway distinctiveness is the “squint test” (i.e. squint your eyes to obstruct sharp focus and rely mostly on dark and light values). However, the squint test is not very practical for HCI and Usability assessments; spatial frequency filtering is a better tool to simulate M pathway representations of icon images for evaluation purposes.

Spatial frequency filtering

The visual system maintains a set of scales that we associate with distance. If we see an object thought to have great size—say, a building—but that takes up little space on the retina (i.e. it looks very small), we immediately “perceive” it as being far away rather than perceiving it as a miniature building. The perception of scale is actually based on the encoding of visual spatial frequency (Schyns & Olivia, 1994).

This is interesting because you can encode images in specific spatial frequencies (Schyns & Olivia, 1999). View figure 3 from a foot away. Now stand back and view it from farther—say, 10 feet. Up close it is difficult to make out the image of Bill Frist in the center image. From farther away the image of Hillary Clinton disappears altogether in the center image. However, at both distances the outer images of Frist and Clinton are easily discernible. This phenomenon is based on our inability to perceive high frequency information from greater distances; if the image has no distinctive low frequency component, it simply disappears when viewed from a distance.

Fig 3

Figure 3: Hillary Clinton (left), frequency composite of Hillary and Bill Frist (center), and original Bill Frist image (right).

Not surprisingly, we hold specific spatial frequency registers for icons. Just as the color and shape choices for an icon design should be unique, so too should the frequency composition of the design. When a user searches through a UI in order to compare or find icons, his or her eyes jump all over the screen. Where eyes land are called fixation points, while the sharp eye movements are referred to as saccades. Users only see roughly 1.5 degrees of visual angle in sharp focus (roughly the size of your thumb nail held at arms distance); the rest of the image is processed in the M pathway and at lower spatial frequencies. At each fixation point, most of the icons in a UI fall outside of 1.5 degrees. The key is to filter the icon images to ensure that they differ in low spatial frequency so as to preserve their uniqueness during visual search. (Filtering methods discussed here are based on the work of Loftus and Harley, 2005, who used filtering to create representations of faces at a distance.)

The technique I show here requires the use of the R package (R Development Core Team, 2005) and the add-on package called “rimage” (Windows, Linux, and OSX versions are available here). Once you have downloaded and installed R, you can download and install the “rimage” addon from within the R program. (On Windows: Start R, then select packages » Install package(s). Choose a mirror from the dialog. Then select the “rimage” package.)

Filtering instructions

After the R program is set up and the rimage package has been installed, you are ready to start. Collect a set of icons you wish to analyze and put them all into a single image using your favorite image editing program, as shown in figure 4. Save the image as a jpeg.

Fig 4

Figure 4

Start the R program. Load the rimage library and the icon collection image into R using the following commands in the console window:

<br /> > library(rimage)<br /> > icons <- read.jpeg("address to your file”)<br />

Where “address to your file” is the full directory address where you saved the icon collection image. Make sure to enclose the address in quotes and use “/” instead of “” to signify subdirectories. Mine looks like this:

<br /> > icons <- read.jpeg("C:/Documents and Settings/Queen/My Documents/icon-collection.jpg”)<br />

Press Enter and then view the image in a display window by typing:

<br /> > plot(icons)<br />

Resize the window so that the images are full scale and not distorted. Now we’ll filter the images:

<br /> > plot(normalize(lowpass(icons,27.8)))<br />

Fig 5

Figure 5: Filtered icon set

Some explanation is necessary here. The number 27.8 defines the radius of the frequency filter in frequency space. I’ll spare you the math lesson and give you a short table of calculations that solve for radial lengths based on user distance from the screen (calculations based on size-distance-invariance equations; see Gilinsky, 1951).

 Icon Pixel Dimensions   Viewer Distance   Radius 
18 in.
24 in.
36 in.
18 in.
24 in.
36 in.
18 in.
24 in.
36 in.
18 in.
24 in.
36 in.

Using this table, you can see that I chose to assume the icons are 48×48 pixels in dimension and the viewer is 2 feet from the screen. As a general practice, filter the icons using all settings that might actually occur at use time and make sure that icons remain sufficiently unique (there are no studies that elaborate on what is sufficient—so be overly cautious).

I feel compelled to note that spatial frequency filtering is very different than just blurring the image; blurring removes detail that the M pathway relies on for recognition. Figure 6 shows the very different results of frequency filtering and blurring.


Figure 6: The filtered image (left) is far more representative of what a user actually sees than the blurred image (right). Extreme differences can be seen in icons with tight detailed patterns such as the second icon on the bottom row. Click to enlarge.

How effective are spatial frequency unique icons?

The following is a short study showing the benefits of using icons that have unique low spatial frequency compositions. 10 users were shown 20 icon images (called “trial icons”) of varied size. Simultaneously, they were presented with two additional icon images and asked to click on the icon that matched the trial icon, as shown in figure 7. Response times were recorded. The idea was to see if low frequency unique icons were easier to identify, and therefore result in faster response times.

Fig 7

Figure 7: Experiment screenshot

With each presentation of a trial icon, the match icon (fig. 7 – right) and distracter icon (fig. 7 – left) had either similar or different low frequency compositions. The response time data was then analyzed to determine if having all 3 icons contain similar low frequency compositions slowed responses. If responses were slower, the match icon was assumed to be more difficult to identify. A box plot of the resulting dataset is shown in figure 8.

Fig 8

Figure 8: Dataset box plot

As you can see, on average, users identified icons with unique low spatial frequency compositions faster than those with compositions similar to distracter icons. In fact, 75 percent of the time the fastest response times under normal conditions were just about average when frequency differences were present. The frequency-unique icons result in almost a half-second faster identification times. Summing that difference for every icon search task during use-time adds up to quite a bit of what could be critical decision-making time. Unique low-frequency compositions in icon designs make a noticeable difference.


  • MARPLOT: http://archive.orr.noaa.gov/cameo/marplot.html
  • Nicholls, J.G., Martin, A. R., and Wallace, B. G. (1992). From Neuron to Brain. Sinauer, 3rd edition.
  • Schyns, P.G., Oliva, A. (1994) From blobs to boundary edges: Evidence for time and spatial scale dependent scene recognition. Psychol Sci 5:195–200.
  • Schyns, P.G., Oliva, A. (1999) Dr. Angry and Mr. Smile: when categorization flexibly modifies the perception of faces in rapid visual presentations. Cognition 69:243–265.
  • Loftus, G. R., & Harley, E. M. (2005) Why is it easier to identify someone close than far away? Psychonomic Bulletin & Review, 12(1), 43-65.
  • Gilinsky, A.S. (1951) Perceived size and distance in visual space. Psychological Review, 58, 460-482.
  • R Development Core Team (2005) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, http://www.R-project.org.

Interaction Modeling

“The relationship between actions and cognitive processes is important because it explains user behavior and translates to supportive arguments for good design solutions.”

Interaction modeling is a good way to identify and locate usability issues with the use of a tool. Several methods exist (see Olson & Olson 1990 for a review of techniques). Modeling techniques are prescriptive in that they aim to capture what users will likely do, and not descriptive of what users actually did.

Most methods—descriptive or prescriptive—fail to incorporate the relationship between user actions and cognitive processes. Models of cognitive processing, for example, might attempt to explain how or why a particular task is mentally difficult, yet the difficulty does not directly relate to observable user actions. Conversely, descriptive techniques such as path metrics, click stream analysis, and bread crumb tracking take little or no account of cognitive processes that lead to those user actions.

The relationship between actions and cognitive processes is important because it explains user behavior and translates to supportive arguments for good design solutions. Both prescriptive and descriptive techniques are necessary for characterizing the cognitive processing and user action (cog-action) relationship.

Collecting and reporting user data without presenting cog-action relationships results in poor problem definitions. In fact, practitioners often present no relationship between observed user behavior and the claim that there is a problem. Usability problems are presented as user verbalizations such as “I don’t know what I’m supposed to do here.” Although there is a cognitive reason why the user has fallen into this apparent uncertainty state, that reason is seldom presented. Further, the relationship between an identified problem and the solution to fix it is often not provided. If we don’t know why the behavior is a problem we can’t design a good solution.

This article presents a three-part method of interaction modeling where:

  • A prescriptive, preferred interaction model (PIM) is created
  • A descriptive user-interaction model (UIM) derived from an actual user study session is created
  • A model of problem solving and decision making (PDM) is used to interpret disparities between the first two models

Preferred Interaction Model (PIM)
A usability study design establishes success criteria. These criteria should be expressed as assumptions about user processes and behaviors. Creating PIM, or the process you would like the user to follow, specifies success criteria. Interaction models are a great tool for this endeavor and have enjoyed many published successes (for good review of case studies see Fischer 2000). There are three important things to remember about PIMs:

  • The PIM is created by the designer
  • The PIM should be based on task requirements, not functional requirements
  • The PIM exists in the system, not in the head of the user

Interaction models are typically quantitative frameworks requiring statistical validation. I use the term ‘model’ in more relaxed, qualitative way. The idea is to establish the PIM as a type of hypothesis or intended goal of development: “The system we designed supports X task/activity” (see e.g., Soudack, 2003). The method presented here is a structured approach for handling that hypothesis based on observation and theory. It lends itself to quantitative methods but doesn’t require them.

Creating a PIM
A PIM is a type of flow diagram that represents the user’s probable decisions, cognitive processes, and actions. Here is a simple example for retrieving a phone number from a contact database (Fig. 1).

[Figure 1: PIM for a phone number retrieval interface of a contact database application]

PIM entities are the decision (diamond), cognitive process (rounded box), action (square box), and system signal (circle). The model ends with a special action entity represented by a rhombus shape.

The PIM starts with a decision (get #?) insuring that the model can fit into the context of a larger model. Notice the sequencing of “cognitive state” then “action.” This is similar to the ordering of thinking, and then acting, that we would observe while watching users perform tasks. It also cues the modeler to encode a cognitive state (either decision or mental process) on either side of an action.

Decisions are modeled as yes/no or true/false statements. If multiple outcomes of a decision are necessary, consider using sub decisions. For example you might have a decision structure that looks like Figure 2.

[Figure 2: Nesting decisions can allow a complex outcome (choice A, B, or C, rather than just choice A or B]

The granularity of the model detail can be determined based on the needs and constraints of the system. Perhaps a higher level model of the contact number example (Fig. 3) might be more useful with different study criteria.

[Figure 3: Models can be high level and need not articulate procedural, physical actions (e.g., click on red button then move cursor to front of text, etc.]

Frequently on projects, the PIM has been loosely established and exists in some unarticulated form. Parts of the PIM might be discovered within prototype interface mockups, development requirements and/or design plans of stakeholders. The PIM can be difficult to construct from such varied sources. However, completing it makes assumptions about preferred interaction explicit and testable. Clearly defined, testable, assumptions are a necessity for this line of work.

User state-trace analysis: recording the user-interaction model (UIM)
State-trace analysis (Bamber, 1979) compares a given group’s performance under controlled conditions to performance under actual conditions. The method results in many interesting metrics and affordances. Collecting data on the UIM is somewhat similar to state-trace analysis yet differs in important ways. The UIM is collected under actual conditions (or as close as possible to actual conditions) and is then compared to is the PIM.

Rather then trying to perform traditional state-trace analysis, user state-trace analysis focuses on the goal of the method. Here, we wish to capture qualitative behavioral data while observing users as they transition from cognitive states to action states. We then use this data to “trace” the user’s path through these states as they complete the provided task. The result is a model of the user’s performance that contains valuable information about decision-making and problem-solving based on the system interface in the context of the task. The UIM can be compared to the PIM because they are similar in form and represent a similar process architecture.

Creating the UIM
User state-trace analysis is a type of coding that allows a researcher to trace the path of behavioral and cognitive states the user exhibits while completing a task. Use the same PIM entity types to create UIM diagrams. Instruct the user to “think out loud” and then trace the user’s path from cognitive processes to action states while they perform the provided task with the system.

This type of analysis has some caveats. First, real-time coding (i.e., recognizing, categorizing, and recording) of states is doomed to fail. The user might transition into states that are not well defined in terms of the task (e.g., an uncertainty process, or a stall action). The best practice is to video tape the study session and review it directly afterwards.

Expect upward of 10-20 times the video session length to complete a full-blown, accurate state-trace. Although a trace can be completed in an hour or less, plenty of extra time is spent determining salient user actions, arguing interpretations, and refining the complete trace model. As is the case in most endeavors, the more decision makers involved equates to more time spent.

If the task seems daunting, however, try restricting your level of trace detail to high level cognitive processes and actions and using the trace as an exploratory tool. This approach will drastically speed up the process.

When you start a trace diagram it is a good idea to use the PIM as a reference point for your coding. Have a printout of the PIM on a data collection sheet in order to take notes over top of it. Above all else, be honest while collecting data. You shouldn’t find yourself making rationalizations such as:

  • “Well they pretty much did that state … I’ll just mark it on the UIM.”
  • “They weren’t actually in a ‘confused’ state for very long … I don’t think that counts.”
  • “This user isn’t even sticking to the task anymore … none of this really counts anymore.”

Be aware that there is a tendency to establish an entire process-action-process relationship before writing anything down. Instead, try to first recognize and label a few states and actions on your data collection sheet. Leave these observations as labels anywhere on the sheet yet do not “link them up.” As you complete a significant phase of a task, start to organize and edit your labeled entities. Working this way allows the trace to develop while taking the mental burden off of the analyst to “guess what will happen next.”

Below is an example excerpt of a PIM and the constituent UIM retrieved from a user state-trace analysis session (Fig. 4). The representation in Figure 4 was obtained from a usability study of an interaction modeling tool. An analyst was asked to review a transcribed user study and assign models of decision making to various text passages.

[Fig 4: Excerpt from PIM/UIM of modeling tool process study.]

User state-traces provide several useful measurements:

  • Departures: the number and magnitude of states that happen outside the preferred model
  • Logical errors: the number of errors and recoverability from errors that resulted directly from departures
  • Lags: the amount of unproductive time spent in departures
  • Return steps: the number of obstructive returns to previous states
  • External processes: the dynamics of reoccurring processes that exist outside of the preferred model
  • Bandwidth constraints: the ability of the user to carry out cognitive processes and the amount of necessary resources available for them to do so
  • Action constraints: a cognitive process results in X possible actions though only a subset of these is available
  • Modal tracking: the discrepancy between application mode shifts and user mode shifts

Obtaining measurements from a user state-trace can result in a valuable dataset that reveals interesting patterns and trends. User state-trace analysis, however, is not a means of drawing inferences from these patterns nor is it a method of interpretation. A user state-trace reveals how the user performed the tasks, not why. The architecture of processes and actions exhibited by the user is generated by a cognitive mechanism the user engaged to deal with the task they were given. A better understanding of the underlying problem-solving and decision-making mechanisms will explain observed actions.

Problem-solving and decision-making model (PDM)
Cognitive mechanisms assist in solving problems and/or making decisions in order to complete a task. These tend to fall into four basic classes (Lovett 2002):

  • Rule-based: The user decides that if X situation arises then they will take Y action. Rule-based models are often engaged when the interface adheres to a dialog-driven, procedural task design. Examples are grocery store checkout systems, and operating system setup wizards.
  • Experienced-based: The user has been in this situation before. Last time they did X and it resulted in a desirable outcome, so they will do X again. Experience-based models are often engaged while performing tasks on systems users have familiarity with. In usability studies, however, participants are frequently recruited based on the criteria of limited or no experience with a system.
  • Pattern-based: The user has seen a situation that appears to have all the same elements as the current situation. In the former situation, X resulted in a positive outcome, so they will do the analogous version of X here. Pattern-based models take surrogate roles for missing experience-based models. The mechanism that handles the pattern-to-experience-based replacement can itself be expressed as a model. In fact the mechanism is regularly referred to as the users “mental model.”
  • Intuition-based: The user has a hunch that X will result in a desirable outcome so they will “follow their gut.” Intuition-based models are not well understood. Think of them as the user’s ability to distinguish patterns in the problem space that are far more detailed than the problem statement or situation will allow. Expert decision making is often categorized as intuition based.

To employ a model of problem solving and/or decision making (PDM) as an explanatory tool, it helps to diagram the model. An example of a rule-based mechanism is the satisficing model of decision making (Lovett 2002) (Fig. 5). In this model, a user chooses the first option they feel will accomplish the task without considering other options/ information. In the following example the satisficing model is recruited to interpret departures observed between a PIM and the recorded UIM:

[Figure 5: The Satisficing model of decision making]

Example Scenario: The application “Story Teller” (Fig. 6) allows a user to add characters from a story to a database. Characters are added to the database using the “add new character” function. Once a character is added, the application allows the user to count the frequency of appearance for a listed character. The PIM for adding a character to the database is illustrated in Figure 7. A user is recruited for participation in our study and given the simple task, “add a character to the character database and determine the number of times the character appears in the short story. The story file has already been loaded into the application.” Figure 8 shows the state-trace of what the user actually did.

[Figure 6: The Story Teller application interface with add character dialog]

[Figure 7: PIM of adding a character to the database]

[Figure 8: UIM obtained from a usability study session]

The UIM shows large departures. It appears that the user tried to “add a new character” for every time they saw the character in the story. We might be tempted to explain this as a problem with labeling in the interface or poor task clarification during the study. We could instead employ the satisficing model to explain the departures for a more rich interpretation:

  • Claim: There is a problem with the current interface.
  • Evidence: Large disparities between the PIM and state trace data (i.e., UIM) were observed.
  • Explanation: The user may be adhering to a satisficing model of decision making. Therefore, the user continually adds the same character as a new entry to the database because the text field that allows the user to enter a character is the first available entry point to the data input process. The text field also signals the user to recruit an experience-based model of problem solving: they copy and paste text to save time when the task is thought to require repetitive data entry. Additionally, the editable dropdown menu functions like a textbox therefore invoking the “data-entry model” to prompt the copy-paste action.
  • Solution: Place a list of available (previously entered) characters below the “New Character” input box. If the list has the name of the character and a clearly identifiable number representing the current number of instances next to it, the user will be allowed to select the character name in the list box and press submit. This will fix the problem of re-adding the same character in addition to allowing the user to include character aliases in the count (e.g., Matt, and/or Matthew).

The satisficing model is an example of a rule-based model yet assumes that the user is affected by cognitive bias. Common examples of cognitive-bias models are:

  • Anchoring and adjustment: Initial information presented to the user affects decisions more than information communicated to the user later in the task.
  • Selective perception: Users disregard information that they feel is not salient to completing the task.
  • Underestimating uncertainty: Users tend to feel they can always recover from faulty decisions in an application.

A good resource for the above models and a more detailed list can be found at the Wikipedia entry for decision making and problem solving (Wikipedia 2005).

It is a good practice to diagram several candidate cognitive-bias models before attempting to use them for explaining a specific departure. The diagramming allows you to get specific about exactly how the model explains the observed departure between the PIM and UIM. The final step is to include the cognitive-bias model as an annotation to the PIM with superimposed UIM (Fig. 9).

[Figure 9: Complete PIM, UIM, and PDM model integration]

The interaction-modeling technique provided here is useful in establishing usability success criteria and uncovers usability issues. The PIM acts as a testable hypothesis and the UIM establishes coded behavioral data. Major disparities observed between the PIM and UIM work as evidence to support the claim that there is a viable usability issue. The use of cognitive decision and problem solving models (PDMs) helps interpret and explain why the disparities exist. The essential components of a viable usability claim, behavioral evidence, and theory driven interpretations will inform the creation and rationale for good user interface design solutions.


Olson, J. R., & Olson, G. M. (1990). The Growth of Cognitive Modeling in Human-Computer Interaction Since GOMS. Human-Computer Interaction, 5(2-3), 221-265.

Bamber, D. (1979). State trace analysis: A method of testing simple theories of causation.
Journal of Mathematical Psychology, 19, 137-181.

Soudack, A. (2003). Don’t Test Users, Test Hypotheses. Boxes and Arrows, October 27.

Lovett, M.C. (2002). Problem Solving. In H. Pashler & D. Medin (Eds.),Stevens’ Handbook of Experimental Psychology: Memory and Cognitive Processes. New York: John Wiley & Sons.

Decision making. (2005, November 17). Wikipedia, The Free Encyclopedia. Retrieved December 1, 2005 from http://en.wikipedia.org/w/index.php?title=Decision_making.

Fischer, G. (2000). User Modeling in Human-Computer Interaction. User Modeling and User-Adapted Interaction, 11, 65-86.