Lecture 3 Learnability

Example

User Interface Hall of Shame

Source: Interface Hall of Shame

This dialog box, which appeared in a program that prints custom award certificates, presents the task of selecting a template for the certificate. This interface is clearly graphical. It’s mouse-driven–no memorizing or typing complicated commands. It’s even what-you-see-is-what-you-get (WYSIWYG)–the user gets a preview of the award that will be created.

So why isn’t it usable?

The first clue that there might be a problem here is the long help message on the left side. Why so much help for a simple selection task? Because the interface is bizarre! The scrollbar is used to select an award template.

Each position on the scrollbar represents a template, and moving the scrollbar back and forth changes the template shown.

This is a cute but poor use of a scrollbar. Notice that the scrollbar doesn’t have any marks on it. How many templates are there? How are they sorted? How far do you have to move the scrollbar to select the next one? You can’t even guess from this interface.

User Interface Hall of Shame

Source: Interface Hall of Shame

Normally, a horizontal scrollbar underneath an image (or document, or some other content) is designed for scrolling the content horizontally. A new or infrequent user looking at the window sees the scrollbar, assumes it serves that function, and ignores it. Inconsistency with prior experience and other applications tends to trip up new or infrequent users.

Another way to put it is that the horizontal scrollbar is an affordance for continuous scrolling, not for discrete selection. We see affordances out in the real world, too; a door knob says “turn me”, a handle says “pull me”.

We’ve all seen those apparently-pullable door handles with a little sign that says “Push”; and many of us have had the embarrassing experience of trying to pull on the door before we notice the sign. The help text on this dialog box is filling the same role here.

But the dialog doesn’t get any better for frequent users, either. If a frequent user wants a template they’ve used before, how can they find it? Surely they’ll remember that it’s 56% of the way along the scrollbar? This interface provides no shortcuts for frequent users. In fact, this interface takes what should be a random access process and transforms it into a linear process. Every user has to look through all the choices, even if they already know which one they want. The computer scientist in you should cringe at that algorithm.

Even the help text has usability problems. “Press OKAY”? Where is that? And why does the message have a ragged left margin? You don’t see ragged left too often in newspapers and magazine layout, and there’s a good reason.

On the plus side, the designer of this dialog box at least recognized that there was a problem–hence the help message. But the help message is indicative of a flawed approach to usability.

Usability can’t be left until the end of software development, like package artwork or an installer. It can’t be patched here and there with extra messages or more documentation. It must be part of the process, so that usability bugs can be fixed, instead of merely patched.

How could this dialog box be redesigned to solve some of these problems?

The Example, Redesigned

Source: Interface Hall of Shame

Here’s one way it might be redesigned. The templates now fill a list box on the left; selecting a template shows its preview on the right. This interface suffers from none of the problems of its predecessor: list boxes clearly afford selection to new or infrequent users; random access is trivial for frequent users. And no help message is needed.

Definition

Definition

initial learnability

The system should be easy to learn by the class of users for whom it is intended.

Michelsen 1980

the effort required for a typical user to be able to perform a set of tasks … with a predefined level of proficiency

Santos & Badre

Allowing users to rapidly begin to work with the system

Holzinger

Extended Learnability

ease at which new users can begin effective interaction and achieve maximal performance

Dix et al.

Minimally useful with no formal training, and should be possible to master the software

Riemann

Taxonomy

Grossman et al., A Survey of Software Learnability Metrics

Combination

Learning = Discovery + Understanding + Memory

How we Learn

How We Don't Learn

When computers first appeared in the world, there were some assumptions about how people would learn how to use the software. Programmers assumed that users would read the manual first–obviously not true.

Companies assumed that their employees would take a class first–not always true. Even now that we have online help built into virtually every desktop application, and web page help often just a search engine query away, users don’t go to the help first or read overviews.

All these statements have to be caveated, because in some circumstances–some applications, some tasks, some users–these might very well be the way the user learns. Very complex, professional-level tools might well be encountered in a formal training situation–that’s how pilots learn how to use in-cockpit software, for example. And some users (very few of them) do read manuals.

Nearly all the general statements we make in this class should be interpreted as “It Depends.” There will be contexts and situations in which they’re not true, and that’s one of the complexities of UI design.

Learning by Doing

So users don’t try to learn first – instead, they typically try to do what they want to do, and explore the interface to see if they can figure out how to do it. This practice is usually called learning by doing, and it means that the user is starting out with a goal already in mind; they are more interested in achieving that goal than in learning the user interface (so any learning that happens will be secondary); and the burden is on the user interface to clearly communicate how to use it and help the user achieve their first goal at the same time.

Seeking Help

Only when they get stuck in their learning-by-doing will a typical user look for help. This affects the way help systems should be designed, because it means most users (even first-timers) are arriving at help with a goal already in mind and an obstacle they’ve encountered to achieving that goal. A help system that starts out with a long text explaining The Philosophy of the System will not work. That philosophy will be ignored, because the user will be seeking answers to their specific problem.

Modern help systems understand this, and make it easy to ask for the user to ask the question up-front, rather than wading through pages of explanation.

Lessons for Designers

The fact that users are learning our interfaces by actually using them has some implications for how we should design them.

First, we should know something about what the users’ goals actually are – collecting information about that is a critical feature of the user-centered design process that we’ll talk about in a few lectures. If we’re designing for the wrong goals, users are going to struggle to figure out how to do what they want in our system.

Second, the UI should be the primary teacher of how to use it. The UI itself must communicate as clearly as possible how it’s supposed to be used, so that users can match their goals with appropriate actions in the system. Later, we’ll talk about a few specific techniques for doing this–affordances, feedback, and information scent.

Third, when the user does have to resort to help, that help should be searchable and goal-directed. Providing a 30-minute video tutorial probably won’t help people who learn by doing.

Learning by Watching

  • Learn lots from watching other users
  • Unfortunately, many tools are used alone

One more way that we learn how to use user interfaces is by watching other people use them. That’s a major way we navigate an unfamiliar subway system, for example.

Unfortunately much of our software–whether for desktops, laptops, tablets, or smartphones–is designed for one person, and you don’t often use it together with other people, reducing the opportunities for learning by watching. Yet seeing somebody else do it may well be the only way you can learn about some features that are otherwise invisible. For example, you probably know how to use Alt-Tab to switch between windows. How did you learn that? The UI itself certainly didn’t communicate it to you. Pinch-zooming on smartphones and tablets is similar–but pinch-zooming may have benefited from mass media advertising showing us all how to use it.

Social computing is changing this situation somewhat. We’ll look at Twitter in a moment, and see that you can learn some things from other people even though they’re not sitting next to you.

Information Architecture

Example

Let's order some personal training lessons at DAPER
In 2019 I wanted to order some personal training lessons at DAPER.

What's wrong?

I navigated to the recreation page and saw this. Where should I go?

What's wrong?

Turns out that the order page for personal training is "Buy Series Sales" (itself a mystery) under account information. Which makes no sense.

New Improved Site

Today's version of the site is much improved. There's a top level menu listing common user goals

New Improved Site

And the dropdown for workout includes sensible categories, including private lessons.

Information Architecture

No user would enjoy content that is disorganized and difficult to navigate through. Because finding the information they need is very time-consuming, people prefer leaving the website immediately rather than searching for what they want, even though the content might have been resourceful and the user interface pretty.

Information Scent

Information Scent

Users depend on visible cues to figure out how to achieve their goals with the least effort. For information gathering tasks, like searching for information on the web, it turns out that this behavior can be modeled much like animals foraging for food. An animal feeding in a natural environment asks questions like: Where should I feed? What should I try to eat (the big rabbit that’s hard to catch, or the little rabbit that’s less filling)? Has this location been exhausted of food that’s easy to obtain, and should I try to move on to a more profitable location? Information foraging theory claims that we ask similar questions when we’re collecting information: Where should I search? Which articles or paragraphs are worth reading? Have I exhausted this source, should I move on to the next search result or a different search? (Pirolli & Card, “Information Foraging in Information Access Environments,” CHI ‘95.)

An important part of information foraging is the decision about whether a hyperlink is worth following – i.e., does this smell good enough to eat? Users make this decision with relatively little information – sometimes only the words in the hyperlink itself, sometimes with some context around it (e.g., a Google search result also includes a snippet of text from the page, the site’s domain name, the length of the page, etc.) These cues are information scent – the visible properties of a link that indicate how profitable it will be to follow the link. (Chi et al, “Using Information Scent to Model User Information Needs and Actions on the Web”, CHI 2001.)

Hierarchy of Exploration Costs

For the user, collecting information scent cues is done progressively, with steadily increasing cost.

Some properties can be observed very quickly, with a glance over the interface: detecting affordances (like buttons or hyperlinks, if they’re well designed), recognizing icons (like a magnifying glass), or short and very visible words (like Search in big bold text).

With more effort, the user can read: long labels, help text, or search result snippets. Reading is clearly more expensive than glancing, because it requires focusing and thinking.

Still more time and effort is required to hover the mouse or press down, because your hands have to move, not just your eyes. We inspect menubars and tooltips this way. Note that tooltips are even more costly, because you often have to wait a time for the tooltip to appear.

Clicking through a link or bringing up a dialog box is next, and actually invoking a command to see its effect is the costliest way to explore.

Exploration is important to learning. But much of this reading has been about techniques for reducing the costs of exploration, and making the right feature more obvious right away. An interface with very poor affordances will be very expensive to explore. Imagine a webpage whose links aren’t distinguished by underlining or color – you’ve just taken away the Glance, and forced the user to Read or Hover to discover what’s likely to be clickable. Now imagine it in a foreign language – you’ve just taken away Read. Now get rid of the mouse cursor feedback – no more Hover, and the user is forced to Click all over the place to explore. Your job as a designer is to make the user’s goal as easy to recognize in your user interface as possible.

Give Good Information Scent

Hyperlinks in your interface – or in general, any kind of feature, including menu commands and toolbar buttons – should provide good, appropriate information scent.

Examples of bad scent include misleading terms, incomprehensible jargon (like “Set Program Access and Defaults” on the Windows XP Start menu), too-general labels (“Tools”), and overlapping categories (“Customize” and “Options” found in old versions of Microsoft Word).

Examples of good scent can be seen in the (XP-style) Windows Control Panel on the left, which was carefully designed. Look, for example, at “Printers and Other Hardware.” Why do you think printers were singled out?

Presumably because task analysis (and collected data) indicated that printer configuration was a very common reason for visiting the Control Panel. Including it in the label improves the scent of that link for users looking for printers. (Look also at the icon – what does that add to the scent of Printers & Other Hardware?)

Date, Time, Language, and Regional Options is another example. It might be tempting to find a single word to describe this category – say, Localization – but its scent for a user trying to reset the time would be much worse.

Bad & Good Information Scent

Here are some examples from the web. Poor information scent is on the left; much better is on the right.

The first example shows an unfortunately common pathology in web design: the “click here” link. Hyperlinks tend to be highly visible, highly salient, easy to pick out at a glance from the web page – so they should convey specific scent about the action that the link will perform. “Click here” says nothing. Your users won’t read the page, they’ll scan it.

Notice that the quality of information scent depends on the user’s particular goal. A design with good scent for one set of goals might fail for another set. For example, if a shopping site has categories for Music and Movies, then where would you look for a movie soundtrack? One solution to this is to put it in both categories, or to provide “See Also” links in each category that direct the user sideways in the hierarchy.

Lots of scent but hard to scan/glance

Here’s an example of going overboard with information scent. There is so much text in the main links of this page (Search listings…, Advertise…, See…, Browse…) that it interferes with your ability to Glance over the page. A better approach would be to make the links themselves short and simple, and use the smaller text below each link to provide supporting scent.

Interaction Learnability

User Interface Hall of Shame

Source: Interface Hall of Shame

IBM’s RealCD is CD player software, which allows you to play an audio CD in your CD-ROM drive.

Why is it called “Real”? Because its designers based it on a real-world object: a plastic CD case. This interface has a metaphor, an analog in the real world.

Metaphors are one way to make an interface more learnable, since users can make guesses about how it will work based on what they already know about the interface’s metaphor.

Unfortunately, the designers’ careful adherence to this metaphor produced some remarkable effects, none of them good.

Here’s how RealCD looks when it first starts up. Notice that the UI is dominated by artwork, just like the outside of a CD case is dominated by the cover art. That big RealCD logo is just that–static artwork. Clicking on it does nothing.

There’s an obvious problem with the choice of metaphor, of course: a CD case doesn’t actually play CDs. The designers had to find a place for the player controls–which, remember, serve the primary task of the interface–so they arrayed them vertically along the case hinge. The metaphor is dictating control layout, against all other considerations.

Slavish adherence to the metaphor also drove the designers to disregard all consistency with other desktop applications. Where is this window’s close box? How do I shut it down? You might be able to guess, but is it obvious? Learnability comes from more than just metaphor.

User Interface Hall of Shame

Source: Interface Hall of Shame

But it gets worse. It turns out, like a CD case, this interface can also be opened. Oddly, the designers failed to sensibly implement their metaphor here. Clicking on the cover art would be a perfectly sensible way to open the case, and not hard to discover once you get frustrated and start clicking everywhere. Instead, it turns out the only way to open the case is by a toggle button control (the button with two little gray squares on it). Opening the case reveals some important controls, including the list of tracks on the CD, a volume control, and buttons for random or looping play. Evidently the metaphor dictated that the track list belongs on the “back” of the case. But why is the cover art more important than these controls? A task analysis would clearly show that adjusting the volume or picking a particular track matters more than viewing the cover art.

And again, the designers ignore consistency with other desktop applications. It turns out that not all the tracks on the CD are visible in the list. Could you tell right away? Where is its scrollbar?

User Interface Hall of Shame

Source: Interface Hall of Shame

We’re not done yet. Where is the online help for this interface?

First, the CD case must be open. You had to figure out how to do that yourself, without help. With the case open, if you move the mouse over the lower right corner of the cover art, around the IBM logo, you’ll see some feedback. The corner of the page will seem to peel back. Clicking on that corner will open the Help Browser.

The aspect of the metaphor in play here is the liner notes included in a CD case. Removing the liner notes booklet from a physical CD case is indeed a fiddly operation, and alas, the designers of RealCD have managed to replicate that part of the experience pretty accurately. But in a physical CD case, the liner notes usually contain lyrics or credits or goofy pictures of the band, which aren’t at all important to the primary task of playing the music. RealCD puts the help in this invisible, nearly unreachable, and probably undiscoverable booklet.

This example has several lessons: first, that interface metaphors can be horribly misused; and second, that the presence of a metaphor does not at all guarantee an “intuitive”, or easy-to-learn, user interface. (There’s a third lesson too, unrelated to metaphor–that beautiful graphic design doesn’t equal usability, and that graphic designers can be just as blind to usability problems as programmers can.)

Fortunately, metaphor is not the only way to achieve learnability. In fact, it’s probably the hardest way, fraught with the most pitfalls for the designer. In this lecture, we’ll look at some other ways to achieve learnability.

More UI Hall of Shame

Here’s another bizarre interface, taken from a program that launches housekeeping tasks at scheduled intervals.

The date and time look like editable fields (affordance), but you can’t edit them with the keyboard. Instead, if you want to change the time, you have to click on the Set Time button to bring up a dialog box.

This dialog box displays time differently, using 12-hour time (7:17 pm) where the original dialog used 24-hour time (consistency). Just to increase the confusion, it also adds a third representation, an analog clock face.

So how is the time actually changed? By clicking mouse buttons: clicking the left mouse button increases the minute by 1 (wrapping around from 59 to 0), and clicking the right mouse button increases the hour. Sound familiar? This designer has managed to turn a sophisticated graphical user interface, full of windows, buttons, and widgets, and controlled by a hundred-key keyboard and two-button mouse, into a clock radio!

Perhaps the worst part of this example is that it’s not a result of laziness. Somebody went to a lot of effort to draw that clock face with hands. If only they’d spent some of that time thinking about usability instead.

Affordances and Signifiers

Affordances & Signifiers

Affordances

  • Actual properties of a thing that determine how the thing could be used
  • Depend on thing and you

Signifiers

  • Hint/indication of an affordance
  • Should match true affordances

Affordance refers to the actual properties of a thing, primarily the properties that determine how the thing could be operated. Chairs have properties that make them suitable for sitting. Signifier refers to perceived properties of a thing that hint at an affordance. doorknobs are the right size and shape for a hand to grasp and turn. A button’s properties say “push me with your finger.” Scrollbars say that they continuously scroll or pan something that you can’t entirely see. Signifiers are how an interface communicates nonverbally, telling you how to operate it.

Signifiers are rarely innate – they are learned from experience. We recognize properties suitable for sitting on the basis of our long experience with chairs. We recognize that listboxes allow you to make a selection because we’ve seen and used many listboxes, and that’s what they do.

Note that signifiers can lie about affordances. A facsimile of a chair made of papiermache has a signifier for sitting, but it doesn’t actually afford sitting: it collapses under your weight. Conversely, a fire hydrant has no signifier for sitting, since it lacks a flat, human-width horizontal surface, but it actually does afford sitting, albeit uncomfortably.

Recall the textbox from the alarm clock, whose signifier (type a time here) disagrees with what it can actually do (you can’t type, you have to push the Set Time button to change it). Or the door handle here, whose nonverbal message (signifier) clearly says “pull me” but whose label says “push” (which is presumably what it actually affords). The parts of a user interface should agree in signifier and actual affordances.

The original definition of affordance (from psychology) referred only to actual properties, but when it was imported into human computer interaction, perceived properties became important too. Actual ability without any perceivable ability is an undesirable situation. We wouldn’t call that an affordance. Suppose you’re in a room with completely blank walls. No sign of any exit – it’s missing all the usual cues for a door, like an upright rectangle at floor level, with a knob, and cracks around it, and hinges where it can pivot. Completely blank walls. But there is actually an exit, cleverly hidden so that it’s seamless with the wall, and if you press at just the right spot it will pivot open. Does the room have an “affordance” for exiting? To a user interface designer, no, it doesn’t, because we care about how the room communicates what should be done with it. To a psychologist (and perhaps an architect and a structural engineer), yes, it does, because the actual properties of the room allow you to exit, if you know how.

Don Norman, who originally imported the psychology term affordance into design and HCI, and popularized it with his wonderful book The Design of Everyday Things, now regrets the confusion of using the same word, “affordance”, for both the perception of use and the reality of use. He proposes that signifier as a better word for a perceived affordance, so that affordance can be reserved for actual affordances, as in psychology. (Don Norman, “Signifiers, not affordances“, Interactions 2008).

Use Appropriate Signifiers

Here are some more examples of commonly-seen affordances in graphical user interfaces. Buttons and hyperlinks are the simplest form of affordance for actions. Buttons are typically metaphorical of real-world buttons, but the underlined hyperlink has become an affordance all on its own, without reference to any physical metaphor.

Downward-pointing arrows, for example, indicate that you can see more choices if you click on the arrow. The arrow actually does double-duty – it makes visible the fact that more choices are available, and it serves as a hotspot for clicking to actually make it happen.

Texture suggests that something can be clicked and dragged – relying on the physical metaphor, that physical switches and handles often have a ridged or bumpy surface for fingers to more easily grasp or push.

Mouse cursor changes are another kind of affordance – a visible property of a graphical object that suggests how you operate it. When you move the mouse over a hyperlink, for example, you get a finger cursor. When you move over the corner of a window, you often get a resize cursor; when you move over a textbox, you get a text cursor (the “I-bar”).

Finally, the visible highlighting that you get when you move the mouse over a menu item or a button is another kind of affordance. Because the object visibly responds to the presence of the mouse, it suggests that you can interact with it by clicking.

Evolution of Hyperlinks and Buttons

Hyperlinks and buttons have evolved and changed significantly. The top row shows how hyperlinks and buttons looked circa 1995 (on NCSA Mosaic, the first widely-used web browser, which used the Motif graphical user interface toolkit). What properties did they have that distinguished them and made them clickable? Which of those properties have been lost over time, presumably as users become more familiar with these objects? The drive toward simplicity is a constant force in aesthetics and user interface design, so affordances tend to diminish rather than increase.

The bottom row shows a hyperlink which has been simplified too far, and an HTML button that has been not only simplified but also lost its mouse cursor affordance. This goes too far.

What’s Wrong With This?

The story of affordances isn’t purely reductionist. Sometimes you can’t boil the affordance down to a single property like its color or a 3D border. This thing here is a button; but it’s so large, and has such a disproportionate relationship between the area and the label, that it loses its sense of clickability.

What Can You Do With This Page?

Here is the Campus Preview Weekend 2011 website. If the user wants an overview of all the events happening that weekend, the user may end up just clicking through the days individually, because those links (at the bottom) are the most salient affordances for interaction.

But it turns out that the graphic in the center page is actually a link to a nifty search interface that lets the user look at all the event listings in addition to other cool functionalities. Unfortunately the graphic doesn’t have strong affordances for interaction. It’s mostly a big logo, so what does a typical user do? Glance at it and then ignore it, scanning the page instead for things that look like actions, such as the clearly marked hyperlinks at the bottom. The “click here to search” text in the logo doesn’t work.

(example and explanation due to Dina Betser)

Signifiers depend on User & Culture

Interaction Styles

Recognition vs. Recall

It’s important to make a distinction between recognition (remembering with the help of a visible cue, also known as knowledge in the world) and recall (remembering something with no help from the outside world–purely knowledge in the head). Recognition is far, far easier than uncued recall.

Psychology experiments have shown that the human memory system is almost unbelievably good at recognition. In one study, people looked at 540 words for a brief time each, then took a test in which they had to determine which of a pair of words they had seen on that 540-word list. The result? 88% accuracy on average! Similarly, in a study with 612 short sentences, people achieved 89% correct recognition on average.

Note that since these recognition studies involve so many items, they are clearly going beyond working memory, despite the absence of elaborative rehearsal. Other studies have demonstrated that by extending the interval between the viewing and the testing. In one study, people looked briefly at 2,560 pictures, and then were tested a year later–and they were still 63% accurate in judging which of two pictures they had seen before, significantly better than chance. One more: people were asked to study an artificial language for 15 min, then tested on it two years later–and their performance in the test was better than chance.

Interaction Style #1: Command Language

User types commands in an artificial language
  • all knowledge in the head
  • low learnability
  • still used, e.g. Google query operators

The earliest computer interfaces were command languages: job control languages for early computers, which later evolved into the Unix command line.

Although a command language is rarely the first choice of a user interface designer nowadays, they still have their place–often as an advanced feature embedded inside another interaction style. For example, Google’s query operators form a command language. Even the URL in a web browser is a command language, with particular syntax and semantics.

Interaction Style #2: Menu and Forms

A menu/form interface presents a series of menus or forms to the user. Traditional (Web 1.0) web sites behave this way. Most graphical user interfaces have some kind of menu/forms interaction, such as a menubar (which is essentially a tree of menus) and dialog boxes (which are essentially forms).

Interaction Style #3: Direct Manipulation

Next we have direct manipulation: the preeminent interface style for graphical user interfaces. Direct manipulation is defined by three principles [Shneiderman, Designing the User Interface, 2004]

  1. A continuous visual representation of the system’s data objects. Examples of this visual representation include: icons representing files and folders on your desktop; graphical objects in a drawing editor; text in a word processor; email messages in your inbox. The representation may be verbal (words) or iconic (pictures), but it’s continuously displayed, not displayed on demand. Contrast that with the behavior of ed, a command language- style text editor: ed only displayed the text file you were editing when you gave it an explicit command to do so.
  2. The user interacts with the visual representation using physical actions or labeled button presses. Physical actions might include clicking on an object to select it, dragging it to move it, or dragging a selection handle to resize it. Physical actions are the most direct kind of actions in direct manipulation–you’re interacting with the virtual objects in a way that feels like you’re pushing them around directly. But not every interface function can be easily mapped to a physical action (e.g., converting text to boldface), so we also allow for “command” actions triggered by pressing a button–but the button should be visually rendered in the interface, so that pressing it is analogous to pressing a physical button.
  3. The effects of actions should be rapid (visible as quickly as possible), incremental (you can drag the scrollbar thumb a little or a lot, and you see each incremental change), reversible* (you can undo your operation–with physical actions this is usually as easy as moving your hand back to the original place, but with labeled buttons you typically need an Undo command), and immediately visible (the user doesn’t have to do anything to see the effects; by contrast, a command like “cp a.txt b.txt” has no immediately visible effect).

Why is direct manipulation so powerful? It exploits perceptual and motor skills of the human machine–and depends less on linguistic skills than command or menu/form interfaces. So it’s more “natural” in a sense, because we learned how to manipulate the physical world long before we learned how to talk, read, and write.

Interaction Style #4: Speech Dialog

A fourth interaction style–once the province of research, but now increasingly important in real deployed apps–is speech dialog in natural language. (This exchange is from the Mercury system, a flight-search system developed at MIT in the 1990s, which could be used over the phone.)

Speech dialog leans heavily on knowledge in the head. Much of this knowledge is “natural”–in the sense that humans learn how to speak and understand their native language very early in our lives, and we have a special innate facility for spoken interaction. But beyond the mechanics of speaking, the user still needs to learn what you can say. What functionality is available in the system? What can I ask for? This is a fundamental problem even in human-human interaction, and is the reason why fast-food restaurant drive-through windows display a menu.

Comparison of Interaction Styles

Let’s compare and contrast the four styles: command language (CL), menus and forms (MF), direct manipulation (DM), and speech dialog (SD).

Learnability: knowledge in the head vs. knowledge in the world. CL requires significant learning. Users must put a lot of knowledge into their heads in order to use the language, by reading, training, practice, etc. (Or else compensate by having manuals, reference cards, or online help close at hand while using the system.) The MF style puts much more information into the world, i.e. into the interface itself. Well-designed DM also has information in the world, delivered by the affordances, feedback, and constraints of the visual metaphor. Since recognition is so much easier than recall, this means that MF and DM is much more learnable and memorable than CL or SD.

Error messages: CL, MF, and SD often have error messages (e.g. “you didn’t enter a phone number”), but DM rarely needs error messages. There’s no error message when you drag a scrollbar too far, for example; the scrollbar thumb simply stops, and the visual constraints of the scrollbar make it obvious why it stopped.

Efficiency: Experts can be very efficient with CL and SD, since they don’t need to wait for and visually scan system prompts, and many CL systems have command histories and scripting facilities that allow commands to be reused rather than constantly retyped. Efficient performance with MF interfaces demands good shortcuts (e.g. keyboard shortcuts, tabbing between form fields, typeahead). Efficient performance with DMs is possible when the DM is appropriate to the task; but using DM for a task it isn’t well-suited for may feel like manual labor with a mouse.

User type: CL is generally better for expert users, who keep their knowledge active and who are willing to invest in training and learning in exchange for greater efficiency. MF, DM, and SD are generally better for novices and infrequent users.

Synchrony: Command languages are synchronous (first the user types a complete command, then the system does it). So are menu systems and forms; e.g., you fill out a web form, and then you submit it. Speech requires turn-taking between the system and user, so it’s synchronous as well. DM, on the other hand, is asynchronous: the user can point the mouse anywhere and do anything at any time. DM interfaces are necessarily event driven.

Combining Multiple Interaction Styles

Real user interfaces often combine multiple interaction styles to make up for deficiencies in one style. For example, the Siri system built into iOS has both speech dialog (the user speaks something like “wake me up in one hour”, and the system replies with speech) and menu/form (the alarm time and on/off setting can be manipulated here).

Example: Twitter’s Tweet Creation UI

Let’s look at Twitter’s interface–specifically, let’s focus on the interface for creating a new tweet. What aspects of this interface are knowledge-in-the-world, and what aspects require knowledge in the head? In what way is Twitter a hybrid of a command language and a menu/form interface?

Twitter is actually an unusual kind of command interface in that examples of “commands” (formatted tweets generated by other users) are constantly flowing at the user. So the user can do a lot of learning by watching on Twitter. On the other hand, learning by doing is somewhat more embarrassing, because your followers can all see your mistakes (the incorrect tweets you send out while you’re still figuring out how to use it).

Self Disclosure

Self-disclosure is a technique for making a command language more visible, helping the user learn the available commands and syntax. Self-disclosure is useful for interfaces that have both a traditional GUI (with menus and forms and possibly direct manipulation) as well as a command language (for scripting). When the user issues a command in the GUI part, the interface also displays the command in the command language that corresponds to what they did. A primitive form of self-disclosure is the address bar in a web browser–when you click on a hyperlink, the system displays to you the URL that you could have typed in order to visit the page. A more sophisticated kind of self-disclosure happens in Excel: when you choose the sum function from the toolbar, and drag out a range of cells to be summed, Excel shows you how you could have typed the formula instead. (Notice that Excel also uses a tooltip, to make the syntax of the formula more visible.)

On the bottom is another example of self-disclosure: Google’s Advanced Search form, which allows the user to specify search options by selecting them from menus, the results of which are also displayed as a commandbased query (“microsoft windows” “operating system” OR OS -glass -washing site:microsoft.com) which can be entered on the main search page. (example suggested by Geza Kovacs)

Try It: Google Autosuggest
to find Learnability Problems

  • Look at the suggested queries for prefixes such as:
    • “photoshop how to”
    • “iphone how to”
    • “android how to”
  • What kind of goals do you see?
  • What kind of goals don’t you see?
  • What does it say about the learnability of the UI for that task?
Adam Fourney, Richard Mann, and Michael Terry. “Characterizing the Usability of Interactive Applications Through Query Log Analysis.” CHI 2011

Search engines have become even more important than in-application help systems, however. And a wonderful thing about search engines is that they show us query suggestions, so we can get some insight into the goals of thousands of other users. What is it that they’re trying to do with their iPhone, but isn’t easily learnable from the interface? (Adam Fourney, Richard Mann, and Michael Terry. “Characterizing the Usability of Interactive Applications Through Query Log Analysis.” CHI 2011.)

Conceptual Models

Models

Regardless of interaction style, learning a new system requires the user to build a mental model of how the system works. Learnability can be strongly affected by difficulties in building that model.

A model of a system is a way of describing how the system works. A model specifies what the parts of the system are, and how those parts interact to make the system do what it’s supposed to do.

For example, at a high level, the model of Twitter is that there are other users in the system, you have a list of people that you follow and a list of people that follow you, and each user generates a stream of tweets that are seen by their followers, mixed together into a feed. These are all the parts of the system. At a more detailed level, tweets and people have attributes and data, and there are actions that you can do in the system (viewing tweets, creating tweets, following or unfollowing, etc.). These data items and actions are also parts of the model.

Three Models in UI Design

There are actually several models you have to worry about in UI design:

  • The system model (sometimes called implementation model) is how the system actually works.
  • The interface model (or manifest model) is the model that the system presents to the user through its user interface.
  • The user model (or conceptual model) is how the user thinks the system works.

A cell phone presents the same simple interface model as a conventional wired phone, even though its system model is quite a bit more complex. A cell phone conversation may be handed off from one cell tower to another as the user moves around. This detail of the system model is hidden from the user.

As a software engineer, you should be quite familiar with this notion. A module interface offers a certain model of operation to clients of the module, but its implementation may be significantly different.

In software engineering, this divergence between interface and implementation is valued as a way to manage complexity and plan for change. In user interface design, we value it primarily for other reasons: the interface model should be simpler and more closely reflect the user’s model of the actual task.

Note that we’re using model in a more general and abstract sense here than when we talk about the model-view-controller pattern (which you may have heard of previously, and which we’ll discuss more in a future lecture). In MVC, the model is a software component (like a class or group of classes) that stores application data and implements the application behavior behind an interface. Here, a model is an abstracted description of how a system works. The system model on this slide might describe the way an MVC model class behaves (for example, storing text as a list of lines). The interface model might describe the way an MVC view class presents that system model (e.g., allowing end-of-lines to be “deleted” just as if they were characters). Finally, the user model isn’t software at all; it’s all in the user’s mind.

Example: Back vs. Previous

Here’s an example drawn directly from graphical user interfaces: the Back button in a web browser. What is the model for the behavior of Back? Specifically: how does the user think it behaves (the mental model), and how does it actually behave (the system model)?

The system model is that Back goes back to the last page the user was viewing, in a temporal history sequence.

But on a web site that has pages in some kind of linear sequence of their own–such as the result pages of a search engine (shown here) or multiple pages of a news article–then the user’s mental model might easily confuse these two sequences, thinking that Back will go to the previous page in the web site’s sequence. In other words, that Back is the same as Previous! (The fact that the “back” and “previous” are close synonyms, and that the arrow icons are almost identical, strongly encourages this belief.)

Most of the time, this erroneous mental model of Back will behave just the same as the true system model. But it will deviate if the user mixes the Previous link with the Back button–after pressing Previous, the Back button will behave like Next!

A nice article with other examples of tricky mental model/system model mismatch problems is “Mental and conceptual models, and the problem of contingency” by Charles Hannon, interactions, November 2008.

Example: Previous

Example: Previous

What does previous do here?

Example: Graphical Editing

Consider image editing software. Programs like Photoshop and Gimp use a pixel editing model, in which an image is represented by an array of pixels (plus a stack of layers). Programs like PowerPoint and Illustrator, on the other hand, use a structured graphics model, in which an image is represented by a collection of graphical objects, like lines, rectangles, circles, and text. In this case, the choice of model strongly constrains the kinds of operations available to a user. You can easily tweak individual pixels in Microsoft Paint, but you can’t easily move an object once you’ve drawn it into the picture.

Example: Text Editing

Similarly, most modern text editors model a text file as a single string, in which line endings are just like other characters. But it doesn’t have to be this way. Some editors represent a text file as a list of lines instead.

When this implementation model is exposed in the user interface, as in old Unix text editors like ed, line endings can’t be deleted in the same way as other characters. ed has a special join command for deleting line endings.

User’s Model May Be Wrong

The user’s model may be totally wrong without affecting the user’s ability to use the system. A popular misconception about electricity holds that plugging in a power cable is like plugging in a water hose, with electrons traveling from the power company through the cable into the appliance. The actual system model of household AC current is of course completely different: the current changes direction many times a second, and the actual electrons don’t move far, and there’s really a circuit in that cable, not just a one-way tube. But the user model is simple, and the interface model supports it: plug in this tube, and power flows to the appliance.

But a wrong user model can also lead to problems. Consider a household thermostat, which controls the temperature of a room. If the room is too cold, what’s the fastest way to bring it up to the desired temperature?

Some people would say the room will heat faster if the thermostat is turned all the way up to maximum temperature. This response is triggered by an incorrect mental model about how a thermostat works: either the timer model, in which the thermostat controls the duty cycle of the furnace, i.e. what fraction of time the furnace is running and what fraction it is off; or the valve model, in which the thermostat affects the amount of heat coming from the furnace. In fact, a thermostat is just an on-off switch at the set temperature. When the room is colder than the set temperature, the furnace runs full blast until the room warms up. A higher thermostat setting will not make the room warm up any faster. (Norman, Design of Everyday Things, 1988)

These incorrect models shouldn’t simply be dismissed as “ignorant users.” (Remember, the user is always right! If there’s a consistent problem in the interface, it’s probably the interface’s fault.) These user models for heating are perfectly correct for other systems: a car heater and a stove burner both use the valve model. And users have no problem understanding the model of a dimmer switch, which performs the analogous function for light that a thermostat does for heat. When a room needs to be brighter, the user model says to set the dimmer switch right at the desired brightness.

The problem here is that the thermostat isn’t effectively communicating its model to the user. In particular, there isn’t enough feedback about what the furnace is doing for the user to form the right model.

Consistency

Consistency

There’s a general principle of learnability: consistency. This rule is often given the hifalutin’ name the Principle of Least Surprise, which basically means that you shouldn’t surprise the user with the way a command or interface object works.

Similar things should look, and act, in similar ways. Conversely, different things should be visibly different.

Kinds of Consistency

There are three kinds of consistency you need to worry about:

  1. Internal consistency within your application
  2. External consistency with other applications on the same platform
  3. Metaphorical consistency with your interface metaphor or similar real-world objects

The RealCD interface discussed before has problems with both metaphorical consistency (CD jewel cases don’t play; you don’t open them by pressing a button on the spine; and they don’t open as shown), and with external consistency (the player controls aren’t arranged horizontally as they’re usually seen; and the track list doesn’t use the same scrollbar that other applications do).

Metaphors

Metaphors are one way you can bring the real world into your interface. RealCD is an example of an interface that uses a strong metaphor in its interface.

A well-chosen, well-executed metaphor can be quite effective and appealing, but be aware that metaphors can also mislead.

The advantage of metaphor is that you’re borrowing a conceptual model that the user already has experience with. A metaphor can convey a lot of knowledge about the interface model all at once. It’s a notebook. It’s a CD case. It’s a desktop. It’s a trashcan.

Each of these metaphors carries along with it a lot of knowledge about the parts, their purposes, and their interactions, which the user can draw on to make guesses about how the interface will work.

Some interface metaphors are famous and largely successful. The desktop metaphor – documents, folders, and overlapping paperlike windows on a desk-like surface – is widely used and copied. The trashcan, a place for discarding things but also for digging around and bringing them back, is another effective metaphor – so much so that Apple defended its trashcan with a lawsuit, and imitators are forced to use a different look. (Recycle Bin, anyone?)

But a computer interface must deviate from the metaphor at some point – otherwise, why aren’t you just using the physical object instead? At those deviation points, the metaphor may do more harm than good. For example, it’s easy to say “a word processor is like a typewriter,” but you shouldn’t really use it like a old-fashioned manual typewriter. Pressing Enter every time the cursor gets close to the right margin, as a typewriter demands, would wreak havoc with the word processor’s automatic word-wrapping.

The basic rule for metaphors is: use it if you have one, but don’t stretch for one if you don’t.

Appropriate metaphors can be very hard to find, particularly with real-world objects. The designers of RealCD stretched hard to use their CD-case metaphor (since in the real world, CD cases don’t even play CDs), and it didn’t work well.

Metaphors can also be deceptive, leading users to infer behavior that your interface doesn’t provide. Sure, it looks like a book, but can I write in the margin? Can I rip out a page?

Metaphors can also be constraining. Strict adherence to the desktop metaphor wouldn’t scale, because documents would always be full-size like they are in the real world, and folders wouldn’t be able to have arbitrarily deep nesting.

The biggest problem with metaphorical design is that your interface is presumably more capable than the real-world object, so at some point you have to break the metaphor. Nobody would use a word processor if really behaved like a typewriter. Features like automatic word-wrapping break the typewriter metaphor, by creating a distinction between hard carriage returns and soft returns.

Most of all, using a metaphor doesn’t save an interface that does a bad job communicating itself to the user. Although RealCD’s model was metaphorical – it opened like a CD case, and it had a liner notes booklet inside the cover – these features had such poor visibility and perceived affordances that they were ineffective.

Natural Mapping: Consistency of Layout

Another important principle of learnability is good mapping of functions to controls.

Consider the spatial arrangement of a light switch panel. How does each switch correspond to the light it controls? If the switches are arranged in the same fashion as the lights themselves, it is much easier to learn which switch controls which light.

A direct mapping means that the physical layout of the controls matches the physical arrangement of their functions. In a direct mapping, control A is to the left of control B if and only if function A (e.g. the light that’s turned on or off) is to the left of function B.

Direct mappings are not always easy to achieve, since a control may be oriented differently from the function it controls. Light switches are mounted vertically, on a wall; the lights themselves are mounted horizontally, on a ceiling. So the switch arrangement may not correspond directly to a light arrangement.

So sometimes the best that can be done is a natural mapping: the physical layouts are not identical, but a simple mental model can transform the controls to the functions and vice versa. The turn signal switch in most cars is a stalk that moves up and down (vertically), but the function it controls is a signal for turning left or right, horizontally. But the mapping is natural, because the turn signal stalk sits on the steering column, and the user can mentally map it to moving the stalk in the direction that the steering wheel will be turned, rather than the direction the car will move.

Other good examples of mapping include:

  • Stove burners. Many stoves have four burners arranged in a square, and four control knobs arranged in a row. Which knobs control which burners? Most stoves don’t make any attempt to provide even a natural mapping.
  • An audio mixer for DJs (proposed by Max Van Kleek for the Hall of Fame) has two sets of identical controls, one for each turntable being mixed. The mixer is designed to sit in between the turntables, so that the left controls affect the turntable to the left of the mixer, and the right controls affect the turntable to the right. The mapping here is direct.
  • The controls on the RealCD interface don’t have a natural mapping. Why not?
  • The Segway’s controls have a direct mapping. Why?
  • Here’s a meta question. What’s wrong with the mapping of this bulleted list with respect to the slide above?

Internal Consistency in Wording

Another important kind of consistency, often overlooked, is in wording. Use the same terms throughout your user interface. If your interface says “share price” in one place, “stock price” in another, and “stock quote” in a third, users will wonder whether these are three different things you’re talking about.

Don’t get creative when you’re writing text for a user interface; keep it simple and uniform, just like all technical writing.

Here are some examples from the Course VI Underground Guide web site – confusion about what’s a “review” and what’s an “evaluation”.

External Consistency in Wording: Speak the User’s Language

  • Use common words, not techie jargon

    • But use domain-specific terms where appropriate

  • Allow aliases/synonyms in command languages

External consistency in wording is important too – in other words, speak the user’s language as much as possible, rather than forcing them to learn a new one. If the user speaks English, then the interface should also speak English, not Geekish. Technical jargon should be avoided. Use of jargon reflects aspects of the system model creeping up into the interface model, unnecessarily. How might a user interpret the dialog box shown here? One poor user read “type” as a verb, and dutifully typed M-I-S-M-A-T-C-H every time this dialog appeared. The user’s reaction makes perfect sense when you remember that most computer users do just that, type, all day. But most programmers wouldn’t even think of reading the message that way. Yet another example showing that you are not the user.

Technical jargon should only be used when it is specific to the application domain and the expected users are domain experts. An interface designed for doctors shouldn’t dumb down medical terms.

When designing an interface that requires the user to type in commands or search keywords, support as many aliases or synonyms as you can. Different users rarely agree on the same name for an object or command. One study found that the probability that two users would mention the same name was only 7-18%. (Furnas et al, “The vocabulary problem in human-system communication,” CACM v30 n11, Nov. 1987).

Incidentally, there seems to be a contradiction between these guidelines. Speaking the User’s Language argues for synonyms and aliases, so a command language should include not only delete but erase and remove too.

But consistency in wording argued for only one command name, lest the user wonder whether these are three different commands that do different things. One way around the impasse is to look at the context in which you’re applying the heuristic. When the user is talking, the interface should make a maximum effort to understand the user, allowing synonyms and aliases. When the interface is speaking, it should be consistent, always using the same name to describe the same command or object. This is quite similar to the 6.033 system design principle of being loose on inputs and strict on outputs.

What if the interface is smart enough to adapt to the user – should it then favor matching its output to the user’s vocabulary (and possibly the user’s inconsistency) rather than enforcing its own consistency? Perhaps, but adaptive interfaces are still an active area of research, and not much is known.

Feedback

Actions Should Have Immediately Visible Effects

Hand-in-hand with affordances is feedback: how the system changes visibly when you perform an action.

When the user invokes a part of the interface, it should appear to respond. Push buttons should depress and release. Scrollbar thumbs and dragged objects should move with the mouse cursor. Pressing a key should make a character appear in a textbox.

Low-level feedback is provided by a view object itself, like push-button feedback. This kind of feedback shows that the interface at least took notice of the user’s input, and is responding to it. (It also distinguishes between disabled widgets, which don’t respond at all.)

High-level feedback is the actual result of the user’s action, like changing the state of the model.

Perceptual Fusion

One interesting effect of human perceptual system is perceptual fusion. Here’s an intuition for how fusion works. Our “perceptual processor” runs at a certain frame rate, grabbing one frame (or picture) every cycle, where each cycle takes T_p seconds. Two events occurring less than the cycle time apart are likely to appear in the same frame. If the events are similar – e.g., Mickey Mouse appearing in one position, and then a short time later in another position – then the events tend to fuse into a single perceived event – a single Mickey Mouse, in motion.

The cycle time of the perceptual processor can be derived from a variety of psychological experiments over decades of research (summarized in Card, Moran, Newell, The Psychology of Human-Computer Interaction, Lawrence Erlbaum Associates, 1983). 100 milliseconds is a typical value which is useful for a rule of thumb.

But it can range from 50 ms to 200 ms, depending on the individual (some people are faster than others) and on the stimulus (for example, brighter stimuli are easier to perceive, so the processor runs faster).

Perceptual fusion is responsible for the way we perceive a sequence of movie frames as a moving picture, so the parameters of the perceptual processor give us a lower bound on the frame rate for believable animation. 10 frames per second is good enough for a typical case, but 20 frames per second is better for most users and most conditions.

Perceptual fusion also gives an upper bound on good computer response time. If a computer responds to a user’s action within T_p time, its response feels instantaneous with the action itself. Systems with that kind of response time tend to feel like extensions of the user’s body. If you used a text editor that took longer than T_p response time to display each keystroke, you would notice.

Fusion also strongly affects our perception of causality. If one event is closely followed by another – e.g., pressing a key and seeing a change in the screen – and the interval separating the events is less than T_p, then we are more inclined to believe that the first event caused the second.

Response Time

Perceptual fusion provides us with some rules of thumb for responsive feedback.

If the system can perform a command in less than 100 milliseconds, then it will seem instantaneous, or near enough. As long as the result of the command itself is clearly visible – e.g., in the user’s locus of attention – then no additional feedback is required.

If it takes longer than the perceptual fusion interval, then the user will notice the delay – it won’t seem instantaneous anymore. Something should change, visibly, within 100 ms, or perceptual fusion will be disrupted. Normally, however, ordinary low-level feedback is enough to satisfy this requirement, such as a push-button popping back, or a menu disappearing.

One second is a typical turn-taking delay in human conversation – the maximum comfortable pause before you feel the need to fill the gap with something, even if it’s just “uh” or “um”. If the system’s response will take longer than a second, then it should display additional feedback. For short delays, the hourglass cursor (or spinning cursor, or throbber icon shown here) is a common design pattern. For longer delays, show a progress bar, and give the user the ability to cancel the command.

Note that progress bars don’t necessarily have to be precise. (This one is actually preposterous – who cares about 3 significant figures of progress?) An effective progress bar has to show that progress is being made, and allow the user to estimate completion time at least within an order of magnitude – a minute? 10 minutes? an hour? a day?

A UIST 2006 paper by Harrison et al. called Rethinking the Progress Bar showed that you could lie in the progress bar (for example, showing 3/4 complete when the task is actually only 1/2 complete) in ways that caused users to feel like they didn’t wait as long.

Feedback Visibility Depends on
Locus of Attention

The metaphor used by cognitive psychologists for how attention behaves in perception is the spotlight: you can focus your attention on only one input channel in your environment at a time. This input channel might be a location in your visual field, or it might be a location or voice in your auditory field. You can shift your attention to another channel, but at the cost of giving up your previous focus.

So when you’re thinking about how to make something important visible, you should think about where the user’s attention is likely to be focused – their document? The text cursor? The animated banner ads on the web site?

Raskin, The Humane Interface, 2000 has a good discussion of attention as it relates to mode visibility. Raskin argues that we should think of it as the locus of attention, rather than focus, to emphasize that it’s merely the place where the user’s attention happens to be, and doesn’t necessarily reflect any conscious focusing process on the user’s part.

The status bar probably isn’t often in the locus of attention. There’s an amusing story (possibly urban legend) about a user study mainly involving ordinary spreadsheet editing tasks, in which every five minutes the status bar would display “There’s a $50 bill taped under your chair. Take it!” In a full day of testing, more than a dozen users, nobody took the money. (Alan Cooper, The Inmates Are Running the Asylum.)

But there’s also evidence that many users pay no attention to the status bar when they’re considering whether to click on a hyperlink; in other words, the URL displayed in the status bar plays little or no role in the link’s information scent (which we’ll discuss next). Phishing web sites (fake web sites that look like major sites like eBay or PayPal or CitiBank and try to steal passwords and account informations) exploit this to hide their stinky links.

The Mac OS menubar has a similar problem – it’s at the periphery of the screen, so it’s likely to be far from the user’s locus of attention. Since the menubar is the primary indicator of which application is currently active – in Mac OS, an application can be active even if it has no windows on the screen – users who use keyboard shortcuts heavily may make mode errors – in this case, sending a keyboard command to the wrong application – because they aren’t attending to the state of the menubar.

What about the shape of the mouse cursor? Surely that’s reliably in the user’s locus of attention? It may be likely to be in the user’s center of vision (or fovea), but that doesn’t necessarily mean they’re paying attention to the cursor, as opposed to the objects they’re working with. Raskin describes a mode error he makes repeatedly with his favorite drawing program, despite the fact that the mode is clearly indicated by a different mouse cursor.

Visible Navigation State

So far we’ve been looking at how to make the set of available actions visible. Let’s turn now to visualizing the state of the system.

Navigation is one important kind of state to visualize – i.e., where am I now? On the Web, in particular, users are in danger of getting lost as they move around in deep, information-rich sites. Breadcrumb trails show where you are as a path through the site’s hierarchy (e.g. Travel, Guides, North America), in a very compact form. Showing the hierarchy in a tree widget with the current node highlighted is another way to do it, but costs more screen space and complexity.

Pagination and highlighted tabs are similar patterns that show the user where they are, along with some context of where else they could go.

Visible Model State

It hardly seems necessary to say that the system model should be visualized in an interface. That’s one of the essential properties of a direct-manipulation interface: a continuous visual representation of the state of the application.

The hard design issues in model visibility tend to lie in what to make visible (i.e. which aspects of the model), and how to display it (i.e., in what representation). We’ll discuss the how, the visual representation, in much greater detail in a future lecture on graphic design.

The what may involve a tension between visibility and simplicity; visibility argues for showing more, but simplicity argues for showing less. Understanding the users and their tasks (a technique called task analysis which we’ll discuss in a future class) helps resolve the tension. For example, Microsoft Word displays a word count continuously in the status bar, since counting words is an important subtask for many users of Word (such as students, journalists, and book authors). Making it always visible saves the need to invoke a word-count command.

Visible View State

  • Selection highlight
  • Selection handles
  • Drag & drop mouse cursor
    • dragging
    • can’t drop
  • Keyboard focus

Still other state is stored in the view (or controller), not in the backend model. This “view state” is the current state of the user’s interaction with the interface.

Selections are particularly important. When the user selects an object to operate on, highlight the object somehow. Don’t just leave the selection invisible and implicit. Selection highlighting provides important feedback that the selection operation was successful; it also shows the current state of the selection if the user has forgotten what was previously selected.

A common technique for showing a selection highlight in text is reverse video (white text on dark colored background). For shapes and objects, the selection highlight may be done by selection handles, or by a dotted or animated border around the object (“crawling ants”). Selection handles are appealing because they do double-duty – both indicating the selection, and providing visible affordances for resizing the object.

When the user selects objects or text and then operates on the selection with a command, keep it selected, especially if it changes appearance drastically or moves somewhere else. If the selected thing is offscreen when the user finally invokes a command on it, scroll it back into view. That allows the user to follow what happened to the object, so they can easily evaluate its final state. Similarly, if the user makes a selection and then invokes an unrelated command (like scrolling or sorting or filtering, none of which actually use the selection), preserve the selection, even if it means you have to remember it and regenerate it. User selections, like user data, are precious, and contribute to the visibility of what the system is doing.

Another form of view state is the state of an input controller, like a drag & drop operation. Drag & drop is often indicated by a cursor change.

And one more important form of view state is keyboard focus – showing which UI element on the screen will receive any keystrokes that the user types. For textboxes, the keyboard focus is conventionally shown by a blinking cursor. For other kinds of UI elements, like buttons and hyperlinks and windows, a colored outline or title bar shows the keyboard focus.

Useless Feedback vs. Useful Feedback

Feedback is important, but don’t overdo it. This dialog box demands a click from the user. Why? Does the interface need a pat on the back for finishing the conversion? It would be better to just skip on and show the resulting documentation.

Summary: Learnability