Kortum P. (ed.) HCI Beyond the GUI. Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces

Подождите немного. Документ загружается.

outsourcing and globalization, you must know your partners and vendors so that

they do not unwittingly introduce inappropriate terms into the script.

Prompts need to be recorded in a suitably quiet environment. Background

noise included in your prompts only makes the already challenging task of listen-

ing to the menu choices harder. From a recording point of view, no office is

“quiet,” and even the hush of the cooling fan on a PC will lower the quality of your

recordings. “Hold music” is fine for being on hold, but it has no place as a back-

ground in an IVR prompt. Your prompts will be played over a telephone with

about 3 kHz of dynamic range (Carey et al., 1984), which is of low audio quality

when compared to a CD recording with a 44-kHz range. Production values should

reflect this. IVR prompts are typically recorded at 8-kHz sampling in mono, and

they do not need to be CD quality. Your prompts will not sound as good played

over a telephone as they do in the studio or as .wav files on your PC, but this is

not an excuse to accept poor recording quality. The IVR will never sound better

than the source recordings, so the recordings must be as clean as practicable.

An important final step in producing prompt recordings for an IVR is to filter

the recordings for DTMF frequencies. Some voice talents’ speech naturally

includes the same frequencies as a DTMF key press. Some recording tools such

as Adobe Audition (formerly Cool Edit) have DTMF filters available as a function

that can be applied to a sound file. Filtering the DTMF frequencies out of the

prompts prevents the IVR from being tricked by its own prompts into thinking

that the user has pressed a key.

7.5

TECHNIQUES FOR TESTING THE INTERFACE

The deceptively simple nature of IVRs requires that special attention be paid to

the methods that are employed in the testing of the interface to ensure that valid,

reliable, and useful results are obtained.

7.5.1 Task-Based Protocol

The best results in the AT&T Human Factors Lab have always been gained when

a task-based testing protocol was used. Each participant is given a number of tasks

that align with the reasons a person might call the number to reach the IVR under

test. If, for example, the IVR under test is a call blocking service that allows you to

block incoming calls from specified phone numbers, one task might be to add a

phone number to the list of numbers that are blocked. A better task would be a

narrative that gives less direction like the following: “Your daughter has an ex-boy-

friend who keeps calling. See if you can set up the call-blocking service so that if

he calls your daughter, the phone won’t ring.”

The least informative tasks are the ones that provide too much information

and lead the test participant through the interface step by step. Create scenarios

7 Interactive Voice Response Interfaces

242

that will exercise the features of the IVR, and provide as little “instruction” as pos-

sible. It is fair and often informative to include a task that the IVR will not support.

It allows you to see if a person can determine that the IVR will not do what she

was trying to accomplish or if better feedback is required. Tasks that contain

incorrect or partial information can be used to test the error recovery elements

of the IVR. It is a good idea to test the error recovery paths in addition to the

“sunny-day path” (in which the user does everything just as the IVR expects).

Our preferred test regimen is one or more iterative usability tests on proto-

types of the IVR to improve usability, utility, and accessibility, followed by a sum-

mative usability test on the last prototype or preferably the production system to

characterize the expected performance of the IVR. Iterative testing of a prototype

can quickly improve a design. It is especially effective if a developer who can

modify the prototype is on hand during testing. Using this method, weakness in

the design can be corrected overnight between participants or, in some cases, as

soon as between tasks. With IVRs, it is also helpful if the voice talent who recorded

the prototype is available during the study. In many cases you may fill the roles

of designer, tester, voice talent, and prototyper, giving you full control and

responsibility to use the iterative process to get the IVR in final form.

7.5.2 Signal Detection Analysis Method

One technique that can be used to great effect in IVR testing borrows from signal

detection theory, which is explained quite well in Wickens’s

Engineering Psychol-

ogy and Human Performance

(1984). By observation it can be determined if the

user successfully completes a task. It can be quite instructive, however, to ask

the user if he believes he has completed the task successfully. One hopes that

each task will result in both successful task completion and perceived success, a

“hit” in signal detection terms. The second best outcome is a “correct rejection,”

where the user fails the task and correctly believes that she has failed. In these

two cases, the user has an accurate picture of her situation and can make an intel-

ligent decision as to what to do next. The other two conditions, actual failure with

perceived success (a false alarm) and actual success perceived as a failure

(a miss), cause significant problems if they occur in deployed systems.

Given the task of making a car reservation at an airport, imagine what hap-

pens to the user. In a system that generates false alarm states, the user believes

that he has rented a car when in fact he has not. Most often, some part of the

interaction has given the user the impression that he has finished before having

completed all necessary steps. Perhaps a case of too much feedback, too soon—

or a poorly organized process that does not linearly drive the user to a successful

conclusion. In any event, the user confidently hangs up, boards a flight, and lands

thousands of miles from home without a car reserved. Systems that generate

misses cause users to mistakenly repeat a process to a successful conclusion more

times than they intend. The user is not getting the feedback that he is done, or

7.5 Techniques for Testing the Interfa ce

243

that the system believes he is done. He repeats the steps looking for a positive

confirmation that the reservation has been accepted, often multiple times. This

person arrives at his destination only to find that he has reserved four cars, per-

haps with nonrefundable deposits.

7.5.3 Prototyping

There are four methods of testing, each with increasing levels of fidelity, that

can be employed to test an IVR: Wizard of Oz (WOZ), WOZ with sound files,

a functional prototype, and beta code on a production platform.

WOZ

The lowest level of fidelity, and also the lowest cost to run is a WOZ study. A WOZ

study is often done for speech recognition, but can also be employed for IVRs.

Like the wizard in the movie, the experimenter takes on the role of the technol-

ogy. You read from the prompt script and have the participant indicate the choice

they would make on each menu. Following a flow or outline, you then read the

next prompt and collect the next response until each task is complete. It is cheap,

fast, and easy to make iterative changes. The face validity is a bit lacking and a

wizard may introduce bias with or without intention or awareness through non-

verbal cues, coaching, or giving a bit of the benefit of the doubt. Nonetheless, this

method is useful if your means are limited, or as a first vetting of the interface

before additional resources are spent on prototyping.

Note that the lower level of fidelity option of handing a participant the outline

or flow and the prompt script was not offered as a viable method, as it is not. The

first step in vetting prompts is to read them out loud. Many things that will pass by

design teams and committees on paper sound as awful as they are when read

aloud the first time. One discount method is to leave your prompt script as a voice

mail to yourself and then listen to it. When you are forced to listen to and not

allowed to read a prompt, its suitability can be better judged. Prompts with awk-

ward structure, repeated words, or too many options will begin to make themselves

apparent.

WOZ with Sound Files

The veracity of a WOZ can be improved by recording the prompts as sound files

on a PC that can be played rather than read, as in the simplest WOZ. In this

way, each participant hears the menus exactly the same as another participant

with the same inflection, emphasis, and phrasing. The recordings provide some

social distance between the experimenter and the participant and give the impres-

sion of working with an IVR system. A means of organizing the sound files so

that they can be accessed easily and played correctly in response to the user’s

indicated key presses is needed. Adding recordings for the wizard to use is a

relatively small step in the effort to increase test validity.

7 Interactive Voice Response Interfaces

244

Functional Prototypes

Functional prototypes allow the test participant to experience your design without

your intervention. A soft prototype could be built on a PC in a scripting language

or even in HTML or VXML that ties your menu logic to sound files that play in

succession as users click their way through. The next level of prototype is one that

will answer phone calls. Now you have high fidelity in that users are interacting

with a phone handset, just like they will with the real IVR. This prototype may

be built on a small PC with a telephony modem, or with line cards, or through a

website that will host VXML scripts and provide a call-in number. If your prompts

are recorded well, your test participants may not know they are calling a proto-

type. One major headache in prompt scripting can now be evaluated.

When using a wireless or a cordless phone the keys cannot be used while lis-

tening. The handset must be held away from the ear to press a key and then

returned to the ear to listen to the next prompt. If critical information resides in

the first few words of the prompt, many users will not hear it as their phone will

be on its way back to the ear. Testing with a cordless or princess-style handset

instead of a desktop or speakerphone will help identify prompts that are vulnera-

ble to this issue. In a functional prototype, connections to data sources may have

to be faked. The PC in your test lab is unlikely to have access to the bank’s account

database. Fake databases or even static data are not usually a problem, and clever

task design can often avoid exposing a prototype’s limitations.

Production System Prototypes

The grand luxury in testing is to be able to run test code or beta designs on a

production system. On the production system, you will likely have access to data-

bases and live feeds. Everything is of the same quality and character that it will be

once it goes live. Audio levels, delays to reach data, slowdowns due to traffic or

loads can all be included as part of your test. You may even be able to use a load

balancer to send every

-th call to your prototype instead of the current IVR. The

downside is that your test may access live accounts, place real orders, and incur

real charges. It can be difficult to obtain test accounts or test processes that

require a real credit card or SSN on production systems. In some cases, testing

is easier and safer on a prototype.

7.5.4 Testing Equipment

There are certain pieces of equipment, such as phone taps, DTMF keystroke

loggers, and compressor/limiters, that make testing prototype or production sys-

tems easier and more informative. Phone taps can be legal or illegal depending

on how they are used: Consult local law enforcement to determine the proper

method in your area. There are three types of phone taps: contact microphone,

handset cord, and line taps. Contact microphone taps are the least expensive

and tend to have the lowest audio quality. As of this writing, Radio Shack carried

7.5 Techniques for Testing the Interfa ce

245

a model that adheres to the handset with a suction cup near the speaker. Better

audio quality can be obtained with a tap that plugs into the cord that goes between

the telephone base and the handset (e.g., from a manufacturer such as JK Audio)

(Figure 7.3). Both of these taps must be used on the phone that your caller is

using. The line tap goes on the phone line itself, and can be used on the caller’s

line or the line that the IVR is connected to.

Line taps do a better job of separating the audio of the caller and the IVR. Line

taps such as Gentner’s Hybrid Coupler or JK Audio’s Innkeeper run from the low

hundreds to the low thousands of dollars; this is a piece of equipment where

higher cost tends to equate to higher quality. DTMF keystroke loggers can tie into

the phone line or the audio from a phone tap, and then log or display the key

corresponding to a DTMF tone. This makes testing IVRs simpler as you do not

have to watch what key the user presses. At one time during a period of heavy

IVR testing, I found that I had memorized the sound of most of the DTMF

keys—perhaps a sign that I needed more variety in my work.

DTMF key presses are loud, especially when played through headphones.

A compressor/limiter is a common piece of audio gear used in the recording

industry and on performance stages large and small. The compressor/limiter

takes an audio input and shapes the output to conform to limits that you set, such

as no matter what the input level nothing leaves the compressor over 65 dB. This

is done not by cutting off loud parts of the input but by compressing the audio to

fit within the limits you set. The benefit is that you can run microphones or phone

FIGURE

7.3

The THAT-1 phone tap.

Because the tap plugs into the handset cord, the audio quality is better than

contact taps. (Courtesy of JK Audio, Inc.—

www.jkaudio.com

7 Interactive Voice Response Interfaces

246

taps hot (at high volume levels) so that you can hear quiet inputs well, without the

worry that when something loud (like a DTMF key press) comes along your ears

will get blasted. The Alesis 3630 compressor/limiter is a workhorse in our lab that

can often be sourced for as low as $100.

7.6

DESIGN GUIDELINES

IVR system interfaces need to be inherently easy to use. This section is intended

to give IVR developers a set of tools that they can use during the design stage to

help ensure that their product meets functional and usability expectations for a

broad range of users. It should be noted that no set of rules or guidelines

alone

will

lead to the development of an optimally usable and useful IVR. The goal in any

interface design effort is to find the best “fit” between a system’s specific require-

ments and the consistency and usability benefits that will result from the applica-

tion of design guidelines. To achieve this goal, it is recommended that guidelines

be used as part of a broader process, including usability testing.

7.6.1 General IVR Design Information

The guidelines that follow are drawn from several sources (HFES/ANSI, 2006;

ANSI, 1992; ANSI/ISO/IEC, 1995; Davidson & Persons, 1992; Engelbeck & Roberts,

1989; Pacific Bell, 1992; Shovar, Workman, & Davidson, 1994; Simon & Davidson,

1990; Pacific Bell, 1990), and represent conclusions that in many cases were

reached independently in multiple labs and that often cross-reference each other.

Do Not Lose Data When Transferring

Information entered by the user should not be lost in the transition between

systems, nor should it be lost when routed from an automated system to a live rep-

resentative. The user should never have to give the same information twice in the

same session (except for required verification or other special circumstances).

Make Walk-up-and-Use Applications Simple

Walk-up-and-use applications should have simple interfaces accessible to the

novice user. Menus should contain few choices, possibly containing only binary

ones (e.g., yes/no or accept/cancel). System options should be explained fully

in complete or nearly complete sentences. Users should not need to ask for help

or for additional information, although it should be available.

Provide Shortcuts in Subscribed-to Applications

Subscribed-to applications can have more complex interfaces. While they should

still be accessible to first-time users, shortcuts should be provided to allow experi-

enced users to operate the interface more efficiently.

7.6 Design Guidelines

247

Use the Best Announcement Voice

F Use human speech whenever possible.

F Use trained voice talent whenever possible.

F Use synthesized speech only for text that cannot be prerecorded or stored.

7.6.2 Opening Message

The opening message is not a throwaway greeting; it has work to do.

Opening Message Should Give and Gather Information

F Present users with an opening message when they first dial into the

system. Except in systems where subscribers are allowed to record their

own greeting messages (e.g., voice mail).

F The opening message should be worded to identify the relevant company

(e.g., AT&T, Rice University) and the application.

F For walk-up services that accept only touch-tone input, verify as early as

possible in the script that the user is at a touch-tone phone, and if not,

present a live-agent routing option for rotary phone users. The user

should be instructed to press 1 to indicate touch-tone service (“If you are

using a touch-tone phone, press 1; otherwise, please hold”), but the system

should recognize any touch-tone input (not just the 1 key) as evidence of

a touch-tone phone.

F If a touch-tone phone is not detected in the time-out period at the opening

message, it should be assumed that the user has rotary service (or is

unwilling to use the IVR) and should be connected with a live agent.

7.6.3 User Control

Give users tools to allow them to succeed.

Give the User Control

F The user should have control of the system wherever possible. This

includes the ability to control the start and end of the system’s actions, the

ability to cancel transactions in progress, and the ability to exit from the

system and/or speak with a service representative.

F Present users with the option to choose between a self-service channel

and a live-agent channel, with the options being presented as early as

possible in the application consistent with organizational objectives.

F Always prompt users with all of the available options.

F There should be a common, universal set of functions that are always

available within the system. These functions should be easy to use, and be

7 Interactive Voice Response Interfaces

248

consistent throughout the interface. An example is using the 0 (zero) key

to transfer to a service representative or to hear a help message.

F Give the user immediate feedback for each action. An example of

feedback is “Your order has been canceled.” Beeps and tones by

themselves are insufficient feedback and should be paired with an

announcement when used.

Allow Dial-Through: Interrupting System Output

F Allow the user to override prompts, menus, and other statements at

virtually any point before or during the time that they are offered. Allow

the user to select a menu option immediately after it has been heard

without having to wait for the entire menu to be presented.

F Some types of system output must not be interruptible. These include

error tones and the noninterruptible portion of an error message.

F User input should terminate the prompt asking for user input within 0.3

seconds, and the system should then act on that input.

F Error messages have three parts. The first part of an error message states

what

went wrong. The second part states

why

it happened. The third part

tells the user

how

to correct the problem if it is correctable, and what

options they may have or whom to call for help. The

what

portion of the

announcement should not be interruptible. It may be appropriate to

combine the

what

and

why

in some cases.

Error:

User enters an incorrect password.

What:

“Access to your account has been denied.”

Why:

“The phone number and PIN you entered do not match our records.”

How:

“Please re-enter your area code and phone number.”

Allow Dial-Ahead: Pre-empting System Output

F Allow users to jump ahead through several menus.

F User input should be queued for processing when the user is working

ahead of the system.

F If an invalid input is encountered by the system while input remains

queued for processing, the queue should be cleared at the time the user

starts to receive error feedback.

Allow the User to Cancel Actions after Data Input

Always provide a way for users to cancel their input. This can be provided as either

a prompted option (“If this is correct, press 1. If incorrect, press 2.”), and/or as an

implicit option (e.g., by enabling the * [star] key to cancel and/or start data entry

over again).

7.6 Design Guidelines

249

Do Not Strand the User

For optimal user control, the user should be returned to a familiar and meaningful

location after an activity is completed (e.g., to the system main menu or the cur-

rent primary topic menu). The location should be titled and contain prompts

for what to do next. When appropriate, present the user with an option to exit

the system, or instructions on how to exit.

7.6.4 Menus

Menus must be purposefully designed to be easy. Rarely do good IVR menus orga-

nize themselves like existing business processes.

Make Menus Informative and Orderly

F Prompted choices should be numbered sequentially, in ascending

numerical order beginning with 1.

F The user should not hear prompts for commands that are not currently

available. It is preferable to skip numbers in the menu than to present

a nonfunctional option. However, skipping numbers should be

considered only for options that are state dependent and are available

at other times. (For example, on a menu that has a delete function,

the option to delete may only be offered when there are items to

delete.)

F When entering a new menu, the user should be given a title and a brief

instruction with the appropriate inflection to separate title from

instruction (e.g., “Forwarding Start Day. <pause> Choose the day to start

forwarding on. For Monday, press 1...”).

F For walk-up-and-use services, tell the user how many entries to expect.

This cues the user not to respond too soon if there is uncertainty (e.g.,

“Please make your selection from the following four choices.”).

F Present the next prompt or statement within 750 milliseconds after a user

makes a menu selection.

Organize the Menu to Help the User

F As a general rule, place the most frequently selected choices first. For

example, if most incoming calls are for billing, the billing option should be

the first one presented.

F Follow a natural/logical order. For example, ask for the user’s telephone

number before asking whether there is trouble on the line. Ask for the

area code and then the telephone number. Ask for the item and then the

desired quantity. Days of the week should be in order rather than by most

frequently selected.

7 Interactive Voice Response Interfaces

250

F Follow a functional order. Related menu items should be presented

together in the same menu.

F There may be cases where grouping options based on the user’s

knowledge of categories is more effective and may lead to better

performance than grouping options by frequency alone.

F The structure of menus can be largely dependent on application

requirements. A hierarchical menu structure is generally the most

appropriate for an interface with a large number of available options. On

the other hand, a flat menu structure (i.e., just one menu) should be used

when only a few options are available.

Limit the Number of Items on Menus

F There is no universal optimum number of choices on a menu or menu

levels in a hierarchy. In general, fewer options and fewer levels will be

easier. Good design is a compromise between the number of options in a

menu, the number of menus on a level, and the number of levels of

menus. Given a set of requirements, create the smallest system that gives

users access to the options that they want.

F As the number of choices in a menu increases, expect increases in the

number of errors and requests for help, and an increase in user response

times. Design the menu structure to break up tasks or categories into subtasks

or subcategories that can be accomplished with shorter menus. This structure

should match with a representative user’s model of the system.

F No menu should ever require double-digit entry. Menus must be designed

to use the 0 to 9 and * and # keys as single key press options. A menu

design that has more than 12 options must be split up into separate

menus, which may require adding levels of hierarchy or adding items to

higher-level menus.

F Avoid using “catchall” alternatives such as “For more options, press 3.”

Instead, provide a description of the option so that a user could select 3 on

purpose, not just because 1 and 2 did not sound right.

Map Binary Menu Choices to the 1 and 2 Keys

F The labels or choices on a binary menu should reflect the user’s, not the

system’s, view of the choice.

F Many yes/no decisions can be rewritten for clarity. Consider this example:

“Do you have dial tone? For yes, press 1. For no, press 2.” This prompt can

be improved as follows: “If you have dial tone, press 1. If not, press 2.”

F When a yes/no binary choice is presented, “yes” must be assigned to the 1

key, and “no” must be assigned to the 2 key.

7.6 Design Guidelines

251