EIG, Inc

3767 Crow Canyon Road
San Ramon, CA  94582

1 888 344 4487 (Phone)
1 925 215 8465 (Fax)

The "Be A Good Machine" Blog
Keep up with the best practices and some thoughtful essays from our industry-leading designers.

Harrison Ford Views Images PDF Print E-mail
Written by Bruce Balentine   
Command and control has been popularly accepted as applicable to speech technology for some time. Here is my favorite sequence from Blade Runner, that provocative film noir of the early eighties. You’ll recall that Harrison Ford plays the part of Rick Deckard, who finds and terminates fugitive replicants—androids, created by the Tyrell Corporation, that are "more human than human."

Deckard: “Enhance two-five-four-one-seventy-six.”

Viewer: [A grid appears over the image, and a square selection region clicks across the grid to outline a target at lower left. Attractive electronic-sounding earcons reinforce the motion.]

brr-brr-brr … beep-beep-beep-beep … blip-blip …. Whirr …

Deckard: “Enhance.”

Viewer: click-click-click-click …

Deckard: “Stop.”

Viewer: [halts display motion and sound]

Deckard: “Move in.”

Viewer: [Closing in on a hand in the photograph] clack-clack-clack …clickety-clickety … click-clack …

Deckard: “Stop.”

Viewer: [freezes in place]

Deckard: “Pull out and track right … stop.”

The dialogue continues in this way for some time. The interaction is known as a command and control dialogue. The device that Deckard is commanding is some kind of image viewer—he inserts a color photograph into a scanner slit, and the image of it appears on a color display. Deckard sits on his couch and drinks whiskey while commanding the viewer by voice.

Deckard: “Center and pull back. … check forty-five right…. center and stop … enhance thirty-four to forty-six”

Viewer: [continuous electromechanical sounds] Whirr ... click-clack-click …

Deckard: “Pan right and pull back … enhance thirty-four to forty-six … pull back”

Viewer: Clickety-clack-clack … whirr … click-clack-click-clack …

Deckard: “Wait a minute …”

Viewer: [halts immediately]

Deckard: “Go right. … Stop … enhance fifty-seven nineteen … track forty-five left”

Viewer: [Arrives at a fuzzy extreme close-up of a person’s chin.]

Deckard: “Gimme a hard copy right there.”

The example shows a fascinating switch of user behavior from highly structured to highly casual as Deckard focuses in on the data rather than the operating requirements of the user interface. Naturally, since this is a movie, the user interface adjusts spontaneously to this alteration in user attitude and behavior.

What I mean is, Deckard gives very specific coordinates to the viewer—things like “enhance fifty-seven nineteen”—in such a way that he clearly knows the image viewer’s user interface intimately. So when he switches to a more casual form of speech—“wait a minute” or later, with “Gimme a hard copy right there”—the scene exposes a certain incongruity. Maybe it’s the whiskey talking, but Deckard has certainly moved from a command and control to a conversational interface. Quite a good ASR, n’est ce pas?

But then again, it is the year 2019.

 
Activity Corner: Listening PDF Print E-mail
Written by Bruce Balentine   

For example, here’s a little activity break. I’ve adapted this exercise—designed to help you clean your ears—from R. Murray Schafer.

Materials Required

  • Writing pad
  • Pen or pencil
  • A nice quiet place to sit and listen
  • Various other listening spots, loud or soft

Step 1: Listen

Sit for a moment and just listen. The exercise is best if you start in a quiet place at a quiet time, perhaps on a Sunday morning.

Step 2: Make a List of Sounds

Write down every single sound that you hear. Your list will be short at first, but will grow as you learn to listen more closely and attentively.

  • Start with sounds close to you.
  • Use the prompting questions later in this essay to trigger your attention.
  • As you run out of close-up sounds, listen for more distant samples.
  • List both human and nonhuman sounds.

Step 3: Repeat at a Different Place and Time

Notice that it’s tricky at first—you can only hear a few sounds. And the loud ones tend to obscure the soft ones. But give it some time, and you’ll start to hear more and more. If you have chosen a loud soundscape–for example, a public place with traffic noise—it will probably take you longer to learn to hear the soft sounds clearly. But you’ll hear them, eventually.

This exercise is an easy one to just skip, as you can already imagine the point of it. But I strongly encourage you to actually do the activity. Spend a good solid half-hour if you can—or at least 15 minutes if that’s all you can afford—just listening as keenly as possible to your environment. You will be amazed at the effect it has on you emotionally, and even more amazed at the respect you will acquire for human hearing.

Here are just a few of the things that you might think about while listening. How many sounds can you hear that are less than 10 feet from you? Is there a radio or TV nearby? If so, get that out of the way first. Listen to the TV. How many people are in the scene? Are there any sound effects? Do you recognize the actors? The music? What is the program that’s on? If it’s under your control, switch off the TV or radio. If it’s not under your control, then tune it out mentally (yes, you do know how to do that, don’t you? You do it all the time).

Now, with the media held temporarily at bay, listen again close-up. Do you have a dog or a cat? Is there a fan? Any other white noise? Can you hear your own breathing?

How many man-made appliances do you hear right now? How many separate sounds does each make? Now farther away, how many man-made sounds? Can you hear traffic? If so, how many vehicles? Of what type? How fast are they going? Is somebody off in the distance blowing leaves or cutting tree branches? Is that the regular garbage truck or the recycling truck? How can you tell?

Now think about nature. Birds? Squirrels? A rustling leaf? Ah, the wind! Now listen to that wind. Is it just one thing—“the wind”? Or can you hear inside of it? Yes, you can! There’s that whoosh as it picks up leaves, but there’s also a higher-pitched flutter—a small branch is shaking as the wind flies by. Dust against the wall. A tiny “tick” as a spec of wood strikes the window. A “ploop” as a piece of debris is loosed from a tree and dropped to the ground. What was it? A small pebble? No, didn’t sound hard enough. It must have been a tiny piece of dried mud, perhaps a dirt dauber’s nest.

Do you hear kitchen sounds?

Listen outside for people noises. Kids playing? How many? What’s the age range? Where are they exactly? Are people talking in a normal voice anywhere near? Are there loud people farther away—perhaps an argument just a few houses down? How many total people can you hear in this stretch of listening time? What about music? How many different sources of music do you hear right now?

Now EXCLUDE from your attention everything that you have written thus far. Search as hard as you can for any sound that you have not yet noticed. Listen! Did you catch the air conditioning system “click” as it switched off or on? Did you notice the sound of your swallowing—you know that you’ve swallowed at least once in this amount of time. Is there a gurgle in your bowel, or a whistle in your lung? When you open your mouth, is there a tiny liquid sound as you pull your tongue downward from the roof of your mouth? You should be able to write down at least 10 sounds that come directly from you—either from inside your body, or from objects that your body is touching. Maybe more if you’ll just listen.

This activity can be made into a fun contest. The person with the longest list of sounds wins the prize. It’s truly amazing how many sounds there are, and how few of them we tend to pay direct attention to. And yet all of them are perceived—that is, are part of our perception—because if they weren’t, then attending wouldn’t help. Try it. You’ll thank me. And tell your friends.

BTW, did you notice the sound of your own writing?

 
The Mixed Initiative Solution PDF Print E-mail
Written by Bruce Balentine   

To allow expert users a short cut to the two-question solution, mixed initiative yes-no questions are easy to build.

  • Do you know the check number?

Some users may answer, "yes, it's check number 406." This is called a mixed-initiaitve reply because the user has initiated a new answer to a question that you haven't yet asked. These users are now able to answer both the yes-no question and the follow-on check number question in a single turn.

The core problem with MI solutions is that callers must experiment to discover that they can answer in a more flexible way, so most callers will never take advantage of the shortcut. On the other hand, this solution is forgiving when users do experiment.

 
The Two-Question Solution PDF Print E-mail
Written by Bruce Balentine   

Another solution to the problem of yanking back the turn is to split one prompt into two. Instead of:

  • What is the check number? ... or say, "list my recent checks."

Ask the user explicitly, and use "yes-no" to bifurcate the population:

  • Do you know the check number?

If the user answers "yes," drop to the prompt that solicits the number. "Then say the number now," or just "what's the number?" If the user answers "no," then follow-on with the alternative dialogue, "Here's a list of your recent checks ..." The population has been divided into those who know and those who don't know, leading to a different follow-on dialogue.

The problem with this solution is that ALL users now must pass through two questions instead of one.

 
The Reverse Solution PDF Print E-mail
Written by Bruce Balentine   

One simple solution to the problem of yanking back the turn is simply to invert the question. Instead of:

  • Say your account number, or say I don't know it.

Reverse the two parts of the question:

  • If you know your account number, say it now.

With this solution, the conditional part of the question comes first, and therefore does not have any turn taking implications. The user has no answer to "If you know your account number ...." and so does not perceive that it's her turn to talk. The following imperative sentence, "Say it now" becomes the sole turn taking juncture.

The core problem with this solution is that it leaves users hanging if they don't know their account number. The 80-20 rule applies here. As long as most callers know the number, then this solution is acceptable. Most callers who don't know the answer infer correctly that their job is simply to wait silently.

There will be few OOG utterances (although you still have vulnerability to background noise). In most cases, those who don't know the number are served more slowly, but the majority are served quickly.

 
(More) Yanking Back the Turn PDF Print E-mail
Written by Bruce Balentine   

Last time I talked about the challenge of helping users who may not know how to reply to a prompt. The same solution is often applied to a related challenge -- offering additional options after a short pause:

Please say the name of the Acme product for which you want technical support ... (3 second pause) ... You can also say, "repair status," or "find a store."

It has the same basic flaw, doesn't it? The design yanks back the turn. Indeed, you can think of this specific design solution as a common knee-jerk response to a broad, general prompting puzzle. What do I do when I have a simple prompt -- one that is appropriate to a large subset of my users -- along with a collection of special cases? The solution of generating two prompts separated by a pause is almost always weak. You can find many examples:

  • What is the check number? ... or say, "list my recent checks."
  • When do you want to pay? ... say a date, or say "today."
  • Please say the date of the claim. ... for example, "January thirteenth, two-thousand eight."

As you can see, the same flawed solution is used for coaching, instructions, examples, alternatives, special cases -- anytime it seems appropriate to add "a little extra" to a prompt with the goal of "clarifying" expectations.

As you consider some possible solutions, bear in mind that the core problem is a turn-taking problem -- the machine asks a question and then yanks back the turn.

 
<< Start < Prev 1 2 3 4 Next > End >>

Copyright © 2010 EIG. All Rights Reserved.