Roger Hart

Technical Author - Red Gate Software

When to be quiet: does video need audio?

Published Friday, March 27, 2009 1:21 PM

Yesterday I gave a rather bumbling presentation to the Cambridge ISTC group on video for user assistance. The first thing I did was apologise to anybody who had read my previous blog, since the talk repeated most of it. Reciprocally: if you went to the talk, sorry. This blog recapitulates some of yesterday's discussion and examples.

Scroll to the end for examples and references if you're not so keen on the blathering. 

In brief

Research suggests that learners perform significantly better when presented with a well-designed combination of audio and video. Where either audio or textual content are redundant, however, learning is drastically hindered. Given that using only crafted explanatory text alongside visuals yields lower but substantial gains in the quality of learning, where you can't use audio, you should work on sensibly combining words and pictures.

That's pretty uncontroversial if you buy into cognitivist learning theory. If you don't, there's a vast and separate discussion, and one I'm quite happy to ignore for now.

I would add, however, that creating well-designed audio information does present costs and problems.

Great stuff about audio

We know from studies like When auditory Presentations Should and Should not be a Component of Multimedia Instruction (Leahy, Wayne, Chandler, Paul, & Sweller, John (2003), Journal of Applied Cognitive Psychology) that an audio narrative which repeats written text creates profoundly unhelpful cognitive clutter. We also know that auditory/verbal and spatial/visual processing occupy separate channels into working memory, and that of the two, the visual tends to occur first.

So the first thing to be wary of is a narration that repeats onscreen text, or distracts from text embedded in graphics or diagrams.

Studies like Jakob Nielsen's eye tracking work suggest that where video has little going on to hold visual attention, this focus wanders. Although modal, the two streams absolutely must be complementary if you want to maximise access to working memory, and thereby likelihood to reinforce learning.

The flip side of this is that if you're quite deliberately not  watching the screen, for whatever reason, you can still receive information from an audio track.

Something that came up in yesterday's discussion, which I'd not really considered, is the humanising effect of audio. Disastrous for reception when it goes wrong, but voice can build trust and rapport, as well as reassuring users that even a complex problem is not, in fact, painfully abstract and intractable.

Bart's videos for ANTS Memory Profiler, and Richard's for Exchange Server Archiver are pretty good at this.

Unfortunately, it's not just about information design. If your video is sitting within an application (especially if it's the primary or only help) there are substantial usability concerns. Likewise (albeit differently) video living in a web context brings with it web-based usability issues, especially around hyperlinks and chunking.

Problems with audio

At simplest, it's hard to chunk audio. A narration is often providing an organising overview, or something like a structuring narrative. This is great for reinforcing mental schemata, but lousy if you want to break up the narration and add navigation. A video in the slide/chapter mould such often has audio that begins instantly, or is already in mid flow, if you jump between slides. Unlike a textual web page, or even one graphic in a sequence, it's hard to find meaningful context when skipping through an audio presentation. Users will likely bring the expectations of their wider context to bear on your video - so if it's on the web, for instance, you'll easily annoy by being static, linear, and generally non web-like.

Problems of continuous audio also introduce those of pacing. Self-pacing is potentially an enormous strength of multimedia learning, and a key element in cognitivist learning strategies. The ability to move through and assimilate content as required to solve problems, rather than didactically in a lesson format, produces strong results for adult learners. It's also a cornerstone of user interaction with information on the web.

The crux of the problem here is that if you speed up somebody talking, they sound like a Disney character on a sugar high. Audio may also stop synchronizing with visuals, or content may be missed, if you attempt to modify the pace of delivery. A lot of this can be avoided with careful scripting, but we're still left with issues around delivery being too slow, too fast, or too inflexible for any given user.

There's an elephant in the room here too: talent. A lot of audio delivery is startlingly shoddy, however well-meaning. Where cost is an issue, it's unlikely you'll have access to competent voice actors or even seasoned and charismatic presenters.

The last big difficulty is around audience. Different listeners have different perceptions of accents, for example, and audio localization (if that's something you do) is pretty costly. Given the problems around information redundancy, it's also worth considering a complete redesign rather than a verbatim transcript if you're supporting deaf users.

So what should we do?

Basically, we should weigh up column A and column B. Audio helps, a lot, but it's hard to do well, presents problems for wider usability that risk alienating an audience, and is expensive to do with any real panache. The same goes for graphics, visual effects, and music, incidentally, as you can see in the examples at the end.

Personally, and given the benefits of carefully combined words and pictures, usable self-pacing, and the fact that our content lives on the web, I'm inclined to do without sound. Especially for material where we expect dipping-in and context switches.

What I'd really like to know, and maybe there's some research out there, is the value of sloppily executed but well designed audio. Does the talent/accent/localization/pacing stuff actually matter? Something too slick could even alienate in an instructional context: how likely are we to switch off if something looks like glib marketing?

Intuitively, I'd guess that if people are going to watch your video content from start to finish, and in advance of using an application, amateurish production is just fine.

Examples

References

I particularly recommend the first two:

An investigation of behaviorist and cognitive approaches to instructional multimedia design.
Deubel, Patricia. (2003), Journal of Educational Multimedia and Hypermedia

e-Learning and the Science of Instruction
Colvin Clark, Ruth & Mayer, Richard (2008) John Wiley

F-Shaped Patterns For Reading Web Content
Jakob Nielsen

Learner and Information Characteristics in the Design of Powerful Learning Environments
Paas, Fred & Kester, Liesbeth (2006), Applied Cognitive Psychology

When auditory Presentations Should and Should not be a Component of Multimedia Instruction
Leahy, Wayne, Chandler, Paul, & Sweller, John (2003), Applied Cognitive Psychology

Storyboard Development for Interactive Multimedia Training
Orr, Kay (1994) Journal of Interactive Instruction Development

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Alex.Davies said:

Mm, interesting. I'd never really thought about the problems of audio apart from the effort in producing it. Not being able to skip through is really obvious now I think about it.
March 30, 2009 12:22 PM

What do you think?

(required) 
(optional)
(required) 



















<March 2009>
SuMoTuWeThFrSa
22232425262728
1234567
891011121314
15161718192021
22232425262728
2930311234
Minesweeper in T-SQL
 Whatever happened to the idea that programming in TSQL can be fun? A Simple-Talk reader contributes an... Read more...

SQL Source Control: The Development Story, Part II
 When creating SQL Source Control, the team had to make decisions as to which source control systems the... Read more...

Raw Materials: Healthy Caution or Something Else?
 Derek slips a cog. Read more...

The DIS-Information Principle, Part II
 Database design simply involves populating a schema with tables that model sets of entities and... Read more...

OCS Disaster Recovery, Part 2
 There are several possible disasters which might happen to your Office Communications Server... Read more...