-
Featured Columnists
- Faruk Ateş
- Andy Clarke
- Kris Hadlock
- Robert Hoekman, Jr.
- Molly Holzschlag
-
Sarah Horton
- Usability Tips You Can Use: Designing Navigation with Lists
- Usability Tips You Can Use: Designing Forms for Keyboard Access
- Usability Tips You Can Use: Designing Flexible Layouts
- Usability Tips You Can Use: Designing Accessible Audio
- Designing Accessible Text—Part 1: Structure
- Designing Accessible Text—Part 2: Text Size
- Designing Accessible Text—Part 3: Color
- Designing Accessible Text—Part 4: Typeface
- Designing Usable Forms
- Designing Pages for Linear Access
- Designing Data Tables
- Working With Large Images
- Writing Link and Heading Text
- Writing Alternate Text for Images
- Cascading Style Sheets Part 1: Browser Styling
- Cascading Style Sheets Part 2: Shorthand
- Cascading Style Sheets Part 3: Media Style Sheets
- Cascading Style Sheets Part 4: Selective Printing
- Cascading Style Sheets Part 5: Styling for Print
- Cascading Style Sheets Part 6: Styling for Small-Screen Devices
- Miraz Jordan
- Jonathan and Lisa Price
- Catherine Seda
- Dave Shea
- Dave Taylor
-
Table of Contents
- Welcome
- Web Basics
- Publishing on the Web: Putting Files on the Server
- Web Design Process and Workflow
- Project Management
- Mark My WWWord: HTML and XHTML
- Standards Compliance
- Layouts
- Forms
- Meta Tags and Search
- Usability
- Accessibility
- Enhancing Web Page Interaction
- Web Graphics
- Web Page Optimization
- Multimedia
- Content
- Overview of Servers
- Server Programming Basics
- Careers in Web Design
- Tools
- Tutorials
- Intellectual Property for Web Designers
Usability Tips You Can Use: Designing Accessible Audio
Last updated Oct 17, 2003.
Text is hands-down the most accessible format for conveying information to the broadest possible Web audience. Unlike images, video, and audio, text can be both seen and heard. Users who can’t hear can read text, and users who can’t see can have text read aloud by software. This doesn’t mean that we should forgo the richness of presentation that comes from images and audio—pages thick with text present their own set of challenges—but that information conveyed through these channels must also be available as text. Fortunately, Web technology offers a variety of methods for presenting media content with equivalent text.
Accessible audio is a perfect illustration of the broad benefits of universal design because access to equivalent text is helpful for everyone. A text transcript can be read and indexed by software, making audio- and video-based content easier to find. Users who have technical issues with Web-based media (slow Internet access, software incompatibilities, and so on) can access the information via the text transcript. Also, a text transcript is an effective way to convey information, and may be more efficient than audio alone, since reading is faster than listening—meaning that people reading a transcript may understand the information better and faster.
So how do you go about putting into text the information contained in an audio recording? Keep reading to find out, and to learn what to do with a transcript once you have one.
Audio Only
Say you’ve recorded a commentary on the ethics of breeding fluorescent pigs. You plan to publish the commentary as a podcast and also post it on your blog. For your commentary to be accessible to users who can’t hear, you need a text transcript.
Transcribing the Audio
Preparing a transcript is a snap if you started with a script. In this case, just review the recording and revise the script for accuracy, noting any sounds that add to the narrative (for example, giggles, grunts, snorts). Preparing a transcript without a script is another story. In this case, you’ll need to play and pause, play and pause, while typing the text. For those of us who lack training, the process of transcribing what’s spoken may be slow and tedious, but is certainly not impossible.
"Whoa there, Missy," you say. "With printed documents, we use software to scan pages and convert the dots on the page into characters and words. Why can’t we use a similar tool to convert spoken words to text?" Well, you can. Speech-to-text technology is available in tools such as Dragon NaturallySpeaking. If you’re doing regular commentaries, it might make sense to spend the time configuring the software to recognize your voice and create accurate transcripts. On the other hand, the technology is not as robust or accurate as the OCR technology used for scanning a printed document, and you may spend more time correcting errors than you would have had you started from scratch.
For those with money to spend, contracting with a transcription service may be the best option. Some Web-based companies offer low prices and quick turnaround for audio transcripts of online media. I can’t recommend a specific service, but googling "podcast transcription service" should provide some options.
Publishing the Transcript
With a complete and accurate transcript in hand, post the text using standards-based, semantic markup and then sit back and reap the benefits of improved findability and broader access. For an example of a site that provides audio transcripts, see NASA Podcasting.
Audio and Video
Now suppose that your commentary contains a video track—photos or video footage of fluorescent pigs, perhaps. For truly accessible audio, you’ll need to go one step further and synchronize the text and video to create captions.
Adding Timecode
For use as captions, the transcript needs to contain timecodes that indicate when to display each text segment along with the video. Some transcription services provide timecodes as part of their service. For the do-it-yourself captioner, try Media Access Generator (known as MAGpie) from the National Center for Accessible Media (NCAM). MAGpie doesn’t automagically add timecodes to an audio transcript. Instead, MAGpie is facilitating software, offering a comfortable and efficient working environment in which to create captions (see Figure 1).
To create captions using MAGpie, you import the audio transcript, play the associated video, and press a function key to mark each break. MAGpie inserts timecodes and creates a text file that can be used for captions:
[00:06:08.26] The realist in the area of race relations seeks to combine the truths of two opposites [00:06:18.22] while avoiding the extremes of both. [00:06:22.90] So the realist would agree with the optimist that we have come a long, long way [00:06:29.16] but he would balance this by agreeing with the pessimist [00:06:33.93] that we have a long, long way to go before this problem is solved. [00:06:41.84] And it is this realistic position that I would like to use as a basis for our thinking together [00:06:49.45] as we think of the future of race relations in the United States. [00:06:55.44] We have made significant strides. [00:06:59.72] We have come a long, long way. [00:07:03.06] But we have a long, long way to go.
Synchronizing Captions with Video
A variety of methods are available for synchronizing captions and video, and the method you choose depends in large part on the format of your media. For the purpose of simplicity, let’s look at publishing captioned QuickTime video using Synchronized Multimedia Integration Language (SMIL).
SMIL, pronounced "smile," is a happy little markup language for multimedia presentations. Using SMIL is a bit like working with page layout software to pull together and position different elements into one presentation. In the following SMIL example, the location of the elements on the page is defined in the <layout> section and the file source and duration are defined in the <body> section. Enclosing the video and text in the <par> code causes them to display in parallel (simultaneously).
<smil>
<head>
<layout>
<root-layout width="400" height="380" />
<region id="video" width="400" height="300" />
<region id="caption" width="360" height="60" left="20" top="320" />
</layout>
</head>
<body>
<par dur="1:00:00.00">
<video dur="1:00:00.00" region="video" src="video.mov" alt="Video" />
<textstream dur="1:00:00.00" region="caption" src="captions.txt" alt="Captions" />
</par>
</body>
</smil>
For those familiar with markup languages, SMIL is just another dialect. For the faint of heart, don’t panic. Authoring tools such as MAGpie and GoLive generate SMIL markup.
Resources
To learn more about audio transcription and captioning, check these resources:
- NCAM’s Rich Media Accessibility is a great source for tutorials and examples on a variety of media access strategies, including captioning.
- For stylistic considerations with regard to captioning, consult the WGBH Media Access Group’s page Suggested Styles and Conventions for Closed Captioning. Also, Joe Clark discusses the nuances of captioning and transcription in his book Building Accessible Websites (New Riders, 2002, ISBN 073571150X).
- For more on creating SMIL presentations using QuickTime, visit Apple’s QuickTime and SMIL.
- Be sure to check out the Accessibility section of the Informit Web Design Reference Guide.
- My book Access by Design: A Guide to Universal Usability for Web Designers provides details and examples to teach you how to optimize page designs to work more effectively for all users, particularly those with special accessibility needs.

