Why is this important?

Some users may be unable to see video content clearly or at all. Important unspoken content – such as a character opening a door to reveal something interesting – must be conveyed to those users through text or audio.

Consider also the environments in which users consume video content. It is not always possible or convenient to be able to watch video content as well as listen to it:

  • when you have a visual impairment; 
  • when it is uncomfortable to watch screens for a sustained period; 
  • when cooking a meal; 
  • when the visual content is not the primary means of communication.

Screen reader users can make use of text transcripts to understand video-only content; audio descriptions can help users understand video content, whilst watching the video content, and benefiting from hearing the original spoken dialogue in real time.


Provide an audio description

For visual content, audio descriptions – detailing a character’s body language, expressions and movements, scene changes and on-screen text – provide the best user experience, however these can be time-consuming and expensive to produce. If you regularly produce video then factor the production of audio descriptions into the creation process. For example, creating audio descriptions from a storyboard will be a lot easier than starting from scratch.

Provide a transcript for video content

A transcript is a presentation of the dialogue and any non-spoken audio content as text. Where possible, provide a descriptive transcript that also includes a text description of the visual content – details of a character’s body language, expressions and movements, scene changes and on-screen text. Descriptive transcripts are required to make video content accessible to users with both visual and hearing impairments.

Make this available through a link or as on-page content, near to the embedded video. Provide timings alongside every line to allow users to skip forwards and backwards whilst following the transcript.

Consider the placement of audio and text equivalents in your designs

If others are responsible for creating and entering audio and text equivalents for their video content, speak to them to understand their requirements and make provision for them in your designs. Ensure any linked or on-page content is easy to find and is in close proximity to its video.

Examples of good practice

A screenshot of a TV programme with the set-top box interface overlayed on the picture. There are options to turn Audio description on or off, turn subtitles on or off, and go to 'help.'
Figure 1

Turning on ‘Audio description’ will result in any important, visual-only content – such as establishing shots, reactions or plot lines unaccompanied by dialogue – being announced between lines of dialogue.

Descriptive transcripts with timings allow users to follow the accompanying video content

0:00:00 – 0:00:05

This video has many close up shots of an event booking system called EventDiary. It begins by fading in the sound of a large group of people chattering. A light jazz piano melody fades in. 

0:00:05 – 0:00:13

We hear Christine White speak: "It’s incredible that everybody made it today. Having real-time feedback about the number of attendees has really helped me plan everything from catering to gift bags."

On screen text reads: "Christine White – Organiser of the Digital Conference 2021"

A close up shot of a participant looking at a streaming conference fades onto the screen. We transition to a sliding shot of a group of participants at lunch, during a conference. 

0:00:15 – 0:00:10

The piano melody and chattering crowd fade to silence. A close up shot of Alan’s face is now in view.

Squire: "Hi, my name is Alan Squire from EventDiary"

On screen text reads: “Alan Squire – Product Manager at EventDiary"

0:00:14 – 0:00:20

Squire: “I’m going to show you how to book a live streaming event using EventDiary”

0:00:21 – 0:00:26

“Here is an example of a straightforward event that we might want to start promoting on our website.”

A close up shot of a poster for a streaming conference fades onto the screen.

0:00:27 – 0:00:29

“It’s a webinar on accessibility.”

0:00:30 – 0:00:32

“The first things we need to note are”

0:00:32 – 0:00:38 

“the date that it’s happening and the video platform we’re going to use.”

The shot pans across the poster, highlighting the date.


WCAG 2.1

  • 1.2.1 Audio-only and Video-only (Pre-recorded) (A)
  • 1.2.3 Audio Descriptions or Media Alternative (Pre-recorded) (A)
  • 1.2.5 Audio Description (Pre-recorded) (AA)

EN 301 549 v 2.1.2

  • Audio-only and Video-only (Pre-recorded)
  • Audio Descriptions or Media Alternative (Pre-recorded)
  • Audio Description (Pre-recorded)


Further reading