Creating accessible video and audio instruction


Instruction statement

This instruction is designed to help staff involved in web publishing to maintain a high standard of video and animated content on the RMIT web presence.


This instruction does not apply to:

  • courseware, including scholarly work, student work and teaching and learning materials
  • websites that have no relationship to RMIT (for example, personal or private sites)

Instruction steps and actions

Accessibility standards are set out in the Creating video, audio and animation content procedure. You should also refer to the Web Accessibility Standards for audio and video for detailed information on how to meet accessibility requirements.

Briefly, these standards require that users who cannot access audio or video files have access to text and/or audio files with sufficient description to convey the information contained in audio or video content.

This includes an appropriate combination of the following:

1. Transcripts

1.1 Transcripts are a text description of audio or video that can be accessed by anyone including deaf/blind users, screen reader users, people less proficient in the language, and search engines. A transcript will often include all spoken words, with additional descriptions to convey the sound effects and visuals.

1.2 Transcripts must be provided for all pre-recorded video and audio content.

2. Captions

2.1 Captions are text versions of spoken words and important sound effects that make a video accessible to people who do not have access to the audio. Captions appear as subtitles on the screen, synchronised with the video.
Captions can be either open or closed. Open captions are a permanent part of the video. Closed captions are most often used for web video. They use the functionality of the video player to be displayed on request.
Captions must be provided for all pre-recorded video.

2.2 Captions can be created with free online tools in formats to suit the player (e.g. YouTube or JWPlayer). Captions can be time-consuming to type, edit and synchronise so including captions earlier in the process (eg. during planning or scripting) will make the process easier. Captions for YouTube videos can be partly automated with voice recognition service but still need to be checked and edited.

3. Audio description

3.1 The audio description is a narration that describes what is happening visually in the video during natural pauses in the audio, so it is accessible to people who cannot see the video. It is provided as an audio file to accompany the video content.

3.2 This is a requirement only for relevant visuals not already covered by the spoken dialogue.

3.3 Not all video needs audio description. For example, you do not need audio description where people speak directly to camera, or each other, or for text on screen where the text is also read, or woven into what is said.

3.4 You will need audio description of things like charts and diagrams where their information is important to the users’ comprehension.

3.5 Ideally a video will be designed with visual elements described in the audio, so audio description is not required.

3.6 Transcribing videos can be a time consuming task. It may be more efficient to outsource this task to a specialist video and audio transcription service.

4. Apply the standards listed in the Writing for the web instruction and Web spelling list to all transcript and caption text.

Digital and Customer Experience Strategy can provide advice on how to ensure these accessibility requirements can be met.

Web Manager

[Next: Supporting documents and information]