Publishers of technology books, eBooks, and videos for creative people

Home > Articles > Web Design & Development > HTML/XHTML

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

Rolling custom controls

One truly spiffing aspect of the <video> and <audio> media elements is that they come with a super easy JavaScript API. The API’s events and methods are the same for both <audio> and <video>. With that in mind, we’ll stick with the sexier media element: the <video> element for our JavaScript discussion.

As you saw at the start of this chapter, Anne van Kesteren has spoken about the new API and about the new simple methods such as play(), pause() (there’s no stop method: simply pause and move to the start), load(), and canPlayType(). In fact, that’s all the methods on the media element. Everything else is events and attributes.

Table 4.3 provides a reference list of media attributes, methods, and events.

Table 4.3 Media Attributes, Methods, and Events




error state






network state







addTrack(label, kind, language)








ready state


















playback state


current Time


start Time











width [video only]

height [video only]

videoWidth [video only]

videoHeight [video only]

poster [video only]

Using JavaScript and the new media API, you have complete control over your multimedia—at its simplest, this means that you can easily create and manage your own video player controls. In our example, we walk you through some of the ways to control the video element and create a simple set of controls. Our example won’t blow your mind—it isn’t nearly as sexy as the <video> element itself (and is a little contrived!)—but you’ll get a good idea of what’s possible through scripting. The best bit is that the UI will be all CSS and HTML. So if you want to style it your own way, it’s easy with just a bit of web standards knowledge—no need to edit an external Flash Player or similar.

Our hand-rolled basic video player controls will have a play/pause toggle button and allow the user to scrub along the timeline of the video to skip to a specific section, as shown in Figure 4.3.

Figure 4.3

Figure 4.3 Our simple but custom video player controls.

Our starting point will be a video with native controls enabled. We’ll then use JavaScript to strip the native controls and add our own, so that if JavaScript is disabled, the user still has a way to control the video as we intended:

<video controls>
  <source src="leverage-a-synergy.webm" type="video/webm" />
  <source src="leverage-a-synergy.mp4" type="video/mp4" />
  Your browser doesn't support video.
  Please download the video in <a href="leverage-a-synergy.webm">WebM</a> or <a href="leverage-a-synergy.mp4">MP4</a> format.
var video = document.getElementsByTagName('video')[0];

Play, pause, and toggling playback

Next, we want to be able to play and pause the video from a custom control. We’ve included a button element that we’re going to bind a click handler and do the play/pause functionality from. Throughout my code examples, when I refer to the play object it will refer to this button element:

<button class="play" title="play">&#x25BA;</button/>

We’re using &#x25BA;, which is a geometric XML entity that looks like a play button. Once the button is clicked, we’ll start the video and switch the value to two pipes using &#x2590;, which looks (a little) like a pause, as shown in Figure 4.4.

Figure 4.4

Figure 4.4 Using XML entities to represent play and pause buttons.

For simplicity, I’ve included the button element as markup, but as we’re progressively enhancing our video controls, all of these additional elements (for play, pause, scrubbing, and so on) should be generated by the JavaScript.

In the play/pause toggle, we have a number of things to do:

  1. If the user clicks on the toggle and the video is currently paused, the video should start playing. If the video has previously finished, and our playhead is right at the end of the video, then we also need to reset the current time to 0, that is, move the playhead back to the start of the video, before we start playing it.
  2. Change the toggle button’s value to show that the next time the user clicks, it will toggle from pause to play or play to pause.
  3. Finally, we play (or pause) the video:
playButton.addEventListener('click', function () {
  if (video.paused || video.ended) {
    if (video.ended) {
      video.currentTime = 0;
    this.innerHTML = ''; // &#x2590;&#x2590; doesn't need escaping here
    this.title = 'pause';;
  } else {
    this.innerHTML = ''; // &#x25BA;
    this.title = 'play';
}, false);

The problem with this logic is that we’re relying entirely on our own script to determine the state of the play/pause button. What if the user was able to pause or play the video via the native video element controls somehow (some browsers allow the user to right click and select to play and pause the video)? Also, when the video comes to the end, the play/pause button would still show a pause icon. Ultimately, we need our controls always to relate to the state of the video.

Eventful media elements

The media elements fire a broad range of events: when playback starts, when a video has finished loading, if the volume has changed, and so on. So, getting back to our custom play/pause button, we strip the part of the script that deals with changing its visible label:

playButton.addEventListener('click', function () {
  if (video.ended) {
    video.currentTime = 0;
  if (video.paused) {;
  } else {
}, false);

In the simplified code, if the video has ended we reset it, and then toggle the playback based on its current state. The label on the control itself is updated by separate (anonymous) functions we’ve hooked straight into the event handlers on our video element:

video.addEventListener('play', function () {
  play.title = 'pause';
  play.innerHTML = '';
}, false);
video.addEventListener('pause', function () {
  play.title = 'play';
  play.innerHTML = '';
}, false);
video.addEventListener('ended', function () {
}, false);

Whenever the video is played, paused, or has reached the end, the function associated with the relevant event is now fired, making sure that our control shows the right label.

Now that we’re handling playing and pausing, we want to show the user how much of the video has downloaded and therefore how much is playable. This would be the amount of buffered video available. We also want to catch the event that says how much video has been played, so we can move our visual slider to the appropriate location to show how far through the video we are, as shown in Figure 4.5. Finally, and most importantly, we need to capture the event that says the video is ready to be played, that is, there’s enough video data to start watching.

Figure 4.5

Figure 4.5 Our custom video progress bar, including seekable content and the current playhead position.

Monitoring download progress

The media element has a “progress” event, which fires once the media has been fetched but potentially before the media has been processed. When this event fires, we can read the video.seekable object, which has a length, start(), and end() method. We can update our seek bar (shown in Figure 4.5 in the second frame with the whiter colour) using the following code (where the buffer variable is the element that shows how much of the video we can seek and has been downloaded):

video.addEventListener('progress', updateSeekable, false);
function updateSeekable() {
  var endVal = this.seekable && this.seekable.length ? this.seekable.end() : 0; = (100 / (this.duration || 1) * endVal) + '%';

The code binds to the progress event, and when it fires, it gets the percentage of video that can be played back compared to the length of the video. Note the keyword this refers to the video element, as that’s the context in which the updateSeekable function will be executed. The duration attribute is the length of the media in seconds.

However, there’s some issues with Firefox. In previous versions the seekable length didn’t match the actual duration, and in the latest version (5.0.1) seekable seems to be missing altogether. So to protect ourselves from the seekable time range going a little awry, we can also listen for the progress event and default to the duration of the video as backup:

video.addEventListener('durationchange', updateSeekable, false);
video.addEventListener('progress', updateSeekable, false);
function updateSeekable() { = (100 / (this.duration || 1) *
    (this.seekable && this.seekable.length ? this.seekable.end() : this.duration)) + '%';

It’s a bit rubbish that we can’t reliably get the seekable range. Alternatively we could look to the video.buffered property, but sadly since we’re only trying to solve a Firefox issue, this value in Firefox (currently) doesn’t return anything for the video.buffered.end() method—so it’s not a suitable alternative.

When the media file is ready to play

When your browser first encounters the video (or audio) element on a page, the media file isn’t ready to be played just yet. The browser needs to download and then decode the video (or audio) so it can be played. Once that’s complete, the media element will fire the canplay event. Typically this is the time you would initialise your controls and remove any “loading” indicator. So our code to initialise the controls would typically look like this:

video.addEventListener('canplay', initialiseControls, false);

Nothing terribly exciting there. The control initialisation enables the play/pause toggle button and resets the playhead in the seek bar.

However, sometimes this event won’t fire right away (or when you’re expecting it to). Sometimes the video suspends download because the browser is trying to prevent overwhelming your system. That can be a headache if you’re expecting the canplay event, which won’t fire unless you give the media element a bit of a kicking. So instead, we’ve started listening for the loadeddata event. This says that there’s some data that’s been loaded, though not necessarily all the data. This means that the metadata is available (height, width, duration, and so on) and some media content—but not all of it. By allowing the user to start playing the video at the point in which loadeddata has fired, browsers like Firefox are forced to go from a suspended state to downloading the rest of the media content, which lets them play the whole video.

You may find that in most situations, if you’re doing something like creating a custom media player UI, you might not need the actual video data to be loaded—only the metadata. If that’s the case, there’s also a loadedmetadata event which fires once the first frame, duration, dimensions, and other metadata is loaded. This may in fact be all you need for a custom UI.

So the correct point in the event cycle to enable the user interface is the loadedmetadata:

video.addEventListener('loadedmetadata', initialiseControls, false);

A race to play video

Here’s where I tell you that as much as native video and audio smells of roses, there’s a certain pong coming from somewhere. That somewhere is a problem in the implementation of the media element that creates what’s known as a “race condition.”

The problem is that it’s possible, though not likely, for the browser to load the media element before you’ve had time to bind the event listeners.

For example, if you’re using the loadedmetadata event to listen for when a video is ready so that you can build your own fancy-pants video player, it’s possible that the native video HTML element may trigger the events before your JavaScript has loaded.


There are a few workarounds for this race condition, all of which would be nice to avoid, but I’m afraid it’s just something we need to code for defensively.

Workaround #1: High Event Delegation

In this workaround, we need to attach an event handler on the window object. This event handler must be above the media element. The obvious downside to this approach is that the script element is above our content, and risks blocking our content from loading (best practice is to include all script blocks at the end of the document).

Nonetheless, the HTML5 specification states that media events should bubble up the DOM all the way to the window object. So when the loadedmetadata event fires on the window object, we check where the event originated from, via the target property, and if that’s our element, we run the setup code. Note that in the example below, I’m only checking the nodeName of the element; you may want to run this code against all audio elements or you may want to check more properties on the DOM node to make sure you’ve got the right one.

function audioloaded() {
  // setup the fancy-pants player

window.addEventListener('loadedmetadata', function (event) {
  if ( === 'AUDIO') {
    // set this context to the DOM node;
}, true);


<audio src="hanson.mp3">
  <p>If you can read this, you can't enjoy the soothing sound of the Hansons.</p>
Workaround #2: High and Inline

Here’s a similar approach using an inline handler:

function audioloaded() {
  // setup the fancy-pants player

<audio src="hanson.mp3" onloadedmetadata="">
  <p>If you can read this, you can't enjoy the soothing sound of the Hansons.</p>

Note that in the inline event handler I’m using .call(this) to set the this keyword to the audio element the event fired upon. This means it’s easier to reuse the same function later on if browsers (in years to come) do indeed fix this problem.

By putting the event handler inline, the handler is attached as soon as the DOM element is constructed, therefore it is in place before the loadedmetadata event fires.

Workaround #3: JavaScript Generated Media

Another workaround is to insert the media using JavaScript. That way you can create the media element, attach the event handlers, and then set the source and insert it into the DOM.

Remember: if you do insert the media element using JavaScript, you need to either insert all the different source elements manually, or detect the capability of the browser, and insert the src attribute that the browser supports, for instance WebM/video for Chrome.

I’m not terribly keen on this solution because it means that those users without JavaScript don’t get the multimedia at all. Although a lot of HTML5 is “web applications,” my gut (and hopefully yours, too) says there’s something fishy about resorting to JavaScript just to get the video events working in a way that suits our needs. Even if your gut isn’t like mine (quite possible), big boys’ Google wouldn’t be able to find and index your amazing video of your cat dancing along to Hanson if JavaScript was inserting the video. So let’s move right along to workaround number 4, my favourite approach.

Workaround #4: Check the Readystate

Probably the best approach, albeit a little messy (compared to a simple video and event handler), is to simply check the readyState of the media element. Both audio and video have a readyState with the following states:


Therefore if you’re looking to bind to the loadedmetadata event, you only want to bind if the readyState is 0. If you want to bind before it has enough data to play, then bind if readyState is less than 4.

Our previous example can be rewritten as:

<audio src="hanson.mp3">
  <p>If you can read this, you can't enjoy the soothing sound of the Hansons.</p>

function audioloaded() {
  // setup the fancy-pants player

var audio = document.getElementsByTagName('audio')[0];

if (audio.readyState > 0) {;
} else {
  audio.addEventListener('loadedmetadata', audioloaded, false);

This way our code can sit nicely at the bottom of our document, and if JavaScript is disabled, the audio is still available. All good in my book.

Will this race condition ever be fixed?

Technically I can understand that this issue has always existed in the browser. Think of an image element: if the load event fires before you can attach your load event handler, then nothing is going to happen. You might see this if an image is cached and loads too quickly, or perhaps when you’re working in a development environment and the delivery speed is like Superman on crack—the event doesn’t fire.

Images don’t have ready states, but they do have a complete property. When the image is being loaded, complete is false. Once the image is done loading (note this could also result in it failing to load due to some error), the complete property is true. So you could, before binding the load event, test the complete property, and if it’s true, fire the load event handler manually.

Since this logic has existed for a long time for images, I would expect that this same logic is being applied to the media element, and by that same reasoning, technically this isn’t a bug, as buggy as it may appear to you and me!

Fast forward, slow motion, and reverse

The spec provides an attribute, playbackRate. By default, the assumed playbackRate is 1, meaning normal playback is at the intrinsic speed of the media file. Increasing this attribute speeds up the playback; decreasing it slows it down. Negative values indicate that the video will play in reverse.

Not all browsers support playbackRate yet (only WebKit-based browsers and IE9 support it right now), so if you need to support fast forward and rewind, you can hack around this by programmatically changing currentTime:

function speedup(video, direction) {
  if (direction == undefined) direction = 1; // or -1 for reverse

  if (video.playbackRate != undefined) {
    video.playbackRate = direction == 1 ? 2 : -2;
  } else { // do it manually
    video.setAttribute('data-playbackRate', setInterval ((function playbackRate () {
      video.currentTime += direction;

return playbackRate; // allows us to run the function once and setInterval
    })(), 500));

function playnormal(video) {
  if (video.playbackRate != undefined) {
    video.playbackRate = 1;
  } else { // do it manually

As you can see from the previous example, if playbackRate is supported, you can set positive and negative numbers to control the direction of playback. In addition to being able to rewind and fast forward using the playbackRate, you can also use a fraction to play the media back in slow motion using video.playbackRate = 0.5, which plays at half the normal rate.

Full-screen video

For some time, the spec prohibited full-screen video, but it’s obviously a useful feature so WebKit did its own proprietary thing with WebkitEnterFullscreen();. WebKit implemented its API in a way that could only be triggered by the user initiating the action; that is, like pop-up windows, they can’t be created unless the user performs an action like a click. The only alternative to this bespoke solution by WebKit would be to stretch the video to the browser window size. Since some browsers have a fullscreen view, it’s possible to watch your favourite video of Bruce doing a Turkish belly dance in full screen, but it would require the user to jump through a number of hoops—something we’d all like to avoid.

In May 2011, WebKit announced it would implement Mozilla’s full-screen API ( This API allows any element to go full-screen (not only <video>)—you might want full-screen <canvas> games or video widgets embedded in a page via an <iframe>. Scripts can also opt in to having alphanumeric keyboard input enabled during full-screen view, which means that you could create your super spiffing platform game using the <canvas> API and it could run full-screen with full keyboard support.

As Opera likes this approach, too, we should see something approaching interoperability. Until then, we can continue to fake full-screen by going full-window by setting the video’s dimensions to equal the window size.

  • + Share This
  • 🔖 Save To Your Account