What is Hyperaudio?

In my not-so-spare time I work on a new technology called Hyperaudio. A question I'm frequently asked is 'What exactly is Hyperaudio?'. Well, it can be a lot of things but I often find it useful to distill it into a sentence. I got it down to this : 'Hyperaudio is to audio as Hypertext is to text.' I usually pause at this point because that statement is loaded with implications.

It's only quite recently that we've started to see the emergence of the audio interface as a serious tool. Apple's Siri speech interface certainly hit the headlines when it was first rolled out and Google's audio search continues to improve. This is driven in part by the massive uptake in smartphones and the desire to be able to use them in a hands-free way but also by the fact that audio only requires only partial attention and can convey emotion very well.

We're still a long way from the concept of seamless audio interfaces as portrayed on the Starship Enterprise, but with the integration of HTML5 audio into the web page, things are starting to move.

Integration is the key word here. HTML5 audio allows us to make audio fully part of web experience, we could always do this to some extent with Flash of course. When me and my colleagues created jPlayer a few years back, it was out of a desire to be able to structure our audio players with HTML, style them with CSS and control them from JavaScript. The idea at the time was to only eliminate the visual aspects of Flash. Now we concentrate on HTML5 audio and video and just use Flash as a fallback. Looking forward, if you're going to develop for mobile you might as well forget Flash as many mobile platforms will not support Flash at the OS level and this includes Android as well as iOS.

So what is Hyperaudio? Hyperaudio is a series of technologies built upon the foundations of HTML5 audio which aim to make audio a first class citizen of the web. In particular, but not exclusively, we are concerned with the spoken word.

What Hyperaudio hopes to achieve:

1. make audio searchable
2. make audio linkable
3. make audio navigable
4. dynamically generate audio
5. convert speech to text
6. represent audio visually

When we represent spoken audio as text we immediately start to open up the content. People can see at a glance what that content is. Transcripts then, underpin much of the work undertaken under the Hyperaudio umbrella so far. As soon as we convert speech to text we are immediately able to scan, search and link to that content. Add timings and we can use that transcript as a form of navigation. We can call these hyperlinked transcripts hyper-transcripts.

So let's look and listen to a few Hyperaudio demos to get a feel for what it can do.

- An early demo created for Denmark's biggest radio-station DR (dual language)
- A more visual demo for WNYC's famous RadioLab programme
- Hyperaudio Pad - a tool for manipulating media from their transcripts (Work In Progress) :
- Breaking Out - An experiment in dynamically generated speech

Many of these demos take advantage of the great work that is being done with the Popcorn.js library. While Popcorn.js is often associated with video it can equally be applied to audio. In short what Popcorn allows you to do is to trigger events at set times in pieces of media.

Other libraries and services that may be of interest to keen audio hackers are Speak.js and Echoprint.

As you can see Hyperaudio has many applications, we can use it to create a range of compelling experiences, educational applications, even tools and it is not difficult to see how this technology can be applied to the medium of games.

Excitingly, as most video contains audio, much of Hyperaudio can also be applied to video. Actually the HTML5 APIs are very similar.

If this article has tweaked you interest in Hyperaudio, please feel free to join the growing Hyperaudio community.

This blog post has been written by Mark Boas

Mark makes, writes about and promotes new and open web technologies. Co-founder of Happyworm - a small web agency and makers of the jPlayer media framework, he enjoys pushing the limits of the browser using HTML5 and JavaScript. Though a generalist at heart, Mark spends much of his time playing with web based media and real-time communications and is actively involved in helping news organisations world-wide as part of the Knight-Mozilla OpenNews initiative. A lover of all things audio, his passion often drives his work and is currently enjoying the challenge of taking audio ‘somewhere new’ with his Hyperaudio experiments. You can follow Mark on Twitter.

Single Post Navigation

About Us | Contact Us | Developer | Terms of Use | Privacy Policy

Copyright © 2013 Dada Entertainment, LLC. All rights reserved.