This article was kindly submitted by Wendy Reid, Senior QA at Rakuten Kobo Inc and one of the co-chairs of the Publishing Working Group of the W3C. She is the editor of the audiobooks profile of Web Publications, the focus of this blog piece. Wendy recently presented on the new audiobook standard at the DPUB Summit in Paris.
When Thomas Edison recorded the first audiobook in 1877, he probably didn’t think of them as anything other than a way to sell more phonographs. In the 1930’s, when the Library of Congress and the AFB developed a program for talking books, audiobooks got their real start and reputation as a medium for reading accessibility. The talking books program was created to provide reading materials for wounded war veterans and people with visual disabilities, and the model would be recreated in other countries in the years that followed.
My first exposure to the world of audiobooks came in grade 10. My English teacher, a man well known for his coke-bottle glasses and sweater vests, popped a cassette tape into a player on his desk and played us a recording of Shakespeare’s Romeo and Juliet. It was pretty good, certainly more interesting than reading the worn copy in my backpack or listening to my classmates stumble over Shakespeare’s peculiar use of English. However, it was a cassette in a time when I carried an iPod around in my school bag and feverishly downloaded MP3s from torrent sites. Audiobooks, though better than reading my school books, seemed antiquated. When I picked up an audiobook again at a summer data entry job, it was on a CD. I later discovered I could torrent them off music sites too, but my tiny 8gb iPhone 4s couldn’t handle more than one at a time, and I always seemed to lose track of my place.
Little did I know then that in a few years I would be working on audiobooks on the other side of the table. Instead of torrenting them I’d be working on an app to purchase, download, and listen to them all on my iPhone 6s. I would spend 6 months of my life listening almost exclusively to audiobooks, some I enjoyed, some I hated, many in languages I didn’t understand, all for the purpose of releasing a product. A product I inevitably use almost every week..
Last year the Publishing Working Group at W3C started work on audiobooks. It had become apparent to us that there was a strong business need for standardization in the industry, especially as it was seeing a newfound popularity.
It would shock many people to know that audiobooks are produced without any sort of unifying specification at all.
Today, if a publisher wants to produce a new audiobook, they are responsible for a few things: they have to produce the audio files, cover, a track list, and any supplemental content that they want to include with the book (this can be things like graphs or photos). Once they have done that, they often send their files to either distributors or direct to retail. Each of those entities may or may not have a preferred “standard” they expect, and this can be very different depending on the requirements.
This fragmentation means that end users, depending on the platform they use (and they are almost always siloed to one), can get many different experiences.
Those reading platforms also have to factor in for a lot of data challenges, things like incorrect chapter lengths, missing track data, a missing or incorrect table of contents.
The Publishing Working Group looked at all of these problems, as well as use cases that we thought were underserved by the current implementation of audiobooks—specifically accessibility. Our specification, now a public working draft, addresses our four main classes of use case:
- Listening—a user should be able to listen to their content without input or interruption
- Portability—a user should be able to download, steam, or offline their content
- Navigation—a user should be able to know when and where they are in their audiobook
- Accessibility—regardless of ability, a user should able to enjoy their content
Our specification makes the possibility of seamless listening possible via the reading order section of the manifest. This provides instruction to reading systems or the web that as long as no other input is present, this is the order the content should be presented it. It does not preclude fast forwarding or rewinding, but if a user chooses to not interact with the listening medium, they get their content in order.
The audiobooks specification is designed for the web first, meaning that streaming was a major use case, but we also recognized the need for an offline distribution model where content could be downloaded and transported in a single piece. For that we have developed a packaging specification that addresses this problem. It means that content creators can bundle their content together into a single file, and users can download that file and enjoy their content wherever they choose to open it.
The audiobooks specification has a specific provision for the Table of Contents. We allow content creators to create an HTML document for the table of contents which means they can create a rich document with the necessary structure of the book, and audio platforms can use it for display and information. That data can be used to help the user understand where they are in their audiobook at any time.
Audiobooks are often considered accessible by default—talking books were designed for the blind after all. However, print disabilities is a bigger classification than this and we needed to address all of the possible users for the specification. For that reason, the Audiobooks specification will be using the Synchronized Media specification to provide a method for content creators to sync audio and textual content for optimal experience. The specification also allows content creators to reference accessibility metadata within the manifest (on top of any ONIX they may use) to allow users to understand exactly what content they are receiving and if it meets their needs.
The audiobooks specification is moving towards recommendation status with the W3C, at this point in our process we are looking for feedback and implementation, so please reach out to us on our GitHub at https://github.com/w3c/wpub or to the editor at firstname.lastname@example.org (you can also reach me on twitter @wendy_a_reid if you are really keen).