YouTube Gets Auto-Captioning From Google Speech Tech

The video sharing site now allows anyone to generate captions for videos with coherent spoken English.

Thomas Claburn, Editor at Large, Enterprise Mobility

March 4, 2010

2 Min Read
Dark Reading logo in a gray background | Dark Reading

In a move to make its massive store of video content more accessible, Google's YouTube is making automated caption generation available to all YouTube users.

YouTube initially added a caption feature in 2008. Last November, it introduced auto-captioning for a select group of partners.

Now, any video created with a clear audio track -- unless disallowed, an option for some of YouTube's content partners -- can be captioned automatically, thanks to the speech-to-text algorithms that power Google Voice Search.

What's more, those captions can be translated from English into one of 50 supported languages at the viewer's discretion.

At the moment, auto-captioning only works in videos with spoken English, but Google product manager Hiroto Tokusei says in a blog post that YouTube plans to support the captioning of more languages in the months ahead.

In a related effort, Google is also working to turn Android phones into universal translators through a combination of speech-to-text and translation technology.

"For content owners, the power of auto-captioning is significant," said Tokusei. "With just a few quick clicks your videos can be accessed by a whole new global audience. And captions can make is easier for users to discover content on YouTube."

Captions, as text content, are useful to Google as a way to improve search relevancy. And with the volume of information that Google has to manage -- over 20 hours of video are uploaded to YouTube every minute -- every improvement helps.

Although speech-to-text conversion isn't perfect, Tokusei says that Google's technology is getting better. Video owners can also improve caption files by downloading them, making corrections, and then uploading them back to YouTube.

Other Google accessibility projects include a talking RSS reader for Android devices, support for WAI-ARIA, the Accessible Rich Internet Applications Suite, in Google Chrome, and support for the AxsJAX framework.

About 650 million people live with a disability, according to the UN.

By 2015, Professor Adrian Davis of the British MRC Institute of Hearing Research estimates that more than 700 million people will be suffering from hearing loss of more than 25 dB, a consequence both of aging and of exposure to noise, among other causes.

Last October, Google consolidated its accessibility resources at a single Web address.

Read more about:

2010

About the Author

Thomas Claburn

Editor at Large, Enterprise Mobility

Thomas Claburn has been writing about business and technology since 1996, for publications such as New Architect, PC Computing, InformationWeek, Salon, Wired, and Ziff Davis Smart Business. Before that, he worked in film and television, having earned a not particularly useful master's degree in film production. He wrote the original treatment for 3DO's Killing Time, a short story that appeared in On Spec, and the screenplay for an independent film called The Hanged Man, which he would later direct. He's the author of a science fiction novel, Reflecting Fires, and a sadly neglected blog, Lot 49. His iPhone game, Blocfall, is available through the iTunes App Store. His wife is a talented jazz singer; he does not sing, which is for the best.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights