LifeHacker recently posted something about
PodZinger, a website that allows you to search the content
inside video and audio files. That's right - using speech recognition technology, it converts the audio into text and makes that searchable. I'm not sure about its accuracy - its hard enough for some people to work out what people are saying sometimes, let alone get a computer to do that. What about if multiple people speak at once, like in a conversation? Its a nice step though, and they now have the ability to search inside YouTube's huge collection of videos.
That said, I think apart from searching, another good use of this technology would be to summarise the video/audio file into a set of dot points, allowing people to jump to the sections they want to listen/watch. If this was reliable, I'd use it because right now, I avoid many podcasts and videos because unlike text content, I can't skim through a video/audio file. I spend enough time reading my feeds as it is (I've already cut
digg from my RSS feed) - add watching/listening to podcasts and I won't be doing anything else.
With more and more media getting out there, video/audio searching is definitely needed, though right now, I still think there's a bit of work to do before its accurate and reliable enough. I'm sure Google, Yahoo & Microsoft are working on similar technologies, not only for their online search engines, but also for their desktop search products, so soon enough this should become a workable reality.