Johns Hopkins engineers are developing a speech recognition system that will help historians sift through thousands of hours of interviews collected from Holocaust survivors and witnesses in languages other than English. The system is intended to be a key component of an innovative "audio search engine" that would allow historians and educators to comb easily through a vast collection of videotaped interviews to find personal accounts of specific Holocaust experiences.
The Hopkins engineers are part of a multi-institution team that just received a $7.5 million National Science Foundation grant, to be disbursed over five years. The NSF grant was awarded to Los Angeles-based Survivors of the Shoah Visual History Foundation to fund research efforts at Johns Hopkins, IBM and the University of Maryland. Advances made by the researchers in the automatic processing of video for search and retrieval in online systems will be applied to the Shoah Foundation's archive of more than 51,000 video interviews with Holocaust survivors and witnesses.
The Shoah Foundation, established by filmmaker Steven Spielberg in 1994 to videotape and preserve the testimony of survivors of the Holocaust, has recorded and digitized more than 116,000 hours of material in more than 30 languages. To make this archive useful, staff members have begun reviewing each English language tape and indexing it according to critical times, places and incidents described in the interview.
Because this process is time-consuming and costly, particularly when it involves interviews in other languages, the NSF grant will support the development of a computer system that should be able to review the tapes and recognize important words and phrases. "Some of the technology for doing this with English language recordings already exists," says Bill Byrne, a co-principal investigator for the project and an associate research professor in the Johns Hopkins Department of Electrical and Computer Engineering. "Our goal is to develop new techniques to streamline the process and lower the cost of developing systems in new languages."
During the project's first year, Byrne and his colleagues in the Whiting School of Engineering's Center for Language and Speech Processing will work on a speech recognition system for Holocaust interviews conducted in the Czech language. The team has already established ties with university researchers in Prague who will assist in the effort in the Czech Republic. "After we build speech recognition systems to process Czech language testimonies, we will explore opportunities to develop systems in other Central European languages," Byrne says.
Although speech recognition systems are available in most major languages, these systems work well only in carefully controlled environments. For example, the technology is most reliable when it is asked to recognize a limited number of words or phrases that are spoken slowly and clearly. But interviews with survivors of the Holocaust often contain speech that is difficult to understand because the speakers have strong accents or are highly emotional when they recount their experiences. Because of these characteristics, existing speech recognition systems are not capable of producing accurate transcriptions of most Holocaust testimony. An additional hurdle is that speech recognition systems are not available for some of the languages in the Shoah Foundation collection. For these reasons, Byrne said, additional basic research into speech and language technology is needed to produce systems that can process this diverse collection of speakers and languages.
Byrne points out that his team is not trying to build a system that will produce a word-for-word transcription of each interview, which may be beyond the reach of current technology. "We want to build a speech recognition system that is good enough to recognize most of the words that a historian or educator might enter into the audio search engine," he says.
For their portions of the project, IBM will develop speech processing and search technology, and the University of Maryland, cataloging and search technology.