Title: Optical Character Recognition for Most of the World's Languages
Speaker: Dr. Ashok C. Popat, Research Scientist, Google
Website: http://ewh.ieee.org/r6/scv/sps/
Abstract:
Much of interpers
...
onal communication is linguistic, and people exchange linguistic information primarily through speech and through graphical symbolic representation of speech utterances, i.e., writing, printing, typing, etc. In the modern digital age we can represent written communication as sequences of bits grouped into Unicode points, a means which is capable of representing many if not most of the world's extant languages. But much of the world's recorded information is still in visual rather than digital Unicode form; it is in books, newspapers, manuscripts, and letters; on post-its, whiteboards, street signs, or video captions. It may also be in the form of a gesture on a touch pad or mobile phone screen, to allow an alternative method of text entry than a keyboard. The conversion of all of these representations to Unicode for use in the digital world is generally known as Optical Character Recognition (OCR). How might an OCR system be designed to handle all of the world's languages? I'll explain some challenges that make this nontrivial and describe an approach we're exploring at Google.
Biography:
Ashok C. Popat received the SB and SM degrees from the Massachusetts Institute of Technology in Electrical Engineering in 1986 and 1990, and the PhD from the MIT Media Lab in 1997. He is a Research Scientist at Google in Mountain View. Prior to joining Google in 2005 he worked at Xerox PARC. His interests include signal processing, data compression, machine translation, and pattern recognition. He enjoys running, skiing, sailing, hiking, and spending time with his wife and two daughters.