Move Over, Siri! This New Tool Could Take Conversational AI Far and Wide
Imagine you want to build a product like Google Home or Amazon Echo that can talk to humans. You teach it words, their definitions, and the grammar to string them together into sentences. But for the product to speak and understand when it’s spoken to, it needs to know pronunciations, too.
Now, researchers from The Graduate Center, CUNY, have created a tool to help with that. Called WikiPron, the tool mines pronunciation data from Wiktionary, the free online dictionary. The paper on their work appears in the anthology of the Association for Computational Linguistics.
The tool will be valuable for creating speech recognizers and synthesizers in new languages, especially less common languages that don’t have many resources available. Though other researchers have created similar products, they didn’t release them to the public, whereas WikiPron is open-source.
Authors on the paper include Ph.D. students Lucas Ashby and Yeonju Lee-Sikka; master’s students Elizabeth Garza, Alan Wong, and Sean Miller, and Professor Kyle Gorman.
Wiktionary is a free, online dictionary that has pronunciation data for over 900 languages. Each word entry includes the pronunciation symbols for the word’s International Phonetic Alphabet representation. The new tool, WikiPron, extracts these representations from the website.
The researchers used WikiPron to collect 1.7 million word pronunciations in 165 languages from Albanian to Zulu, even including the constructed language of Esperanto. Next, they used this database to train a system to predict the IPA representation of new words. This kind of artificial intelligence would be valuable for “smart” products or assistive text-to-speech technologies.
The major perk of using Wiktionary is its huge variety of languages. Anyone will be able to use WikiPron to create pronunciation databases for languages that don’t currently have many pronunciation resources available.
As a challenge to other researchers, Gorman and colleagues also created a shared task in which participants must build artificial intelligence tools that can predict the pronunciation of unfamiliar words in 15 languages, based on databases created by WikiPron.