TAIPEI, Taiwan Winbond Electronics Corp. claims to have developed a low-power system-on-chip that translates text into audible speech that sounds more natural than the robotic voice common to most computer synthesized speech. The company is set to unveil the chip Thursday (Nov. 1).
Using English and Mandarin as its base languages, the chip is able to process text and generate voice samples by accessing a programmable database of acoustic elements drawn from human voice recordings. The chip could enable items such as a teddy bear that lulls a child to sleep by reading a bedtime story with the pre-programmed voice of Winnie the Pooh.
In the short term, however, Winbond is setting its sites on more sophisticated markets. Topping the list are power-sensitive mobile devices, such as PDAs, cell phone accessories and pagers. Also on the radar are automotive applications such as telematics systems and car stereos.
"We are looking at devices that don't necessarily have a really powerful processor on board," said Hezi Saar, product marketing manager at Winbond. "Usually most of the accessories for handheld devices don't have the power to run text-to-speech algorithms and they don't have the huge memory capacity to support this feature."
The company is hoping that growing markets in wireless mobile data delivery, where e-mail and short messaging applications are popular, will drive demand for accessories that translate speech to text. By integrating Mandarin recognition into the first version of the chip, Winbond is hoping to ride the coattails of a fast growing mobile market in China.
Last summer, China eclipsed the United States as the largest market for cell phone users, and although wireless data networks are still nascent their user base is rapidly growing. China is also making aggressive moves toward developing a third-generation wireless network. "It doesn't matter what kind of technology you have either 3G, GPRS or whatever you are going to get text, you are going to get e-mails or short messages and there will be a need to convert this text to speech" by users too busy to digest the rising flow of streaming information, Saar said.
Winbond's new WTS701 chip integrates a text processor, smoothing filter and a patented multilevel storage array, the company said. The chip skims the text, using either ASCII or Unicode standards, and converts words, common abbreviations and numbers into speech patterns that are analyzed for phonetic interpretation.
The end result is mapped into samples that are piped out of the chip's analog storage array. The signal is then smoothed over by routing it through a low pass filter and is available as an analog signal, or it can be passed through a encoder/decoder for digital audio output.
The multilevel storage memory system allows the chip to store up to 256 different voltage levels, or the equivalent of 8 bits, into one EEPROM cell, which is up to 8x the capacity of conventional memories, Saar said. "If you look at other TTS [text-to-speech] solutions you see enormous flash or memory accompanying the processor, but we can integrate it into one chip using MLS," he said. "That technology allows us to record a real human voice and then extract specific speech elements out of it."
While only two languages are currently supported, Winbond said others will be developed according to market demand.
To prime interest in the chip, Winbond is giving potential product developers a SMS reader that connects to cell phones. It is also stressing the chip's low power consumption of 35 milliamps in active mode and 5 microamps in standby. The chip is housed in a 56-pin TSOP.