Voice-generating technology hitting all the right notes
Artificial voice generators generally receive a lot of bad press, but this week was an exception. Two developments in the communications market were announced to worldwide acclaim: a silent-speech device incorporating an automatic translation tool with a twist; and a bespoke voice synthesizer which was aired on the Oprah Winfrey Show.
Silence was certainly speaking volumes at the CeBIT trade fair in Germany this week when scientists from the Karlsruhe Institute of Technology (KIT) demonstrated a device capable of ‘lipreading’ and transforming these movements into speech. The technology in question is called Silent Sounds which according to AFP works by electromyography – ‘monitoring the muscular movements produced when we speak and converting them into electrical pulses that can then be turned into speech.’
Currently the device functions through a variety of electrodes attached to the skin but it is anticipated that within a decade, the technology will become an everyday feature of mobile phones once it can be integrated into handsets. It is said to be 99 per cent accurate at the moment, but its success with different accents or technical language remains to be seen.
However, Silent Sounds does boast another feature and that is the automatic translation application which translates the input language into an output language of the user’s choice. At the moment it is mainly European languages which are on the menu as the developers explained that support for Chinese, for example, would require more development to incorporate ‘tone’.
But this type of technology is also important for the medical world and could help improve the quality of life for people who have no longer have the ability for speech due to an operation, illness, or accident. Such was the case for American film critic Roger Ebert who lost his voice four years ago following an operation. This week he unveiled a bespoke piece of voice-generating software on the Oprah Winfrey Show which has enabled him to speak again for the first time since the surgery that robbed him of his voice.
The device was developed by Edinburgh speech synthesis company, Cereproc, and what makes this machine stand out is that the computer-generated output sounds like Mr Ebert’s voice and not an electronic reproduction. The BBC reported how this was made possible through a process of accessing recordings of Mr Ebert’s voice, breaking these down into individual sounds, completing a transcription stage and finally reassembling everything. The user types out what he/she would like to say and the computer generates a ‘human’ voice. Mr Ebert commented that ‘It still needs improvements, but at least it sounds like me.’
These innovative technologies could well become common place in the future and what may seem like science fiction today, may be everyday communication tools when the products become market ready. For example, the ongoing work with the Silent Sounds device includes developing a system which is operable in offices and budding MI5 agents, military personnel, cinema-goers wishing to communicate from inside the theatre and even commuters will surely be adding it to their wish lists.
Further development stages and lots of tweaking are undoubtedly the order of the day for these devices and the jury is still out on the degree of success with which the automatic translation application will deal with the nuances and complexities of language. However, from those who would prefer to use silent communication for security reasons to the truly life-changing experience of giving people their voice back, there is no doubt that voice-generating technology is certainly hitting all the right notes.