People who have amyotrophic lateral sclerosis (ALS) know how hard it can be to communicate, but a new app developed by Microsoft researchers, called GazeSpeak, may make speaking with the eyes a reality.
GazeSpeak, to be unveiled in May at the Conference on Human Factors in Computing Systems in Colorado, is an artificial intelligence (AI) mobile app that converts eye movements into speech.
The new app runs on the listener’s smartphone, which he or she points at the person with ALS. That person looks at a sticker on the back of the phone, which contains a grid instructing the speaker where to look, depending on where words are grouped. The phone then track the speaker’s eye movements and registers them as letters.
“For example, to say the word ‘task’ they first look down to select the group containing ‘t’, then up to select the group containing ‘a’, and so on,” Xiaoyi Zhang, who developed GazeSpeak while he was still an intern at Microsoft, told New Scientist magazine’s Timothy Revell.
Then, as with smartphone texting, the app selects the letters from the different groups and predicts four words the speaker might be saying, reading the top one aloud. The system also takes into account personalized words, like names of people or places that a user might say more frequently.
To indicate he or she is about to finish a word, the speaker may wink or look straight ahead for two seconds.
“We’re using computer vision to recognize the eye gestures, and AI to do the word prediction,” said Meredith Morris of Microsoft Research.
Until now, most people with ALS have used the low-tech version of a similar system, with a board displaying letters grouped in a grid. An interpreter then tracks the speaker’s eye movements as he or she selects letters.
When tested, GazeSpeak was faster than traditional boards among 20 people without ALS who tried both methods. It took an average 78 seconds to complete a sentence using GazeSpeak, compared to 123 seconds using the boards.
Some ALS patients and their interpreters also tested the app, and one person who tried the system typed a sentence in just 62 seconds. The patient said his interpreter would have made it even faster in real life, as he would have easily predicted what he was likely to say.
Other currently available eye-tracking software options mostly use infrared cameras, but they are often expensive and bulky; they also work poorly in sunlight. GazeSpeak is portable and comparatively cheap, as it only requires a smartphone.
Apple’s app store will soon carry GazeSpeak. In addition, its developers will make the source code available for free, so that other software developers can help improve it.