Understanding voice recognition technology

Posted: Updated:
Since its humble beginnings in the 1950s, voice recognition technology has made great strides, but there are still many challenges to making it work the way most people think it should work. (Source: nevarpp via 123RF) Since its humble beginnings in the 1950s, voice recognition technology has made great strides, but there are still many challenges to making it work the way most people think it should work. (Source: nevarpp via 123RF)
(DATA DOCTORS) -

Q: Is there a voice recognition system that can replace typing that actually works well?

A: As much as technology has advanced since Bell Labs developed the "Audrey" in the 1950s, which could only recognize digits spoken by a single user, we’re still far from what you’ve seen in science fiction movies.

Commands vs. dictation
Voice command platforms like many automated phone systems use are reasonably effective because they severely limit the number of verbal commands you can use.

Natural speech recognition is what most people want and that’s a challenge that has yet to be met in a way for it to be widely adopted.

We’re surrounded by options that offer some form of voice command/recognition from Apple, Google and Amazon, but they are far from perfect, as we all well know.

[MORE: Data Doctors]

Accurate dictation has been the challenge that many very sophisticated companies, including IBM, has been trying to solve for over 60 years.

To put the problem into perspective, a system with a 90 percent accuracy means that every 10th word is wrong; 95 percent accuracy gives us a 1 in 20 ratio and even at 98 percent, we’re still looking at roughly 1 in 50 words being incorrect.

With an average paragraph in the 100-150 word range, you can start to see how the time we may save in generating the text can get eaten up in editing what was captured.

[READ MORE: Is voice dictation ready for prime time?]

Throw in that our voices change when we’re sick, various accents, the speed at which we speak and a host of other variables and you start to understand how much more sophisticated of a processor the human brain is.

The context problem
Another huge challenge is context, both in command and dictation technology. Google recently started to bridge the context gap with its latest Google Assistant technology that allows you to have more of a conversation.

For example, you can ask, "Do I need an umbrella today?" and after it responds, you can follow up with, "What about tomorrow?"

Another advance in context is being made possible by what many consider the "creepy" factor of today’s technology. Because our smartphones can remember virtually everything we’ve done in the past, consider our current location or what we’ve been searching for online or in a mapping program, they can use this additional information to help better understand your verbal commands.

[RELATED: Understanding & managing smartphone location history]

[RELATED: Is Amazon Echo snooping on me?]

[RELATED: How to manage privacy on a Roomba]

Tips for being successful
If dictation is your key need, the company that’s been at it the longest, as far as a consumer product goes, is Dragon NaturallySpeaking.

As good as the program is, expecting to install the software and have it magically become your new way of  "typing" will guarantee failure.  You are essentially going to be learning a new language in a sense.

If you aren’t willing to take the necessary time to train yourself to learn how to speak to your computer, you shouldn’t bother spending the money.

You’ll also need to make sure that you have the proper hardware to be successful, such as enough processing power, RAM and a good microphone, so be sure to review the system requirements before taking the plunge.


Click/tap here to download the free azfamily mobile app.

Copyright 2017 KPHO/KTVK (KPHO Broadcasting Corporation). All rights reserved.