Modev Blog

Subscribe Here!

Is Voice Authentication Secure?

Odds are that you've unlocked your phone using either your finger or your face recently. That's called biometric authentication. The idea behind biometric authentication is to use a fixed physical attribute (one that doesn't continuously change, like your face or your fingerprints) to verify your identity. It's convenient, quick, and, best of all, it doesn't require you to remember a password or pin.

But biometric authentication isn't just about fingerprints and faces. There's a third player: voice. While voice authentication isn't as popular as fingerprint or facial recognition, its use in biometric authentication is growing. And while the tech is perhaps not 100% ready to replace fingerprint or facial recognition, there are nonetheless compelling reasons to use it as a form of authentication. And as the advances in artificial intelligence keep on keeping on, voice authentication will undoubtedly go mainstream.

Voice Authentication: How Does it Work?

Voice authentication uses one's voice, rather than a password - which can be pretty weak, to uniquely identify individuals. The way this typically works is two-fold. 

First, we need to create a template of our voice for the authentication system to use. This template is usually called a 'stored model voice print,' and it works as follows: You recite a predetermined series of words/sentences. As the vocal recognition system records your voice, it reduces each spoken word to shorter segments made up of dominant frequencies called formants. Each formant comprises several tones that, taken together, identify your unique voice print. This recording process is repeated several times to ensure that the system can deal with any natural variability in your speech patterns.

The second part is you speaking to your device to authenticate yourself. It, again, records your voice as you speak and makes another voice print, which it will compare to the original print. If they match, you're authenticated. If they don't, you're not.

What Are the Benefits of Voice Authentication?

For one, voice authentication is the only biometric enabling remote verification. And it also allows users to be authenticated without sharing any personal information over the phone. Nor does any personal or biometric data need to be stored on an end-user device.

Another benefit is how it levels the playing field for certain demographics, like the elderly and the disabled. Voice authentication enables these groups to access their accounts without needing to remember/type in a password or provide any personal information.

We also like the fact that no special equipment (smartphone, specific application) is required. Depending on the service, you only need a regular phone line or a web browser. That simply makes our day-to-day more inclusive as technology no longer acts as an access barrier.

Is Voice Authentication Secure?

The short answer is 'yes,' with a 'but'… Let's unpack that a little.

So voice authentication is secure. As long as the server on which your voice print is stored is secure and that encryption is applied to your voice as it transits over phone lines or the internet to reach that server, voice authentication is considered secure. Or as secure as any other means of authentication. That was the 'yes' part. Now onto the 'but'…

The vulnerability of voice authentication resides in our ability to synthesize voice, often called voice cloning or deepfakes. The same tech that enables voice recognition also allows for voice cloning. Voice cloning is the process of taking portions of recorded speech and applying artificial intelligence (AI) to extract the speech patterns from those speech samples. Once the AI has analyzed the data, it can output absolutely realistic speech that the person who recorded the original speech samples never uttered.

The risk comes from the relative ease with which we can spoof someone's speech today. There's even anecdotal evidence of people using speech from youtube videos to trick a voice authentication system. And that's a problem.

To tackle that issue, the industry has so far developed two main ways to prevent fraud in voice authentication systems: liveness detection and continuous authentication.

  • Liveness detection - Liveness detection attempts to validate that the fingerprint or voice sample being used is from a genuine human being or a bot. Liveness detection will attempt to determine things like whether or not certain irregularities are actually speech rather than artifacts stemming from the fact that it is a recording being played back.
  • Continuous authentication - Continuous authentication verifies an individual's identity repeatedly over the length of a session instead of only once. Within the context of voice authentication, the idea is to detect potential issues like callers changing in the middle of a phone call, for example.

So How should we Use Voice Authentication?

Well, the best way to use voice authentication is actually the way it's being used today: as a factor in multi-factor authentication (MFA). MFA is a method used to secure your authentication by requiring at least two factors of proof, like a password and a one-time PIN from a token authenticator device, for example.

MFA has three main types of factors:

  1. Knowledge - Something you know (a password)
  2. Possession - Something you have (a token authenticator device)
  3. Inherence - Something you are (a biometric)

Voice authentication is in the inherence category (something you are).

Attacks on accounts protected with MFA are much more difficult to pull off. The odds of an attacker being able to compromise two or more factors in the authentication process aren't nil, but they're rather low. And voice authentication is an excellent and secure factor within an MFA scheme.

Wrap up

So voice authentication is a secure means of identification, but it has vulnerabilities related to voice spoofing. We may well solve that problem in the long term, but we're not there yet. Still, until that day comes, voice authentication remains an excellent way to secure your accounts - if only as a factor within MFA. But I don't doubt that the day will come when voice is used as a primary means of identification. 

Good things come to those who wait.

About Modev

Modev was founded in 2008 on the simple belief that human connection is vital in the era of digital transformation. Today, Modev produces market-leading events such as VOICE Global, presented by Google Assistant, VOICE Summit, and the award-winning VOICE Talks internet talk show. Modev staff, better known as "Modevators," include community building and transformation experts worldwide. To learn more about Modev, and the breadth of events offered live and virtually, visit modev.com.

Modev News, VOICE Summit