Can You Fool Voice Biometrics?

There is no question that biometrics are secure. Biometrics is considered the strongest authentication method because it ties an identity to an actual individual (‘what you are’ rather than ‘what you know’ or ‘what you have’).

While organizations aim to keep up to date and deploy the most secure authentication technologies, fraudsters are constantly looking for ways to overcome authentication and hack systems including those secured with biometrics. We have all watched movies where someone cuts off an enemy’s finger to access a secret room, or an eyeball is torn off someone if it can be read by a retina scanner and provide access to classified information. While reality is normally not as violent, fraudsters do find ways to fool biometrics systems. Using a photo or a mask for face recognition and creating a fake finger for fingerprint access, are just a couple of examples.

Focusing on voice biometrics for contact centers, I always try to keep in mind what a fraudster would do. Looking to fool voice biometrics, there are two common practices among fraudsters: playing a recording which matches the desired voiceprint, and using synthetic speech.

There are several ways to tackle these fraudulent practices. In this post I would like to stress the importance of two of them: liveness detection and continuous authentication.

Liveness detection

Liveness detection is an anti-spoofing measure built into a biometric system, which is meant to confirm that the body part submitted for authentication is real.

Liveness detection is critical in presentation attacks where fraudsters try to break into a biometric system using fake fingerprints, masks, high definition pictures and so on. When it comes to voice biometrics, the most common spoofing attempt is by trying to manipulate voice biometrics using an audio recording (playback) of the victim. A more recent method to try and overcome voice biometrics is the use of synthetic speech tools. Following a teaser by Adobe launching Project Voco in 2016 as well as Google’s wavenet discovery, companies like Voysis and Lyrebird made a leap forward in generating artificial voice. While synthetic speech still has its limitations and isn’t yet accessible to anyone, it is quite clear that fraudsters will be able to use it at some point in the future.  

Playback and synthetic speech are not yet common practices by fraudsters (and to be honest, if I would have been a fraudster, I would just target organizations that don’t use voice biometrics, it’s much easier to hack their systems…). Yet, we, as authentication providers, keep an eye on these evolving technologies, ensuring that they won’t put our customers at risk. That is why we recognize liveness detection as top priority and keep our products up to date with fraud prevention manipulations. To do this, NICE Real-Time Authentication extracts a set of audio features which ensure that the speaker is a real person rather than a recording or a synthetic voice of an individual.

Continuous authentication

Continuous authentication is a technology that continuously verifies an individual’s identity throughout a session rather than just at the entry login point.

Most authentication methods applied at the contact center - such as Knowledge Based Authentication (KBA), One Time Password (OTP) and even facial biometrics when applicable - verify the caller’s identity at the entry point. From that point onwards the assumption is that the person we have on line is the same individual who was successfully authenticated at the entry point. But what if someone is authenticated, and then someone else takes control over the interaction? Continuous authentication addresses exactly that!

Continuous authentication changes the perspective of authentication from a ‘one time event’ to an actual process. This way the system constantly checks that the person on the call is indeed the same person that was authenticated, perhaps just a few seconds ago.

While most vendors still assume that performing authentication just once in the beginning of the call is sufficient, and don’t take into consideration cases when the speaker might change during the call, this assumption is not accurate, especially when it comes to fraudsters.

Working with contact centers for more than 30 years now, NICE is familiar with the associated risks of voice authentication at contact centers and takes the importance of continuous authentication into consideration, thus making it an integral part of our voice biometrics solution.

At this point I’m going to stop myself from revealing more insiders’ information on how we do that. After all fraudsters may read this post too…