Researcher breaks Google CAPTCHA using speech-to-text AI

Captcha showing on a smartphone display

The battle between bots and cloud services just took another turn as a researcher broke Google's CAPTCHA technology using artificial intelligence (AI) — again.

CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart. It uses puzzles that only humans can solve to stop automated bots from signing into accounts or registering for new ones. The problem is AI allows computers to perform more human-like tasks, and security researchers have repeatedly used this fact to help computers solve CAPTCHAs.

Now, researcher Nikolai Tschacher claims to have solved the second version of Google's CAPTCHA implementation, known as reCAPTCHA. By default, this system presents a visual puzzle, asking users to select the portions of an image containing a certain object. However, there is an audio option for visually impaired users that lets them type in the words they hear.

"The idea of the attack is very simple," says Tschacher on his blog post. "You grab the mp3 file of the audio reCAPTCHA and you submit it to Google's own Speech to Text API."

The post includes a video demonstration of the attack, which shows the computer 'listening' to an audio snippet of the words "fastest drives currently" from reCAPTCHA and automatically submitting them to the Speech to Text API. The API returns the correct text, and the computer enters it automatically into the reCAPTCHA.

Google has updated its technology repeatedly over the years to stay one step ahead of researchers like Tschacher. A team at the University of Maryland broke the search giant's system using the same technique in 2017. They published the code for their technique, called unCAPTCHA, and Google updated reCAPTCHA to evade their algorithm.

The update thwarted unCAPTCHA, but Tschacher's technique modifies the same code to make it work again with a 97% success rate. Other researchers have published anti-CAPTCHA research, including one team that unveiled an attack on Google's system at Black Hat Asia in 2016. California-based AI company Vicarious also created software that broke CAPTCHAs via visual processing in 2017.

This is just another step in the cat-and-mouse game between CAPTCHA techniques and attackers, which seems to be taking two distinct paths. One of them is to passively analyze user behavior, including elements such as their typing cadence, which areas of the sites they visit in what order, and their mouse or touch activity.

Google has already implemented behavioral analysis in the third version of its bot-detection system that examines how humans interact with a website to detect bots. It uses a baseline of real traffic to individual websites to determine what's normal, enabling it to spot unusual activity.

The other option is to make the tests harder using games or other tests that are more difficult for users to solve. However, to be inclusive, those tests would have to be accessible to visually impaired users.

Threatpost reports that Tschacher's unCAPTCHA revision even works on reCAPTCHA Version 3. In an interview with the publication, Tschacher warned the technique might be challenging to scale thanks to Google's use of rate-limiting to stop bots hammering its systems with too many queries. The company also fingerprints the software agents accessing its system.

Danny Bradbury

Danny Bradbury has been a print journalist specialising in technology since 1989 and a freelance writer since 1994. He has written for national publications on both sides of the Atlantic and has won awards for his investigative cybersecurity journalism work and his arts and culture writing. 

Danny writes about many different technology issues for audiences ranging from consumers through to software developers and CIOs. He also ghostwrites articles for many C-suite business executives in the technology sector and has worked as a presenter for multiple webinars and podcasts.