GUEST ESSAY: A breakdown of Google’s revisions to streamline its ‘reCAPTCHA’ bot filter

By Emma Yulini

Most of us internet users are obviously familiar with CAPTCHAs: a challenge or test that is designed to filter out bots (automated programs) and only allow legitimate human users in.

Related: How bots fuel ‘business logic’ hacking

The basic principle behind CAPTCHA is fairly simple: the test must be as difficult as possible (if not impossible) to solve by these bots, but at the same time it must be easy enough for human users not to hurt user experience.

This principle is precisely where all sorts of troubles surrounding CAPTCHAs come in. Today’s bots are really advanced, and advanced AIs are now pretty reliable in solving CAPTCHAs. So, we have to make the CAPTCHAs more difficult, but at the same time we all know how CAPTCHA challenges can be really annoying, and we’ll simply bounce from a site featuring even more difficult CAPTCHA.

This is why Google invented the invisible reCAPTCHA and other newer versions of reCAPTCHA.

reCAPTCHA

What actually is reCAPTCHA? Simply put, there are many different companies offering CAPTCHA solutions at the moment, and reCAPTCHA is Google’s brand of CAPTCHA solution.

If you’ve been on the internet long enough, you might have remembered the first iteration of reCAPTCHA (now called reCAPTCHA v1) where we are shown a pair of words, one of them is scratched, distorted, or made obscure in different ways so it can only be identified by a human user.

This distortion method to fool the bot’s OCR (Optical Character Recognition) was actually a big innovation back then, which convinced Google to purchase the reCAPTCHA company back in 2009, and reCAPTCHA v1 continues to be reliable and popular throughout the 2010s.

That is, however, no longer the case. reCAPTCHA v1 was officially shut down in 2018 and was replaced by the newer invisible reCAPTCHA.

Invisible reCAPTCHA

The invisible reCAPTCHA is actually just one of several different versions of reCAPTCHA v2.

Yulini

Most likely you’ve stumbled upon the “I’m not a robot” reCAPTCHA checkbox, which is actually a type of reCAPTCHA v2. Google often calls this version the no CAPTCHA-reCAPTCHA.

As a user, we only need to click on the checkbox, but it is actually a pretty advanced technology. To put it simply, Google analyzes the client’s behavior before, during, and after clicking the checkbox to determine whether the client is a human user. Google uses various advanced technologies, including analyzing your browser history (if you are using Chrome), mouse movement, typing patterns, and so on.

If Google is still unsure whether you are a human or a bot, only then you will be presented with the “select all images with xxxx.”

The invisible reCAPTCHA, however, is even more advanced, and is technically only a badge like this:

Invisible reCAPTCHA uses the same method of analyzing the client’s behaviors and activities, but no user interaction is required, not even clicking the checkbox.

Similarly, only when Google is not sure whether the client is a bot or a human will it be challenged with a CAPTCHA test.

reCAPTCHA v3

While Google’s algorithm in analyzing human behaviors is extremely good, the invisible reCAPTCHA and reCAPTCHA v2 aren’t 100% perfect, and the CAPTCHA test provided when Google isn’t sure whether you are a human user garnered a lot of complaints from users.

Not to mention, there are CAPTCHA farm services that are used by cybercriminals to solve the reCAPTCHA v2 challenge. The human workers of the CAPTCHA farm will solve the reCAPTCHA challenge and then send the response token back to the hacker. Voila! Now we’ve rendered reCAPTCHA v2 powerless.

(Let’s admit it, those “click on xxx images” tests are really annoying and time-consuming).

This is why Google released the reCAPTCHA v3.

The reCAPTCHA v3 uses even more advanced technologies to monitor how a client interacts with the website. While the actual process is very complicated, in general, Google will monitor all interactions of a user, and for each action, reCAPTCHA v3 will give a score between 0 and 1. The closer the user’s average score is to 0, it is determined as a bot, and vice versa.

Problem for Webmasters

While in theory reCAPTCHA v3 is really good, but setting up reCAPTCHA v3 on a website can be a nightmare for webmasters.

The webmaster must define the reCAPTCHA score for all possible actions on the website where the webmaster must decide between three possible responses:

•Pass the user as a legitimate human user and provide to their requests

•Provide a reCAPTCHA v2 challenge when the score result is not definitive

•Block the user immediately

For example, if a user’s score is below 0.3 for a certain action, which response should we give? What about if it’s 0.5?

As you can imagine, this can be a very time-consuming and difficult process, especially for larger websites.

Also, the reCAPTCHA v3 relies on analyzing user’s interactions on the website, and to do this accurately it will first need to analyze a large enough volume of data. Or else, it won’t be able to objectively measure normal human interactions on the specific site. This can be an issue if the site is relatively new with relatively small traffic.

Closing Thoughts

While invisible reCAPTCHA and reCAPTCHA v3 are pretty reliable, no CAPTCHA solutions are 100%, and sooner or later there’ll be more advanced AIs that will be able to render them totally useless.

So, if you really want to protect your site from spam, DDoS, and other malicious bot activities, you’ll need a bot management solution as a powerful alternative to CAPTCHA solutions that can protect your site in real-time without disrupting user activities.

About the essayist: Emma Yulini is a professional blogger who has written more than 500 articles on a variety of tech topics.

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInEmail this to someone