we recommend including reCAPTCHA verification on forms or actions as
well as in the background of pages for analytics.
Note: You can execute reCAPTCHA as many times as you'd like with
different actions on the same page.
(https://developers.google.com/recaptcha/docs/v3)
Based on this comment, should I be executing only the grecaptcha.execute function in the background of the page or do I also need to verify that the token is correct, also in the background?
This is in terms of improvement of the recaptcha v3 ML
Thanks
Related
I am a little confused about the implementation of google's V3 recaptcha. The docs say that the algorithm decides the users score based on a number of actions. How is it watching those actions if those actions occur when the recaptcha is not being used?
For example, I have users "vote" on various categories. These votes happen on several different pages within a react app and possibly over more than one session. However, the only time that the recaptcha is called is when the user submits the form. Do I need to somehow notify the captcha to watch the voting actions? Does the code need to be imported in all components or on a higher component so that all voting ones have access to it?
Brief Summary
Let's start with a brief introduction of what a Google reCaptcha farm is - a service that bot developers can query via an API to automate solving Google reCaptcha:
The bot is blocked by a Captcha challenge.
It makes an API call to the Captcha farm with the website’s Captcha public key & its domain name as parameters.
The Captcha farm asks one of its workers to solve the Captcha.
After ~30-45 seconds, the Captcha is solved and you obtain its response token.
The bot solves the Captcha by submitting the response token.
In short, solving a Captcha is as simple as calling a function in the bot's code. The attacker doesn't even need to interact directly with the Google reCaptcha by clicking on it. If the attackers know the structure and the URL of the Google reCaptcha callback, i.e. the request where the website sends the Google reCaptcha response token after a successful response has been submitted (which is straightforward by looking at the devtools), they can prove that they've solved a Captcha without even using a real browser.
Problem
My website is fully integrated with Google reCaptcha V2 (Invisible reCaptcha). The implementation follows all steps listed in the documentation. It worked like a charm till now. As time passed by, we experienced different kind of attacks that tried to infiltrate our login. The one the caused the biggest problem was a Dictionary attack combined with automated Google reCaptcha solving mechanism. The attackers are using farms (or may be scripts) that solve the Google reCaptcha and generate unique response codes, which are used by a bot network (different IP addresses around the world, User-Agents, Browser Fingerprints, etc.). Using these codes, the Google reCaptcha is taken out of the picture and we MUST use different mechanisms to block the attackers.
Question
I reviewed the Google reCaptcha documentation multiple times along with different topics related to this problem, but couldn't find a way to prevent such attack in an easy way. I have a few questions and will be very grateful if somebody succeeded to answer them:
Is it possible to bind the Google reCaptcha response code to a code challenge, cookie or something similar in order to ensure that the code is generated by the exact client?
Is there any way to distinguish the Google reCaptcha codes, taken from a farm/script and the ones generated by the exact client?
I found that there are some solutions as DataDome, which are very expensive. Is there something similar but on lower price or an algorithm that can be implemented on my own?
Big thanks in advance!
Script
Below is a simplification of the script that acts like a Google reCaptcha farm:
bypassReCaptcha();
function bypassReCaptcha() {
grecaptcha.render(createPlaceholder(), buildConfiguration());
grecaptcha.execute();
}
function createPlaceholder() {
document.body.innerHTML += '<div class="g-recaptcha-hacker"></div>';
return document.getElementsByClassName('g-recaptcha-hacker')[0];
}
function buildConfiguration() {
return {
size: 'invisible',
badge: 'bottomleft',
sitekey: '<your site-key>',
callback: (reCaptchaResponse) => localStorage.setItem('reCaptchaResponse', reCaptchaResponse)
};
}
I am using a server-side validation - something like this:
curl -X POST 'https://www.google.com/recaptcha/api/siteverify?secret=<your secret>&response=<generated code from above>&remoteip=<client IP address>'
It seems that the remoteip parameter is not working as expected - the validation is successful no matter of the client IP. I checked some topics and seems that this is a common problem:
Google reCAPTCHA's remoteip parameter is ignored
Is there any reason to include the remote ip when using reCaptcha?
I'm migrating from Google reCAPTCHA v2 to v3. As they are quite different, I have a question.
I used to place my reCAPTCHA v2 only inside web pages where a form exists, to make users click and avoid bots. That's understood, ok, but with reCAPTCHA v3 there is NOT a checkbox where to click on (reCAPTCHA v3 analyzes the user behaviour and clicks).
So... should I place the reCAPTCHA v3 just in forms pages or should I place it in all and every pages I have (to make recaptcha observe how the user interacts with the web)?
I would disagree with Galzor’s answer. The documentation says that
The score is based on interactions with your site and enables you to take an appropriate action for your site.
It’s “site” and not page. It goes on to say
reCAPTCHA works best when it has the most context about interactions with your site, which comes from seeing both legitimate and abusive behavior. For this reason, we recommend including reCAPTCHA verification on forms or actions as well as in the background of pages for analytics.
To me that last sentence means “every page with analytics on my site” — i.e. every page, whether it has a form on it or not. Which then gives rise to all sorts of privacy concerns, see also here.
Now my question is: what does the “reCAPTCHA verification” refer to? Including the api.js script or executing something or… 🤔
Unfortunately, the docs don’t spell this out clearly.
Addendum
(Feb 2023)
I switched to hCaptcha and their docs are also somewhat unclear. However, their customer service responded with
You should add the script and the DOM container with hCaptcha widget only on the contact form page and then call our /siteverify endpoint to validate the user.
and
Same scenario for second case, add it only on the sign up page and if validated within our side the user should be able to log in.
Based on that response I added the CAPTCHA only to the Contact page of my website and to the Sign Up page of the webapp.
Not sure this would also apply to Google’s CAPTCHA, though.
I dont think it should go into every page. mostly the users will find it too intrusive on all pages. in my opinion use it on page with form only.
Placement on your website
reCAPTCHA v3 will never interrupt your users, so you can run it whenever you like without affecting
conversion. reCAPTCHA works best when it has the most context about
interactions with your site, which comes from seeing both legitimate
and abusive behavior. For this reason, we recommend including
reCAPTCHA verification on forms or actions as well as in the
background of pages for analytics.
Source: https://developers.google.com/recaptcha/docs/v3
The above document says we need to integrate ReCAPTCHA V3 on multiple pages. So question is, do we really need to generate and verify token for each page or just generating token is enough?
like
grecaptcha.execute(reCaptchaPublicKey, {action: 'cartpage'}).then(function(token) {
//skip verification
});
Note:
On the form for which I want to block the bot, I am generating a token and passing it to the server with the user's form data. Now on the server-side, I am validating token using API and getting a score in response to take further action. like, block the user action if the score is low.
No, Calling grecaptcha.execute with the appropriate action (use 'homepage' for traffic on your homepage) is enough to make the reCAPTCHA service count and process the visit.
The token that is provided to your callback is requested from the reCAPTCHA service by the reCAPCHA client script. Sending it to your server to then send it back to the reCAPTCHA service to get the score makes no sense if you don't use the score.
Google announced Invisible ReCAPTCHA is coming soon. For now, if you want to integrate the new reCAPTCHA to your site or app you can register here.
I do have 2 site keys whitelisted for the new Invisible reCaptcha and I've started "playing" with their examples: see them here https://developers.google.com/recaptcha/docs/invisible
Yes, when the page loads the recaptcha is invisible but when the form is submitted the recaptcha challenge appears all the time. You have to click on images, draw something around something else... etc
I've been testing this on different servers, 2 different sites which have the site key approved to use the Invisible reCaptcha, with different browsers form different locations. Same behavior: Google shows the challenge when the form is submitted on all 3 examples they have on their page.
Is this what we should expect?
Just as with the checkbox, if it can't reliably determine if you aren't a bot, you get a challenge. I can confirm that the invisible part does work when you are detected as a human.
Actually you have to approve the Terms of Service when you create a new reCAPTCHA site, that says that
You agree to explicitly inform visitors to your site that you have implemented the Invisible reCAPTCHA on your site and that their use of the Invisible reCAPTCHA is subject to the Google Privacy Policy and Terms of Use.