I am using Google reCAPTCHA for form validation. I have got many customers complaint the image selections are too difficult. From the Google reCAPTCHA report, I saw the failed request more than passed request. Is this common?
So question here,
Is there a way to monitor this kind of situation or improve it?
If I stuck into the image selection, is there any way I can get back to normal tick captcha?
Thanks.
You can get the details from the picture, the fail is more than pass.
Picture for spam index and average response time:
You can tune the difficulty in Security Preference, under Advanced Settings of the reCAPTCHA settings of your site.
Not really, since image selection only appears when reCAPTCHA thinks the tick is not enough to determine the user is a human.
Related
I've implemented recaptcha v3 per the docs, but in our contact form when I do the callback to get the token and parse it in the backend, I'm getting a score of 0.9... all the time. Well ONE came back at 0.7, but we're getting considerable amounts of bot spam which shouldn't be at 0.9 one would think. We went to v3 because v2 was giving just as much spam.
If you have normal scores, can you answer a couple quick questions?
You loading it on every page (in the template), or just your contact form
What (if anything) are you using actions for, the docs just kinda show "submit" and I'm not sure what else to issue a challenge for that isn't a "submit"...
Is there something I'm missing?
To test a low score you can simulate a browser with custom user-agent, for example "Googlebot/2.1"
Screenshot from my chrome dev tools
I have a website where people can interact with different objects to view specific content. I would like to know which objects get the most interactions by real people. For example there are thumbnails of images and I would like to know when a user clicks on a thumbnail to view an image.
To do this I thought I would create a psql table with thumbnail_id and an IP address, where every single view is stored (to ensure every combination of thumbnail and ip is only counted once and people can't just spam click it).
And so every time a click happens, a post request on a /views endpoint with the thumbnail id attached is made in the background.
The proplem is, some people may be incentivized to create bots to auto click certain images with many different IPs.
So I was wondering if I could use recaptcha v3 to identify real users as opposed to bots which would include a token with every view request.
But I was wondering, would is this too much for my backend to handle (since it would have to talk to googles servers every time anybody views an image, which might be every few seconds for each user and I would be billed while the server waits for a response) or be too expensive, since I have to pay google on every request? Or is there some other obvious problem with this?
I'm asking since I have only ever found recaptcha used for single form validation and never for traffic measurements, even though that seems like a pretty obvious use case.
Brief Summary
Let's start with a brief introduction of what a Google reCaptcha farm is - a service that bot developers can query via an API to automate solving Google reCaptcha:
The bot is blocked by a Captcha challenge.
It makes an API call to the Captcha farm with the website’s Captcha public key & its domain name as parameters.
The Captcha farm asks one of its workers to solve the Captcha.
After ~30-45 seconds, the Captcha is solved and you obtain its response token.
The bot solves the Captcha by submitting the response token.
In short, solving a Captcha is as simple as calling a function in the bot's code. The attacker doesn't even need to interact directly with the Google reCaptcha by clicking on it. If the attackers know the structure and the URL of the Google reCaptcha callback, i.e. the request where the website sends the Google reCaptcha response token after a successful response has been submitted (which is straightforward by looking at the devtools), they can prove that they've solved a Captcha without even using a real browser.
Problem
My website is fully integrated with Google reCaptcha V2 (Invisible reCaptcha). The implementation follows all steps listed in the documentation. It worked like a charm till now. As time passed by, we experienced different kind of attacks that tried to infiltrate our login. The one the caused the biggest problem was a Dictionary attack combined with automated Google reCaptcha solving mechanism. The attackers are using farms (or may be scripts) that solve the Google reCaptcha and generate unique response codes, which are used by a bot network (different IP addresses around the world, User-Agents, Browser Fingerprints, etc.). Using these codes, the Google reCaptcha is taken out of the picture and we MUST use different mechanisms to block the attackers.
Question
I reviewed the Google reCaptcha documentation multiple times along with different topics related to this problem, but couldn't find a way to prevent such attack in an easy way. I have a few questions and will be very grateful if somebody succeeded to answer them:
Is it possible to bind the Google reCaptcha response code to a code challenge, cookie or something similar in order to ensure that the code is generated by the exact client?
Is there any way to distinguish the Google reCaptcha codes, taken from a farm/script and the ones generated by the exact client?
I found that there are some solutions as DataDome, which are very expensive. Is there something similar but on lower price or an algorithm that can be implemented on my own?
Big thanks in advance!
Script
Below is a simplification of the script that acts like a Google reCaptcha farm:
bypassReCaptcha();
function bypassReCaptcha() {
grecaptcha.render(createPlaceholder(), buildConfiguration());
grecaptcha.execute();
}
function createPlaceholder() {
document.body.innerHTML += '<div class="g-recaptcha-hacker"></div>';
return document.getElementsByClassName('g-recaptcha-hacker')[0];
}
function buildConfiguration() {
return {
size: 'invisible',
badge: 'bottomleft',
sitekey: '<your site-key>',
callback: (reCaptchaResponse) => localStorage.setItem('reCaptchaResponse', reCaptchaResponse)
};
}
I am using a server-side validation - something like this:
curl -X POST 'https://www.google.com/recaptcha/api/siteverify?secret=<your secret>&response=<generated code from above>&remoteip=<client IP address>'
It seems that the remoteip parameter is not working as expected - the validation is successful no matter of the client IP. I checked some topics and seems that this is a common problem:
Google reCAPTCHA's remoteip parameter is ignored
Is there any reason to include the remote ip when using reCaptcha?
Our customer service is an important user of our website. When doing their work they frequently send requests to the part of our website that is protected by invisible reCaptcha (v2). For that reason I think their actions are being marked as suspicion by reCaptcha and they keep getting the reCaptcha where you need to select photo's with a certain image, this makes their work has become quite a bit more time consuming. Is their a solution for this? Perhaps by whitelisting our IP so traffic from our IP will never be suspicious, and the reCaptcha with the images will not show?
I couldn't find the answer in the documentation so hope that someone can help!
I'm migrating from Google reCAPTCHA v2 to v3. As they are quite different, I have a question.
I used to place my reCAPTCHA v2 only inside web pages where a form exists, to make users click and avoid bots. That's understood, ok, but with reCAPTCHA v3 there is NOT a checkbox where to click on (reCAPTCHA v3 analyzes the user behaviour and clicks).
So... should I place the reCAPTCHA v3 just in forms pages or should I place it in all and every pages I have (to make recaptcha observe how the user interacts with the web)?
I would disagree with Galzor’s answer. The documentation says that
The score is based on interactions with your site and enables you to take an appropriate action for your site.
It’s “site” and not page. It goes on to say
reCAPTCHA works best when it has the most context about interactions with your site, which comes from seeing both legitimate and abusive behavior. For this reason, we recommend including reCAPTCHA verification on forms or actions as well as in the background of pages for analytics.
To me that last sentence means “every page with analytics on my site” — i.e. every page, whether it has a form on it or not. Which then gives rise to all sorts of privacy concerns, see also here.
Now my question is: what does the “reCAPTCHA verification” refer to? Including the api.js script or executing something or… 🤔
Unfortunately, the docs don’t spell this out clearly.
Addendum
(Feb 2023)
I switched to hCaptcha and their docs are also somewhat unclear. However, their customer service responded with
You should add the script and the DOM container with hCaptcha widget only on the contact form page and then call our /siteverify endpoint to validate the user.
and
Same scenario for second case, add it only on the sign up page and if validated within our side the user should be able to log in.
Based on that response I added the CAPTCHA only to the Contact page of my website and to the Sign Up page of the webapp.
Not sure this would also apply to Google’s CAPTCHA, though.
I dont think it should go into every page. mostly the users will find it too intrusive on all pages. in my opinion use it on page with form only.