After reading through the documentation, i understand that recaptcha makes it difficult for the bots to do a form submission. This reduces spam for sure.
Apart from this, is there other advantage of using recaptcha?
Some articles were indicating that from a proxy or a virtual machine(for the first time), recaptcha is triggered. But is this really needed or rather what is the advantage of this?
Also, whether recaptcha does something to prevent bots crawling the website? I do not think that might be a case because this may affect search engine crawlers also.
From the documentation, "reCAPTCHA protects you against spam and other types of automated abuse." what are the other types of automated abuses in this context?
Well it doesn't matter if the bot is friendly or malicious. Some webmasters don't want bots on their website, and some bots do not respect robot.txt that would tell the bots to keep off their lawn. Besides, web crawlers should not be on the pages that require the user to post information about themselves.
To quote the website, "reCAPTCHA offers more than just spam protection. Every time our CAPTCHAs are solved, that human effort helps digitize text, annotate images, and build machine learning datasets. This in turn helps preserve books, improve maps, and solve hard AI problems."
Related
I am making an application that shows real-time status for a Valorant game. like players alive, the type of weapons each play has, time remaining, etc.
Is it possible to use Riot Valorant API to do this for live matches or for previously played matches?
As per my knowledge you couldn't. But I think you should try with Riot Games' official production API, not development API.
Let me know if you find something relatable.
(This is adding onto Sanskar's answer, which I cannot comment on as I lack the required 'reputation')
I'm aware that this is an old question, but for anyone who happens to have stumbled upon this question, there is no way to obtain real-time in-game events however, there is a way to retrieve certain data from a match-- only except, not in an official way that does go against Riot Game's TOS of using third party software. Though, I wouldn't worry about this too much as long as you do not ruin the competitive integrity of the game by providing yourself with an in-game advantage over others in the game. I personally have been using this for over a year now and have not received any form of punishment for doing so.
Anyhow, back to the actual question of this thread, check out this document of API endpoints that have been scraped through monitoring HTTP traffic of the Riot Client. https://github.com/techchrism/valorant-api-docs/tree/trunk/docs/ You'll need to obtain certain authorization tokens of the Valorant account through whatever methods are available to you (I pray that it is through lawful means :) ), which highly depends on the type of endpoint. There are certain wrappers for these endpoints already made by other users somewhere on GitHub, and you can always ask for help in the small community of developers that are using these endpoints in the README of the GitHub page I sent in this post.
REMEMBER TO NOT DO ANYTHING THAT WOULD CREATE AN UNFAIR ADVANTAGE, OR ANYTHING ELSE THAT A RIOT EMPLOYEE WOULD NOT APPROVE OF USING THIS :)
Got a request with a list of questions from Business to investigate the possibility of integration Google reCaptcha with our Site and one of the questions is:
How many tries of the puzzle does user get?
and the second one:
What happens if the user fails the puzzle as many times as it is allowed to attempt?
I spend a few hours to find the proper answers to the questions above and unfortunately, I didn't get success. No required information on official site, Google Search did not help as well.
Google reCaptchas do not have a default "unsuccessful attempt limit" and I'm not aware of any option set one up. Captchas are not intended to turn away humans (or hackers), regardless of how many tries it take.
Captchas (an acronym for "Completely Automated Public Turing test to tell Computers and Humans Apart") are unnecessary unless your site is at risk of excessive scraping or automated spam.
Invisible Captcha's seem to be the preferred choice nowadays, to reduce user annoyance with the security feature.
Here are links to:
Google's reCaptcha demo
Google's Invisible reCaptcha demo
FunCaptcha Verification by Puzzle
21 Free CAPTCHA Sources
I am researching whether the following is possible and if so how I could go about achieving it.
We collect reviews for businesses from their customers and we’d like to post these reviews to Google places as part of the reviews they have on their.
I was wondering how I would go about getting our website to “push” this data to the Google places website, I’ve done lots of searching on the APIs but have found nothing that says it’s possible or not.
Currently the Google Places API does not have write capability. It only has read capability. Right now only ratings are available, but I suspect reviews might come someday too.
Although you can send check-in signals and fix Places through the API. Hopefully Google will add the ability to send reviews and receive them.
If you're looking to get your content added to Google, you may want to talk to their content partnerships teams http://www.google.com/support/mapcontentpartners/
Since Google's local and maps initiatives are under the same people that would be the place to go.
I too looked into this as it would be of huge value to companies if possible.
My research led me to believe that it is not possible and could possibly violate Google's TOA with negative results for the company's Places page.
Instead, I built a workaround that makes it really easy for companies to collect feedback and get their own customers to submit the reviews: http://dallasmarketingservices.com/survey-local-unveiled-how-online-reviews-affect-your-local-business/
Maybe we will see this in the future though.
I'm using the twitter gem to build a Twitter bot in Ruby. I am trying to make it self-sustainable as it were, so I want it to generate its own content to tweet by scraping tweets of users outside its social circle (and then perhaps garbling them with Markov chain generator).
Which one is a better strategy?
Search for tweets via api
Load Twitter pages and scrape tweets with Hpricot or Nokogiri
Also, how can I try to ensure the base tweets come from outside my bot's followers' friends so it's harder to tell it's a bot?
At the moment I use a .yml file with tweets I generated by hand, which is far from ideal.
There's two questions here.
It's always better to use an API where one is available. This will future-proof you against the bot randomly breaking if a simple html element is changed, and it will also allow the website (ie, twitter) to rate limit your searches in case you put too high a load on the service. Although this is unlikely for twitter, it's good practice.
Sometimes, the information you want is unobtainable via the API. In this case, you should consider if you really need to scrape it, and if so, how to limit yourself to be polite.
Basically, if the API allows you to do what you want, use it for maintainability.
As for your second question, I do not have any experience with the twitter API. Is there a method to get twitter IDs of all your followers, and who they follow? If not, you'll be forced to scrape as earlier mentioned - if you really do need this information.
Once you have a list of those who your followers follow, you can check if the ID of the poster of what you want to repost falls inside this set.
Would you consider retweeting for this aspect of the bot?
One thing to also note is performance. If you were to scrape the website, you would have to download the entire page, then scrape the page(which is processor intensive as it is). As opposed to hitting the API, which would only return JSON/XML data.
So from strictly a performance standpoint, I would go with the API.
I want to know how an advertising network like adwords is built. What kind of systems display the ads and what kind of systems search keywords in the content of the publisher's website.
Google has a spider which indexes the content of pages on its adsense network. The ads are pulled in with JavaScript. The actual algorithms which decide what ads to display on a page are closely guarded secrets. Google uses Python a lot, so odds are most of the backend uses that.
To make this question approachable you need to specify what level/type of detail you want/need. Are you looking for a broad understanding of the information architecture and flow? do you need search/parse algorithms pseudo code/code? what exactly do you need?