the geo coder to fetch more requests - ruby

I am working with geocoder gem and like to process more number of requests from an IP. By default Google API provides only 2500 requests per day.
Please share your thoughts on how I can do more requests than the limit?

As stated before: Using only Google API the only way around the limitation is to pay for it. Or in a more shady way make the requests form more than one IP/API-Key which i would not recommend.
But to stay on the save side i would suggest mixing the services up since there a few more Geocoding APIs out there - for free.
With the right gem mixing them is also not a big issue:
http://www.rubygeocoder.com/
Supports a couple of them with a nice interface. You would pretty much only have to add some rate-limiting counters making sure you stay within the limits of each provider.
Or go the heavy way of implementing your own geocoding. With for example your own running Openstreetmaps database. The Database can be downloaded here: http://wiki.openstreetmap.org/wiki/Planet.osm#Worldwide_data
Which is the best way depends on what your actual requirements are and what ressources you have available.

Related

Rate-Limit an API (spring MVC)

I'm looking the best more efficient way to implement (or use an already setup) rate limiter that would protect all my rest api url. the protection I'm looking at is a "call per second per user limiter"
I had a look on the net and what comes out was the use of either "Redis" or Guava RateLimiter.
To be honest I have never used Redis and I'am really not familiar with it. But by looking on its docs it seems that it has a quite robust rate limiter system.
I have also had a look at Guava's RateLimiter. And it looks a bit easier to use (don't need a redis installation etc...)
So I would like some suggestion of what would be "in my case" the best solution? Is using Redis "too much"?
Have any of you already tried RateLimter? Is this a good solution? Is it scaleable?
PS: I am also open to other solutions than the 2 I aforementioned if you think there are better choices.
Thank you!
If you are trying to limit access to your Spring-based REST api you should use token-bucket algorithm.
There is bucket4j-spring-boot-starter project which uses bucket4j library to rate-limit access to the REST api. You can configure it via application properties file. There is an option to limit the access based on IP address or username.
If you are using Netflix Zuul you could use Spring Cloud Zuul RateLimit which uses different storage options: Consul, Redis, Spring Data and Bucket4j.
Guava’s RateLimiter blocks the current thread so if there’s a burst of asynchronous calls against the throttled service lots of threads will be blocked and might result exhaust of free threads.
Perhaps Spring-based library Kite meets your needs. Kite's "rate-limiting throttle" rejects requests after the principal reaches a configurable limit on the number of requests in some time period. The rate limiter uses Spring Security to determine the principal involved.
But Kite is still a single-JVM approach. If you do need a cluster-aware approach Redis is a way to go.
there is no hard rule, it totally depends on your specific situation. provided that "I have never used Redis", I would recommend guava RateLimiter. compare to redis, a completely new nosql system for you, guava RateLimiter is much easier to get started with. by adding a few lines of code, you are enable to distribute permits at a configurable rate. what left to do is to adapt it to fit your need, like providing rate limit on a per user basis.

Sitecore with DMS vs caching server - how do you handle it?

We're planning to introduce DMS to our customer's Sitecore installation. It's a rather popular site in our country and we have to use proxy caching server (it's Nginx in this case) to make it high-traffic-proof.
However, as far as we know, it's not possible to use all the DMS features with caching proxy enabled - for example personalization of content - if it gets cached it won't be personalized.
Is there a way to make use of all the DMS features with proxy cache turned on? If not, how do you handle this problem for high-traffic sites - is it buying more Content Delivery servers to carry the load, or extending current server with better hardware (RAM, CPU, bandwidth)?
You might try moving away from your proxy caching for some pages, or even all.
There's no reason not to use a CDN for static assets and media library assets, so stick with that
Leverage Sitecore's built-in html cache for sublayouts/renderings - there are quite a few options for caching
Use Sitecore's Debug feature to track down the slowest components on your site
Consider using indexes instead of doing "fast" or Sitecore queries
Don't do descendants query "//*" (I often see this when calculating selected state for navigation - hint: go the other way, calculate the ancestors of the current page)
#jammykam wrote an excellent answer on this over here.
John West wrote a great blog post on this also, though a bit older.
Good luck!
I've been wondering about this myself.
I have been thinking of implementing an ajax web service that:
- talks to the DMS and returns JSON
- allows you to render the personalised components client side
- allows you to trigger anlaytics events
I have been googling around and I haven't found anyone that has done it and published the information yet. The only place I have found something similar is actually in the mobile sdk, but I haven't had a chance to delve into it yet.
I have also not been able to use proxy server caching and DMS together successfully. For extremely high loads, I have recommended to clients to follow the standard optimization and scaling guidelines, especially architecting for proper Sitecore sublayout and layout caching for as much of the site as possible. With that caching done, follow it up by distributing across multiple Content Delivery nodes with load balancing to help support high volume with personalization at the same time.
I've heard that other CMS's with personalization use a javascript approach to load the personalized content on the client-side, but I would be worried about losing track of the analytics data that is gathered when personalized content is loaded and interacted with.

Are there performance issues of being a client of your own API?

Take Twitter for example, they say twitter.com as a client of their own API. Could this be one of the reason why Twitter is quite 'slow'?
Reference: http://engineering.twitter.com/2010/09/tech-behind-new-twittercom.html
Would you recommend using your own API for you main website/app?
If using own API is OK, what are the ways to avoid performance issues?
Regarding using your own API: It's about trade offs. In the twitter example by using their own API they were able to "allocate more resources to the API team." That benefit for them outweighed a performance hit. There are other benefits not mentioned either, Like, being the first to vet your api and having a single unified entry point into the system. There are drawbacks as well that are mentioned in the link you posted.
For your application you should look at the architectural qualities you want to achieve and balance that with the constraints you are given and make your own choice. If ultra high performance is at the top of the list then craft your solution to meet that goal.
Regarding performance when using your own API: Again it depends. In the twitter case they knew they would be accessing the API in JavaScript. So the physical jumps are Browser --> Server --> DB. There is no way to get around these hops if you are doing client-server development. In the link you posted they talked about going directly to the DB. Yes that would be faster, but I'm not sure how to do that from a javascript client. I suppose if they had used websockets to a custom API then that would have been faster, but at what development cost.
Summary So it's not that they are using their own API that was the performance hit, it was that they wanted the client to be an HTTP hop away.
Please note that none of these comments talk about what the server --> db calls look like or their caching strategy, or any of the other dozen things which could be a bottleneck

Does some optimized web servers for single page application exists?

When we do single page application, the webserver basically does only one things, it gives some data when the client asks them (using JSON format for example). So any server side language (php, ror) or tool (apache, ningx) can do it.
But is there a language/tool that works better with this sorts of single page applications that generates lot of small requests that need low latency and sometimes permanent connection (for realtime and push things)?
SocketStream seems like it matches your requirements quite well: "A phenomenally fast real-time web framework for Node.js ... dedicated to creating single-page real time websites."
SocketStream uses WebSockets to get lowest latency for the real-time portion. There are several examples on the site to build from.
If you want a lot of small requests in realtime by pushing data - you should take a look at socket type connections.
Check out Node.js with Socket.io.
If you really want to optimize for speed, you could try implementing a custom HTTP server that just fits your needs, for example with the help of Netty.
It's blazingly fast and has examples for HTTP and WebSocket servers included.
Also, taking a look at GWAN may be worthwile (though I have not tried that one yet).
http://en.wikipedia.org/wiki/Nginx could be appropriate

Google Visualization API

I want a real and honest opinion what do you think of Google Visualization API?
Is it reliable to use becasue when i was reading the documentation i noticed that there are alot of issues and defects to overcome and can i use it to retrieve data from mysql database.
Thank you.
I am currently evaluating it. As compared to other javascript data visualization frameworks, i think it has a lot going for it:
dynamic loading is built-in
diverse, many things to choose from.
looks really great!
framework mostly takes care of picking whatever implementation fits the current browser
service based, you don't need to download anything in advance
unified data source: just create one data table, and have multiple visalizations draw from that data.
As a disadvantage, I'd like to mention security. I mean, because it's all service based, it is not so transparent what happens when you pass data into these API calls. And as far as I know, the API is free, but not open source, so I can't really check what is going on behind the covers.
I think the Google visualization API really shines if you want to very quickly whip up a visualization gadget for use in a blog or so, and you are not interested in deploying all kinds of plugins and libraries (for eaxmple, with jQuery based frameworks, you need may need to manage multitple javascript libraries that work together to deliver the goods). If on the other hand you are creating an application that you want to sell, you might want to keep more control over what components you are using, and I would probably consider using something like Flot
But like I said, I am only evaluation atm, I am not using this in production.
Works really great for me. Can be customized fairly easily. Haven't seen any scaling issues. No data is exposed so security should not be an issue. - Arunabh Das
One point I want to add here is that, Google Visualization API cannot be downloaded, its not available for offline usage. So application which is going to use it must be always connected to internet, otherwise I think it wont be able to render charts. Due
to this limitation, this API cannot be used in some applications for which internet connection is not available.
I am currently working on a web based application that will have the Google Visualization API added to it and from the perspective of a developer the Google Visualization API is very limited in what you can do with each individual Chart and if I had a choice I would probably look at dojox charting just because of the extra flexibility that the framework gives you.
If you are doing any kind of large web application that will use charting extensively then I would not recommend the Google Visualizations API it does not have enough flexibility for a large web application.
I am using Google Visualization API and I want to stress that they still won't let you download it, which means if their servers are down, your app will be down if you depend on it. I have been using it for about 4 months, and they have crashed once me once so I'd say they pretty reliable and their documentation is really nice.

Resources