Distributed rate limiting algorithm [closed] - algorithm

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I am working on a pricing platform on wich I have to implement a distributed rate limiting algorithm. I have k gateways that provide x services. Any gateway can provide any service (via a load balancer). A customer buy a number of call per second to a service, its call could be routed through any gateway. So, is somebody knowing a good algorithm to update call counters on all gateways in order to limit customer calls?
Two important indicators, regarding this algorithm, are the network overhead and the deviation between the number of accepted calls and the rate limit.
Thanks!
Edit
I just want to know if there is a "well-known" algorithm.

I've implemented a solution based on this article (archive.org). I think the algorithm is called Leaky Bucket but it works fine. It's not perfect since it allows the entire quota to be used in a burst, but overall it's very fast with node.js and Redis. The difference between accepted requests and rate can be quite high and depend on the ratio between sample window and bucket size.

Related

Is a caching layer needed when using a nosql data store? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Do we really need a caching layer when using a nosql datastore? I understand the use case when using a traditional sql db, the overhead of query processing can be avoided by a simple key/value lookup. But in the case of nosql, that's not the case anymore. Also, the cache instances are almost always running in separate instances, which means they still have the same network delay associated with accessing a nosql datastore.
Thanks!
Caching is simply a tool for performance optimization and should be treated as such. This means doing some load testing to see what gains (if any) your performance enhancements give.
Most NoSQL servers do make claims to be much faster than traditional RDBMS but only testing it out will tell you if they're faster for your applications and infrastructure.

aws - How many users can a small ec2 serve [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
If you take an average dynamic web site, what would be peak number of users that one small ec2 instance could serve concurrently. Please don't send "it depends..." answers, I need some crudest estimation. Thanks.
Well... it depends! :) I am sure you are able to serve really a lot of static images with a high performance webserver like nginx. But you will only be able to serve a small number of users if you have a complete Java enterprise stack.
There are so many factors in this that you can not give at least the crudest estimation. Some points to consider is your app, the processing it does, how many resources it needs, your server infrastructure... too many variables to give a correct answer.
Therefore I suggest the following: develop a comparable set of test tools. Try to mimic the load pattern of your users as close as possible (it would for example be possible to replay an Apache access log). Measure how many requests you will be able to serve. Tune your config, measure again. Change servers, measure again. This is the only way to get any results.
Tools include Siege, multi-mechanize, ab and probably a lot more.

TDD Naïve Text Search Algorithm [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I need to test drive Naïve string search algorithm.
http://en.wikipedia.org/wiki/String_searching_algorithm
Can someone shed some light on how I could approach the issue.
should my tests only be testing outside behaviour? (i.e. the pattern occuring indexes irrespective of the algorithm used? )
Or should I be algorithm specific and test drive algorithm specific implementations?
Or should I be algorithm specific and test drive algorithm specific implementations?
This largely depends on how your class will be used. Testing public contract is usually the way to go (and it's fairly easy to write decent tests for that), so unless your clients can somehow use implementation details knowledge, I'd stick to that.
Note that having specific algorithm on paper could help pinpointing few basic tests, without writing strictly implementation related tests, like:
invalid input (empty strings, nulls)
input being too large/too small (like, pattern exceeding searched string length - what do you do then?)
valid input, yet matching nothing
This should give you basic entry point for more implementation specific testing. Keep in mind that utilizing data driven testing can help you avoid the need of having implementation level knowledge altogether, and with large enough data set might be just enough to verify algorithm correctness aswell.

Designing a parallel algorithm for DDOS prevention? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
My multicore machine is receiving packets and distributes them evenly (sort-of round robin) among its cores. Each core should decide whether to let the packet pass or drop it, depending solely on its origin address.
I need to find lock-less algorithm and data structure to allow this. Can you help?
If you are happy to use Java, or look at the design of Java source, you could chose a random key and then retrieve a queue from http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/ConcurrentHashMap.html. Given the queue, you could add the packet to it without blocking if it was a http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/ConcurrentLinkedQueue.html
Much of java.util.concurrent is due to Doug Lea, who has information on it at http://gee.cs.oswego.edu/dl/concurrency-interest/index.html.
Possibly overkill for your particular problem, but might satisfy a general request for info on data structures in this area.

Algorithm to check consistency? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
Assume:
There are hundred students and each on
of them are working on a common
project.
Ideally, being consistent implies that a student works everyday on the project at-least once.
If we have data like this:
Student 1 work day 1(worked)
day 2(worked)
day 3(took a break)
etc
Now is there any algorithm that can be used to check and rank students based on consistency ?
EDIT:
This is not a homework problem. I am developing a plugin in java that rates group work according to consistency. So I was wondering if there was an algorithm that can accurately predict consistency. I was thinking about using standard deviation but if there is something more precise, it would help.
I believe the quantity you are looking for is called variance. This describes consistency, if you were to say, use the time each day that a student works.

Resources