How is this website is still rate limiting despite me using multiple proxies? - proxy

I have 10 proxies servers in AWS EC2, created using squid proxy.
I have also set forwarded_for to off in this, to ensure that is can't be detected as a proxy.
The load balancer (nginx in EC2) is using a round robin algorithm to ensure equal distribution among proxies among requests.
I am ending requests to a public API.
I have tested my proxies by creating my own server (python flask) and checking the requests....request.remote_addr are the ip addresses of the proxy servers as expected and is using round robin.
There are no headers which include the source ip address. And in fact, by looking at request.dict in python, I cannot see the source IP address anywhere.
How is the API still rate limiting me? The rate limit error occurs after the same number of requests whether I use proxies or not.

Related

Load testing a web app which has a load balancer

I wrote a Jmeter test (that uses different user credentials) to load test a web app which has a load balancer and all it forwards the requests to a single node. How can I solve this?
I used the DNS Cache manager but that did not work.
Are there any other tools which I could use? (I looked into AWS Load testing but that too won't work because all the containers would get the same set of user credentials and when parallel tests are run they would fail.)
It depends on the load balancing mechanism used in your load balancer, it might be the case it's looking into the source IP address and forwarding requests from the same IP to the same backend node. You can try using multiple IP addresses (or aliases) and see whether it makes the difference. See IP Spoofing With JMeter: How to Simulate Requests from Different IP Addresses article for more details.
Also adding DNS Cache Manager might be not sufficient, you can try configurign a custom DNS resolver, i.e. 1.1.1.1 as the DNS server so each thread would resolve the underlying IP address on its own

Can I use CloudFlare's Proxy with an AWS Application Load Balancer?

I have created a listener on the LB (load balancer) with rules so that requests to different subdomains. I have set a CNAME for each subdomain in Cloudflare.
The problem is when I try to use Proxy feature, when I turn it off, my page works without a problem, but when i turn it on, it results in a time out.
There is a way to use Proxy feature with an LB?
The very simple answer is yes, this is very possible to do.
The longer answer is that a timeout indicates that some change in the request by Cloudflare creates a timeout for your AWS servers. Just a couple of examples could be:
A firewall rule that times out Cloudflare's IPs
Servers are not respecting X-Forwarded-For header and all requests from a small group of IPs are messing with application logic
It would help to know if the requests are reaching your load balancer and furthermore if the servers behind the LB are receiving the requests.

Google Compute Engine load balancing keep session

I have 2 TomEE servers on google's machines. they both serves the same application.
The web application have login page with jaas. an both servers works with the same DB.
when Im tring to access the servers separatly everything works fine.
but when I try to access via the load-balancer It's look like the load balancer hopping my requests between the two servers and therefore my web app not working well since the VM that I didn't login to rejects my requests.
my problem is how to make the session works well when loadbalancing the servers?
You want to look at the sessionAffinity feature of load balancer.
Specifically, per the load balancer target pool docs:
sessionAffinity
[Optional] Controls the method used to select a backend virtual machine instance. You can only set this value during the creation of the target pool. Once set, you cannot modify this value. The hash method selects a backend based on a subset of the following 5 values:
Source / Destination IP
Source / Destination Port
Layer 4 Protocol (TCP, UDP)
Possible hashes are:
NONE (i.e., no hash specified) (default)
5-tuple hashing, which uses the source and destination IPs, source and destination ports, and protocol. Each new connection can end up on any instance, but all traffic for a given connection will stay on the same instance if the instance stays healthy.
CLIENT_IP_PROTO
3-tuple hashing, which uses the source and destination IPs and the protocol. All connections from a client will end up on the same instance as long as they use the same protocol and the instance stays healthy.
CLIENT_IP
2-tuple hashing, which uses the source and destination IPs. All connections from a client will end up on the same instance regardless of protocol as long as the instance stays healthy.
5-tuple hashing provides a good distribution of traffic across many virtual machines. However, a second session from the same client may arrive on a different instance because the source port may change. If you want all sessions from the same client to reach the same backend, as long as the backend stays healthy, you can specify CLIENT_IP_PROTO or CLIENT_IP options.
In general, if you select a 3-tuple or 2-tuple method, it will provide for better session affinity than the default 5-tuple method, but the overall traffic may not be as evenly distributed.
Caution: If a large portion of your clients are behind a proxy server, you should not use CLIENT_IP_PROTO or CLIENT_IP. Using them would end up sending all the traffic from those clients to the same instance.

Algorithm/Method to save costs of running filtering proxy on AWS EC2

I am trying to set up a Squid Proxy combined with DansGuardian Content filtering engine on EC2. I will be filtering traffic from mobile(IOS/Android) clients via this filtered proxy but that could mean a lot of traffic flowing through my system, since I will have to route all of the traffic through the DNS, which inturn could mean a lot Amazon EC2 costs!. Is there a known method/standard in which I can direct only known blacklisted traffic via this proxy in a cost effective manner?. Things I have explored include creating blacklists on the device and filtering right there , but that might mean I have to keep going back and changing (adding or removing sites) and this is not really feasible anyway.

Getting (non-HTTP) Client IP with load-balancer

Say I want to run something like the nyan cat telnet server (http://miku.acm.uiuc.edu/) and I need to handle 10,000 concurrent connections total. I have 10 servers in addition to a load balancer. Each server can handle 1,000 concurrent connections, and I want to put a load balancer in front of it to randomly divide the traffic to the 10 servers.
From what I've read, it's fairly simple for a load balancer to pass an HTTP request (along with the client IP) to the backend server, perhaps with FastCGI or with an X- header.
What would be the simplest way for the load balancer to pass the client IP to the backend server in this case with a simple TCP server? Would a hardware load balancer be needed, or are there ways to do this simply through software?
In other words, is there a uniform way to pass client IP when load balancing for non-HTTP stuff? The same way Google gets client IP when they load-balances Google Talk XMPP server or their Gmail IMAP server
This isn't for anything in specific; I'm just curious about if and how it can be done. Thanks in advance!
The simplest way would be for the load balancer to make itself completely invisible and pass the connection on with the source and destination IP address unmolested. For this to work, the same IP address must be assigned (as a loopback address, not to a physical interface) to all 10 servers and that would be the IP address the clients connect to. Internet traffic to that IP address has to go to the load balancer. The load balancer must be the default gateway for the servers.

Resources