How to cache YouTube videos? - caching

There are times I watched a particular video multiple times, Is there a way to cache videos on the DNS level thus whenever I open a video which is cached, the DNS server loads the video from the local server thus having no change on the client side.

DNS server resolves and can cache only the ip address of the server, not the content hosted on the ip address.
For caching the content of a given url, you need to use some proxy cache server eg. squid.

Related

Laravel Request IP Address: will Requests coming from VPNs show the same IP address or not?

Currently I am developing an HTTP server and I am using the throttle (access limitation per minute) functionality of Laravel based on IP address.
However I am afraid that when a VPN and/or Proxy Server is used by different people the incoming request will show the same IP address. The rate limitation is included only to prevent dedicated DOS attacks and I don't want the user of my website to be blocked by rate limitation if they are using a VPN.
First of all, I don't have a solid understanding of how IP addresses are obtained and stored in the Request object. I assume it is included in the HTTP request header however I wasn't able to find it in Google Chrome's developer tool, "Network" tab. The developer tool only shows the destination address and not the source ip address in the "Request Header" session.
Next, I don't have a testing environment where I can test whether the IP address will be the same when sending by different machines using the same VPN, hence I have to ask the question here.
Any help would be appreciated.
will Requests coming from VPNs show the same IP address or not?
Yes, it will show up as the same IP address as this is the whole purpose of using a VPN service, to change the user IP address.
However, if you want to detect if a user is using VPN there are third-party services to help you with that https://ipinfo.io/

DNS solution for Dante SOCKS proxy

I am trying to build a SOCKS solution for forward proxy. I am using dante SOCKS proxy as I have heard that big companies like google uses it as forward proxy solution.
on the SOCKS server, I am allowing based on FQDN's like google.com:443
Now the problem is, when the client constructs the packet, it tries to resolve google.com and gets X.X.X.X and sends connect request to SOCKS server. Now when the server receives the packets, it tries to reconstruct the packet to send out to internet, the server again does DNS resolution and if the server gets response as Y.Y.Y.Y, then it doesn't allow client's request as the destination IP in the client's request is different then the server's resolved IP address.
There was a solution in dante client which tells client to put a dummy destination address 0.0.0.1 and sends request to server and server processes it properly then. However that is creating a problem with internal domains as after using that dns resolution method, every requests goes through dante server :(
Please let me know
If there is any solution through which would help me in maintaining a DNS record expiry DC wide for e.g. google.com resolves to X.X.X.X and I should be able to resolve to this same IP address on 100's of DNS client and in case if the record changes, then it should immediately change/expire on client.
Any other proxy/socks solution which should be transparent to applications for forward proxy
I went ahead with this solution in case anyone is curious to see the solution.
I used PowerDNS Auth Server with Pipe backend. The requests would land to PowerDNS server for resolution, it will pass on all the data to Pipe backend script with ABI, the script analysis the requests, sees if it is present under cached variable/memory map, if it is cache hit, it will respond using cached DNS records else it will use a DNS resolver to resolve that query like a resolver resolves normally.
PowerDNS version lower than 4.1 supports Pipe backend + resolver. This way, the request would first land to pipe backend script, if the script doesn't have any entries cached, it will not respond or will respond blank and then PowerDNS would resolve it with the mentioned resolver server in the configuration. However with version 4.1 and above, the resolver part is removed from PowerDNS Auth server hence you need to handle that behaviour via Pipe backend script.
It depends on your client. Firefox, for example, sends hostname to SOCKS proxy without resolving it. You can confirm that by Wireshark.
PS. assume you are using a SOCKS5/4a proxy. SOCKS4 does not support hostname. Ref: https://en.wikipedia.org/wiki/SOCKS#SOCKS4a

How does the proxy mechanism work with proxy settings in browser

We often find columns like Address, Port in web browser proxy settings. I know when we use proxy to visit a page, the web browser request the web page from the proxy server, but what I want to know is how the whole mechanism works? I have observed that many ISP allow only access to a single IP(of their website) after we exhausted our free data usage. But when we enter the site which we wants to browse in proxy URL and then type in the allowed IP, the site get loaded. How this works?
In general, your browser simply connects to the proxy address & port instead of whatever IP address the DNS name resolved to. It then makes the web request as per normal.
The web proxy reads the headers, uses the "Host" header of HTTP/1.1 to determine where the request is supposed to go, and then makes that request itself relaying all remaining data in both directions.
Proxies will typically also do caching so if another person requests the same page from that proxy, it can just return the previous result. (This is simplified -- caching is a complex topic.)
Since the proxy is in complete control of the connection, it can choose to route the request elsewhere, scrape request and reply data, inject other things (like ads), or block you altogether. Use SSL to protect against this.
Some web proxies are "transparent". They reside on a gateway through which all IP traffic must pass and use the machine's networking stack to redirect outgoing connections to port 80 to a local port instead. It then behaves the same as though a proxy was defined in the browser.
Other proxies, like SOCKS, have a dedicated protocol that allows non-HTTP requests to be made as well.
There are 2 types of HTTP proxies, there are the ones that are reversed and the ones that
are forward.
The web browser uses a forward proxy, basically it is sending all http traffic through the proxy, the proxy will take this traffic out to the internet. Every http packet that comes out from your computer, will be send to the proxy before going to the target site.
The ISP blocking does not work when using a proxy because, every packet that comes out from your machine is pointing to the proxy and not to the targe site. The proxy could be getting internet through another ISP that has no blocks whatsoever.

Loadbalancing web sockets

I have a server which supports web sockets. Browsers connect to my site and each one opens a web socket to www.mydomain.example. That way, my social network app can push messages to the clients.
Traditionally, using just HTTP requests, I would scale up by adding a second server and a load balancer in front of the two web servers.
With web sockets, the connection has to be directly with the web server, not the load balancers, because if a machine has a physical limit of say 64k open ports, and the clients were connecting to the load balancer, then I couldn't support more than 64k concurrent users.
So how do I:
get the client to connect directly to the web server (rather than the load balancer) when the page loads? Do I simply load the JavaScript from a node, and the load balancers (or whatever) randomly modifies the URL for the script, every time the page is initially requested?
handle a ripple start? The browser will notice that the connection is closed as the web server shuts down. I can write JavaScript code to attempt to reopen the connection, but the node will be gone for a while. So I guess I would have to go back to the load balancer to query the address of the next node to use?
I did wonder about the load balancers sending a redirect on the initial request, so that the browser initially requests www.mydomain.example and gets redirected to www34.mydomain.example. That works quite well, until the node goes down - and sites like Facebook don't do that. How do they do it?
Put a L3 load-balancer that distributes IP packets based on source-IP-port hash to your WebSocket server farm. Since the L3 balancer maintains no state (using hashed source-IP-port) it will scale to wire speed on low-end hardware (say 10GbE). Since the distribution is deterministic (using hashed source-IP-port), it will work with TCP (and hence WebSocket).
Also note that a 64k hard limit only applies to outgoing TCP/IP for a given (source) IP address. It does not apply to incoming TCP/IP. We have tested Autobahn (a high-performance WebSocket server) with 200k active connections on a 2 core, 4GB RAM VM.
Also note that you can do L7 load-balancing on the HTTP path announced during the initial WebSocket handshake. In that case the load balancer has to maintain state (which source IP-port pair is going to which backend node). It will probably scale to millions of connections nevertheless on decent setup.
Disclaimer: I am original author of Autobahn and work for Tavendo.
Note that if your websocket server logic runs on nodejs with socket.io, you can tell socket.io to use a shared redis key/value store for synchronization.
This way you don't even have to care about the load balancer, events will propagate among the server instances.
var io = require('socket.io')(3000);
var redis = require('socket.io-redis');
io.adapter(redis({ host: 'localhost', port: 6379 }));
See: Socket IO - Using multiple nodes
But at some point I guess redis can become the bottleneck...
You can also achieve layer 7 load balancing with inspection and "routing functionality"
See "How to inspect and load-balance WebSockets traffic using Stingray Traffic Manager, and when necessary, how to manage WebSockets and HTTP traffic that is received on the same IP address and port." https://splash.riverbed.com/docs/DOC-1451

Setting up a Server as a CDN

We've got a server and domain we use basically now as a big hard drive with video files and images (hosted by MediaTemple). What would it take to setup this server and domain as a CDN?
I saw this article:
http://www.riyaz.net/blog/how-to-setup-your-own-cdn-in-30-minutes/technology/890/
But that looks to be aliasing to the box, and not actually moving the content. Our content is actually hosted on a different box.
One of the tenets of a CDN is that content is geographically close to the client - if you only have one CDN server (rather than several replicated servers), it's not a CDN.
However, you can still get some of the benefits of a CDN. Browsers will typically only fetch 8 resources in parallel from any given hostname. You can give your 'CDN' server several subdomain hostnames and round-robin requests.
www1.example.com
www2.example.com
www3.example.com
...
This will effectively triple the number of concurrent requests a browser will make to your server, as it will see the three hostnames as three separate web servers.
Its basically like you creating a "best route possible" server for your client.
What you basically does is putting multiple IP addresses to one HOSTNAME example.
non static content *(Dynamic pages) are on WWW.Example.Com
Whereas the JSP,AVI etc are stored on media.cdn.example.com
media.cdn.example.com looks up as 1.2.3.4;8.8.9.9;103.10.4.5;etc
so the router on the user end will find nearest to that location and that will be your cdn.
another way is to force content be served using a certain route, and as such, pushes the router to do the same.

Resources