AWS Cloud Front CDN Alternate Domain using CNames browser cache - caching

Does a web browser cache files based on what is shown in the URL bar, or by where the file actually come from?
Consider the following two Cloud Front Distributions.
distro1.cloudfront.net
distro2.cloudfront.net
A CName record points www.foo.com to distro1.cloudfront.net.
If I change the CName to point to distro2.cloudfront.net.
Since the source is changing, but the address is not...
Will browsers notice the different source and request a new file or just load the cached version. (assuming they have a cached version)
Thank You!!
-C

A browser should not notice that the IP address is different and decide the locally-cached object needs to be refreshed. If it does notice... that is a broken implementation.
A web site can have many, many different IP addresses, all at the same time, all with the same content... and, conversely, a single IP address can have many, many different web sites behind it. Either way, the underlying IP address, and any intermediate targets of CNAMEs is an implementation detail that the browser has to remain unaware of for caching purposes.

Related

My website is not refreshing CSS changes made

I have one specific domain that this issue is connected with. I have 10+ more domains from the same registrar. This one domain is on a on a different webhosting account that the rest of the domains (the same webhosting company though).
Whenever I make changes to CSS, the changes are not reflected until I change an IP address via VPN. And even then, it only refreshes once, then I need to change the IP again to see another change made. Sometimes not even that helps.
This happens on different internet networks.
The website runs on wordpress, but I have tested it with a separate set of files outside of WordPress.
Does anyone have a clue what it may be and how could it be resolved? Thank you!
I have tried broadband, mobile network but it's the same scenario for both. This makes me believe that it's not a router or device issue (local cache). It goes without saying that I have cleared cache and DNS multiple times.
One thing to mention is that all of my domains run through Cloudflare - yet only one is affected.
My webhosting company is not very helpful this time and only have checked whether my IP is blocked, which I think is a useless taken the above scenario.
All of my other 10+ domains reflect the changes immediately, even without clearing the cache.
Just in case anyone is experiencing something similar, it was due to Cloudflare. I have set the nameservers to point directly to the hosting provider and that fixed the issue.

Changes to VMs in Availability Set and Load Balancing

I have gone through the whole process of testing and setting up two or even three VMs under availability sets and load balancing endpoints, and I have noticed how when accessing the domain the different VMs instances are loaded since I put different titles on each instance of a CMS web site to test the availability. The main reason I am trying to look into this is that the current VM/web site has had some problems when Windows did their periodical updates, which at times stopped the FTP or changed the server settings.
While this is working almost the way I thought it would, my question is about what happens when a client, who this will be setup for, makes changes to a CMS web site. My thought is that if they make changes to the CMS then those changes only apply to one instance of the VMs in the availability set, and if the VMs are load balancing where the different VM instances are loading then multiple different changes could be applied to each VM in the Availability Set.
What I am trying to determine but not coming across anything concrete, is if there is away to setup a shared network or system to mirror any changes to the each VM so that the web site stays consistent. Or if using the Availability Set for the current VM and web site is still applicable.
If anyone can give me some insight that would be great.
Is using the server's file system necessary for the CMS software? Could the CMS software read/write content to/from a database instead?
If using the server file system is the only option, you could probably set up a file share on one server that all the other servers would work against. This creates the problem though that if the primary server (that containing the file share) goes down for some reason, the site is as well.
Another option could be to leverage Web Deploy to help publish the content changes. Here are two blog posts that discuss this further:
http://www.wadewegner.com/2013/03/using-windows-azure-virtual-machines-to-publish-and-synchronize-a-web-farm/
http://michaelwasham.com/2012/08/13/publishing-and-synchronizing-web-farms-using-windows-azure-virtual-machines/
This really depends on the CMS system you're using.
Some CMS systems, especially modern ones, will persist settings in some shared storage, like SQL Server database and thus any actions that users make to the CMS will be stored in this shared storage and available to all web servers that are housing the CMS.
Other CMS systems may not be compatible with load-balanced web servers. Doing file sharing/replication/etc of the files stored on local servers may or may not work, depending on the particular CMS and its architecture. I would really try to avoid this approach.

Image storage on different server

I can see that all big sites store the images on a complete different server. What are the benefits of this practice?
Load balancing.
Separation of dynamic and static content.
Static content is served from servers which are geographically (or in network "length") close to the client.
(Update) forgot to mention that browsers used to limit the number of concurrent requests to the same server or domain (don't know if it's still used) and using different domain names allowed the server to bypass this limitation.
This way each kind of server serves resources it's tuned up for so clients get pages faster.
This way, the browser won't send cookies when requesting images.
It also enables the use of location-aware CDNs for images only.

Advantages of having css,js and media subdomains

What is the benefits of seperating css,js and media folders under subdomains like
css.domain-name.com
js.domain-name.com
media.domain-name.com
I know that scalibilty begin from static/media files but does serving them from subdomain has any advantage ?
If so, in which degree should I do that ? For example, if I allowed to photo uploads, should I put my "uploads" folder under media subdomain ?
Thanks
I'd separate uploads from static files used in the generic layout (e.g. logos, icons, etc.) so its a lot easier to clear the existing files to upload a new design without having to care for the uploads to not be deleted/overwritten.
As for the domain names, I wouldn't split the files that way. One sub domain for static files, one for uploads - fine. But I wouldn't go as far as adding one for scripts or stylesheets.
Using sub domains can have advantages though, depending on the web server you can configure the whole virtual host to adhere to specific rules, e.g. not providing directory listings or not allowing access to any files other than images - or refusing to deliver hotlinked files (without having to worry about specific sub directories). It can as well make it easier to move the files to another host later on, e.g. for media files or downloads to a cloud hosting service.
Considering your example I'd use the following sub domains:
www.domain-name.com (basic web presence)
static.domain-name.com or media.domain-name.com (serving support files like js, css, images, etc. - stuff that doesn't change and can be cached for a long time)
uploads.domani-name.com (serving uploaded files)
Don't overcomplicate it as you're not gaining any additional performance that way (unless utilizing different servers and you're expecting heavy load). In fact page load might be slower (due to additional DNS lookups) and you might encounter security limitations, e.g. regarding cookie validity/accessibility or cross domain scripting.
There are mainly two reasons for doing this
Scaling - static content and dynamic content has other scaling parameters. The more you may differ between webservers serving dynamic and static contents. Based on this you may scale different based on your websites requirements. E.g if you host a photo site you will end up having 10times more static servers than dynamic sites. Static servers are usually much more lightweight than full featured application servers.
Cookies - Cookies are always sent to the domain they are assigned to. So cookies will be sent to e.g. www.xyz.com and not to sub.xyz.com
Probably it makes no sense to go more into detail than static[1-n].xyz.com. But that really depends on what you want to do.
To you "upload" folder question. Preferable the images uploaded to your main domain will be served by a static server (serving contents on your subdomain).
For JS, this seems like a bad idea. There are security restrictions to what you can do using JS when dealing with a different domain, which would be the case here.
For media files and downloads, basically BLOB storage, the story is a bit different. For high-traffic sites, it may be required to separate this in order to create a good load-balancing structure and to avoid putting unnecessary load on the web application servers (the media subdomain can point to different servers, thus reducing the load on the app servers while allowing to create massive load balancing for just the binary data serving).

1 A-record for every subdomain (10000+); any potential issues? Any other solution?

Most solutions I've read here for supporting subdomain-per-user at the DNS level are to point everything to one IP using *.domain.com.
It is an easy and simple solution, but what if I want to point first 1000 registered users to serverA, and next 1000 registered users to serverB? This is the preferred solution for us to keep our cost down in software and hardware for clustering.
alt text http://learn.iis.net/file.axd?i=1101
(diagram quoted from MS IIS site)
The most logical solution seems to have 1 x A-record per subdomain in Zone Datafiles. BIND doesn't seem to have any size limit on the Zone Datafiles, only restricted to memory available.
However, my team is worried about the latency of getting the new subdoamin up and ready, since creating a new subdomain consist of inserting a new A-record & restarting DNS server.
Is performance of restarting DNS server something we should worry about?
Thank you in advance.
UPDATE:
Seems like most of you suggest me to use a reverse proxy setup instead:
alt text http://learn.iis.net/file.axd?i=1102
(ARR is IIS7's reverse proxy solution)
However, here are the CONS I can see:
single point of failure
cannot strategically setup servers in different locations based on IP geolocation.
Use the wildcard DNS entry, then use load balancing to distribute the load between servers, regardless of what client they are.
While you're at it, skip the URL rewriting step and have your application determine which account it is based on the URL as entered (you can just as easily determine what X is in X.domain.com as in domain.com?user=X).
EDIT:
Based on your additional info, you may want to develop a "broker" that stores which clients are to access which servers. Make that public facing then draw from the resources associated with the client stored with the broker. Your front-end can be load balanced, then you can grab from the file/db servers based on who they are.
The front-end proxy with a wild-card DNS entry really is the way to go with this. It's how big sites like LiveJournal work.
Note that this is not just a TCP layer load-balancer - there are plenty of solutions that'll examine the host part of the URL to figure out which back-end server to forward the query too. You can easily do it with Apache running on a low-spec server with suitable configuration.
The proxy ensures that each user's session always goes to the right back-end server and most any session handling methods will just keep on working.
Also the proxy needn't be a single point of failure. It's perfectly possible and pretty easy to run two or more front-end proxies in a redundant configuration (to avoid failure) or even to have them share the load (to avoid stress).
I'd also second John Sheehan's suggestion that the application just look at the left-hand part of the URL to determine which user's content to display.
If using Apache for the back-end, see this post too for info about how to configure it.
If you use tinydns, you don't need to restart the nameserver if you modify its database and it should not be a bottleneck because it is generally very fast. I don't know whether it performs well with 10000+ entries though (it would surprise me if not).
http://cr.yp.to/djbdns.html

Resources