Backblaze B2 Native URL vs. Friendly URL - transaction caps? - cloud-storage

I'm using B2 as a storage for the images on a website we develop.
Due to the fact, there're millions of images to be served from B2 I'm wondering if there's a difference in using either native URL or friendly URL provided by B2 in terms of transaction caps?
See sample URLs below:

There is no difference between those two URLs due to the fact that both are classed as Transaction Class B. This means that their pricing is identical and the free daily limit too.
If you want to reduce your B2 transactions consider using CloudFlare with caching turned on. Another benefit of CloudFlare is that Backblaze won't charge you for bandwidth if you download using CloudFlare.
https://www.cloudflare.com/bandwidth-alliance/backblaze/
Should you need to download from a private bucket via CloudFlare, try using this CloudFlare Worker script I wrote: https://gitlab.com/snippets/1867981

Related

Can I/Should I add Cloudfront to my webservice running on elastic beanstalk

Current situation
I have a Java Tomcat application running on ElasticBeanstalk. The application is a webservice that receives search queries and returns the results in Xml format. The webservice is only updated with new data once a month so any query sent at the end of the month will return identical results to one returned at the start of the month.
We take advantage of EB's load balancing so that usually just one EC2 instance is running but at time of peaks usage and another EC2 instance may get started.
To allow deployment of new versions Elastic Beanstalk we have a domain name on Route53, and a subdomain mapped to the the EB Application, customers use this subdomain in order to use the webservice.
This is working reasonably well, except peak usage can be somewhat higher than normal usage resulting in need more instances to be started increasing cost but also a slower response rate even with the extra machine.
Should I use CloudFront
I was wondering if I could use CloudFront to cache these responses, Im making these assumption
There would be less peaks and troughs on EB
I would save me money assuming cloudfront requests cheaper then extra load on EB
It would improve response rate for customers not near my EB server, i.e EB server is based in EU but I have many US customers.
If so how do I do it
I went to try and create a Cloudfront Distribution but in the Original Domain Name field it only listed my s3 buckets not my S3 domain so havent gone any further.
I always put cloudfront in front of any solutions I deliver on AWS. In response to your specific questions:
Most likely yes, it would off load some of the work that may go to an EC2 instance so it might prevent an extra instance from spinning up sometimes.
Maybe, maybe not. It might save you money, but its also possible that it could end up costing you a fortune. Cloudfront can be abused by a hacker, if for no other reason than to give you a huge bill, so you may want to add a billing alert so you are not suprised by this.
Yes, it all likelihood it will improve the responsiveness of you websites. Thats the prime reason I always use it.
CloudFront does allow you to serve dynamic content: http://aws.amazon.com/cloudfront/dynamic-content/ however from reading there it seems that it would cache query results based on a URL pattern. Would that be compatible with your site's use?
Information on how to specify an EC2 as a CloudFront origin can be found here: http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/CustomOriginBestPractices.html

A better solution to host static files besides Amazon S3

I made a mobile application in static html, which is equal to my site wordpress site
The first version was completely static, all texts were in the mobile HTML application.
Today, I updated my application to pull data from the wordpress with AJAX.
The problem is that now, with so many requests being made, the S3 bucket is not being enough.
Despite having decreased from 6kb to 83kb, but it is still more slow because of AJAX..
is it possible put static applications in some other service from Amazon?
For the static content, you should probably be looking at AWS CloudFront instead of S3. As per the page itself:
Amazon CloudFront is a content delivery web service. It integrates with other Amazon Web Services products to give developers and businesses an easy way to distribute content to end users with low latency, high data transfer speeds, and no minimum usage commitments.
Other thing you can leverage is the AJAX caching. That will make your webpage load much faster from the next time. You may also want to using nginx on your server for caching (this will reduce your server load)

Can Azure CDN propogate changes to all nodes with just one "miss"?

I'm using an Azure CDN endpoint on a hosted service (meaning, not a Blob Storage CDN endpoint).
The service is lazy rendering images, and once they are rendered, they are practically static (I can safely use Cache-Control:public, max-age=31536000).
In the naive implementation, there will be up to** X misses (X times the service will render an image) - Where X is the number of CDN nodes around the world.
There are two workarounds, as I see it:
The lazy created images are stored in Blob Storage, and later pulled from there.
Implement a cache in the Cloud Service.
Is there a way to propagate files to all nodes? Is there a better solution then having two caching layers (Cloud Service Cache / Blob Storage + CDN)?
** "Up to", depending on the geographical location of web requests. In my case, all around the world.
There is currently not a way for you to push something to one of the remote CDN nodes. This is a feature that many folks have asked Microsoft for in the CDN product.
Both workarounds would work. The first one benefits from not having to recreate them for the other CDN notes at all and will reduce the load on the server since it wouldn't be feeding these up after it renders them the first time. However, if it turns into you needing to get the request at the server anyway and then redirect to a version already in BLOB storage then you could easily just return the cached image as well. I think it depends on how many images you are talking about. If you have a LOT of them I'd lean more toward the first option.

Not getting improvements by using CDN

I've just added a CDN distribution using Amazon Cloudfront to my Rails application on Heroku, it's working OK.
My homepage serves around 11 static assets, I've made some tests using http://www.webpagetest.org/ and there are no differences (in terms of performance, optimizing load times) between using the CDN or not.
Is there any particular reason why this could be happening?
My region is Latin America btw, so it's using the All locations edge option.
Thanks.
The main benefits of using CDN from Amazon or others is that they are hosted on fast and reliable servers and offload the traffic served directly from your server, which in case that you have a dedicated fast server you won't see a considerable boost.
But another benefit is that they are potentially cached by user's browser (due to visiting other websites which have used the same CDN) so the visitor will have a better experience first time they visit your site.
A couple of suggestinos.
If the site CSS is one of the static assets that you have moved to CloudFront then I would try moving it back to your main server.
Since page display can't start until the site CSS is downloaded, you want to serve this as fast as possible. If it's coming from a CDN then it requires a second HTTP request.
Also, use the waterfall display from webpagetest.org to pinpoint where the bottlenecks are.
Good luck!

Is it better to use Cache or CDN?

I was studying about browser performance when loading static files and this doubt has come.
Some people say that use CDN static files (i.e. Google Code, jQuery
latest, AJAX CDN,...) is better for performance, because it requests
from another domain than the whole web page.
Other manner to improve the performance is to set the Expires header
equal to some months later, forcing the browser to cache the static
files and cutting down the requests.
I'm wondering which manner is the best, thinking about performance and
if I may combine both.
Ultimately it is better to employ both techniques if you are doing web performance optimization (WPO) of a site, also known as front-end optimization (FEO). They can work amazingly hand in hand. Although if I had to pick one over the other I'd definitely pick caching any day. In fact I'd say it's imperative that you setup proper resource caching for all web projects even if you are going to use a CDN.
Caching
Setting Expires headers and caching of resources is a must and should be done 100% of the time for your resources. There really is no excuse for not doing caching. On Apache this is super easy to config after enabling mod_expires.c and mod_headers.c. The HTML5 Boilerplate project has good implementation example in the .htaccess file and if your server is something else like nginx, lighttpd or IIS check out these other server configs.
Here's a good read if anyone is interested in learning about caching: Mark Nottingham's Caching Tutorial
Content Delivery Network
You mentioned Google Code, jQuery latest, AJAX CDN and I want to just touch on CDN in general including those you pay for and host your own resources on but the same applies if you are simply using the jquery hosted files cdn or loading something from http://cdnjs.com/ for example.
I would say a CDN is less important than setting server side header caching but a CDN can provide significant performance gains but your content delivery network performance will vary depending on the provider.
This is especially true if your traffic is a worldwide audience and the CDN provider has many worldwide edge/peer locations. It will also reduce your webhosting bandwidth significantly and cpu usage (a bit) since you're offloading some of the work to the CDN to deliver resources.
A CDN can, in some rarer cases, cause a negative impact on performance if the latency of the CDN ends up being slower then your server. Also if you over optimize and employ too much parallelization of resources (using multi subdomains like cdn1, cdn2, cdn3, etc) it is possible to end up slowing down the user experience and cause overhead with extra DNS lookups. A good balance is needed here.
One other negative impact that can happen is if the CDN is down. It has happened, and will happen again. This is more true with free CDN. If the CDN goes down for whatever reason, so does your site. It is yet another potential single point of failure (SPOF). For javascript resources you can get clever and load the resource from the CDN and should it fail, for whatever the case, then detect and load a local copy. Here's an example of loading jQuery from ajax.googleapis.com with a fallback (taken from the HTML5 Boilerplate):
<script src="//ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js"></script>
<script>window.jQuery || document.write('<script src="js/vendor/jquery-1.8.2.min.js"><\/script>')</script>
Besides obvious free API resources out there (jquery, google api, etc) if you're using a CDN you may have to pay a fee for usage so it is going to add to hosting costs. Of course for some CDN you have to even pay extra to get access to certain locations, for example Asian nodes might be additional cost then North America.
For public applications, go for CDN.
Caching helps for repeated requests, but not for the first request.
To ensure fast load on first page visit use a CDN, chances are pretty good that the file is already cached by another site already.
As other have mentioned already CDN results are of course heavily cached too.
However if you have an intranet website you might want to host the files yourself as they typically load faster from an internal source than from a CDN.
You then also have the option to combine several files into one to reduce the number of requests.
A CDN has the benefit of providing multiple servers and automatically routing your traffic to the closest location to your client. This can result in faster delivery, optimized by location.
Also, static content doesn't require special application servers (like dynamic content) so for you to be able to offload it to a CDN means you completely reduce that traffic. A streaming video clip may be too big to cache or should not be cached. But you don't neccessarily want to support that bandwidth. A CDN will take on that traffic for you.
It is not always about the cache. A small application web server may just want to provide the dynamic content but needs a solution for the heavy hitting media that rarely changes. CDNs handle the scaling issue for you.
Agree with #Anthony_Hatzopoulos (+1)
CDN complements Caching; also in some cases, it will help optimize Caching directives.
For example, a company I work for has integrated behavior-learning algorithms into its CDN, to identify and dynamically cache generated objects.
Ordinarily, these objects would be un-Cachable (i.e. [Cache-Control: max-age=0] Http header). But in this case, the system is able to identify Caching possibilities and override original HTTP Header directions. (For example: a dynamically generated popular product that should be Cached, or popular Search result page that, while being generated dynamically, is still presented time over time in the same form to thousands of users).
And yes, before you ask, the system can also identify personalized data and very freshness, to prevent false positives... :)
Implementing such an algorithm was only possible due to a reverse-proxy CDN technology. This is an example of how CDN and Caching can complement each other, to create better and smarter acceleration solutions.
Above those experts quotes, the explanation are perfect to understand CDN tech and also cache
I would just provide my personal experience, I had worked on the joomla virtuemart site and unfortunately it will not allow update new joomla and virtuemart version cause it was too much customised fields in product pages, so once the visitor up to 900/DAY and lots user could not put their items in their basket because each time to called lots js and ajax called for order items takes too much time
After optimise the site, we decide to use CDN, then the performance is really getting good, along by record from gtmetrix, the first YSlow Score was 50% then after optimise + CDN it goes to 74%
https://gtmetrix.com/reports/www.florihana.com/jWlY35im
and from dashboard of CDN you could see which datacenter cost most and data charged most to get your improvement of marketing:
But yes to configure CDN it has to be careful of purge time and be balancing numbers of resource CDN cause if it down some problem you need to figure out which resource CDN cause
Hope this does help

Resources