Performance degradation using Azure CDN? - performance

I have experimented quite a bit with CDN from Azure, and I thought i was home safe after a successful setup using a web-role.
Why the web-role?
Well, I wanted the benefits of compression and caching headers which I was unsuccessful obtaining using normal blob way. And as an added bonus; the case-sensitive constrain was eliminated also.
Enough with the choice of CDN serving; while all content before was served from the same domain, I now serve more or less all "static" content from cdn.cuemon.net. In theory, this should improve performance since browsers parallel can spread content gathering over "multiple" domains compared to one domain only.
Unfortunately this has lead to a decrease in performance which I believe has to do with number of hobs before content is being served (using a tracert command):
C:\Windows\system32>tracert -d cdn.cuemon.net
Tracing route to az162766.vo.msecnd.net [94.245.68.160]
over a maximum of 30 hops:
1 1 ms 1 ms 1 ms 192.168.1.1
2 21 ms 21 ms 21 ms 87.59.99.217
3 30 ms 30 ms 31 ms 62.95.54.124
4 30 ms 29 ms 29 ms 194.68.128.181
5 30 ms 30 ms 30 ms 207.46.42.44
6 83 ms 61 ms 59 ms 207.46.42.7
7 65 ms 65 ms 64 ms 207.46.42.13
8 65 ms 67 ms 74 ms 213.199.152.186
9 65 ms 65 ms 64 ms 94.245.68.160
C:\Windows\system32>tracert cdn.cuemon.net
Tracing route to az162766.vo.msecnd.net [94.245.68.160]
over a maximum of 30 hops:
1 1 ms 1 ms 1 ms 192.168.1.1
2 21 ms 22 ms 20 ms ge-1-1-0-1104.hlgnqu1.dk.ip.tdc.net [87.59.99.217]
3 29 ms 30 ms 30 ms ae1.tg4-peer1.sto.se.ip.tdc.net [62.95.54.124]
4 30 ms 30 ms 29 ms netnod-ix-ge-b-sth-1500.microsoft.com [194.68.128.181]
5 45 ms 45 ms 46 ms ge-3-0-0-0.ams-64cb-1a.ntwk.msn.net [207.46.42.10]
6 87 ms 59 ms 59 ms xe-3-2-0-0.fra-96cbe-1a.ntwk.msn.net [207.46.42.50]
7 68 ms 65 ms 65 ms xe-0-1-0-0.zrh-96cbe-1b.ntwk.msn.net [207.46.42.13]
8 65 ms 70 ms 74 ms 10gigabitethernet5-1.zrh-xmx-edgcom-1b.ntwk.msn.net [213.199.152.186]
9 65 ms 65 ms 65 ms cds29.zrh9.msecn.net [94.245.68.160]
As you can see from the above trace route, all external content is delayed for quite some time.
It is worth noticing, that the Azure service is setup in North Europe and I am settled in Denmark, why this trace route is a bit .. hmm .. over the top?
Another issue might be that the web-role is two extra small instances; I have not found the time yet to try with two small instances, but I know that Microsoft limits the extra small instances to a 5Mb/s WAN where small and above has 100Mb/s.
I am just unsure if this goes for CDN as well.
Anyway - any help and/or explanation is greatly appreciated.
And let me state, that I am very satisfied with the Azure platform - I am just curious in regards to the above mentioned matters.
Update
New tracert without the -d option.
Being inspired by user728584 I have researched and found this article, http://blogs.msdn.com/b/scicoria/archive/2011/03/11/taking-advantage-of-windows-azure-cdn-and-dynamic-pages-in-asp-net-caching-content-from-hosted-services.aspx, which I will investigate further in regards to public cache-control and CDN.
This does not explain the excessive hops count phenomenon, but I hope a skilled network professional can help in casting light to this matter.
Rest assured, that I will keep you posted according to my findings.

Not to state the obvious but I assume you have set the Cache-Control HTTP header to a large number so as your content is not being removed from the CDN Cache and being served from Blob Storage when you did your tracert tests?
There are quite a few edge servers near you so I would expect it to perform better: 'Windows Azure CDN Node Locations' http://msdn.microsoft.com/en-us/library/windowsazure/gg680302.aspx
Maarten Balliauw has a great article on usage and use cases for the CDN (this might help?): http://acloudyplace.com/2012/04/using-the-windows-azure-content-delivery-network/
Not sure if that helps at all, interesting...

Okay, after I'd implemented public caching-control headers, the CDN appears to do what is expected; delivering content from x-number of nodes in the CDN cluster.
The above has the constrain that it is experienced - it is not measured for a concrete validation.
However, this link support my theory: http://msdn.microsoft.com/en-us/wazplatformtrainingcourse_windowsazurecdn_topic3,
The time-to-live (TTL) setting for a blob controls for how long a CDN edge server returns a copy of the cached resource before requesting a fresh copy from its source in blob storage. Once this period expires, a new request will force the CDN server to retrieve the resource again from the original blob, at which point it will cache it again.
Which was my assumed challenge; the CDN referenced resources kept pooling the original blob.
Also, credits must be given to this link also (given by user728584); http://blogs.msdn.com/b/scicoria/archive/2011/03/11/taking-advantage-of-windows-azure-cdn-and-dynamic-pages-in-asp-net-caching-content-from-hosted-services.aspx.
And the final link for now: http://blogs.msdn.com/b/windowsazure/archive/2011/03/18/best-practices-for-the-windows-azure-content-delivery-network.aspx
For ASP.NET pages, the default behavior is to set cache control to private. In this case, the Windows Azure CDN will not cache this content. To override this behavior, use the Response object to change the default cache control settings.
So my conclusion so far for this little puzzle is that you must pay a close attention to your cache-control (which often is set to private for obvious reasons). If you skip the web-role approach, the TTL is per default 72 hours, why you may not never experience what i experienced; hence it will just work out-of-the-box.
Thanks to user728584 for pointing me in the right direction.

Related

Why does Google's YouTube Data API v3 "Queries per day" Quota increase more than I'm calling it?

I have the default 10,000 queries per day. I added some careful debugging output to my program that's making calls to see exactly how many calls I'm making:
https://youtube.googleapis.com/youtube/v3/search
https://youtube.googleapis.com/youtube/v3/channels with 21 channelIds
https://youtube.googleapis.com/youtube/v3/search
https://youtube.googleapis.com/youtube/v3/channels with 2 channelIds
https://youtube.googleapis.com/youtube/v3/search
https://youtube.googleapis.com/youtube/v3/channels with 39 channelIds
https://youtube.googleapis.com/youtube/v3/search
https://youtube.googleapis.com/youtube/v3/channels with 16 channelIds
https://youtube.googleapis.com/youtube/v3/search
https://youtube.googleapis.com/youtube/v3/channels with 1 channelId
https://youtube.googleapis.com/youtube/v3/search
https://youtube.googleapis.com/youtube/v3/channels with 8 channelIds
https://youtube.googleapis.com/youtube/v3/search
https://youtube.googleapis.com/youtube/v3/channels with 37 channelIds
https://youtube.googleapis.com/youtube/v3/search
https://youtube.googleapis.com/youtube/v3/channels with 11 channelIds
total search = 8
total channels = 8 calls with 21+2+39+16+1+8+37+11 = 135 channelIds
8 + 8 + 135 = 151
I only made 16 http requests, 8 to search and 8 to channels. The id field in channels call is a comma separated list of ids. So counting each of those as another "hit" that's 135 more for the total of 151.
But, after running this my quota reported 808 hits from my 10,000 limit! Why the huge delta between what I'm calling and the quota hit?
Each one has maxResults set to 50, do I get charged for each result back vs each http call?
Correction you have 10000 units per day. Which does not mean that you can make 10000 requests.
The YouTube Data API uses a quota system to ensure that developers use the service as intended and do not create API clients that unfairly reduce service quality or limit access for others.
Projects that enable the YouTube Data API have a default quota allocation of 10,000 units per day, an amount sufficient for the majority of our API users. You can see your quota usage on the Quotas page in the API Console.
As you can see the YouTube api is cost based each request implies a cost against your total quota.
For example. The search.list method costs 100 points to make the request.
Which give you 10000 / 100 = 100
You can make 100 requests before you run out of quota.
Useful links:
Quota cost caculator
Intro to YouTube API and cost based quota

Postgres connect time delay on Windows

There is a long delay between "forked new backend" and "connection received", from about 200 to 13000 ms. Postgres 12.2, Windows Server 2016.
During this delay the client is waiting for the network packet to start the authentication. Example:
14:26:33.312 CEST 3184 DEBUG: forked new backend, pid=4904 socket=5340
14:26:33.771 CEST 172.30.100.238 [unknown] 4904 LOG: connection received: host=* port=56983
This was discussed earlier here:
Postegresql slow connect time on Windows
But I have not found a solution.
After rebooting the server the delay is much shorter, about 50 ms. Then it gradually increases in the course of a few hours. There are about 100 clients connected.
I use ip addresses only in "pg_hba.conf". "log_hostname" is off.
There is BitDefender running on the server but switching it off did not help. Further, Postgres files are excluded from BitDefender checks.
I used Process Monitor which revealed the following: Forking the postgres.exe process needs 3 to 4 ms. Then, after loading DLLs, postgres.exe is looking for custom and extended locale info of 648 locales. It finds none of these. This locale search takes 560 ms (there is a gap of 420 ms, though). Perhaps this step can be skipped by setting a connection parameter. After reading some TCP/IP parameters, there are no events for 388 ms. This time period overlaps the 420 ms mentioned above. Then postgres.exe creates a thread. The total connection time measured by the client was 823 ms.
Locale example, performed 648 times:
"02.9760160","RegOpenKey","HKLM\System\CurrentControlSet\Control\Nls\CustomLocale","REPARSE","Desired Access: Read"
"02.9760500","RegOpenKey","HKLM\System\CurrentControlSet\Control\Nls\CustomLocale","SUCCESS","Desired Access: Read"
"02.9760673","RegQueryValue","HKLM\System\CurrentControlSet\Control\Nls\CustomLocale\bg-BG","NAME NOT FOUND","Length: 532"
"02.9760827","RegCloseKey","HKLM\System\CurrentControlSet\Control\Nls\CustomLocale","SUCCESS",""
"02.9761052","RegOpenKey","HKLM\System\CurrentControlSet\Control\Nls\ExtendedLocale","REPARSE","Desired Access: Read"
"02.9761309","RegOpenKey","HKLM\System\CurrentControlSet\Control\Nls\ExtendedLocale","SUCCESS","Desired Access: Read"
"02.9761502","RegQueryValue","HKLM\System\CurrentControlSet\Control\Nls\ExtendedLocale\bg-BG","NAME NOT FOUND","Length: 532"
"02.9761688","RegCloseKey","HKLM\System\CurrentControlSet\Control\Nls\ExtendedLocale","SUCCESS",""
No events for 388 ms:
"03.0988152","RegCloseKey","HKLM\System\CurrentControlSet\Services\Tcpip6\Parameters\Winsock","SUCCESS",""
"03.4869332","Thread Create","","SUCCESS","Thread ID: 2036"

load testing results now serves lesser users

I was testing website with 30 users/second and it was working fine. But now it is not even serving 25 users/second. The website is a search engine kind of site. In between these two tests of 30 users/second and 25 users/seconds, we have started the crawler to get some sites crawled and then again stopped it before the load testing. 30 users/second was working fine before crawler turned on. Elastic search is used as DB for the website and it went down saying no nodes available.
I've used standard thread group. Below is the configuration.
Total Samples: 250
Ramp up (sec) : 10
Loop count : 1
When I checked the results in table, it shows all green signal but when we hit the site, it gives nonodesavailable exception

Strategies in reducing network delay from 500 milliseconds to 60-100 milliseconds

I am building an autocomplete functionality and realized the amount of time taken between the client and server is too high (in the range of 450-700ms)
My first stop was to check if this is result of server delay.
But as you can see these Nginx logs are almost always 0.001 milliseconds (request time is the last column). It’s hardly a cause of concern.
So it became very evident that I am losing time between the server and the client. My benchmarks are Google Instant's response times. Which almost often is in the range of 30-40 milliseconds. Magnitudes lower.
Although it’s easy to say that Google's has massive infrastructural capabilities to deliver at this speed, I wanted to push myself to learn if this is possible for someone who is not that level. If not 60 milliseconds, I want to shave off 100-150 milliseconds.
Here are some of the strategies I’ve managed to learn.
Enable httpd slowstart and initcwnd
Ensure SPDY if you are on https
Ensure results are http compressed
Etc.
What are the other things I can do here?
e.g
Does have a persistent connection help?
Should I reduce the response size dramatically?
Edit:
Here are the ping and traceroute numbers. The site is served via cloudflare from a Fremont Linode machine.
mymachine-Mac:c name$ ping site.com
PING site.com (160.158.244.92): 56 data bytes
64 bytes from 160.158.244.92: icmp_seq=0 ttl=58 time=95.557 ms
64 bytes from 160.158.244.92: icmp_seq=1 ttl=58 time=103.569 ms
64 bytes from 160.158.244.92: icmp_seq=2 ttl=58 time=95.679 ms
^C
--- site.com ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 95.557/98.268/103.569/3.748 ms
mymachine-Mac:c name$ traceroute site.com
traceroute: Warning: site.com has multiple addresses; using 160.158.244.92
traceroute to site.com (160.158.244.92), 64 hops max, 52 byte packets
1 192.168.1.1 (192.168.1.1) 2.393 ms 1.159 ms 1.042 ms
2 172.16.70.1 (172.16.70.1) 22.796 ms 64.531 ms 26.093 ms
3 abts-kk-static-ilp-241.11.181.122.airtel.in (122.181.11.241) 28.483 ms 21.450 ms 25.255 ms
4 aes-static-005.99.22.125.airtel.in (125.22.99.5) 30.558 ms 30.448 ms 40.344 ms
5 182.79.245.62 (182.79.245.62) 75.568 ms 101.446 ms 68.659 ms
6 13335.sgw.equinix.com (202.79.197.132) 84.201 ms 65.092 ms 56.111 ms
7 160.158.244.92 (160.158.244.92) 66.352 ms 69.912 ms 81.458 ms
mymachine-Mac:c name$ site.com (160.158.244.92): 56 data bytes
I may well be wrong, but personally I smell a rat. Your times aren't justified by your setup; I believe that your requests ought to run much faster.
If at all possible, generate a short query using curl and intercept it with tcpdump on both the client and the server.
It could be a bandwidth/concurrency problem on the hosting. Check out its diagnostic panel, or try estimating the traffic.
You can try and save a response query into a static file, then requesting that file (taking care as not to trigger the local browser cache...), to see whether the problem might be in processing the data (either server or client side).
Does this slowness affect every request, or only the autocomplete ones? If the latter, and no matter what nginx says, it might be some inefficiency/delay in recovering or formatting the autocompletion data for output.
Also, you can try and serve a static response bypassing nginx altogether, in case this is an issue with nginx (and for that matter: have you checked out nginx' error log?).
One approach I didn't see you mention is to use SSL sessions: you can add the following into your nginx conf to make sure that an SSL handshake (very expensive process) does not happen with every connection request:
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
See "HTTPS server optimizations" here:
http://nginx.org/en/docs/http/configuring_https_servers.html
I would recommend using New Relic if you aren't already. It is possible that the server-side code you have could be the issue. If you think that might be the issue, there are quite a few free code profiling tools.
You may want to consider an option to preload autocomplete options in the background while the page is rendered and then save a trie or whatever structure you use on the client in the local storage. When the user starts typing in the autocomplete field you would not need to send any requests to the server but instead query local storage.
Web SQL Database and IndexedDB introduce databases to the clientside.
Instead of the common pattern of posting data to the server via
XMLHttpRequest or form submission, you can leverage these clientside
databases. Decreasing HTTP requests is a primary target of all
performance engineers, so using these as a datastore can save many
trips via XHR or form posts back to the server. localStorage and
sessionStorage could be used in some cases, like capturing form
submission progress, and have seen to be noticeably faster than the
client-side database APIs.
For example, if you have a data grid component or an inbox with
hundreds of messages, storing the data locally in a database will save
you HTTP roundtrips when the user wishes to search, filter, or sort. A
list of friends or a text input autocomplete could be filtered on each
keystroke, making for a much more responsive user experience.
http://www.html5rocks.com/en/tutorials/speed/quick/#toc-databases

how to determine where the bottleneck is in my slow (home) internet?

I recently got a new cable modem from my ISP (Rogers in Canada; old modem was a "Webstar" something, the new modem is a "SMC D3GN-RRR"). Since I got the new modem, it feels like my internet access is slower.
What I'm perceiving is sometimes when I enter a URL and hit enter, there is a delay -- a slight dealy, but it lasts half a second to two or three seconds -- before the web page loads. Once the web page starts loading it loads fast, but there's that delay during while it's looking it up or something.
I have a MacBook Pro, an Apple Airport Extreme wireless router, the new cable modem.
Is there some kind of tool, or cool UNIX command (traceroute, or something?) I can run see how much time is takes to jump from device to device, so I can "prove" where the delay is?
Just FYI, here's a "traceroute www.google.com", in case it's useful. I don't know what this means. :)
traceroute www.google.com
traceroute: Warning: www.google.com has multiple addresses; using 173.194.75.105
traceroute to www.l.google.com (173.194.75.105), 64 hops max, 52 byte packets
1 10.0.1.1 (10.0.1.1) 4.455 ms 1.204 ms 1.263 ms
2 * * *
3 * 69.63.255.237 (69.63.255.237) 36.694 ms 30.209 ms
4 69.63.250.210 (69.63.250.210) 44.503 ms 41.303 ms 46.039 ms
5 gw01.mtmc.phub.net.cable.rogers.com (66.185.81.137) 40.504 ms 34.937 ms 44.493 ms
6 * * *
7 * 216.239.47.114 (216.239.47.114) 58.605 ms 37.710 ms
8 216.239.46.170 (216.239.46.170) 56.073 ms 57.250 ms 64.373 ms
9 72.14.239.93 (72.14.239.93) 70.879 ms
209.85.249.11 (209.85.249.11) 114.399 ms 59.781 ms
10 209.85.243.114 (209.85.243.114) 72.877 ms 80.151 ms
209.85.241.222 (209.85.241.222) 82.524 ms
11 216.239.48.159 (216.239.48.159) 82.227 ms
216.239.48.183 (216.239.48.183) 80.065 ms
216.239.48.157 (216.239.48.157) 79.660 ms
12 * * *
13 ve-in-f105.1e100.net (173.194.75.105) 76.967 ms 71.142 ms 80.519 ms
Same problem for me. Download and upload numbers are great for me. Looks like the DNS, in my case, is responding extremely slowly (not due to slow RTT or line speeds, but maybe the DNS server itself is overloaded, or perhaps the DNS in this region is overloaded or under some form of attack). If I'm correct, it isn't trivial for us as "customers" to demonstrate that this is the explanation, or to have it fixed, unfortunately.

Resources