How to log/instrument in performance critical web service? - performance

I'm working on a server push service, kind of like pusher/pubnub. The most critical part that handles the client polling is currently using Node.js and Redis, which is working just fine, except for one thing. I don't really have a good idea about what's going on in the app.
The whole idea is based around long polling, which means there is A LOT of requests going back and forth, and a lot of checking in redis and seeing if there's something new. The problem is that I don't really know how to monitor such a thing.
Let's say that there are 10 000 users online on average on a single site. At a polling interval of 5 seconds this would result in at least 2000 log entries every second. How do I manage that many logs? Should I use something like logstash to collect them to have at least some idea of what's going on in the app?
Would it be a good idea to have all instrumentation turned off, and only enable it with something like kill -s USR2?
I thought about using Redis monitor command to gather some data, but just running it slows down redis by 50%, not to mention the overhead of analyzing the incoming data.
How do people generally handle this? Is there any good book or something about building high-load & high-availability applications?

Related

How many users should a EC2 Micro Instance be able to handle only with a nginx server?

I have a iOS Social App.
This app talks to my server to do updates & retrieval fairly often. Mostly small text as JSON. Sometimes users will upload pictures that my web-server will then upload to a S3 Bucket. No pictures or any other type of file will be retrieved from the web-server
The EC2 Micro Ubuntu 13.04 Instance runs PHP 5.5, PHP-FPM and NGINX. Cache is handled by Elastic Cache using Redis and the database connects to a separate m1.large MongoDB server. The content can be fairly dynamic as newsfeed can be dynamic.
I am a total newbie in regards to configuring NGINX for performance and I am trying to see whether I've configured my server properly or not.
I am using Siege to test my server load but I can't find any type of statistics on how many concurrent users / page loads should my system be able to handle so that I know that I've done something right or something wrong.
What amount of concurrent users / page load should my server be able to handle?
I guess if I cant get hold on statistic from experience what should be easy, medium, and extreme for my micro instance?
I am aware that there are several other questions asking similar things. But none provide any sort of estimates for a similar system, which is what I am looking for.
I haven't tried nginx on microinstance for the reasons Jonathan pointed out. If you consume cpu burst you will be throttled very hard and your app will become unusable.
IF you want to follow that path I would recommend:
Try to cap cpu usage for nginx and php5-fpm to make sure you do not go over the thereshold of cpu penalities. I have no ideia what that thereshold is. I believe the main problem with micro instance is to maintain a consistent cpu availability. If you go over the cap you are screwed.
Try to use fastcgi_cache, if possible. You want to hit php5-fpm only if really needed.
Keep in mind that gzipping on the fly will eat alot of cpu. I mean alot of cpu (for a instance that has almost none cpu power). If you can use gzip_static, do it. But I believe you cannot.
As for statistics, you will need to do that yourself. I have statistics for m1.small but none for micro. Start by making nginx serve a static html file with very few kb. Do a siege benchmark mode with 10 concurrent users for 10 minutes and measure. Make sure you are sieging from a stronger machine.
siege -b -c10 -t600s 'http:// private-ip /test.html'
You will probably see the effects of cpu throttle by just doing that! What you want to keep an eye on is the transactions per second and how much throughput can the nginx serve. Keep in mind that m1small max is 35mb/s so m1.micro will be even less.
Then, move to a json response. Try gzipping. See how much concurrent requests per second you can get.
And dont forget to come back here and report your numbers.
Best regards.
Micro instances are unique in that they use a burstable profile. While you may get up two 2 ECU's in terms of performance for a short period of time, after it uses its burstable allotment it will be limited to around 0.1 or 0.2 ECU. Eventually the allotment resets and you can get 2 ECU's again.
Much of this is going to come down to how CPU/Memory heavy your application is. It sounds like you have it pretty well optimized already.

How many dynos would require to host a static website on Heroku?

I want to host a static website on Heroku, but not sure how many dynos to start off with.
It says on this page: https://devcenter.heroku.com/articles/dyno-requests that the number of requests a dyno can serve, depends on the language and framework used. But I have also read somewhere that 1 dyno only handles one request at a time.
A little confused here, should 1 web dyno be enough to host a static website with very small traffic (<1000 views/month, <10/hour)? And how would you go about estimating additional use of dynos as traffic starts to increase?
Hope I worded my question correctly. Would really appreciate your input, thanks in advance!
A little miffed since I had a perfectly valid answer deleted but here's another attempt.
Heroku dynos are single threaded, so they are capable of dealing with a single request at a time. If you had a dynamic page (php, ruby etc) then you would look at how long a page takes to respond at the server, say it took 250ms to respond then a single dyno could deal with 4 requests a second. Adding more dynos increases concurrency NOT performance. So if you had 2 dynos, in this scenario you be able to deal with 8 requests per second.
Since you're only talking static pages, their response time should be much faster than this. Your best way to identify if you need more is to look at your heroku log output and see if you have sustained levels of the 'queue' value; this means that the dynos are unable to keep up and requests are being queued for processing.
Since most HTTP 1.1 clients will create two TCP connections to the webserver when requesting resources, I have a hunch you'll see better performance on single clients if you start two dynos, so the client's pipelined resources requests can be handled pipelined as well.
You'll have to decide if it is worth the extra money for the (potentially slight) improved performance of a single client.
If you ever anticipate having multiple clients requesting information at once, then you'll probably want more than two dynos, just to make sure at least one is readily available for additional clients.
In this situation, if you stay with one dyno. The first one is free, the second one puts you over the monthly minimum and starts to trigger costs.
But, you should also realize with one dyno on Heroku, the app will go to sleep if it hasn't been accessed recently (I think this is around 30 minutes). In that case, it can take 5-10 seconds to wake up again and can give your users a very slow initial experience.
There are web services that will ping your site, testing for it's response and keeping it awake. http://www.wekkars.com/ for example.

Heroku: web dyno vs. worker dyno? How many/what ratio do I need?

I was curious as to what the difference between web and worker dynos is on Heroku. They give a one sentence explanation on their pricing page, but this just left me confused. How do I know how many to pick of each? Is there a ratio I should aim for? I'm pretty new to this stuff, so can someone give an in depth explanation, or maybe some sort of way I can calculate how many and which kind of dynos I would need?
Also, I'm confused about what they mean by the amount of hours for each dyno.
http://www.heroku.com/pricing
I also happened upon this article. As one of their suggested solutions, they said to increase the amount of dynos. Which type of dyno are they referring to here?
http://devcenter.heroku.com/articles/backlog-too-deep
Your best indication if you need more dynos (aka processes on Cedar) is your heroku logs. Make sure you upgrade to expanded logging (it's free) so that you can tail your log.
You are looking for the heroku.router entries and the value you are most interested is the queue value - if this is constantly more than 0 then it's a good sign you need to add more dynos. Essentially this means than there are more requests coming in than your process can handle so they are being queued. If they are queued too long without returning any data they will be timed out.
There's no ideal ratio I'm afraid, you could have an app doing 100 requests a second needing many web processes but just doesn't make use of workers. You only need worker processes if you are doing processing in the background like sending emails etc etc.
ps Backlog too deep would be a Dyno web process that would cause it.
UPDATE: On March 26 2013 Heroku removed queue and wait fields from the log out put.
queue and wait fields have been removed from router log messages.
Also, the Heroku router no longer sets X-Heroku-Dynos-In-Use,
X-Heroku-Queue-Depth and X-Heroku-Queue-Wait-Time HTTP headers for
incoming requests.
Dynos are basically processes that run on your instance. With the new Cedar stack, they can be set up to execute any arbitrary shell command. For web applications, you generally have one process called "web" which is responsible for responding to HTTP requests from users. All other processes are what were previously called "workers." These run continuously in the background for things like cron, processing queues, and any heavy computation that you don't want to tie up your web processes. You can also scale each type of process, so that multiple processes of each type will be booted up for additional concurrency. The amount of each that you use really depends on the needs of your application and the load it receives. You can use tools like the New Relic plugin to monitor these things. Take a look at the articles on the Process Model and the Procfile in Heroku's dev center for more details.
A number of people have mentioned that there is no known ratio and that the ratio of web workers to 'background' workers you will want is dependent on how you designed your application - that is correct. However I thought it might be useful to add that as a general rule of thumb, you want your web workers - and thus the controller actions they are serving - to be lightning quick and very lightweight, to reduce latency in response times from browser actions. If there is some browser action that would require more than, say, about a half a second of real time to serve, then you will probably want to architect some sort of system that pushes the bulk of that action on to a queue.
You would then design an offline worker dyno(s) that will service this queue. They can take much longer because there are no HTTP responses pending on their output. Perhaps the page you rendered from the initial browser request that pushed the action will serve some Javascript that starts a thread that checks to see if the request has finished every 5 seconds, or something along those lines.
I still can't give you a ratio to work with for the same reason others have given, but hopefully this helps you decide how to architect your app. (I should also mention this is just one design out of many valid ones.)
https://stackoverflow.com/a/19965981/1233555 - Heroku has gone to random routing, so some dynos can have queues stacking up (while they serve a lengthy request) while other dynos are free. Avoid this by making sure that all requests are handled very quickly in your web dynos. This will reduce the number of web dynos you need, while requiring more worker dynos.
You also need to care about your web app supporting concurrency, which only some Rails configs do - try Unicorn, or carefully-written code (for I/O that doesn't block the EventMachine) with Thin.
You probably have to try, rather than calculate, to see how many dynos of each kind you need. Make sure their New Relic reports the dyno queue - see the above link.
Short answer is that you need as many as you need to keep your queues down.
As John describes, if you start seeing a queue in your logs then you need more dynos. If you start seeing your background queues getting too long (how you get this info is dependant on what you have implemented) then you need more workers.
There is no ratio as it's very much dependent on your application design and usage.

AJAX Polling Question - Blocking Or Frequent?

I have a web application that relies on very "live" data - so it needs an update every 1 second if something has changed.
I was wondering what the pros and cons of the following solutions are.
Solution 1 - Poll A Lot
So every 1 second, I send a request to the server and get back some data. Once I have the data, I wait for 1 second before doing it all again. I would detect client-side if the state had changed and take action appropriately.
Solution 2 - Block A Lot
So I start a request to the server that will time-out after 30 seconds. The server keeps an eye on the data on the server by checking it once per second. If the server notices the data has changed it sends the data back to the client, which takes action appropriately.
Scenario
Essentially, the data is reasonably small in size, but changes at random intervals based on live events. The thing is, the web UI will be running something in the region of 2,000 instances, so do I have 2,000 requests per second coming from the UI or do I have 2,000 long-running requests that take up to 30 seconds?
Help and advice would be much appreciated, especially if you have worked with AJAX requests under similar volumes.
One common solution for such cases is to use static json files. Server-side scripts update them when the data is changed and they are served by fast and light webserver (like nginx). Since files are static and small - webserver will do that right in cache, in very fast manner.
Consider a better architecture. Implementing this kind of messaging system is trivial to do right in something like nodeJS. Message dispatch will be instantaneous, and you won't need to poll for your data on either side.
You don't need to rewrite your whole system: The data producer could simply POST the updates to the nodeJS server instead of writing them to a file, and as a bonus, you don't even need to waste time on disk IO.
If you started without knowing any nodeJS, you could still be done in a couple hours, because you can just hack up the chat example.
I can't comment yet, but I would agree with geocar. Running live or almost live web services with just polling will be solution stuck between a rock and a hard place.
You could also look into web sockets to allow push as this sounds a better solution for this than just updating every second to 30 seconds.
Good luck!

Why are my basic Heroku apps taking two seconds to load?

I created two very simple Heroku apps to test out the service, but it's often taking several seconds to load the page when I first visit them:
Cropify - Basic Sinatra App (on github)
Textile2HTML - Even more basic Sinatra App (on github)
All I did was create a simple Sinatra app and deploy it. I haven't done anything to mess with or test the Heroku servers. What can I do to improve response time? It's very slow right now and I'm not sure where to start. The code for the projects are on github if that helps.
If your application is unused for a while it gets unloaded (from the server memory).
On the first hit it gets loaded and stays loaded until some time passes without anyone accessing it.
This is done to save server resources. If no one uses your app why keep resources busy and not let someone who really needs use them ?
If your app has a lot of continous traffic it will never be unloaded.
There is an official note about this.
You might also want to investigate the caching options you have on Heroku w/ Varnish and Memcached. These are persisted independent of the dynos.
For example, if you have an unchanging homepage, you can cache that for extended periods in Varnish by adding Cache-Control headers to the response. Then your users won't experience the load hit until they want to "do something" rather than when they arrive.
You should check out Tom Robinson's answer to "Scalability: How Does Heroku Work?" on Quora: http://www.quora.com/Scalability/How-does-Heroku-work
Heroku divides up server resources among many different customers/applications. Your app is allotted blocks of computing power. Heroku partitions based on resource demand. When you have a popular application that demands more power, you can pay for more 'dynos' (application containers) and then get a larger chunk of the pie in return.
In your case though, you are running a free app that few people--if any outside of you--are visiting/using. Therefore, Heroku cuts down on the resources you're getting by unloading your app--putting it in hibernation essentially--until there is a request made to your address. When that happens, and your app has been idling for a long time, it takes time to reload.
Add 1 extra dyno to keep your app from falling asleep, if that reload time is important.
I am having the same problem. I deployed a Rails 3 (1.9.2) app last night and it's slow. I know that 1.9.2/Rails 3 is in BETA on Heroku but the support ticket said it should be fine using some instructions they sent me.
I understand that the first request after a long time takes the longest. Makes sense. But simply loading pages that don't even connect to a DB taking 10 seconds sometimes is pretty bad.
Anyway, you might want to try what I'm going to do. That is profile my app and see how long it takes locally. If it's taking 400ms then something is wrong. But if I get 50ms locally and it still takes 10 seconds on Heroku then something is definitely wrong.
Obviously, caching helps but you only get 5MB for free and once again, with ONE person using the site, it shouldn't be that slow.
I had the same problem with every app I have put on via heroku's free account. Now there are options of adding dynos so that your app does not get offloaded while it is not being used for a while, you can also try using redis or memcached for caching. But I used a hacky solution for my small scale project, I basically built web scraper put it inside an infinite loop with sleep and tada the website actually had much better response times(I guess it was not getting offloaded because of the script).

Resources