My app doesn't get a lot of traffic, but still occasionally the response time spikes and can stay like this until I force a restart. It doesn't seem correlated with traffic:
Related
I am experiencing a once per 60-90 minute spike in traffic that's causing my Heroku app to slow to a crawl for the duration of the spike - NewRelic is reporting response times of 20-50 seconds per request, with 99% of that down to the Heroku router queuing requests up. The request count goes from an average of around 50-100rpm up to 400-500rpm
Looking at the logs, it looks to me like a scraping bot or spider trying to access a lot of content pages on the site. However it's not all coming from a single IP.
What can I do about it?
My sysadmin / devops skills are pretty minimal.
Guy
Have your host based firewall throttle those requests. Depending on your setup, you can also add Nginx in to the mix, which can throttle requests too.
So I'm use Goliath to develop an api, /list/users, it is very simple, just query mysql and return.
the request itself takes Response Time: 53.84ms, but if I do a press test with 10 threads to request the server by ab, I can only get 20 requests/second.
At meantime, I access the request in Chrome, I saw wait time: 400ms
What is wrong? how can I improve it?
I also created a nodejs version /list/users. the request itself also takes about 50ms, but I can get 130 requests/second when press test, and the wait time is almost 10ms.
Do I did something wrong, is there any setting need to be done fr Goliath?
And also I want to know why nodejs can have more requests/second since the single request response time is same?
Did you run goliath in Production mode? In development it does code reloading which will negatively impact performance. -e prod will put the server in production mode.
I have been testing some viewport issues for mobile and probably ran
git push heroku master
about 50 times in the last 3 hours. I am now seeing from the google speed tests that:
Reduce server response time In our test, your server responded in 8.9
seconds. There are many factors that can slow down your server
response time. Please read our recommendations to learn how you can
monitor and measure where your server is spending the most time. Hide
details
This wasn't popping up earlier this morning and was under .5 seconds. Did I damage one of my dynos on the heroku servers?= My site isn't really getting any traffic yet so I haven't been doing any stage testing.
What is the best way to test production?
I was reading through this but was wondering if there is a better way to test production quickly.
https://devcenter.heroku.com/articles/multiple-environments
Thanks,
Jeff
There's nothing wrong with pushing many times in a row, but every time you push, your dynos will cycle. This takes something like 5 to 15 seconds depending on the size of your slug.
Generally this means that the first query sent to your app at the moment your dynos are cycling might hang for about that long. If Google checked your server's speed at that time, then that explains the response time. However, there shouldn't be any lasting effects after you finish pushing repeatedly.
If I recall correctly there is a Heroku labs option to cycle dynos to eliminate this pause, basically taking down some of your dynos and then cycling them while the other ones are still up, but I do not recommend using it as it makes code pushes very unpredictable and can result in two versions of your app being live at the same time.
Often when troubleshooting performance using the Google Chrome's network panel I see different times and often wonder what they mean.
Can someone validate that I understand these properly:
Blocking: Time blocked by browser's multiple request for the same domain limit(???)
Waiting: Waiting for a connection from the server (???)
Sending: Time spent to transfer the file from the server to the browser (???)
Receiving: Time spent by the browser analyzing and decoding the file (???)
DNS Lookup: Time spent resolving the hostname.
Connecting: Time spent establishing a socket connection.
Now how would someone fix long blocking times?
Now how would someone fix long waiting times?
Sending is time spent uploading the data/request to the server. It occurs between blocking and waiting. For example, if I post back an ASPX page this would indicate the amount of time it took to upload the request (including the values of the forms and the session state) back to the ASP server.
Waiting is the time after the request has been sent, but before a response from the server has been received. Basically this is the time spent waiting for a response from the server.
Receiving is the time spent downloading the response from the server.
Blocking is the amount of time between the UI thread starting the request and the HTTP GET request getting onto the wire.
The order these occur in is:
Blocking*
DNS Lookup
Connecting
Sending
Waiting
Receiving
*Blocking and DNS Lookup might be swapped.
The network tab does not indicate time spent processing.
If you have long blocking times then the machine running the browser is running slowly. You can fix this by upgrading the machine (more RAM, faster processor, etc.) or by reducing its workload (turn off services you do not need, closing programs, etc.).
Long wait times indicate that your server is taking a long time to respond to requests. This either means:
The request takes a long time to process (like if you are pulling a large amount of data from the database, large amounts of data need to be sorted, or a file has to be found on an HDD which needs to spin up).
Your server is receiving too many requests to handle all requests in a reasonable amount of time (it might take .02 seconds to process a request, but when you have 1000 requests there will be a noticeable delay).
The two issues (long waiting + long blocking) are related. If you can reduce the workload on the server by caching, adding new server, and reducing the work required for active pages then you should see improvements in both areas.
You can read a detailed official explanation from google team here. It is a really helpful resource and your info goes under Timeline view section.
Resource network timing shows the same information as in resource bar in timeline view. Answering your quesiton:
DNS lookup: Time spent performing the DNS lookup. (you need to find out IP address of site.com and this takes time)
Blocking: Time the request spent waiting for an already established connection to become available for re-use. As was said in another answer it does not depend on your server - this is client's problem.
Connecting: Time it took to establish a connection, including TCP handshakes/retries, DNS lookup, and time connecting to a proxy or negotiating a secure-socket layer (SSL). Depends on network congestion.
Sending - Time spent sending the request. Depends on the size of sent data (which is mostly small because your request is almost always a few bytes except if you submit a big image or huge amount of text), network congestion, proximity of client to server
Waiting - Time spent waiting for the initial response. This is mostly the time of your server to process and respond to your response. This is how fast if your server calculating things, fetching records from database and so on.
Receiving - Time spent receiving the response data. Something similar to the sending, but now you are getting your data from the server (response size is mostly bigger than request). So it also depends on the size, connection quality and so on.
Blocking: Time the request spent waiting for an already established connection to become available for re-use. As was said in
another answer it does not depend on your server - this is client's
problem.
I do not agree with the statement above. All else being same [my machine workload] - my browser shows very less "Blocking" time for one website and long blocking time for some other website.
So if waiting for one of the six threads + proxy negotiation** is high, it is mostly because of the cascading effect of the server's slowness OR the bad design of page [too much being sent across the wire, too many times].
** - whatever "Proxy Negotiation" means!, nobody explains this very well, particularly where no local/CDN proxy is actually involved
IIS includes a worker process health check "ping" function that pings worker processes every 90 seconds by default and recycles them if they don't respond. I have an application that is chronically putting app pools into a bad state and I'm curious if there is any reason not to lower this time to force IIS to recycle a failed worker process quicker. Searching the web all I can find is people that are increasing the time to allow for debugging. It seems like 90 seconds is far to high for a web application, but perhaps I'm missing something.
Well the obvious answer is that in some situations requests would take longer than 90 seconds for the worker process to return. If you can't imagine a situation where this would be appropriate, then feel free to lower it.
I wouldn't recommend going too much lower than 30 seconds. I can see situations where you get in recycle loops. However you can do testing and see what makes sense in your situation. I would recommend Siege for load testing to see how your application behaves.