I recently enabled the beta Ruby Language Metrics in Heroku (docs)
I have 2 Performance-M web dynos running. The Puma pool metrics are below.
The usage seems unexpectedly low to me over the first 2 hours of this enabled. Am I missing something, or are these numbers somewhat expected?
Puma Pool Usage Metrics screenshot
So did I recently. I believe what you are seeing is simply that there are no data for that time frame. This view corresponds to what I see before the deploy that activated the metrics.
Related
I'm trying to understand the Dyno Load section of my metrics of my app. My particular app has five worker dynos. Given that information, if I see a Load Max or Load Avg of 2 or 2.5 then I should be ok, right? With this setup my goal would be to keep the load under five (1 for each dyno)? Is that the correct way to view this?
The load you see in Heroku Metrics is per dyno. Each dyno sends its own load, the max being the maximum value.
So expecting 5 to be a good value because you have 5 dynos isn't right.
You need to evaluate that value based on the type of dynos you have, as each of them will have more CPU shares and be able to handle more load.
Heroku recommends (here) keeping Free, Hobby and Standard dynos between 0.5 and 1.0 load.
Performance-M dynos can go to 3.0 or 4.0, and PL can go up to 16.0.
See also dyno sizes and their CPU share: https://devcenter.heroku.com/articles/dyno-types
If I have a server with 1 core, how many puma workers, threads and what database pool size is appropriate?
What's the general thumb here?
Not an easy answer.
The two main sources of information are:
Puma github repository (the authors' point of view)
Heroku's web page (the main big user's point of view)
Unfortunately they are inconsistent mostly because heroku has different deployment metrics and terminology.
So I ended up following the puma repository guidelines which says:
One worker per core
Threads to be determined in connection with RAM availability and application and
Threads = Connection Pool
So the number of threads is mostly a try and check operation.
How does the NewRelic agent know how many instances are running or how much RAM the app is using?
I'm wondering how it can glean so much without running an agent on the system as I thought you could only run your application code on Heroku, and not just any process?
I'll assume you're referring to Ruby, but the agent is not actually different on Heroku except for some accounting integrations due to New Relic/Heroku partnerships.
If you're using the New Relic Add-On for Heroku, the memory usage displayed in the "Instances" tab is an average per process.
New Relic will only track the instances that are being used by the app in the time window that you select.
I am using New Relic standard and Rails 3 on Heroku to build a web site. But not sure what indicators shown on New Relic should I keep an eye on to scale up the web dyno when certain criteria are met?
Say, indicator A comes to X level, I should add one Dyno to put it down.
Thank you!
Primarily you want to be looking at your logs and at the queue attribute on the heroku[router] - if this starts going up (and importantly staying up) then you have too many requests that are being queued and can't be processed fast enough.
If you're seeing long queue-wait times in the New Relic dashboard and there are no other good explanations (i.e. high variability in response times, use of Thin web server instead of Unicorn, etc.), that's generally a good indication requests are waiting in queue to be processed by a dyno.
This might also belong on serverfault. It's kind of a combo between server config and code (I think)
Here's my setup:
Rails 2.3.5 app running on jruby 1.3.1
Service Oriented backend over JMS with activeMQ 5.3 and mule 2.2.1
Tomcat 5.5 with opts: "-Xmx1536m -Xms256m -XX:MaxPermSize=256m -XX:+CMSClassUnloadingEnabled"
Java jdk 1.5.0_19
Debian Etch 4.0
Running top, every time i click a link on my site, I see my java process CPU usage spike. If it's a small page, it's sometimes just 10% usage, but sometimes on a more complicated page, my CPU goes up to 44% (never above, not sure why). In this case, a request can take upwards of minutes while my server's load average steadily climbs up to 8 or greater. This is just from clicking one link that loads a few requests from some services, nothing too complicated. The java process memory hovers around 20% most of the time.
If I leave it for a bit, load average goes back down to nothing. Clicking a few more links, climbs back up.
I'm running a small amazon instance for the rails frontend and a large instance for all the services.
Now, this is obviously unacceptable. A single user can bring spike the load average to 8 and with two people using it, it maintains that load average for the duration of our using the site. I'm wondering what I can do to inspect what's going on? I'm at a complete loss as to how I can debug this. (it doesn't happen locally when I run the rails app through jruby, not inside the tomcat container)
Can someone enlighten me as to how I might inspect on my jruby app to find out how it could possibly be using up such huge resources?
Note, I noticed this a little bit before, seemingly at random, but now, after upgrading from Rails 2.2.2 to 2.3.5 I'm seeing it ALL THE TIME and it makes the site completely unusable.
Any tips on where to look are greatly appreciated. I don't even know where to start.
Make sure that there is no unexpected communication between the Tomcat and something else. I would check in the first place if:
ActiveMQ broker doesn't communicate with the other brokers in your network. By default AMQ broker start in OpenWire auto-discovery mode.
JGroups/Multicasts in general do not communicate with something in your network.
This unnecessary load may result from the processing of the messages coming from another application.