We are using HireFire to manage multiple DelayedJob queues, with queues associated with different Procfile entries that might vary by application.
For 95% of the day some of the dyno types in the formation are scaled to 0. (Basically, overnight batch jobs start, submit to one queue or other, and Performance-M or L dynos are scaled to 1, 2, or 3 to handle the jobs, then they scale back to zero when the queues are empty).
The CLI ps command will retrieve information on the running processes, but if one or more of the dyno types is scaled to zero, they do not appear.
Does anyone know of a way of retrieving dyno formation info even when the dynos are not running?
Related
I'm interested in assigning different workloads to each worker of a particular type in Heroku. The workload is continuous rather than discrete so a work queue is not appropriate. I could coordinate work through a database or Zookeeper, but these bring complexity and reliability issues.
I know that Heroku dynos are assigned names like worker.1 but I'm curious how these names are assigned and if I can rely on them to have certain properties. I'm only interested in the behavior for worker dynos since web preboot probably changes the behavior for web dynos.
Specifically,
Is it possible for two dynos to have the same name/number at the same time (I assume not for worker processes since the previous one should be shutdown or have failed before a new one is started)
If the ps:scale for my worker is n, can I rely on dyno names being worker.1, worker.2 ... worker.n or would I sometimes get numbers outside (1, n)? If it is possible to get numbers outside (1, n) under what circumstances might it happen?
I'm hoping to create a configuration that maps worker numbers to work assignments so I would rely on worker numbers to be exactly 1 to n.
I am looking for the best approach to handle the following scenario:
I have multiple edge devices publishing sensor data to a RabbitMq broker. The broker will experience an overall workload of ~500 messages per seconds. Then there is a python worker dyno who consumes one sensor reading at a time, applies a filter on it (which can take up to 5-15ms) and publishes the result to another topic.
Of course one worker is not enough to serve all requests, so I need a proper scaling. I use a queue to make sure each sensor reading is consumed only once!
My questions are:
Do I scale horizontally and just start as many dynos as necessary to handle all requests in the RabbitMQ queue? Seems simple but more expensive.
Or would it be better to have less dynos but more threads running on each dyno, and using e.g. celery?
Or is there a load balancer that consumes 1 item out of the queue and schedules a dyno dynamically?
Something totally different?
option 1 or 2 are your best bets
i don't think option 3 exists without tying directly into the heroku API, and writing a ton of code for yourself... but that is overkill for your needs, IMO
between 1 & 2, the choice would depend on whether or not you want to grow the ability to handle more messages without re-deploying your code.
option 1 is generally my preference because i can just add a new dyno instance and be done. takes 10 seconds.
option 2 might work if you don't mind adjusting your code and redeploying. it will add extra time and effort for the tradeoff of cost.
but at some point, option 2 will need to turn into option 1 anyways, as you can only do so much work on a dyno to begin with. you will run into limitations on threads, with dynos. and then you'll be scaling out with dynos.
It seems with GuvScale you can scale the workers consuming massages from RabbitMQ
Heroku describes their dynos here and it lists the amount of memory each one has and also the amount of Compute resources. Nowhere do I see the definition of a "Compute".
When I run this command on the performance-l dynos it tells me it has 8 cores.
grep -c processor /proc/cpuinfo
I don't see how this relates to the 46x Compute that's on the chart. It seems like an arbitrary number to me and I don't understand exactly what it is.
Heroku's compute units are just Amazon's compute units (because Heroku runs on top of AWS).
One compute unit on AWS is defined as the computer power of a 1.0-1.2Ghz of a 2007 server CPU.
Keep in mind though: these units are typically pretty variable depending on how many other active dynos are on the same underlying EC2 host.
I am new to sidekiq, my requirement is that there can be as many high priority jobs as the number of users logged into the system. Lets sat each user is expecting a notification soon as his job is processed.
I have one sidekiq daemon running with concurrency of 50 so at a time I can have just 50 jobs processing? I have read that the wiki states we should have multiple sidekiqs running.
What is the upper limit on the number of sidekiqs to run?
how will I be able to match the number of users logged in with the number of concurrent workers?
Is there a technology stack I can use to launch these workers? Something like unicorn to have a pool of workers? Can i even use unicorn with sidekiq ?
What is the upper limit on the number of sidekiqs to run?
You will want a max of one Sidekiq per processor core. If you have a dual-core processor, then 2 Sidekiqs. However, if your server is also doing other stuff such as running a webserver, you will want to leave some cores available for that.
how will I be able to match the number of users logged in with the number of concurrent workers?
With Sidekiq, you pre-emptively create your threads. You essentially have a thread-pool of X idle threads which are ready to deploy at any moment should a huge surge of jobs come in. You will need to create as many threads as the max number of jobs you think you will have at any time. However going over 50 threads per core is not a good idea for performance reasons (the amount of time switching between a huge number of threads significantly cuts into the CPU time allocated for the threads to do actual work).
Is there a technology stack I can use to launch these workers? Something like unicorn to have a pool of workers? Can i even use unicorn with sidekiq ?
You can't use Unicorn for this. You need some process supervisor to handle starting/restarting of Sidekiq. Their wiki recommends Upstart or systemd, but I've found that Supervisor works incredibly well, and is really easy to set-up.
I am currently trying to understand why some of my requests in my Python Heroku app take >30 seconds. Even simple requests which do absolutely nothing.
One of the things I've done is look into the load average on my dynos. I did three things:
1) Look at the Heroku logs. Once in a while, it will print the load. Here are examples:
Mar 16 11:44:50 d.0b1adf0a-0597-4f5c-8901-dfe7cda9bce0 heroku[web.2] Dyno load average (1m): 11.900
Mar 16 11:45:11 d.0b1adf0a-0597-4f5c-8901-dfe7cda9bce0 heroku[web.2] Dyno load average (1m): 8.386
Mar 16 11:45:32 d.0b1adf0a-0597-4f5c-8901-dfe7cda9bce0 heroku[web.2] Dyno load average (1m): 6.798
Mar 16 11:45:53 d.0b1adf0a-0597-4f5c-8901-dfe7cda9bce0 heroku[web.2] Dyno load average (1m): 8.031
2) Run "heroku run uptime" several times, each time hitting a different machine (verified by running "hostname"). Here is sample output from just now:
13:22:09 up 3 days, 13:57, 0 users, load average: 15.33, 20.55, 22.51
3) Measure the load average on the machines on which my dynos live by using psutil to send metrics to graphite. The graphs confirm numbers of anywhere between 5 and 20.
I am not sure whether this explains simple requests taking very long or not, but can anyone say why the load average numbers on Heroku are so high?
Heroku sub-virtualizes hosts to the guest 'Dyno' you are using via LXC. When you run 'uptime' you are seeing the whole hosts uptime NOT your containers, and as pointed out by #jon-mountjoy you are getting a new LXC container not one of your running Dynos when you do this.
https://devcenter.heroku.com/articles/dynos#isolation-and-security
Heroku’s dyno load calculation also differs from the traditional UNIX/LINUX load calculation.
The Heroku load average reflects the number of CPU tasks that are in the ready queue (i.e. waiting to be processed). The dyno manager takes the count of runnable tasks for each dyno roughly every 20 seconds. An exponentially damped moving average is computed with the count of runnable tasks from the previous 30 minutes where period is either 1-, 5-, or 15-minutes (in seconds), the count_of_runnable_tasks is an entry of the number of tasks in the queue at a given point in time, and the avg is the previous calculated exponential load average from the last iteration
https://devcenter.heroku.com/articles/log-runtime-metrics#understanding-load-averages
The difference between Heroku's load average and Linux is that Linux also includes processes in uninterruptible sleep states (usually waiting for disk activity), which can lead to markedly different results if many processes remain blocked in I/O due to a busy or stalled I/O system.
On CPU bound Dyno's I would presume this wouldn't make much difference. On an IO bound Dyno the load averages reported by Heroku would be much lower than what is reported by what you would get if you could get a TRUE uptime on an LXC container.
You can also enable sending periodic load messages of your running dynos with by enabling log-runtime-metrics
Perhaps it's expected dyno idling?
PS. I suspect there's no point running heroku run uptime - that will run it in a new one-off dyno every time.