Can Heroku's dyno cycling be adjusted? - heroku

I would like to increase the rate at which my background workers cycle. Maybe have them automatically restart every 12 hours instead of every 24. Is this a config option on Heroku?
Thanks

Not sure why you would want to do this, but the only reasonable option to achieve this is to have each of your background dynos kill itself after 'N' hours, or 'N' messages processed. Amount of hacking needed for this depends on your environment.

Related

Heroku, RabbitMQ and many workers. What is the best architecture?

I am looking for the best approach to handle the following scenario:
I have multiple edge devices publishing sensor data to a RabbitMq broker. The broker will experience an overall workload of ~500 messages per seconds. Then there is a python worker dyno who consumes one sensor reading at a time, applies a filter on it (which can take up to 5-15ms) and publishes the result to another topic.
Of course one worker is not enough to serve all requests, so I need a proper scaling. I use a queue to make sure each sensor reading is consumed only once!
My questions are:
Do I scale horizontally and just start as many dynos as necessary to handle all requests in the RabbitMQ queue? Seems simple but more expensive.
Or would it be better to have less dynos but more threads running on each dyno, and using e.g. celery?
Or is there a load balancer that consumes 1 item out of the queue and schedules a dyno dynamically?
Something totally different?
option 1 or 2 are your best bets
i don't think option 3 exists without tying directly into the heroku API, and writing a ton of code for yourself... but that is overkill for your needs, IMO
between 1 & 2, the choice would depend on whether or not you want to grow the ability to handle more messages without re-deploying your code.
option 1 is generally my preference because i can just add a new dyno instance and be done. takes 10 seconds.
option 2 might work if you don't mind adjusting your code and redeploying. it will add extra time and effort for the tradeoff of cost.
but at some point, option 2 will need to turn into option 1 anyways, as you can only do so much work on a dyno to begin with. you will run into limitations on threads, with dynos. and then you'll be scaling out with dynos.
It seems with GuvScale you can scale the workers consuming massages from RabbitMQ

What is the upper limit on the number of Sidekiq workers we can have?

I am new to sidekiq, my requirement is that there can be as many high priority jobs as the number of users logged into the system. Lets sat each user is expecting a notification soon as his job is processed.
I have one sidekiq daemon running with concurrency of 50 so at a time I can have just 50 jobs processing? I have read that the wiki states we should have multiple sidekiqs running.
What is the upper limit on the number of sidekiqs to run?
how will I be able to match the number of users logged in with the number of concurrent workers?
Is there a technology stack I can use to launch these workers? Something like unicorn to have a pool of workers? Can i even use unicorn with sidekiq ?
What is the upper limit on the number of sidekiqs to run?
You will want a max of one Sidekiq per processor core. If you have a dual-core processor, then 2 Sidekiqs. However, if your server is also doing other stuff such as running a webserver, you will want to leave some cores available for that.
how will I be able to match the number of users logged in with the number of concurrent workers?
With Sidekiq, you pre-emptively create your threads. You essentially have a thread-pool of X idle threads which are ready to deploy at any moment should a huge surge of jobs come in. You will need to create as many threads as the max number of jobs you think you will have at any time. However going over 50 threads per core is not a good idea for performance reasons (the amount of time switching between a huge number of threads significantly cuts into the CPU time allocated for the threads to do actual work).
Is there a technology stack I can use to launch these workers? Something like unicorn to have a pool of workers? Can i even use unicorn with sidekiq ?
You can't use Unicorn for this. You need some process supervisor to handle starting/restarting of Sidekiq. Their wiki recommends Upstart or systemd, but I've found that Supervisor works incredibly well, and is really easy to set-up.

Finish sidekiq queues much quicker

I reached a point now, where is taking to long for a queue to finish, because new jobs are added to that queue.
What are the best options to overcome this problem.
I already use 50 processors, but I noticed that if I open more, it will take longer for jobs to finish.
My setup:
nginx,
unicorn,
ruby-on-rails 4,
postgresql
Thank you
You need to measure where you are constrained by resources.
If you're seeing things slow down as you add more workers you're likely blocked by your database server. Have you upgraded your Redis server to handle this amount of load? Where are you storing the scraped data to? Can that system handle the increased write load?
If you were blocked on CPU or I/O, you should see the amount of work through the system scale linearly as you add more workers. Since you're seeing things slow down when you scale out, you should measure where your problem is. I'd recommend instrumenting NewRelic for your worker processes and measuring where the time is being spent.
My guess would be that your Redis instance can't handle the load to manage the work queue with 50 worker processes.
EDIT
Based on your comment, it sounds like you're entirely I/O Bound doing web scraping. In that case, you should be increasing the concurrency option for each Sidekiq worker using the -c option to spawn more threads. Having more threads will allow you to continue processing scraping jobs even when scrapers are blocked on network I/O.

Heroku web dynos often slowing down but never getting more than 20% load?

A couple of times a day we see the response time of any one of our Heroku web dynos increase tremendously. We have been analyzing this with little success so far.
One strange thing we see, however, is the following. Look at this "Instances running" graph from New Relic:
You see that we've played around with the number of web dyno's, but the majority of the time we have had 2 dynos with 4 Unicorn processes each. But never ever do these instances seem to get the "full load". How should we interpret this? Is this just that at any time the sum of CPU usage of all instances never exceeds approx. 20%? And if so, it seems we're really underutilizing our dynos. What can we optimize here?
For clarity: the average memory usage is constant between 110MB and 120MB, so that does not seem to be the bottleneck.
Have you seen http://news.rapgenius.com/James-somers-herokus-ugly-secret-lyrics ?
Heroku has worked on addressing this, but you'll need to talk to them for further recommendations.

What's the minimum time counted for billing in Heroku?

Ec2 instance hour calculated by hour by hour. If you just start and close an instance, it still counted as one hour.
How Heroku handle this? By Minute or By Hour?
Lets assume my app usage exceeds 750 Free Dyno Hour Limit
Heroku prorata to the second. A dyno costs $0.05 per hour. So if you go over 750 hours you will be charged at $0.05 per hour or $0.000833333 per minute. In fact, pretty much all addons also follow the same billing model too.
You can read about billing and charges as https://devcenter.heroku.com/articles/usage-and-billing#cost
I will say, though, that the previous answer seems to be more accurate for the web dyno versus a worker dyno. Heroku's automated sleep cycle for your web dyno tries to prevent it from running too long when it's idle, say, for more than an hour. For the free web dyno it must sleep at least six hours per day for it not to incur charges. As long as you set the scaling to 1 for your web dyno and it sleeps then it should be free.
That said, when you add your first worker dyno those same automations aren't applied to this dyno. It presumably won't be triggered to sleep on idling for an hour. This means that unless you manage it you'll likely be charged $34.50 for each worker dyno per month. I wouldn't exactly call this lying to the customer but most people start off with that first free dyno, get comfortable with that and then innocently think that the next dyno will behave in a similar way--it won't and you'll likely get tagged paying more money than you'd bargained for. That's $414-per-year for a dyno. Compare this with Amazon's t2.micro cost of $150-per-year for one instance or $75-per-year for a 50% duty cycle of same.
As they say, "the devil is in the details". Heroku might be cheap for vanity websites but it's a bit costly if you have a database and worker thread (without any scaling otherwise whatsoever).

Resources