According to the Heroku documentation when session affinity is turned on and number of Dynos (nodes) scales up the existing traffic is evenly distributed to the new Dynos.
This has affect that client of 'old' Dynos are assigned to the Dynos they have not communicated before. This is unavoidable when scaling down but is not be necessary when scaling up.
Is there a possiblity to prevent Heroku load balancer from assigning existing sessions to the new Dynos and rather keeping them with the original ones ?
Related
From Heroku preboot docs:
When you make releases with preboot, Heroku switches the routing from the old dyno to the new dyno at one point. It is possible to have a very short time overlap where the requests go to both dynos during the switchover.
Instead of stopping the existing set of web dynos before starting the new ones, preboot ensures that the new web dynos are started (and receive traffic) before the existing ones are terminated.
I couldn't find anywhere stating if the new dynos need to be healthy (serving successful requests) before the rollout successfully completes or what happens otherwise.
Documentation: https://devcenter.heroku.com/articles/preboot
For anyone who has used Heroku (and perhaps anyone else who has deployed to an PaaS before and has experience):
I'm confused on what Heroku means by "dynos", how dynos handle memory, and how users scale. I read that they define dynos as "app containers", which means that the memory/file system of dyno1 can't be accessed by dyno2. Makes sense in theory.
The containers used at Heroku are called “dynos.” Dynos are isolated, virtualized Linux containers that are designed to execute code based on a user-specified command. (https://www.heroku.com/dynos)
Also, users can define how many dynos, or "app containers", are instantiated, if i understand correctly, through commands like heroku ps:scale web=1, etc etc.
I recently created a webapp (a Flask/gunicorn app, if that even matters), where I declare a variable that keeps track of how many users visited a certain route (I know, not the best approach, but irrelevant anyways). In local testing, it appeared to be working properly (even for multiple clients)
When I deployed to Heroku, with only a single web dyno (heroku ps:scale web=1), I found this was not the case, and that the variable appeared to have multiple instances and updated differently. I understand that memory isn't shared between different dynos, but I have only one dyno which runs the server. So I thought that there should only be a single instance of this variable/web app? Is the dyno running my server on single/multiple processes? If so, how can I limit it?
Note, this web app does save files on disk, and through each API request, I check to see if the file does exist. Because it does, this tells me that I am requesting from the same dyno.
Perhaps someone can enlighten me? I'm a beginner to deployment, but willing to learn/understand more!
Is the dyno running my server on single/multiple processes?
Yes, probably:
Gunicorn forks multiple system processes within each dyno to allow a Python app to support multiple concurrent requests without requiring them to be thread-safe. In Gunicorn terminology, these are referred to as worker processes (not to be confused with Heroku worker processes, which run in their own dynos).
We recommend setting a configuration variable for this setting. Gunicorn automatically honors the WEB_CONCURRENCY environment variable, if set.
heroku config:set WEB_CONCURRENCY=3
The WEB_CONCURRENCY environment variable is automatically set by Heroku, based on the processes’ Dyno size. This feature is intended to be a sane starting point for your application. We recommend knowing the memory requirements of your processes and setting this configuration variable accordingly.
The solution isn't to limit your processes, but to fix your application. Global variables shouldn't be used to store data across processes. Instead, store data in a database or in-memory data store.
Note, this web app does save files on disk, and through each API request, I check to see if the file does exist. Because it does, this tells me that I am requesting from the same dyno.
If you're just trying to check which dyno you're on, fine. But you probably don't want to be saving actual data to the dyno's filesystem because it is ephemeral. You'll lose all changes made to the filesystem whenever your dyno restarts. This happens frequently (at least once per day).
I have deployed my node app on heroku free web dyno plan.I want to know how much free hours are remaining and how much are left so for that I am using
heroku ps -a <AppName>
After running above command I got something like this below:
As per the result everything is clear but what does Web(Free) mean written in green color. Someone please let me know any help would be appreciated.
THANKS
It means your app is running on a single web dyno and free dyno type.
Dyno configurations
Every dyno belongs to one of the three following configurations:
Web: Web dynos are dynos of the “web” process type that is defined in your Procfile. Only web dynos receive HTTP traffic from the routers.
Worker: Worker dynos can be of any process type declared in your Procfile, other than “web”. Worker dynos are typically used for background jobs, queueing systems, and timed jobs. You can have multiple kinds of worker dynos in your application. For example, one for urgent jobs and another for long-running jobs. For more information, see Worker Dynos, Background Jobs and Queueing.
One-off: One-off dynos are temporary dynos that can run detached, or with their input/output attached to your local terminal. They’re loaded with your latest release. They can be used to handle administrative tasks, such as database migrations and console sessions. They can also be used to run occasional background work, as with Heroku Scheduler. For more information, see One-Off Dynos.
Once a web or worker dyno is started, the dyno formation of your app will change (the number of running dynos of each process type) - and subject to dyno lifecycle, Heroku will continue to maintain that dyno formation until you change it. One-off dynos, on the other hand, are only expected to run a short-lived command and then exit, not affecting your dyno formation.
Dyno Types
Heroku provides a number of different dyno types each with a set of unique properties and performance characteristics. Free, Hobby, Standard and Performance dynos are available in the Common Runtime to all Heroku customers. Private Dynos only run in Private Spaces and are available in Heroku Enterprise.
I am using New Relic standard and Rails 3 on Heroku to build a web site. But not sure what indicators shown on New Relic should I keep an eye on to scale up the web dyno when certain criteria are met?
Say, indicator A comes to X level, I should add one Dyno to put it down.
Thank you!
Primarily you want to be looking at your logs and at the queue attribute on the heroku[router] - if this starts going up (and importantly staying up) then you have too many requests that are being queued and can't be processed fast enough.
If you're seeing long queue-wait times in the New Relic dashboard and there are no other good explanations (i.e. high variability in response times, use of Thin web server instead of Unicorn, etc.), that's generally a good indication requests are waiting in queue to be processed by a dyno.
I'm hoping the community can clarify something for me, and that others can benefit.
My understanding is that gunicorn worker processes are essentially virtual replicas of Heroku web dynos. In other words, Gunicorn's worker processes should not be confused with Heroku's worker processes (e.g. Django Celery Tasks).
This is because Gunicorn worker processes are focused on handling web requests (basically throttling up the performance of the Heroku Web Dyno) while Heroku Worker Dynos specialize in Remote API calls, etc that are long-running background tasks.
I have a simple Django app that makes decent use of Remote APIs and I want to optimize the resource balance. I am also querying a PostgreSQL database on most requests.
I know that this is very much an oversimplification, but am I thinking about things correctly?
Some relevant info:
https://devcenter.heroku.com/articles/process-model
https://devcenter.heroku.com/articles/background-jobs-queueing
https://devcenter.heroku.com/articles/django#running-a-worker
http://gunicorn.org/configure.html#workers
http://v3.mike.tig.as/blog/2012/02/13/deploying-django-on-heroku/
https://docs.djangoproject.com/en/dev/howto/deployment/wsgi/gunicorn/
Other Quasi-Related Helpful SO Questions for those researching this topic:
Troubleshooting Site Slowness on a Nginx + Gunicorn + Django Stack
Performance degradation for Django with Gunicorn deployed into Heroku
Configuring gunicorn for Django on Heroku
Troubleshooting Site Slowness on a Nginx + Gunicorn + Django Stack
To provide an answer and prevent people from having to search through the comments, a dyno is like an entire computer. Using the Procfile, you give each of your dynos one command to run, and it cranks away on that command, re-running it periodically to refresh it and re-running it when it crashes. As you can imagine, it's rather wasteful to waste an entire computer running a single-threaded webserver, and that's where Gunicorn comes in.
The Gunicorn master thread does nothing but act as a proxy server, spawning a given number of copies of your application (workers), distributing HTTP requests amongst them. It takes advantage of the fact that each dyno actually has multiple cores. As someone mentioned, the number of workers you should choose depends on how much memory your app takes to run.
Contrary to what Bob Spryn said in the last comment, there are other ways of exploiting this opportunity for parallelism to run separate servers on the same dyno. The easiest way is to make a separate sub-procfile and run the all-Python Foreman equivalent, Honcho, from your main Procfile, following these directions. Essentially, in this case your single dyno command is a program that manages multiple single commands. It's kind of like being granted one wish from a genie, and making that wish be for 4 more wishes.
The advantage of this is you get to take full advantage of your dynos' capacity. The disadvantage of this approach is that you lose the ability scale individual parts of your app independently when they're sharing a dyno. When you scale the dyno, it will scale everything you've multiplexed onto it, which may not be desired. You will probably have to use diagnostics to decide when a service should be put on its own dedicated dyno.