Using Heroku's PAAS it is understood that changing config variables causes a new release and the preboot feature handles typical releases by starting new dynos before switching over request handling. What is unclear from looking at Heroku's documentation for deployments with preboot, is how preboot works when just changing the config/environment variables.
Can someone confirm that there the behaviour when changing config vars is identical the a normal release?
Upon re-reading the documentation it does appear that the dyno preboot process does apply when updating config vars - so there is no gap in processing requests.
Related
For anyone who has used Heroku (and perhaps anyone else who has deployed to an PaaS before and has experience):
I'm confused on what Heroku means by "dynos", how dynos handle memory, and how users scale. I read that they define dynos as "app containers", which means that the memory/file system of dyno1 can't be accessed by dyno2. Makes sense in theory.
The containers used at Heroku are called “dynos.” Dynos are isolated, virtualized Linux containers that are designed to execute code based on a user-specified command. (https://www.heroku.com/dynos)
Also, users can define how many dynos, or "app containers", are instantiated, if i understand correctly, through commands like heroku ps:scale web=1, etc etc.
I recently created a webapp (a Flask/gunicorn app, if that even matters), where I declare a variable that keeps track of how many users visited a certain route (I know, not the best approach, but irrelevant anyways). In local testing, it appeared to be working properly (even for multiple clients)
When I deployed to Heroku, with only a single web dyno (heroku ps:scale web=1), I found this was not the case, and that the variable appeared to have multiple instances and updated differently. I understand that memory isn't shared between different dynos, but I have only one dyno which runs the server. So I thought that there should only be a single instance of this variable/web app? Is the dyno running my server on single/multiple processes? If so, how can I limit it?
Note, this web app does save files on disk, and through each API request, I check to see if the file does exist. Because it does, this tells me that I am requesting from the same dyno.
Perhaps someone can enlighten me? I'm a beginner to deployment, but willing to learn/understand more!
Is the dyno running my server on single/multiple processes?
Yes, probably:
Gunicorn forks multiple system processes within each dyno to allow a Python app to support multiple concurrent requests without requiring them to be thread-safe. In Gunicorn terminology, these are referred to as worker processes (not to be confused with Heroku worker processes, which run in their own dynos).
We recommend setting a configuration variable for this setting. Gunicorn automatically honors the WEB_CONCURRENCY environment variable, if set.
heroku config:set WEB_CONCURRENCY=3
The WEB_CONCURRENCY environment variable is automatically set by Heroku, based on the processes’ Dyno size. This feature is intended to be a sane starting point for your application. We recommend knowing the memory requirements of your processes and setting this configuration variable accordingly.
The solution isn't to limit your processes, but to fix your application. Global variables shouldn't be used to store data across processes. Instead, store data in a database or in-memory data store.
Note, this web app does save files on disk, and through each API request, I check to see if the file does exist. Because it does, this tells me that I am requesting from the same dyno.
If you're just trying to check which dyno you're on, fine. But you probably don't want to be saving actual data to the dyno's filesystem because it is ephemeral. You'll lose all changes made to the filesystem whenever your dyno restarts. This happens frequently (at least once per day).
I currently have one hobby dyno and I'd like to upgrade it to Standard 1x because of the preboot feature laid out here: https://devcenter.heroku.com/articles/preboot
Instead of stopping the existing set of web dynos before starting the
new ones, preboot ensures that the new web dynos are started (and
receive traffic) before the existing ones are terminated. This can
contribute to zero downtime deployments.
The wording is confusing because it sounds like I must have more than 1 dyno for it to work. An old one and a new one. Is this true? Or can I do zero downtime deploys with just 1 standard dyno?
This also works with 1 dyno, since the second dyno then is handled by heroku in the background. We're heavily using it for all kinds of applications.
The article already states most of the important details.
If I have some configuration changes in the Unsecure (does it matter?) Configuration Settings, and I make a change, will that force all instances of the plugin to get the newest settings, or does it take a while for the configuration settings to propagate?
The change does have to propagate to each front end web server, so I have definitely seen very short delays before, but I'm talking seconds. The vast majority of the time, as soon as I hit Update Step and then initiate whatever action in the UI, I can see that the plugin ran with the updated configuration value.
I'm less sure about delays when it comes to plugins running in the async service. Meaning, if 30 async plugins/wfs are queued up, you make your config change, I'm not sure if those queued up async jobs will use the new value or the old.
One easy way to investigate this would be for your plugin to write to the trace log and then set the trace log level in system settings to all. Plugin-trace log records show what configuration values the plugin ran with.
My Question:
If I see these lines in heroku logs, does that imply that preboot is disabled? If not, why not?
2015-02-04T14:48:00.674205+00:00 heroku[web.1]: State changed from up to starting
2015-02-04T14:48:00.720515+00:00 heroku[web.2]: State changed from up to starting
My understanding is that preboot should fire up brand new dynos, get them ready to serve requests, start routing requests to them, then shut down the old dynos. Nowhere in that process would I imagine dynos changing from up to starting.
The Background:
I'm working on a deploy script that automatically toggles preboot depending on whether any database changes will be made. In testing the script, I'm watching the logs hoping to determine whether preboot is actually being used when it should. I see preboot turning on in the console output of my script:
Enabling preboot for <snip>... done
Yet in the logs I am seeing what I pasted at the top. I'm trying to reconcile these facts.
One way to verify this is to watch the dyno ID number change. You can see the dyno ID by adding log-runtime-metrics to your app.
source=web.1 dyno=heroku.2808254.d97d0ea7-cf3d-411b-b453-d2943a50b456 sample#load_avg_1m=2.46 sample#load_avg_5m=1.06 sample#load_avg_15m=0.99
You can watch for that "dyno" value to change once the new dynos are accepting requests.
Is there a way to identify the heroku dyno name (e.g. web.1, web.2) from within the application? I'd like to be able to generate a unique request id (e.g. to track requests between web and worker dynos for consolidated logging of the entire request stack) and it seems to me that the dyno identifier would make a decent starting point.
If this can't be done, does anyone have a fallback recommendation?
Recently that issue has been addressed by Heroku team.
The Dyno Manager adds DYNO environment variables that holds identifier of your dyno e.g. web.1, web.2, foo.1 etc. However, the variable is still experimental and subject to change or removal.
I needed that value (actually instance index like 1, 2 etc) to initialize flake id generator at instance startup and this variable was working perfectly fine for me.
You can read more about the variables on Local environment variables.
I asked this question of Heroku support, and since there are others here who have asked similar questions to mine I figured I should share it. Heroku staff member JD replied with the following:
No, it's not possible to see this information from inside the dyno.
We've reviewed this feature request before and have chosen not to
implement it, as this would introduce a Heroku-specific variable which
we aim to avoid in our stack. As such, we don't have plans to
implement this feature.
You can generate / add to your environment a unique identifier (e.g. a
UUID) on dyno boot to accomplish a similar result, and you can
correlate this to your app's dynos by printing it to your logs at that
time. If you ever need to find it later, you can check your logs for
that line (of course, you'll need to drain your logs using Papertrail,
Loggly, etc, or to your own server).
Unfortunately for my scenario, a UUID is too long (if I wanted such a large piece of data, I would just use a UUID to track things in the first place). But it's still good to have an official answer.
Heroku has a $DYNO environment variable, however there are some big caveats attached to it:
"The $DYNO variable is experimental and subject to change or removal." So they may take it away at any point.
"$DYNO is not guaranteed to be unique within an app." This is the more problematic one, especially if you're looking to implement something like Snowflake IDs.
For the problem you're attempting to solve, the router request ID may be more appropriate. Heroku passes a unique ID to every web request via the X-Request-ID header. You can pass that to the worker and have both the web and worker instance log the request ID anytime they log information for a particular request/bit of work. That will allow you to correlate incidents in the logs.
This may not exactly answer the question, but you could have a different line in your Procfile for each worker process (using a ps:scale of 1 for each). You could then pass in the worker number as an environment variable from the Procfile.
Two lines from an example procfile might look like:
worker_1: env WORKER_NUMBER=1 node worker
worker_2: env WORKER_NUMBER=2 node worker
The foreman package which heroku local uses seems to have changed the ENV variable name again (heroku/7.54.0). You can now get the worker name via $FOREMAN_WORKER_NAME when running locally. It has the same value $DYNO will have when running on Heroku (web.1, web.2, etc)
The foreman gem still uses $PS, so to access the dyno name and have it work both on heroku and in development (when using foreman) you can check $PS first and then $DYNO. To handle the case of a local console, check for Rails.console
dyno_name = ENV['PS'] || ENV['DYNO'] || (defined?(Rails::Console) ? "console" : "")
It's dangerous to use the DYNO environment variable because its value is not guaranteed to be unique. That means you can have two dynos running at the same time that briefly have the same DYNO variable value. The safe way to do this is to enable dyno metadata and then use the HEROKU_DYNO_ID environment variable. That will better let you generate unique request ids. See: https://devcenter.heroku.com/articles/dyno-metadata