I have a Play 2.x app up and running on Heroku with a single web dyno.
On startup, an Akka actor is triggered which itself schedules future jobs (e.g. sending push notifications).
object Global extends GlobalSettings {
override def onStart(app:Application) {
val actor = Akka.system.actorOf(Props[SomeActor])
Akka.system.scheduler.scheduleOnce(0 seconds, actor, None)
}
}
This works fine with one web dyno but I am curious to know what happens if I turn up the number of web dynos.
Will onStart be executed twice with two web dynos?
Would be great if Global really works globally and onStart is only executed once, independently of the number of web dynos. If not, multiple dynos have to somehow agree on one dyno responsible for doing the job.
Did anybody run into a similar issue?
If you run two web dynos, your global will be executed twice. Global is global to the process. When you scale your web process, you are running two processes. You have a couple options:
Use a different process (aka a singleton process) to run your global. The nice thing about Play is that you can have multiple GlobalSettings implementations. When you start your process, you specify the global you want to use with -Dapplication.global=YourSecondGlobal. In your procfile, then, you would have singleton: target/start -Dhttp.port=${PORT} ${JAVA_OPTS} -Dapplication.global=YourSecondGlobal. Start your web processes and singleton process and make sure singleton is scaled to 1.
Use a distributed semaphor to obtain a lock. Each process will then race to obtain a lock -- the one that wins will proceed and the others will fail. If you're using Postgres (as many people do on Heroku), an advisory lock is a good choice.
You can also get dyno name at runtime:
String dyno = System.getenv("DYNO");
so doing a check like this may also work:
if(dyno.equals("web.1")) {
}
Related
I am currently using a Hobby dyno to run my web process -- frontend website as well as an API. To execute jobs, I need to run a background process that processes these jobs. Do I need to spin up another dyno that specifically runs background workers? I ask this because I see that Standard 1X/2x dynos include "unlimited background workers" which makes me think that multiple process types can run on a single dyno. It looks like I can run 2 hobby dynos -- one for web, one for workers or upgrade to one of the standard dynos...is this correct?
I believe that yes, if you want to have a dedicated worker app, that needs to be a separate dyno. You can run the processes in memory though and that saves you doing it with a separate binary/dyno
For anyone who has used Heroku (and perhaps anyone else who has deployed to an PaaS before and has experience):
I'm confused on what Heroku means by "dynos", how dynos handle memory, and how users scale. I read that they define dynos as "app containers", which means that the memory/file system of dyno1 can't be accessed by dyno2. Makes sense in theory.
The containers used at Heroku are called “dynos.” Dynos are isolated, virtualized Linux containers that are designed to execute code based on a user-specified command. (https://www.heroku.com/dynos)
Also, users can define how many dynos, or "app containers", are instantiated, if i understand correctly, through commands like heroku ps:scale web=1, etc etc.
I recently created a webapp (a Flask/gunicorn app, if that even matters), where I declare a variable that keeps track of how many users visited a certain route (I know, not the best approach, but irrelevant anyways). In local testing, it appeared to be working properly (even for multiple clients)
When I deployed to Heroku, with only a single web dyno (heroku ps:scale web=1), I found this was not the case, and that the variable appeared to have multiple instances and updated differently. I understand that memory isn't shared between different dynos, but I have only one dyno which runs the server. So I thought that there should only be a single instance of this variable/web app? Is the dyno running my server on single/multiple processes? If so, how can I limit it?
Note, this web app does save files on disk, and through each API request, I check to see if the file does exist. Because it does, this tells me that I am requesting from the same dyno.
Perhaps someone can enlighten me? I'm a beginner to deployment, but willing to learn/understand more!
Is the dyno running my server on single/multiple processes?
Yes, probably:
Gunicorn forks multiple system processes within each dyno to allow a Python app to support multiple concurrent requests without requiring them to be thread-safe. In Gunicorn terminology, these are referred to as worker processes (not to be confused with Heroku worker processes, which run in their own dynos).
We recommend setting a configuration variable for this setting. Gunicorn automatically honors the WEB_CONCURRENCY environment variable, if set.
heroku config:set WEB_CONCURRENCY=3
The WEB_CONCURRENCY environment variable is automatically set by Heroku, based on the processes’ Dyno size. This feature is intended to be a sane starting point for your application. We recommend knowing the memory requirements of your processes and setting this configuration variable accordingly.
The solution isn't to limit your processes, but to fix your application. Global variables shouldn't be used to store data across processes. Instead, store data in a database or in-memory data store.
Note, this web app does save files on disk, and through each API request, I check to see if the file does exist. Because it does, this tells me that I am requesting from the same dyno.
If you're just trying to check which dyno you're on, fine. But you probably don't want to be saving actual data to the dyno's filesystem because it is ephemeral. You'll lose all changes made to the filesystem whenever your dyno restarts. This happens frequently (at least once per day).
I subscribed a Hobby plan in Heroku.
The details of the plan specifies that it allows up to 10 Process Types.
So I developed an app with the following Procfile:
backend-dev: node ./backend-dev/backend.js
backend-prod: node ./backend-prod/backend.js
Which represents 2 Process Types, right ?
But when I run it with:
heroku ps:scale backend-dev=1
heroku ps:scale backend-prod=1
I end up with two Hobby Dynos...
As the plan also specifies 7€/month/Dyno I am billed 14€/month.
So my questions are:
What is the difference between Process Types and Dynos?
Can I run 2 Process Types within a single Dyno?
Can I run for instance 1 free Dyno (for backend-dev) and 1 Hobby Dyno (for backend-prod)?
Consider this simple example of web application with background worker, so it has web process and worker process. When such app receives a lot of web traffic, but processes very few background jobs, you can increase the number of dynos for your web process, but have only one dyno for worker process. It is also possible to have different dyno size per process. Instead of using more dynos, you can use performance-l dyno for web process and standard-1x for worker process. In other words, Process Types describe different processes that are working together within one application. They are not supposed to be different applications like in your case.
No. You can run one Process Type on multiple dynos.
Technically you can run one process on free dyno and another on hobby, but it won't work in your case. When you upgrade to professional dynos, then all processes must run on professional dynos.
Your Procfile is all wrong. You must have Process Type name web to receive web traffic. If you start your current setup, you will be running two processes, but they will never receive any web requests. It is described in Heroku docs, only web process can receive web traffic and you can only have one such process. So to run two versions of your app, you need to create two different Heroku applications. And ideally you should allow to configure your app via environmental variables so you can deploy the same code to both apps.
Very similar question to Is it feasible to run multiple processeses on a Heroku dyno?, or Running Heroku background tasks with only 1 web dyno and 0 worker dynos except I'm talking about a Ruby on Rails app.
Context:
I understand that it's encouraged to separate worker and web dynos... but I'm still testing and don't want to pay the expense. Especially because with my app, all the web requests pretty much happen either in the AM or in the PM, and during the whole middle of the day (and also middle of the night), literally nothing is happening.
I'd like the web dyno to run two types of background processing on the "downtime":
A recurring, long-running task every day (mailings)
An asynchronous, long-running task that is triggered when a user performs a certain action (it's a mailer)
I've done quite a bit of reading on this, but this is my first time doing anything asynchronous, so I wanted to ask the community a couple of questions just to ensure what I'm trying to do is feasible.
Questions
How do I do activity #1 for free?
To put it bluntly... considering my context above, if I use Heroku's Scheduler add-on, this runs a one-off dyno which I'll be charged for since I use NewRelic now to constantly ping my web dyno so it never actually sleeps meaning my one web dyno is my free dyno. Is there another way of doing this with the one web dyno that, in the middle of the night, won't be processing any requests? Alternatively, is there a way to tell New Relic to ping except at certain times, which will also then allow me to spin up a one-off dyno but still be within my free dyno hours?
For activity #2, I'm thinking of using Delayed Jobs, but how do I tell Delayed Jobs to delay until end of user 1's session, and then run mailer for user 1, but then pause again if user 2 sends a request, and then when user 2 is finished, start where left off on user 1's mailer, and then do user 2's mailer... and so forth? I think the root of the challenge here is that from what I've read, Delayed Jobs needs to be started with a script. But I'm not going to be at my computer starting a script all the time. How do I make the start (and the queue as illustrated in the question) something that happens automatically?
Would love even just point me directional pointers on what methods/ what considerations, etc.
I'm going to check out a nifty gem https://github.com/brandonhilkert/sucker_punch to do this. According to the author, it was written specifically to use Heroku's single dyno for hobby websites that have no need to spin up another dyno. It basically creates another thread.
FYI also, there is an add-on link that allows sucker_punch to do recurring tasks, called https://github.com/facto/fist_of_fury
If I have an app on Heroku that consists of one worker and one or no web dynos, will it run? I'm unsure if the absent or idling web dynos will cause the worker dyno not to run.
Heroku doesn't just run web dynos, in fact, it makes no assumptions at all with regards to the processes you're running. There's absolutely nothing wrong with launching a single worker process.
This is actually a common scenario for me to deploy single cron-like tasks to Heroku, I've written about it here http://blog.y3xz.com/blog/2012/11/16/deploying-periodical-tasks-on-heroku/
If you are looking for cron-like tasks for simple jobs (like I am), now you have another alternative: Heroku Scheduler. It is easy to configure in a dashboard.
Advantage:
No need to choose and learn a new scheduler library. Configure it in seconds.
Same way for different platforms: Python, Ruby, etc.
Save Dyno Hours for Free Plan user. Only the actual working time counts. Some scheduler library (like Rufus Scheduler) will keep running between launches (so that it does not rely on cron to work).
Disadvantage:
Trivial options. You can only choose among "Daily"/"Hourly"/"Every 10 minutes".
Conclusion: Best for basic use.