Is there a hard limit on how long Azure role startup can take? - windows

Suppose I include a rather long-running startup task into my Azure role - running something like up to several minutes. What happens if the startup task runs "too long".
I'm currently testing on Compute Emulator and observe the following.
I have a 450 megabytes .zip file together with Info-Zip unzip. The startup task unzips the archive. Deployment starts and I look into Task Manager. Numerous service processes start, then unzip.exe is run. After about two minutes all those processes stop and then start anew and unzip.exe starts again.
So it looks like a deployment is allowed to run for about two minutes, then is forcefully reset and started again.
Is this the expected behavior? Does it persist on real cloud? Are there any hard limits on how long a role startup can take? How do I address this situation except moving the unpacking into RoleEntryPoint.OnStart()?

I had the same question, so tried an experiment. I ran a Startup Task - taskType="simple" so that it would block the Roles from beginning to execute - and let it run for 50 hours. The Fabric Controller did not complain and the portal did not show any error. It finished its long "do nothing" loop after the 50 hours was up, then this Startup Task exited, and my Web Role started up fine.
So my emperical test says Startup Tasks can take a long time! At least 50 hours.

This should inform the load balancer that your process is still busy:

I have run startup tasks that run for a pretty long time (think 20-30 mins) and the role is simply in a 'Busy' state. I don't think there is a hard limit for how long the role will stay in that state as long as the Startup task is still executing and did not exit with a non-zero return code (in fact, this is a gotcha for most first time startup task creators when they pop a prompt). The FC is technically still running just fine, so there would be no reason to 'recover' the role (i.e. heartbeats are still going).
The dev emulator just notices when the role hasn't started and warns you. If you click the 'keep waiting' option, it will continue to run the Startup task to completion. The cloud does not do this of course (warn you).
Never tried a task that ran super long, so there might be a very long limit. I seem to recall 3 hrs was a magic number in some timeout cases like role recycles, but I have never tried...

There are some heartbeats that the Azure Fabric Agent will do against the role. If these are not acknowledged (say a long-running blocking process), this could cause the role to be flagged as unavailable.
You might try putting your startup process into a background thread that runs independently. This should help you keep the role from being recycled while the process is starting up. Just keep in mind you may need to make some adjustments if you get requests before the role fully starts up. There's also a way (that I can't seem to recall ATM) to flag the role and take it out of the load balancer temporarially while your process completes.


