How do I hard stop an Azure role? - windows

Here's my scenario: my Azure web role does a lot of work in OnStart() and produces a huge debug trace that is uploaded to Blob Storage.
Now OnStart() hangs for whatever reason and I look into Blob Storage and see that trace has not been updated for several minutes already. So I decide the role is beyond repair and I want to shut it down immediately so that I can update the role with another package and start it again.
The problem is when I hit "Stop" in the Management Portal it takes up to ten minutes to stop the role - I guess it tries to convince the role to stop gracefully and wait for several minutes.
Can I somehow make the role stop immediately without letting it stop gracefully?

I wonder if deleting the deployment (that's presumably what you're going to do after stopping it?) is faster, but I'm not sure. As far as I know, there's only one kind of "stop," so no, I don't think there's a way to force a faster stop.

Have a look # Windows Azure Platform PowerShell Cmdlets
It should give you at least the same functionality and probably more control over the actions. You could also request the current status as it is not always reflected immediately in the Silverlight portal.

Related

Stateful application in Azure

The issue I have is that I'm using a third party dll for something (very expensive operation), it's not serializable, and it takes a minute to spin up each time. It's needed on each call of a WCF service and I can't keep it in memory (recyling), and I can't keep it in a cache (unserializable).
I was wondering what alternatives (if any) there are? I was originally thinking about using a Worker Role, but then I read that they are recycled too. Then I considered a Windows service, but I'm hoping there is something better suited.
I'd like to think I'm not the only one with this issue, and that someone else has already solved this issue! :)
Why are you unable to use Worker Roles or Web Roles to keep the data generated by yoru process in memory? Neither of the two roles "flushes" it's memory on a frequent basis. True, that it is not guaranteed that reboots do not happen, but those reboots are very rare and checking to see IF your statefull data is empty and then repopulating it when it is, shouldnt be a big deal and the logic would work on any server the same way, whether it is a Cloud Service or a dedicated VM.
Edit: Web roles or worker roles do not restart on any known cycle. However, by default IIS does recycle on a schedule. This timer can be changed or disabled via a startup script.
Furthermore, no such recycling happens in worker roles. So, if you're running a worker role, the thing will stay in memory as long as you dont recycle the server yourself or a rare windows update happens
HTH

Azure co-located cache starting too long

I am working on Azure solution (Azure SDK 2.1) with one web role (2 instances) and one worker role (2 instances). Both are using co-located (in-role) caching. The problem is that cache service on the worker role instances starting way too long - for several minutes every call to cache returns only DataCacheException-s saying that cache is temporary unavailable etc.
From your experience, is this normal? I think that cache service should be part of the "provisioned" environment, and should be already ready when Run method is called.
Is there anything I can do to handle this? Maybe some "event" to know when cache is ready? A way to say azure fabric to run my worker code only when cache is ready, etc. ?
We have run into the same issue, and most people have suggested a 2-3 minute sleep in your OnRun event as a work around.
I am pretty sure there is an item on the Microsoft Azure backlog for this, and they don't really have a work around other than waiting to use caching until it is ready.
I know this is not the most elegant solution, but in my talks with different people at Microsoft this has been their suggestion for the time being.

What exactly happens when I change number of Azure role instances?

I observe the following weird behavior. I have an Azure web role which is deployed on love Azure cloud. Now I click "Configure" in the Azure Management Portal and change the number of instances - the portal shows some "activity". Now I open the browser and navigate to the URL assigned to my deployment and start refreshing the page something like once per two seconds. The page reloads fine many times and then fro some time it will stop reloading - the request will be rejected, then after something like half a minute the requests are handled normally.
What is happening? Is the web server temporarily stopped? How do I change number of instances so that HTTP requests to the role are handled at all times?
When you change the configuration file, your current instance might be restarted. This might be the reason you met with, which your website didn't response in about 30 seconds.
Please have a look http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.serviceruntime.roleenvironment.changing.aspx and check if it 's because of the role restarting.
What you are doing is manual. Have you looked at the SDK for autoscaling Azure?
http://channel9.msdn.com/posts/Autoscaling-Windows-Azure-applications
Check out the demo at the 18 minute mark. It doesn't answer your question directly, but its a much more configurable/dynamic way of scaling Azure.
Azure updates your roles one update domain at a time, so in theory you should see no downtime when updating the config (provided you have at least two instances). However, if you refresh the browser every couple of seconds, it's possible that your requests go always to the same instance due to keep-alive.
It would be interesting to know what the behavior is if you disable keep-alives for your webrole. Note that this will have a performance impact, so you'll probably want to re-enable keep-alives after the exercise.

What processĀ API do I need to hook to track services?

I need to track to a log when a service or application in Windows is started, stopped, and whether it exits successfully or with an error code.
I understand that many services do not log their own start and stop times, or if they exit correctly, so it seems the way to go would have to be inserting a hook into the API that will catch when services/applications request a process space and relinquish it.
My question is what function do I need to hook in order to accomplish this, and is it even possible? I need it to work on Windows XP and 7, both 64-bit.
I think your best bet is to use a device driver. See PsSetCreateProcessNotifyRoutine.
Windows Vista has NotifyServiceStatusChange(), but only for single services. On earlier versions, it's not possible other than polling for changes or watching the event log.
If you're looking for a user-space solution, EnumProcesses() will return a current list. But it won't signal you with changes, you'd have to continually poll it and act on the differences.
If you're watching for a specific application or set of applications, consider assigning them to Job Objects, which are all about allowing you to place limits on processes and manage them externally. I think you could even associate Explorer with a job object, then all tasks launched by the user would be associated with your job object automatically. Something to look into, perhaps.

Does Azure role need to clean up local resources before terminating?

Suppose my Azure role is notified that it will be terminated soon and technically can clean up local resources after itself (temporary files for example). Should it do so?
I'm not asking about whether someone will see my leftover temporary files - just how my role can be a polite good Azure citizen.
Does it make sense for the role to clean up local resources or should it just leave everything as is?
Like Stuart said, there's no reason to do any local storage cleanup. You either leave it for yourself to use in the future (which is not guaranteed), or you have the local storage cleaned up automatically after your role instance shuts down.
What you do want to do during shutdown is relesae blob leases, close open sessions, shut down database connections, etc. You won't have this opportunity if the Guest OS (or Host OS) crashes, but you always want to handle graceful shutdowns when possible.
I can't think of any good reason why you should clean up things like temporary files during this shutdown.
Instead I just use the notification as a graceful way to shutdown - hopefully avoiding leaving any jobs "half-finished".
For the issue of temporary files in particular, the LocalStorage feature has a "Clean on Role Recycle" property - you should probably set that to true.

Resources