I configure the recovery for Windows services to restart with a one minute delay after failures. But I have never gotten it to actually restart the service (even with the most blatant errors).
I do get a message in the EventViewer:
The description for Event ID ( 1 ) in Source ( MyApp.exe ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: Access violation at address 00429874 in module 'MyApp.exe'. Write of address 00456704.
Is there something else I have to do? Is there something in my code (I use Delphi) which needs to be set to enable this?
Service Recovery is intended to handle the case where a service crashes - so if you go to taskmgr and right click "end process" on your service process, the recovery logic should kick in. I don't believe that the service recovery logic kicks in if your service exits gracefully (even if it exits with an error).
Also the eventvwr message indicates that your application called the ReportEvent API specifying event ID 1. But you haven't registered your event messages with the event viewer so it can't convert event ID 1 into a meaningful text string.
Service Recovery only works for unexpected exit like (exit(-1)) call.
For all the way we use to stop the service in usual way will not works for recovery.
If you want to stop service and still wants recovery to work, call exit(-1) and you will see error message as "service stopped with unexpected error" , and then your service will restart as recovery setting is.
The Service Control Manager will attempt to restart your service if you've set it up to be restarted by the SCM. This is detailed here in the documentation for the SERVICE_FAILURE_ACTIONS structure.
A service is considered failed when it
terminates without reporting a status of SERVICE_STOPPED to the
service controller.
This can be fine tuned by setting the SERVICE_FAILURE_ACTIONS_FLAG structure's fFailureActionsOnNonCrashFailures flag, see here). You can set this setting from the Services applet by checking the "Enable actions for stops with errors" checkbox on the recovery tab.
If this member is TRUE and the service has configured failure actions, the failure actions are queued if the service process terminates without reporting a status of SERVICE_STOPPED or if it enters the SERVICE_STOPPED state but the dwWin32ExitCode member of the SERVICE_STATUS structure is not ERROR_SUCCESS (0).
If this member is FALSE and the service has configured failure actions, the failure actions are queued only if the service terminates without reporting a status of SERVICE_STOPPED.
So, depending on how you have structured your service, how you have configured your failure actions AND what you do when you have your 'fatal error' it may be enough to call ExitProcess() or exit() and return a non zero value. However, it's probably safest to ensure that your service exits without the code that's dealing with the SCM telling the SCM that your service has reached the SERVICE_STOPPED state. This ensures that your failure actions ALWAYS happen...
If you 'kill' service from task manager - forgot for recovery logic. In background task manager 'kills' process by 'stop service'. and as yuo can guess - this is not service failure. This forced me to kill it really with Visual Studio. In task manager right click on service process. Select debug.
In Visual studio select Debug-> Terminate All.
And now you have simulated service fail. In this case recovery logic works fine.
Related
In a Windows service, my Service Control Handler receives a SERVICE_CONTROL_STOP command. I would like to determine the reason for this command; specifically, I need to know whether the STOP was requested because a depended-upon service ("master") is stopping or because of any other reason. The reason is, if my service stopped because the user requested a stop or because Windows is shutting down or any other similar reason, I don't need to do anything, but if my service is stopping because master is stopping, I need to make sure I restart my service when master restarts.
Unfortunately, I don't really see any source of this information - RegisterServiceCtrlHandlerEx will let me provide a handler which can get some details behind the control event, but there doesn't seem to be a notification that I can use. But maybe there's some other way, e.g. getting the info through the Session Manager or something.
In a Windows service, my Service Control Handler receives a SERVICE_CONTROL_STOP command. I would like to determine the reason for this command
Sorry, but the SCM does not provide that information to services.
specifically, I need to know whether the STOP was requested because a depended-upon service ("master") is stopping or because of any other reason.
There is no way for your service to determine that.
The reason is, if my service stopped because the user requested a stop or because Windows is shutting down or any other similar reason, I don't need to do anything
Detecting Windows shutting down is easy - your service can request to receive SERVICE_CONTROL_PRESHUTDOWN and SERVICE_CONTROL_SHUTDOWN events. For any other stop reason, it will only receive SERVICE_CONTROL_STOP with no explanation as to why.
if my service is stopping because master is stopping, I need to make sure I restart my service when master restarts.
There are two possible ways to handle that:
run a separate process that monitors the status of "master", either by regularly polling QueryServiceStatus() or by using NotifyServiceStatusChange(), and have it start your service when it detects "master" stop and restart.
if "master" logs events in the system log via an ETW provider, you can use ChangeServiceConfig2(SERVICE_CONFIG_TRIGGER_INFO) to register a trigger action that starts your service when a particular event is logged.
Unfortunately, I don't really see any source of this information - RegisterServiceCtrlHandlerEx will let me provide a handler which can get some details behind the control event, but there doesn't seem to be a notification that I can use.
Correct, because there isn't one.
We have a service running on a Windows Server 2003 machine. This service watches a particular folder on an FTP server, and when files appear there, it invokes one of a few different executables to process them.
I've been asked to find a way for staff to be alerted in some way when this service hangs or stops.
Can anyone suggest anything with just this much information? If not, what else would you need to know?
Seems we could write ANOTHER service to watch THIS service, but then there's a chance THAT one would stop ... so we haven't resolved anything.
About the only thing that I know if is writing another application or service that monitors if that service is running; something like that shouldn't have any unexpected behavior and stop, hopefully.
Another thing to do is go to the service in Windows, go to its properties, and then go to recovery options. From here, you can set the behavior of a service if it is to fail. The options in Windows 7 are to restart the service or computer, or run a program. This program could send some sort of notification. However, I don't know if any or all of these options exist in Server 2003. This would also not likely work if the service were to just hang, but a service watching it probably wouldn't either.
Also, if you have the source code, you can override some of the service-related methods such as OnStop() (for C#) to send a notification, but I don't believe this works with a failure.
My personal choice would be to set the recovery options just to restart the service on failure, unless it repeatedly fails, which there is also an option for. But just do what you think will work best for you; there isn't really a fail-safe method to do it.
UPDATE:
I did check, and Server 2003 does indeed have the same recovery options in the service manager. As the guys said above, you can deal with that, but it is only in C++ from what I have seen; there is also a command prompt way to do it:
sc failure [servicename] reset= 0 actions= restart/60000
I found that command here and you can look at it more in its MSDN documentation. You could call this command from C# or other languages if you are not using C++, or use it directly from the command prompt if you do not have the source code.
Use ChangeServiceConfig2() to define Failure Actions for your service. You could use that to invoke an external command to issue the alert (or do pretty much anything else you want) if the service terminates unexpectedly.
The SCM (the component which handles services) has built-in auto-restart logic that you can take advantage of to restart your service, as necessary. Additionally and/or alternatively, you can configure 'custom actions' to be associated with a failure of the service - the custom action can include launching a program of your own, which could then log the failure, and perhaps manually restart your service.
You can read more about such custom actions on MSDN by looking at the documentation of the structure used to configure such actions: SERVICE_FAILURE_ACTIONS. Once you fill that structure, you notify the SCM by calling the ChangeServiceConfig2 function.
Please don't ask "well, what happens if my failure handler program crashes" :)
I'm trying to deploy a patch to a service I created and replace the service file.
For that reason I need to stop the service so the file will be released.
I'm using sc \\remote stop svcname, then I query the service using sc \\remote query svcname until I see that it's state is STOPPED.
At this point the service file should be unlocked, and to be on the safe side I also delete the service using sc \\remote delete svcname.
Still, it doesn't seem to release the file and any deletion or change attempt fails.
I know one solution might be polling the file repeatedly, but I want to avoid this method.
Any suggestions?
Windows don't ensure the process providing the service terminates when the service is stopped (the process may provide more than one service). It just considers the service stopped when it handles the message sent to it.
So if the service process has a bug and does not properly release resources, they may still be locked. I would probably wait a little and than simply terminate the process.
There is also a tool from Microsoft called handle.exe (this is command-line version, they also have a GUI-one) that can list which processes hold the file open. It should be possible to get the same information programmatically, but I am not sure of the exact calls to make (and you need administrator privileges; you have to give them to the tool too). That way you can check whether the file is open, by which process and wait for it to terminate or force-terminate it if you didn't know which one it is.
sc query state= all works as expected from the command line.
From within another Service, sc query state= all doesn't print anything to that sub-process' stdout (captured by the parent, of course).
Is there a permission/privilege that the Service needs in order to list/start/stop the other servies?
A little background: I am making a service that periodically restarts some misbehaving services.
Well, for one don't do that, at least not in a blocking manner. In order for your own service to respond to the SCM (Service Control Manager) in order to return its status, the service has to be able to execute its dispatcher code. This means that if you call this program and wait for it to exit you'll wait indefinitely. One way to mitigate this would be to put this into a separate thread so it's not blocking your dispatching and your service will continue to talk to the SCM.
Alternatively (and probably better) you could use the EnumServicesStatusEx function to talk to the SCM and inquire about the statuses of other services yourself. The function itself doesn't mention anything about being blocking, so you'd have to figure out yourself whether it is and then use a thread again to prevent your service from stopping to talk to the SCM.
One last note: if those misbehaving services are yours, you should more likely fix the respective code. I've had a share of legacy code and had one misbehaving service which got its own helper application as "fault action" (can be configured in service configuration as SERVICE_CONFIG_FAILURE_ACTIONS) that would go about and restart the service whenever it crashed. Once I took that code over, figured out the cause and fixed it, the service was stable again and that application isn't really needed anymore.
I have created a service in Windows and set enteries in Registry so that the service automatically starts on log on.
Now the problem is that in Task manager->Services field, my service's status is Running for only 2-3 minutes after log-on.
After this time my service status turns to Stopped, and it never again switches to running.
It also doesnot do its designated work.
I want to know that what changes in Registry or the Properties of the service can be made to make sure that service is always running.
Chances are that you are getting an unhandled exception which is shutting down the process.
You need to add logging to your windows service - something that will write all exceptions to the event log is a fairly common thing to do.
This will allow you to see why the service is stopping. At this point you will hopefully have enough information to fix the coding error.
I want to know that what changes in Registry or the Properties of the service can be made to make sure that service is always running.
Well, assuming the issue is with your service, there is no configuration that will help.