Phantom Pending Reboot causing SCCM Updates to fail - sccm

Has anyone else encountered this problem:
Every month I apply windows updates to servers using SCCM Software Update Groups. Some servers are considered lower priority so I push the updates as required to the server and expect the updates to install and the server to reboot if necessary during its assigned maintenance window only to find out that the some of the updates are failing. With experience, I have found this is because the system is waiting for a reboot. I would expect that SCCM would know that there is a pending reboot and reboot the server during the maintenance window to finish applying the updates but it does not. It seems as though these are "pending reboots" that SCCM cannot detect.
As a result, this requires manual intervention each month on a dozen or more servers that have to be manually rebooted in the middle of the night so as to not interrupt production.
One of the biggest culprits to this issue is the monthly Malicious Software Removal Tool. It always seems to fail to apply then works after a reboot.

The Computer Restart related setting can be configured in the "Client Setting" node on your console. No matter if you determine to use the Default client setting or the custom client settings, you should make sure that that the value for the restart temporary notification interval and the value for the final countdown interval are shorter in duration than the shortest maintenance window that is applied to the computer (the default values are 90 and 15 mins). This is important for the deployments which require a reboot completed on your clients.
Additionally, you can examine the logs on the client-side as below:
Update deployments related logs: UpdatesDeployment.log, WindowsUpdate.log
Reboot & Maintenance related logs:RebootCoordinator.log,ServiceWindowManager.log
More details about how to track the Update deployment process in ConfigMgr can be found here.

Related

Server 2012 MS Update Causes Boot Loop - How to Stop or Diagnose?

Running an EC2 instance with Server 2012. Windows is set to Automatically Install Updates. I am not sure what update is causing it, but it was installed after December 1. When this update reboots the server, the server goes into a boot loop where it comes back up for 20 seconds, then goes back down again. I can't RDP to it given the short window. Restoring the root drive back to the December 1 version works, but with auto-updates installed, I am afraid it will go down again. I am disabling auto updates for now.
Here is a screen shot of the 12/1 drive version and "Check for Updates" ran. There are 2 "important" which I assume are the only ones that are auto-installed, and 3 option.
Important: https://imgur.com/a/Nw8cmDY
Optional: https://imgur.com/a/jK9P3V5
I don't think any of these have a known issue of causing the server to be stuck in a boot loop. If you know which updates were installed at that date, then why not remove those updates and see if it corrects your issue?
What events are triggering in the event viewer when the machine goes down?

How can I remotely detect if a Windows server is ready for login

In order to build an automated deployment pipeline, I need to be able to clone and deploy Windows Server virtual machines, sysprep them, and then perform various customisation tasks on them.
Some steps, such as sysprep, require a reboot, so I currently simply wait for the reboot to happen, and once the machine comes online again I can execute the customisation on it.
The problem with this is that sysprep performs various actions after the reboot, and as far as I can tell, everything on the machine becomes available during the time when "preparing Windows" is still showing on the machine. This means I can use PowerShell Remoting to start changing things, but I don't want to do my customisations which could reboot the machine, while the first-boot stuff is still happening.
How can I remotely detect that a machine is "fully" booted, or at least past any deployment stages so it's in a state ready to log into? Is there some service that only starts when the login is available? Maybe a registry key to indicate that the boot process was completely fully?
This loop does the trick pretty well for me.
while (-not (Test-Path \\machineName\c$)) {
Start-Sleep 1
Write-Host "Waiting..."
}
From what I've experienced, the file system is accessible a couple seconds after the login screen appears, so I'd assume the system is "fully" booted.
make a loop and wait for following registry key to become IMAGE_STATE_COMPLETE
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Setup\State
IMAGE_STATE_COMPLETE represents fully completed sysprep after reboot.

Remove records from Deployment Status

We're running a chain of updates and installs via a Task Sequence, SCCM 2012 R2 and a mix of 7 and 10 labs. Some of the stations reported success in one deployment I made, others are reporting success in a deployment my coworker made (we were both testing at the same time and didn't communicate well enough).
While the machines have been removed from the deployment, Deployment Status still reports them in one deployment or the other. We'd like to remove the stations that Errored in one group, as they're reporting as Success in the other.
Is there any way to remove records from a Deployment Status for a collection in Monitoring?
I've been unsuccessful in my searches, won't be surprised if the answer is no, and sorry if this is a duplicate, I searched before asking.
It is possible to clear out status messages from sccm for a given period but not for specific systems that i know of. SCCM probably saves these in the database, You could try to delete them from there.

DNN install always hangs

I must have done over 100 installs of DNN (dnnsoftware.com) now, and I have never seen the "wizard" complete correctly. It always hangs at 28%. Upgrades usually hang at the 53% level.
This appears to be a user interface fault, because if I wait long enough - how long? - I can navigate to the site and all is working perfectly. If I don't wait long enough, some part of the installation process restarts and gives me a site without a superuser account.
The installation process does not appear to have been adequately tested. Anyone know how to get the "wizard" to complete correctly?
I haven't experienced this high of failure rate with the installation or upgrade.
In order to diagnose the problem, do an upgrade. Before applying the upgrade, set the AutoUpgrade key in the web.config to "false". Then initiate the upgrade by going to:
[baseurl]/install/install.aspx?mode=upgrade.
Instead of the simple progress bar, you will get a detailed report of the extensions and items being installed or upgraded.
Before installing, make sure your file permissions are set appropriately on the website folder and you follow all of the steps as per the installation guidelines.
Also, if it turns out the issue is timeouts during the install due to a slow server, try increasing the execution timeout in the web.config in the <httpRuntime> node. The default executionTimeout is "1200" which is 20 minutes. if your upgrade or installation is taking longer, increase the value.

Problems with Windows EC2 snapshots

I am getting serious access problems every single time I take a Windows snapshot from the EC2 console. After taking the snapshot neither the original machine, nor the images snapshot, are avaialable. And by this I mean that there is no RDP, HTTP or HTTPS connectivity, all of which were accessible ports before the snapshot. There is nothing to explain why this error occurs, as the sys logs are either blank or seem to show a successful snapshot and machine launch. Note that I have also had scenarios where I reboot the machine and again NOTHING is available.
What am I doing wrong? These are the steps I take.
1) Launch a default Win2k8 with IIS7 image. This is my machine: ami-c5e40dac
2) Install .NET 4.0
3) Activate the database (turn on the service).
4) Install my application and the database. This include an HTTPS certificate (I think I read somewhere that Windows has a restart problem if a cert is in the machine store - WTF?)
5) Take a snapshot or reboot --- Bang, everything is dead!
Anyone come across such problems?
I had a similar problem earlier this week; it turned out that my instance was just taking an age to boot (1 hour +).
Is it possible that you had some pending windows updates that wanted to run on startup?

Resources