ansible wait until output is received from remote host - ansible

I want to automate patching my servers, and using the following playbook:
- name: Patch Upgrade
block:
- name: Patch upgrade process
ansible.netcommon.cli_command:
command: patch install {{ node_patch }} patches_repository
check_all: True
prompt:
- "[yes] ?"
- "[yes] ?"
answer:
- 'yes'
- 'yes'
register: result
until: result.stdout.find("The system is going down for reboot NOW") != -1
During patching the output is similar to this:
ISE/admin#patch install ise-patchbundle-10.1.0.0-Ptach3-19110111.SPA.x86_64.tar.gz FTP_repository
% Warning: Patch installs only on this node. Install with Primary Administration node GUI to install on all nodes in deployment. Continue? (yes/no) [yes] ? yes
Save the current ADE-OS run configuration? (yes/no) [yes] ? yes
Generating configuration...
Saved the ADE-OS run Configuration to startup successfully
Initiating Application Patch installation...
Getting bundle to local machine...
Unbundling Application Package...
Verifying Application Signature...
patch successfully installed
% This application Install or Upgrade requires reboot, rebooting now...
Broadcast message from root#ISE (pts/1) (Fri Feb 14 01:06:21 2020):
Trying to stop processes gracefully. Reload lasts approximately 3 mins
Broadcast message from root#ISE (pts/1) (Fri Feb 14 01:06:21 2020):
Trying to stop processes gracefully. Reload takes approximately 3 mins
Broadcast message from root#ISE (pts/1) (Fri Feb 14 01:06:41 2020):
The system is going down for reboot NOW
Broadcast message from root#ISE (pts/1) (Fri Feb 14 01:06:41 2020):
The system is going down for reboot NOW
Each line is sent one after the other, and there is no specific wait time, the prompts are handled without issues as the patching starts, I want the upgrade task to keep running until the line The system is going down for reboot NOW is received then it should proceed to another task where it waits for the host to get back up.
Unfortunately it's not working as I am getting this instead:
fatal: [serv-1]: FAILED! =>
msg: 'The conditional check ''result.stdout.find("The system is going down for reboot NOW") != -1'' failed. The error was: error while evaluating conditional (result.stdout.find("The system is going down for reboot NOW") != -1): ''dict object'' has no attribute ''stdout'''
How can I fix this?

The broadcast notification "The system is going down for reboot NOW" is triggered and owned by journald (in old OS it was syslog), not the patch command, so they won't be reported to the result of the command; the error happens when the host is restarting, as result will have stderr instead of stdout.
An option would be to monitor the journal (or syslog) for the case that the restart occurs; but there are some caveats:
Those broadcast messages can be disabled or routed to a different output than the log or the console, here is an example of how that can be done
Not all the patches will trigger a reboot, in that case waiting for the "reboot" message won't appear

Related

Jenkins Agent can't reconnect

I've been attempting to setup a Windows agent for our Jenkins on Linux to use. Mostly using the instructions from https://www.gdcorner.com/2019/12/30/JenkinsHomeLab-P3-WindowsAgents.html
Only mostly, as those (like EVERY OTHER SET I've found) don't match the current Jenkins screens.)
I got to the point where I ran the curl/java -jar commands on the Agent machine in an Administrator Command window.
It worked, and connected to Jenkins, showed up as running. Though this means it was still running as a normal program in a command window.
While attempting to figure out what was meant by "Click the Launch agent from browser. " - which showed a small window that should then be displayed - which would allow you to install it as a service, I closed the Command window.
I never figured out how to "click the launch agent from browser", so I opened a new admin command window and re-ran the commands. They no longer work. It gets this error every time: (note that "WinDInstaller" is really "Win%2DInstaller")
Sep 28, 2022 10:15:40 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using C:\Enterprise\Agent\remoting as a remoting work directory
Sep 28, 2022 10:15:40 AM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to C:\Enterprise\Agent\remoting
Failed to obtain http://metrixbld1:8080/computer/WinDInstaller/jenkins-agent.jnlp?encrypt=true
java.io.IOException: Failed to load http://metrixbld1:8080/computer/WinDInstaller/jenkins-agent.jnlp?encrypt=true: 404 Not Found
at hudson.remoting.Launcher.parseJnlpArguments(Launcher.java:514)
at hudson.remoting.Launcher.run(Launcher.java:346)
at hudson.remoting.Launcher.main(Launcher.java:297)
Waiting 10 seconds before retry
It seems like it should be possible to get an Agent running on windows as a Service - but so far I've not found a set of instructions that work.
EDIT: Since that agent definition no longer worked, I created a new one and had it just copy the settings from the old one.
Then ran the new java/agent command - and it worked.
Apparently you can only launch a new agent once. If it ever stops you have to re-create it????

Time isn't synchronize with server, causing cloning cert error

I'm setting up a honeypot for my boss, and I'm coming across an issue with actually getting the time to synchronize with my workstations time (the reason I want to achieve this is because before looking at the steps on the link below, I had NOOBS rasbian OS installed which had the same issue with not being able to clone, but after doing the following command sudo apt-get install ntp, I was able to clone the files into the system with no issues, but because the link below calls for the "Rasbian Stretch Lite OS", I had to re-do the process, and because of this I can't seem to get the time to sync anymore.
https://github.com/DShield-ISC/dshield
So when I attempt to do the following command in the steps:
git clone https://github.com/DShield-ISC/dshield.git
fatal: unable to access 'https://github.com/DShield-ISC/dshield.git/': server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none
I've Tried the following methods with no luck:
sudo /etc/init.d/ntp stop
sudo raspi-config (setting timezone)
sudo /etc/init.d/ntp start
the timedatectl settings are as follows:
Local time: Mon 2016-02-04 12:04:52 PST
Universal time: Mon 2016-02-04 20:04:52 UTC
RTC time: n/a
Time zone: America/Los_Angeles (PST, -0800)
Network time on: yes
NTP synchronized: yes
RTC in local TZ: no
Also i've tried..
sudo ntpd -q -g
I've noticed with this command I get a ton of results, and the process never finishes, if this is vital I can re-run the command and tell you what kind of information is coming back to me.
Yes I've set the time to be as close as possible to the actual clock before attempting any of these, I've noticed that regardless it's always either a minute or some seconds off, whenever rebooting. I'm assuming that's because it isn't synchronized even though it states that it is.
The Cert error was due to me being HARDWIRED into my RBI verses using wifi, after going into sudo raspi-config, and setting up a wifi connection I was able to successfully clone the github repo.

Chef reboot resource causes chef run to time out before Windows server is able to start back up after update and resume recipe

I am trying to create a domain on a windows 2012 R2 server and it requires a reboot before the recipe can proceed:
reboot "reboot server" do
reason "init::chef - continue provisioning after reboot"
action :reboot_now
end
I receive the following error, which indicates a timeout + it occurs before I see the OS comes back to life after the update:
Failed to complete #converge action: [WinRM::WinRMAuthorizationError] on default-windows2012r2
Does anyone out there know how to make the chef server continue to run after the OS is back up? I hear that :restart_now is supposed to do the trick... ^^^ but as you can see, it isn't :)
P.S. this also causes windows to update... Goal: get chef to resume after the update is complete and the server is back up
Update: The server actually seems to be rebooting twice and exiting the chef run on the second reboot. If I remove the ONE reboot resource block that I have then it does not reboot at all (that makes no sense to me)... here is output from the chef run:
Chef Client finished, 2/25 resources updated in 19 seconds
[2018-10-29T08:04:11-07:00] WARN: Rebooting server at a recipe's request. Details: {:delay_mins=>0, :reason=>"init::chef - continue provisioning after reboot", :timestamp=>2018-10-29 08:04:11 -0700, :requested_by=>"reboot server"}
Running handlers:
[2018-10-29T08:04:11-07:00] ERROR: Running exception handlers
Running handlers complete
[2018-10-29T08:04:11-07:00] ERROR: Exception handlers complete
Chef Client failed. 2 resources updated in 20 seconds
[2018-10-29T08:04:11-07:00] FATAL: Stacktrace dumped to C:/Users/vagrant/AppData/Local/Temp/kitchen/cache/chef-stacktrace.out
[2018-10-29T08:04:11-07:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2018-10-29T08:04:11-07:00] FATAL: Chef::Exceptions::Reboot: Rebooting server at a recipe's request. Details: {:delay_mins=>0, :reason=>"init::chef - continue provisioning after reboot", :timestamp=>2018-10-29 08:04:11 -0700, :requested_by=>"reboot server"}
^^^ that repeats twice ^^^
Update #2: I even commented out every line except for the reboot block and am experiencing the same issue... this is ridiculous and I'm confident that it isn't my code that is the problem (considering that all I am using now is a reboot command).
Update #3: I generated an entirely new cookbook and called it "reboot"... it contains the following code:
reboot 'app_requires_reboot' do
action :request_reboot
reason 'Need to reboot when the run completes successfully.'
end
And unfortunately, it too reboots the Windows server twice... Here are the logs:
Recipe: reboot::default
* reboot[app_requires_reboot] action request_reboot[2018-10-29T10:21:41-07:00] WARN: Reboot requested:'app_requires_reboot'
- request a system reboot to occur if the run succeeds
Running handlers:
Running handlers complete
Chef Client finished, 1/1 resources updated in 03 seconds
[2018-10-29T10:21:41-07:00] WARN: Rebooting server at a recipe's request. Details: {:delay_mins=>0, :reason=>"Need to reboot when the run completes successfully.", :timestamp=>2018-10-29 10:21:41 -0700, :requested_by=>"app_requires_reboot"}
Running handlers:
[2018-10-29T10:21:41-07:00] ERROR: Running exception handlers
Running handlers complete
[2018-10-29T10:21:41-07:00] ERROR: Exception handlers complete
Chef Client failed. 1 resources updated in 03 seconds
[2018-10-29T10:21:41-07:00] FATAL: Stacktrace dumped to C:/Users/vagrant/AppData/Local/Temp/kitchen/cache/chef-stacktrace.out
[2018-10-29T10:21:41-07:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2018-10-29T10:21:41-07:00] FATAL: Chef::Exceptions::Reboot: Rebooting server at a recipe's request. Details: {:delay_mins=>0, :reason=>"Need to reboot when the run completes successfully.", :timestamp=>2018-10-29 10:21:41 -0700, :requested_by=>"app_requires_reboot"}
seems like an issue with chef now... this is bad... who has ever successfully rebooted windows with Chef before? and why does a single reboot block, reboot the server twice?
Update number 4 will be after I have thrown my computer out the window
The issue was resolved with the following logic:
reboot "reboot server" do
reason "init::chef - continue provisioning after reboot"
action :nothing
only_if {reboot_pending?}
end
Adding the only if statement allows the recipe to ski[p that step if the OS does not detect that there is a Windows update pending.
I had forgotten that windows actually does track whether a system update/reboot is required.
As part of my poweshell_script block, I included the following: notifies :reboot_now, 'reboot[reboot server]', :immediately

RabbitMQ fails on Error: unable to connect to node rabbit#TPAJ05421843: nodedown

On a Windows 7 Enterprise machine, I made a fresh install of Erlang 17.4 and RabbitMQ 3.4.3 x64. The installation was successful and uneventful.
I have not yet tried to create my first queue or exchange, but I already see trouble. This problem is similar to another SO post, but that other post appears to involve clustering, which I don't have. Furthermore, that other poster can circumvent his issue by restarting the RabbitMQ service; that approach does not work for me.
My "nodedown" problem is evident at the RabbitMQ command prompt:
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.4.3\sbin>rabbitmqctl status
Status of node rabbit#TPAJ05421843 ...
Error: unable to connect to node rabbit#TPAJ05421843: nodedown
DIAGNOSTICS
attempted to contact: [rabbit#TPAJ05421843]
rabbit#TPAJ05421843:
* connected to epmd (port 4369) on TPAJ05421843
* epmd reports: node 'rabbit' not running at all
other nodes on TPAJ05421843: ['RabbitMQ']
* suggestion: start the node
current node details:
- node name: 'rabbitmqctl-19884#TPAJ05421843'
- home dir: H:\
- cookie hash: PD4QQCYrf0TME9vIko3Xuw==
Based on the above, I chose to check the status of the node explicitly named 'RabbitMQ'. I get this:
C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.4.3\sbin>rabbitmqctl -n RabbitMQ status
Status of node 'RabbitMQ#TPAJ05421843' ...
Error: unable to connect to node 'RabbitMQ#TPAJ05421843': nodedown
DIAGNOSTICS
attempted to contact: ['RabbitMQ#TPAJ05421843']
RabbitMQ#TPAJ05421843:
* connected to epmd (port 4369) on TPAJ05421843
* epmd reports node 'RabbitMQ' running on port 59301
* TCP connection succeeded but Erlang distribution failed
* suggestion: hostname mismatch?
* suggestion: is the cookie set correctly?
current node details:
- node name: 'rabbitmqctl-23076#TPAJ05421843'
- home dir: H:\
- cookie hash: PD4QQCYrf0TME9vIko3Xuw==
Ok, this is barely better since at least it acknowledges 'RabbitMQ' running on port 59301. But what the heck could it mean that "Erlang distribution failed"?
When I try to research this topic, I found articles saying "be sure you have matched cookies." Based on that I found this article, which claims the "cookie mismatch" does not pertain to me, because I have not created (nor intend to create) a RabbitMQ cluster.
What should I do?
I had this same problem today. There were no cookie or firewall problems and windows reported that the service was running successfully. This is what finally fixed it:
Run RabbitMQ sbin command prompt as administrator.
Run "rabbitmq-service remove"
Run "rabbitmq-service install"
For some reason the service set up by the installer did not configure several registry entries. Running this set them correctly and allowed the service to run.
One thing I noticed was that before I did this, there was no description of the service in the Windows Services view. After installing with the rabbitmq-service command, the description was visible. This might be a quick indicator if you are having the same problem.
As #eddyP commented, I had two different Erlang cookie files:
A server cookie file, located at $env:WINDIR\system32\config\systemprofile\.erlang.cookie (prior to Erlang 20.2 it was located at $env:WINDIR\.erlang.cookie).
A client cookie file, located at $env:USERPROFILE\.erlang.cookie.
Copying the server cookie file over the client one, so that both files were the same, fixed the problem for me.
For further details, see "How Nodes (and CLI tools) Authenticate to Each Other: the Erlang Cookie".
From RabbitMQ Command Prompt sbin (run as administrator) execute this command:
rabbitmq-server restart
In Windown, For some reason delete all folder in c:\Users\xxx\AppData\Roaming\RabbitMQ\db\ (xxx is your username)
then flow #Jerdev answer and
start rabbitmq net start rabbitmq
check rabbitmq service rabbitmqctl status
The same question on the RabbitMQ mailing list: https://groups.google.com/forum/#!topic/rabbitmq-users/0s1ExFhl4hM.
The Erlang cookie is used by rabbitmqctl as well as server nodes, so it may need being taken care of (placed in the correct location).
See "Installing as a non-administrator user leaves .erlang.cookie in the wrong place" on Windows quirks.
I resolve my problem doing this in Windows 10.
Execute RabbitMQ Command Prompt (sbin dir) as administrator.
Execute "rabbitmq-service remove" in (RabbitMQ Command Prompt).
Execute %AppData% in Run Dialog Box of Windows.
Delete all files in RabbitMQ folder.
Execute "rabbitmq-service install" in (RabbitMQ Command Prompt).
Execute "rabbitmqctl start_app" in (RabbitMQ Command Prompt).
If you come here looking for a linux answer for the same error message, try
sudo service rabbitmq-server start
(which is not a blocking command)
Just do the following:
Uninstall rabbitmq and erlang.
delete the rabbitmq folder existing in your appdata (if you dont
know the appdata location, just type echo %AppData% in the command
prompt)
Then install erlang first and then rabbitmq.
After installing, enable the management plugin using below command:
rabbitmq-plugins enable rabbitmq_management
For me the cookies didnt match, like the other comments but the locations was in a different path for those having the same issue as me C:\Windows\System32\config\systemprofile
That is happening because rabbit MQ is not being installed correctly on Windows (and this error is misleading!). So to solve it do the following:
type "cmd" in Cortana search or in "Run" for older version of Windows
right click on in and choose "Run as Administrator"
go to rabbit's sbin folder (cd "C:\Program Files\RabbitMQ Server\rabbitmq_server-3.7.4\sbin")
run: rabbitmq-service remove
run: rabbitmq-service install
now you can run
6. rabbitmq-plugins enable rabbitmq_management
7. rabbitmq-service start
8. and, finally, run: start http://localhost:15672
9. log on as user "guest" with password: "guest" and that's it. Happy Rabbiting!
I missed restarting my WINDOWS OS and then deleting the old version of ERLANG (which I uninstalled before restarting).
Somehow the fresh installation of Rabbit was referring to the old (un-installed version) and all the mismatch was happening. Clue was the 'services' referred Rabbit from the old ERLANG version.
This is how I resolved the error in my Windows 8 system:
Check for a syntax error in the rabbitmq.config file placed in the AppData folder for Windows.
How to check if there is any syntax error?
You can run rabbitmq-server restart from sbin folder in:
Program Files/RabbitMQ/rabbitmq_server_x.x/sbin/.
Replace the content of the rabbitmq.config with rabbitmq.config.example.
You may find the rabbitmq.config.example in:
Program Files/RabbitMQ/rabbitmq_server_x.x/etc/
Warning, you will lose the configuration you have saved previously with rabbitmq.
After changing the files, just hit
rabbitmq-server restart
in the sbin folder mentioned above.

Apache & OS X FileVault 2

Over the weekend I enabled FileVault 2 on OS 10.8.4.
When I fired up MAMP PRO just now, I receive the following error:
Apache wasn't able to start. Please check log for more information.
When I look at my log, the last line is:
Fri Jun 21 18:57:52 2013] [notice] caught SIGTERM, shutting down
I have tried stopping and restarting Apache.
If I run: sudo apachectl start, and then try to start MAMP, I receive a different error message:
The built in Apache is active which can cause a port conflict with at least one of your virtual hosts.
It's recommended either to choose a port different than 80 or to stop the built in Apache.
Enabling FileVault is the only thing I can think of since Friday that could have potentially affected Apache. But I'm not sure how to debug what exactly the problem is?
I disabled FileVault, did a system reboot, and everything is back to normal. So apparently there are some issues with running Mac Apache and FileVault. Not much has been said about it online. Hopefully this thread can serve some purpose and we can figure out a solution.

Resources