Jenkins Agent can't reconnect - windows

I've been attempting to setup a Windows agent for our Jenkins on Linux to use. Mostly using the instructions from https://www.gdcorner.com/2019/12/30/JenkinsHomeLab-P3-WindowsAgents.html
Only mostly, as those (like EVERY OTHER SET I've found) don't match the current Jenkins screens.)
I got to the point where I ran the curl/java -jar commands on the Agent machine in an Administrator Command window.
It worked, and connected to Jenkins, showed up as running. Though this means it was still running as a normal program in a command window.
While attempting to figure out what was meant by "Click the Launch agent from browser. " - which showed a small window that should then be displayed - which would allow you to install it as a service, I closed the Command window.
I never figured out how to "click the launch agent from browser", so I opened a new admin command window and re-ran the commands. They no longer work. It gets this error every time: (note that "WinDInstaller" is really "Win%2DInstaller")
Sep 28, 2022 10:15:40 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using C:\Enterprise\Agent\remoting as a remoting work directory
Sep 28, 2022 10:15:40 AM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to C:\Enterprise\Agent\remoting
Failed to obtain http://metrixbld1:8080/computer/WinDInstaller/jenkins-agent.jnlp?encrypt=true
java.io.IOException: Failed to load http://metrixbld1:8080/computer/WinDInstaller/jenkins-agent.jnlp?encrypt=true: 404 Not Found
at hudson.remoting.Launcher.parseJnlpArguments(Launcher.java:514)
at hudson.remoting.Launcher.run(Launcher.java:346)
at hudson.remoting.Launcher.main(Launcher.java:297)
Waiting 10 seconds before retry
It seems like it should be possible to get an Agent running on windows as a Service - but so far I've not found a set of instructions that work.
EDIT: Since that agent definition no longer worked, I created a new one and had it just copy the settings from the old one.
Then ran the new java/agent command - and it worked.
Apparently you can only launch a new agent once. If it ever stops you have to re-create it????

Related

Kibana/Elasticsearch/logstash install issue

I have an issue installing Kibana/elasticsearch/logstash.
I followed the instructions for the download and the install of each. From the Admin CMD line, I ran the elasticsearch.bat for elasticsearch. The script ran, but just hung and never fully completed. However, when testing on my localhost 9200, it showed ES installed correctly. I did the same with Kibana and had the same result. The script never fully completed, however, I was able to open the web app on port 5601. I also completed the install for logstash – the script never fully completed. When I manually exited each script by exiting the cmd window, I lost connection to the localhost. "This site can't be reached."
Has anyone had this issue or know what might be happening here?

Docker "hello-world" is giving "process cannot access the file because it is being used by another process"

We have just uninstalled Docker Community Edition and installed Enterprise Edition on a Windows Server 2016 System as per these steps.
On reaching the last step which is to test a hello-world container we are receiving the following error:
C:\Program Files\Docker\docker.exe: failed to register layer: re-exec
error: exit status 1: output: ProcessUtilityVMImage
\?\C:\ProgramData\Docker\windowsfilter\e345ad40cc8f7d073f62501b7445d42d677889c04b2c6fe0963ea6e092b52f95\UtilityVM:
The process cannot access the file because it is being used by another
process.
We are seeing lots of examples on SO related to other types of applications giving this error but not Docker.
How might we fix this?
This turned out to be an issue related to Symantec Endpoint Protection(SEP) clashing with Docker.
SEP needs to be upgraded to version 14 RU1 and several files need execution privileges.
Symantec posted a fix here: https://support.symantec.com/en_US/article.TECH246815.html which we have tried and worked.

Jenkins doesn't recognize slave being down and thus does not allow for it to reconnect

We have a Jenkins instance running on Ubuntu that has several slaves in different systems. One of them is a Windows 7 host, having jenkins slave instance configured as a service.
We have a problem that when that machine is rebooted, master Jenkins doesn't realize it's gone. It looks to be just fine in the nodes view. Then, when a build is issued that is supposed to use that slave it gets stuck. If that is stopped, the next build fails immediately
Caused by: java.util.concurrent.TimeoutException: Ping started at 1457016721684 hasn't completed by 1457016961684
... 2 more
[EnvInject] - [ERROR] - SEVERE ERROR occurs: channel is already closed
When the slave has started up and it tries to connect back to master, connection is refused, and in the logs there is an error saying connection with that name already exists:
Server didn't accept the handshake: xxx is already connected to this master. Rejecting this connection.
There is issue JENKINS-5055 which claims a fix was committed allowing the same JNLP slave to reconnect without getting rejected, apparently this commit, and according to changelog, it was introduced in version 1.396 (2011/02/02). We are however using version 1.639 and seeing this. Somebody else seems to be seeing it as well. By looking at current codebase, I see where the error is coming from, but don't see the fix done in Jenkins-5055.
Any ideas on resolving this?
Edit: also asked on jenkins user mailing list, but no responses.
We faced the same issue. Used https://wiki.jenkins-ci.org/display/JENKINS/slave-status as workaround
Reinstalling the slave on a Windows Server 2012 R2 machine shows no signs of this behavior, so it seems that either there was a mistake done during installation steps or this is something caused by using a workstation Windows version.
Regardless, here were the steps to get it working, assuming a brand new installation of Windows, with no network connectivity, and master instance using a self-signed certificate:
Install JRE on the machine. If you have 64-bit operating system, install both 32-bit and 64-bit, otherwise go with 32-bit. Download link here
Install .NET 3.5 on the machine. This is needed by the Jenkins service. You can follow the steps outlined by my other answer for this.
Install Jenkins using Windows installer (.zipped) to C:\Jenkins. It can be downloaded from here.
Check your installation is responding by navigating to http://localhost:8080 . In case of trouble, check for logs in the jenkins folder. If there is a port conflict, edit jenkins.xml and change the httpPort to something else.
From the Windows computer, navigate to your master jenkins and configure a new node there.
Start a slave agent using Java Launch Agent in configure -> node screen (you need to be still using your Windows slave computer)
You should see a visible window opening. From there, select File -> Install as a service. (details and screenshots) If you experience an error without proper explanation, confirm .NET 3.5 is properly installed. If you see "WMI.WmiException: AccessDenied", save the jnlp file locally and start it from administrator prompt or otherwise with elevated privileges (details).
From the Administrative tools -> Services, stop and disable the Jenkins service, and stop Jenkins Slave Agent but leave it on Automatic so it will start up when starting up the computer.
This is only relevant if you're using a self-signed or otherwise problematic certificate:
download the previously mentioned Java Launch Agent file (.jnlp file) again and save it to C:\jenkins
open c:\jenkins\jenkins-slave.xml to your editor
change it to refer to your local .jnlp file by changing jnlp url parameter (file:/C:/jenkins/jenkins-slave.jnlp)
add -noCertificateCheck to parameters
replace the -secret parameter with -auth "user:pass", since otherwise automatic url get parameters will be added which will mess finding the .jnlp file
Start the Jenkins Slave Agent service again
For problems with jenkins slave service, check out jenkins-slave.err.log. For Windows Server 2012 R2, you can get the functionality of tail by using Get-Content .\jenkins-slave.err.log -Wait -Tail 10 in Powershell prompt. For older versions of Powershell, leave out -Tail 10.

Headless testing display error

Background
I am running a set of selenium tests using a Maven and Jenkins with Testng. I had them working fine headlessly up until a week ago. Jenkins sits on the server accessible with port 8080. The tests also run fine through eclipse.
Software Versions
I have read lots about Firefox being incomparable with selenium so here is a list of software and versions that I am using.
Firefox: 39
Maven: 3.3.3
Java: 1.7.0_79
Selenium: 2.46 & 2.47(currently 2.47)
Jenkins: 1.622
Xvnc: 1.3.9
ubuntu 14
Error
After I run the tests and the fail I check the console through Jenkins. The error I am getting makes me think it's a problem with Xvnc and firefox but I can't pin point it. I get a NotConnectedException. The firefox console error has changed a few times here is a list of different errors the console has shown me.
Error: cannot open display: :87
firefox: Fatal IO error 11 (Resource temporarily unavailable) on X server :46.
firefox: Fatal IO error 2 (No such file or directory) on X server :78.
Research
I've been on bugzilla but cant find a conclusive answer to the problem.
I've also looked around SO but found no fixes.
Conclution
From what I have gathered it is something to do with Xvnc, Could running
sudo apt-get update
make changes to how Xvnc operates? I have updated the packages some time last week but our testers didn't check Jenkins properly when adding new tests and as such I've wasted and entire day trying to pin point when and what the problem is.
Question
What would cause Jenkins to return errors like this, how can I fix them and how can I prevent something doing this again?
EDIT 1
The display variable seems to be the issue, upon typing the command
echo $DISPLAY
There is no response just an empty line.
EDIT 2
running the command
export DISPLAY=:0.10
no gives the result
:0.10
when I echo $DISPLAY
I think the DISPLAY varibale is not functioning as expected and hence firefox is unable to connect to it. To know more about the $DISPLAY refer this link https://askubuntu.com/questions/432255/what-is-display-environment-variable
Try to run this command on the slave node where the job runs, this should give you the required setting for the tests to connect and run.
nohup /usr/bin/Xvfb :2 –screen 0 1024x768x24 > /dev/null 2>&1 &

SonarQube installation failing to start service

I'm installing sonarqube on Windows Server 2012.
I have followed the following steps:
Downloaded sonarqube4.4 and extracted to C:\Sonarqube
Downloaded Java JDK 1.7.0_60 and jre 1.7.0_67 as well as jre7
Installed Windows SDK 7 and .NET Framework 4
Navigated to C:\sonar\bin\windows x86-64 and ran StartSonar.bat as an administrator, this ran ok with no output and Ihad to hot ctrl- Z to break
I then ran \windows-x86-64\InstallNTService.bat as an administrator and I am seeing the sonarQube services was launched, but failed to start.
Not sure what the problem is.
I believe you first ran \windows-x86-64\InstallNTService.bat successfully and then StartSonar.bat unsuccessfully (the inverse order of what you describe).
You probably have [this problem]: http://qualilogy.com/fr/wp-content/uploads/sites/2/2013/09/Sonar_ServiceLaunchError2.jpg
Windows could not start the Sonar service on Local Computer.
Error 1067: The process terminated unexpectedly.
In that case, the solution is to change the user/rights to launch the Sonar service: https://qualilogy.com/en/migrate-sonarqube-tomcat-to-windows-service/
Go to the Services window, find the Sonar service, and open the Properties windows to change the user it logs on as to one with sufficient permissions.
I was able to solve this problem by creating a new folder named “Temp” in C:\Windows\System32\config\systemprofile\AppData\Local\
The Log-File will show only
--> Wrapper Started as Service
Cleaning or creating >temp directory C:\Program Files (x86)\SonarQube\sonarqube\temp
<-- Wrapper Stopped
The SonarQube service was launched, but failed to start.
After a long search, I came up to this site http://zen-and-art-of-programming.blogspot.de/2013/03/installing-and-running-sonar-source.html.
Solution:
Navigate to C:\Windows\system32\config\systemprofile\AppData\Local\ and create the directory Temp
2: Set the user rights to full access
3: Run the StartNTService.bat

Resources