Failed to start SSH Agent: java.lang.RuntimeException: Cannot parse ssh-agent output: '' - TeamCity - teamcity

System configs:
TeamCity Master: Windows Server 2012 R2 Standard
TeamCity Agent: Windows Server 2022 Datacenter
Infrastructure is in AWS with the correct security groups and open ports needed (9090, 22, 443, etc)
I keep getting this error. I have checked that the ssh-agent is in the path. Keys are added to Teamcity master (private) and GitHub (public key, clone is successful). Windows firewall is also open for port exceptions.

Seems like TeamCity is incompatible with Windows OpenSSH variant. I suggest you to uninstall it (via powershell or windows features) and install git for windows instead which in bundled with all ssh tools.

Related

Can I develop with VS Code in containers on a remote host running Windows/WSL2?

Original Post
I have a Windows workstation with WSL2 and Docker installed that I am able to use for container based development in VS Code. I would like to be able to develop inside the containers on this system remotely. I am able to SSH directly into the WSL2 environment on the workstation and am able to start the docker daemon without logging directly into Windows by creating a Task to start the daemon automatically as described here: https://stackoverflow.com/a/59467740/10692741
However when I try to access Docker on the remote machine by following this guide: https://code.visualstudio.com/docs/remote/containers-advanced#_developing-inside-a-container-on-a-remote-docker-host, I get the following error:
error during connect: Get http://docker/v1.24/version: net/http: HTTP/1.x transport connection broken: malformed HTTP status code "\x00c\x00o\x00m\x00m\x00a\x00n\x00d\x00"
I have also tried connecting via a SSH tunnel as outlined here: https://code.visualstudio.com/docs/remote/troubleshooting#_using-an-ssh-tunnel-to-connect-to-a-remote-docker-host and am unable to connect to Docker as well.
Has anyone had success with a setup like this? Or is this not supported due to limitations with Docker on Windows, WSL2, and/or Windows OpenSSH implementation?
Update: 2021-01-21
When I SSH into the Windows machine remotely, I am able to see the docker containers in the VS Code extension. I am able to start them, stop them, and enter into them with the shell. However, when I try to attach VS Code I get same error shown above.
Things that may have possibly affected this over the past couple days:
Adding SSH keys on my local machine to the ssh-agent via ssh-add /my/key
Exposing Docker daemon on tcp://localhost:2375 without TLS on the remote Windows machine
Also I want to note that the I've tried using Windows, Mac, and Linux as the local machine. With Mac and Linux I am able to open a remote session into the Windows machine, but from the Windows local machine I am able to SSH into the remote Windows machine but cannot open a remote connection in VS Code for some reason.
Ok, I was able to get this working using the port/socket forwarding technique. For sake of clarity, I'll use:
local development workstation, local workstation, or just workstation to indicate the computer from which we wish to use VSCode to access Docker containers on ...
the remote Docker host, remote, or just Docker host
Sanity check -- Do you have Docker Desktop installed on both systems? On the local development workstation, you can skip the WSL2 integration, but you'll at least need the client tools, since the VSCode extension uses them.
Steps I took:
I already had Docker with WSL2 integration set up on my main system (which for the purposes of this exercise, became my remote Docker host), along with VSCode, so I knew everything was working there. It sounds like that was your starting point as well.
On another system on the same network (accessed with RDP to make it simple), I already had VSCode installed as well, with the Remote Development Extension Pack. I also have WSL on that system, but only a v1 instance there. Not that WSL on the workstation should be a factor at all for the purposes of this exercise.
I installed Docker Desktop for Windows on that local development workstation.
I also installed the Docker extension for VSCode, since I didn't yet have it on the local development workstation.
On the workstation, I was not yet set up to SSH from PowerShell into my WSL Ubuntu distro on the remote. From PowerShell on the workstation, I generated an ECDSA key (per this and other documents) and added the public key to my authorized_keys on the the remote.
On the workstation, I started the OpenSSH Authentication Service and added the newly created key to the agent (in PowerShell) with ssh-agent add ~\.ssh\id_ecdsa.
I logged out of the workstation and back in so that the path changes were picked up for the Docker desktop install.
I was then able to ssh from Powershell on the local to Ubuntu/WSL on the remote with the port forwarding. Since I'm using the Windows 10 OpenSSH server as a jumphost to my WSL SSH servers, my command looked slightly different (with a -o "ProxyCommand ... mainly), but overall the structure is the same as the one listed in the "SSH Tunnel" doc you linked in your question.
On the remote (manually, not through any integration from the local), I did a basic docker run -it --rm Ubuntu and left it open.
On the local, from PowerShell, I set the DOCKER_HOST environment variable via [System.Environment]::SetEnvironmentVariable("DOCKER_HOST","tcp://localhost:23750").
I was then able to see the remote container using docker ps on the local. I could also docker exec -it containername bash into it remotely.
Of course, the above two steps aren't needed in the long term for VSCode, they were just part of my process to make sure everything was up and running (since, as you might expect, I did have several points at which I failed during this process).
So with that working, it was a simple matter in VSCode to change the Docker extension's DOCKER_HOST setting to tcp://localhost:23750. And voila, I could see all images on the remote as well as attach to them from VSCode.
Other thing(s) to check
I'll add to this list if we find additional reasons why it might not be working, but for now:
You mention that you are starting the Docker Desktop daemon automatically at startup via Task Manager, but you don't mention anything about the WSL2 instance. However, since you are able to ssh into it, I assume you have a way to bring it up as well? My experience has been that, unless the owning user is logged in, WSL terminates any instances after a few seconds, even if a service is running. There's a workaround, I believe, that I can dust off if this is a problem.

how to restart teamcity server

I am a beginner to teamcity. Our Teamcity 9 server stopped working after I installed Gradle. I doubt that it was problem with port or something like that. I removed Gradle but Teamcity didn't work. So I tried to restart Teamcity server. We have two teamcity agents. I stopped agents with:
sudo ./runAll.sh stop
and I stopped the server with sudo ./shutdwon.sh
after that I started server again with ./startup.sh and agents with
sudo ./runAll.sh start
Now when I am writing url address in browser I am getting either connection_timout or connection refused But when I am writing url with explicit IP address like 10.31.24.18:8111 then I am getting
My questions:
1- How can I restart Teamcity and agents so that I am getting same agents and project as before restart in TeamCity UI? Or If I am creating Administrator account now after that I should reconfigure all projects or my projects before restart will be there?
2- Why URL with IP-address is working but URL with domain name server name is not working?
You can restart TeamCity right from the UI: Administration > Diagnostics > Server Restart. You will need to have server admin permissions for that.
using command line
cd /opt/teamcity/bin
(sudo) ./teamcity-server.sh stop
(sudo) ./teamcity-server.sh start

Jenkins doesn't recognize slave being down and thus does not allow for it to reconnect

We have a Jenkins instance running on Ubuntu that has several slaves in different systems. One of them is a Windows 7 host, having jenkins slave instance configured as a service.
We have a problem that when that machine is rebooted, master Jenkins doesn't realize it's gone. It looks to be just fine in the nodes view. Then, when a build is issued that is supposed to use that slave it gets stuck. If that is stopped, the next build fails immediately
Caused by: java.util.concurrent.TimeoutException: Ping started at 1457016721684 hasn't completed by 1457016961684
... 2 more
[EnvInject] - [ERROR] - SEVERE ERROR occurs: channel is already closed
When the slave has started up and it tries to connect back to master, connection is refused, and in the logs there is an error saying connection with that name already exists:
Server didn't accept the handshake: xxx is already connected to this master. Rejecting this connection.
There is issue JENKINS-5055 which claims a fix was committed allowing the same JNLP slave to reconnect without getting rejected, apparently this commit, and according to changelog, it was introduced in version 1.396 (2011/02/02). We are however using version 1.639 and seeing this. Somebody else seems to be seeing it as well. By looking at current codebase, I see where the error is coming from, but don't see the fix done in Jenkins-5055.
Any ideas on resolving this?
Edit: also asked on jenkins user mailing list, but no responses.
We faced the same issue. Used https://wiki.jenkins-ci.org/display/JENKINS/slave-status as workaround
Reinstalling the slave on a Windows Server 2012 R2 machine shows no signs of this behavior, so it seems that either there was a mistake done during installation steps or this is something caused by using a workstation Windows version.
Regardless, here were the steps to get it working, assuming a brand new installation of Windows, with no network connectivity, and master instance using a self-signed certificate:
Install JRE on the machine. If you have 64-bit operating system, install both 32-bit and 64-bit, otherwise go with 32-bit. Download link here
Install .NET 3.5 on the machine. This is needed by the Jenkins service. You can follow the steps outlined by my other answer for this.
Install Jenkins using Windows installer (.zipped) to C:\Jenkins. It can be downloaded from here.
Check your installation is responding by navigating to http://localhost:8080 . In case of trouble, check for logs in the jenkins folder. If there is a port conflict, edit jenkins.xml and change the httpPort to something else.
From the Windows computer, navigate to your master jenkins and configure a new node there.
Start a slave agent using Java Launch Agent in configure -> node screen (you need to be still using your Windows slave computer)
You should see a visible window opening. From there, select File -> Install as a service. (details and screenshots) If you experience an error without proper explanation, confirm .NET 3.5 is properly installed. If you see "WMI.WmiException: AccessDenied", save the jnlp file locally and start it from administrator prompt or otherwise with elevated privileges (details).
From the Administrative tools -> Services, stop and disable the Jenkins service, and stop Jenkins Slave Agent but leave it on Automatic so it will start up when starting up the computer.
This is only relevant if you're using a self-signed or otherwise problematic certificate:
download the previously mentioned Java Launch Agent file (.jnlp file) again and save it to C:\jenkins
open c:\jenkins\jenkins-slave.xml to your editor
change it to refer to your local .jnlp file by changing jnlp url parameter (file:/C:/jenkins/jenkins-slave.jnlp)
add -noCertificateCheck to parameters
replace the -secret parameter with -auth "user:pass", since otherwise automatic url get parameters will be added which will mess finding the .jnlp file
Start the Jenkins Slave Agent service again
For problems with jenkins slave service, check out jenkins-slave.err.log. For Windows Server 2012 R2, you can get the functionality of tail by using Get-Content .\jenkins-slave.err.log -Wait -Tail 10 in Powershell prompt. For older versions of Powershell, leave out -Tail 10.

Perforce installation in network folder for Windows

I would like to install the Windows version of Perforce in a network location so that users can call p4 via:
\\somewhere\p4.exe -p server:1666 -c some_client_name sync
where "somewhere" is consistently mapped on all Windows machines. I tried to do this by installing locally, then copying p4.exe to \\somewhere.
On the computer where I installed locally, \\somewhere\p4.exe works just fine. But when I switch to another machine and try to run
\\somewhere\p4.exe -p server:1666 info
I get the following error:
Perforce client error
Connect to server failed; check $P4PORT.
TCP connect to server:1666 failed.
A non-recoverable error occurred during a database lookup.
What does this error mean? I couldn't find any information in the documentation; I suspect I might need another file besides p4.exe. Indeed, when I install Perforce locally on the other machine, using the local p4.exe works, but \\somewhere\p4.exe still does not.
Any pointers?
Thanks!
You shouldn't need any other files besides P4.exe.
The TCP connection error is probably because that other machine isn't able to translate "server" into an IP address.
Try using some of the Windows command line tools to diagnose this, as in:
nslookup server
or
ping server
Also, try changing your test to run:
\\somewhere\p4.exe -p NNN.NNN.NNN.NNN:1666 info
where the "NNN.NNN.NNN.NNN" is the IP address of your server machine.

How to debug Jenkins error message "could not find a suitable ssh-agent provider"?

I'm using Jenkins on Win7 and i've installed tomcat for ssh-agent plugin. And I could clone my GitLab project via git bash via ssh.
But if I build the project by Jenkins, it always says :
[ssh-agent] Using credentials IliptonChen(APRTest)
[ssh-agent] Looking for ssh-agent implementation...
[ssh-agent] FATAL: Could not find a suitable ssh-agent provider
FATAL:[ssh-agent] Unable to start agent
The full output text is here
Did I do anything wrong?
Check the version of your ssh-agent used by Jenkins.
This bug (for linux, but could apply to Windows too) reports (10 days ago, January 2014) this very same error message:
"JENKINS-20276: Native Library Error after upgrading ssh-agent from 1.3 to 1.4".
Downgrading to 1.3 resolves the issue.
Update 2019, five years later: as commented, this should be fixed now.
ssh-agent.exe is part of a Git for Windows distribution
D:\git\git>where ssh-agent.exe
D:\prgs\gits\current\usr\bin\ssh-agent.exe
(provided path/to/git/usr/bin is first in the %PATH% used by Jenkins)
Assuming you've installed Windows Git on Windows slave, it comes with ssh-agent binary (e.g. C:\Program Files\Git\usr\bin). Try adding its path to system variable PATH.
Otherwise untick SSH Agent and choose the credentials by selecting Credentials from dropdown in Source Code Management section.
Another way is to generate personal API token (OAuth) for that GitHub user and include that along with your repository address, e.g.
git clone https://4UTHT0KEN#github.com/foo/bar
For windows, the plugin still requires Tomcat to be installed in both master and slave.
I got this error because I was using an Ubuntu image for the agent, which doesn't have SSH installed.
agent {
docker { image 'ubuntu:focal' }
}
... so the solution was as simple as installing SSH as part of the pipeline:
steps {
sh "apt-get update && apt-get install ssh -y"
// rest of your steps here...
}
In my case, the error was accompanied by an error about disk space depletion:
[ssh-agent] FATAL: Could not find a suitable ssh-agent provider
[ssh-agent] Diagnostic report
[ssh-agent] * Exec ssh-agent (binary ssh-agent on a remote machine)
[ssh-agent] hudson.AbortException: Failed to run ssh-agent: mkdtemp: private socket dir: No space left on device
So I ran docker system prune -a, which fixed it.

Resources