SSH timeout error on Azure DevOps CD pipeline - amazon-ec2

I am getting an timeout error when trying to deploy to an VM instance hosted on AWS. Manually I can log ing using
ssh -i myKeyFile.pem myuser#IP
Once I accessed the remote machine I can execute some docker commands and everything works fine. But now that I need to automated that on the CD pipeline is where I am getting the following error:
2020-06-02T21:37:12.6877276Z Trying to establish an SSH connection to ***#IP:port
2020-06-02T21:38:52.4629461Z ##[error]Failed to connect to remote machine. Verify the SSH service connection details. Error: Error: Timed out while waiting for handshake.
2020-06-02T21:38:52.4685976Z ##[section]Finishing: Run shell commands on remote machine
The steps I follow to make the SSH connection are:
I created a SSH service connection on the project settings in Azure DevOps
I created the CD pipeline
I added a SSH task with the following parameters
When I manually trigger it to test if it works, the release start working fine but after 1:43 minutes more or less is when I got the error:
Then, when I review the logs, it is the same error I pasted at the beginning:
[error]Failed to connect to remote machine. Verify the SSH service connection details. Error: Error: Timed out while waiting for handshake
I've increase the handshake timeout settings from the default one (20000) to 90000, but no luck.
Any one has face this problem before?

Seems there is an ongoing error with the default agent pools from Azure DevOps. Lot of people have been reported this and Azure DevOps teams is working on it at the time this post is been written (I couldn't find the post where all that is details. I will add this later on).
The workaround is
To create a self-hosted agent.
After this has been created you will need to re-create your CD pipeline using the new self-hosted agent.
The rest of the SSH task configuration depends on your needs. But if you want to test the SSH connection works, just print something:
echo 'I'm connected'
After this you CD pipeline should be working fine.
More details on how to created the Self-Hosted Agent on Windows. There are also links for Linux and Mac.

I had a similar issue with a VM in Azure. It turned out I had set the security group to only allow SSH in from my local network and Azure Dev-Ops agents obviously run in a Microsoft network and were coming from a different IP Address range. The solution was to open up SSH to all source IP Addresses. You can get the list of IP address ranges Dev-Ops agents use but they appear to change every week which isn't very helpful.
See https://learn.microsoft.com/en-us/azure/devops/organizations/security/allow-list-ip-url?view=azure-devops#microsoft-hosted-agents

Related

Cannot pull image from AWS ECR repository using docker with VirtualBox or Colima

The Situation
As you may all know, Docker has changed its license for Docker desktop to limit free usage for limited use cases.
As a result, I have resorted to alternatives such as Colima and use of virtual box as a means to continue using docker CLI while respecting Docker's new changes.
While it works fine for pulling images from Docker Hub, I've noticed that I can no longer pull images from my company's AWS ECR repo. The reason is due to unknown certificate authority issues.
My understanding of how docker runs is limited, but the gist I got from this stackoverflow post is that docker CLI acts as the client for the developer to send commands to the Docker Daemon that runs on a virtual machine. So this issue is most likely related to the VM that the docker daemon is running on.
The Error Message
Pulling from myrepo/myapp
5ad559c5ae16: Pulling fs layer
d7a7f7e76287: Pulling fs layer
3eb3e996f0d7: Pulling fs layer
d8f3fbab0eaf: Waiting
d310dd0da683: Waiting
6f542466a6be: Waiting
8851a2099770: Waiting
f1dd90cdff4b: Waiting
4a852bd6c6f1: Waiting
538106d55e7d: Waiting
dbc972867db8: Waiting
2bc8828e78a2: Waiting
1a653b47f557: Waiting
877c2f613a70: Waiting
09eac264496b: Waiting
66dd8ce5c695: Waiting
ccde39d6cfef: Waiting
4351b359c9e4: Waiting
52e095209afc: Waiting
c6ad9f161855: Waiting
233f3e28c5a3: Waiting
error pulling image configuration: Get https://prod-ca-central-1-starport-layer-bucket.s3.ca-central-1.amazonaws.com/<a-very-long-hash>/<another-very-long-hash>?X-Amz-Security-Token=<AWS-security-token>&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20220210T215140Z&X-Amz-SignedHeaders=host&X-Amz-Expires=3600&X-Amz-Credential=<my-credential>&X-Amz-Signature=<amazon-signature>: x509: certificate signed by unknown authority
My hypothesis for why I'm getting this error message
This is purely a guess. Please feel free to correct me.
I know that with Docker Desktop, I do not get this certificate error and my guess is that with the integration of hyperkit, it the VM can run via localhost, which will allow Docker Daemon to tap into macOS' trusted certificate authority certs.
The problem now arises because the VM that I've obtained from the Internet now no longer has access to those trusted certs.
What I've tried
Ensure I've logged into ECR using AWS command aws ecr get-login-password --region ca-central-1 | docker login --username AWS --password-stdin <my-aws-account-id>.dkr.ecr.ca-central-1.amazonaws.com
reinstall both Colima and the virtual box hypervisor
Isolate the issue by experimenting solely on virtual box setup.
I noticed that the folder /etc/docker is present on the VM. From Docker's documentation, the default directory for certificates for docker is in /etc/docker/certs.d to which I noticed it is absent in my Virtual Machine installation.
I think I'm close to a solution, but I'm quite new to how certificates work and I'm not sure where I can obtain the certificates I need to put them into that path to test.
Does anyone know how this can be done?
I got into same issue, I did this and it worked
Remove the line "credsStore": "xxx" from ~/.docker/config.json.

Deploy API REST IBM Hyperledger Composer Blockchain

I'm developing a POC over IBM HyperLedger Blockchain. I have a business network developed and deployed in IBM Cloud. I can generate a working local API REST, but cannot make it work on cloud, on the deployed IP.
I'm following this guide:
https://ibm-blockchain.github.io/interacting/
You just have to execute the following command:
./create/create_composer-rest-server.sh --business-network-card MY_BIZNET_CARD_NAME
But it doesn't deploy anything, and get the following (more related to kubernetes than blockchain).
Preparing yaml file for create composer-rest-server
Creating composer-rest-server pod
Running: kubectl create -f /Users/sm/jsblock/ibm-container-service/cs-offerings/scripts/../kube-configs/composer-rest-server.yaml
The connection to the server localhost:8080 was refused - did you specify the right host or port?
the server doesn't have a resource type "svc"
Creating composer-rest-server service
Running: kubectl create -f /Users/sm/jsblock/ibm-container-service/cs-offerings/scripts/../kube-configs/composer-rest-server-services-free.yaml
The connection to the server localhost:8080 was refused - did you specify the right host or port?
Composer rest server created successfully
Any ideas? Thanks too much.
You need to ensure you have a correct kube config setup. Step 10 in https://ibm-blockchain.github.io/setup/ provides the details to set up KUBECONFIG as the error suggests that either it is not configured or not configured correctly.
The document you refer to https://ibm-blockchain.github.io/interacting/ is being updated and should be available soon.
When you run the command ./create/create_composer-rest-server.sh --business-network-card MY_BIZNET_CARD_NAME - should be the name of the Network Admin for the network you deployed, NOT the PeerAdmin card so it will be something like ./create/create_composer-rest-server.sh --business-network-card admin#perishable-network
Look like it's an issue of acceess control. You should make sure again you are running with Local Admin configuration.it will help you to run queries

Composer-Rest-Server not connecting

I am testing a a business network I created, I ran the Composer-rest-server and all worked fine, then shut the server as suggested in the developers guide , then I proceeded use the yo hyperledger composer to create the skeleton of the angular app, however, now the angular app is showing in the local browser, however, the composer-rest- server is not.
Expected Behavior:
I should start the composer-rest- server in localhost:3000 and the angular app as well
Actual Behavior:
I get this message;
scovering types from business network definition ...
Connection fails: Error: Error trying to ping. Error: Error trying to query chaincode. Error: Connect Failed
It will be retried for the next request.
Exception: Error: Error trying to ping. Error: Error trying to query chaincode. Error: Connect Failed
Error: Error trying to ping. Error: Error trying to query chaincode. Error: Connect Failed
at _checkRuntimeVersions.then.catch (/home/node/.nvm/versions/node/v6.11.2/lib/node_modules/composer-rest-server/node_modules/composer-connector-hlfv1/lib/hlfconnection.js:696:34)
Your Environment
composer-cli#0.11.3
generator-hyperledger-composer#0.11.3
composer-rest-server#0.11.3
Docker version 17.06.0-ce, build 02c1d87
docker-compose version 1.13.0, build 1719ceb
The Problem
If you kill your fabric instance using ./stopFabric that you started using the ./startFabric command then all the containers that were apart of the business network were killed as well and therefore you need to reinstall the .bna and start the network again. (the development flow provided is purposely volatile for rapid development)
The Solution
1.) Type docker ps to see all of your running containers. You should see none if you are getting that error because your peer is not responding to pings
2.) Open a separate terminal and navigate to where you have fabric-dev-servers in the terminal and run ./fabricStart. This will start all the containers like your network Certificate Authority, the peer, the orderer, etc.
3.) Return to your project in another terminal. Do Step 1 & 2 found at the developer tutorial (you likely won't need to do step 3 since you likely already imported the network administrator identity going through the tutorial)
4.) Run composer network ping --card admin#tutorial-network. The ping should go through.
5.) Run docker ps. You should see 4 containers running
6.) Run composer-rest-server and follow the steps from the tutorial.
7.) Run cd tutorial-network-app to switch to where your angular application is (or wherever you generated it with the yo command)
8.) Navigate to http://localhost:3000 and everything should work.
Any other questions or problems just reply here and I can help.
The expected behaviour is that the REST server is already running (the the generator uses Loopback to spin up a REST server already (that's why you shut down the previous REST server)). Its described here https://hyperledger.github.io/composer/unstable/tutorials/developer-guide.html under 'Generate your Skeleton Web Application'.
After you created the application - following completion of the yo hyperledger-composer questions (and after providing the answers) you run your application using npm start from within the generated application directory. Your app is accessible at http://localhost:4200.

Starting Bitbucket Server in Ansible

I'm using Vagrant and Ansible to create my Bitbucket Server on Ubuntu 15.10. I have the server setup complete and working but I have to manually run the start-webapp.sh script to start the server each time I reprovision the server.
I have the following task in my Bitbucket role in Ansible and when I increase the verbosity I can see that I get a positive response from the server saying it will be running at http://localhost/ but when I go to the URL the server isn't on. If I then SSH in to the server and run the script myself, getting the exact same response after running the script I can see the startup webpage.
- name: Start the Bitbucket Server
become: yes
shell: /bitbucket-server/atlassian-bitbucket-4.7.1/bin/start-webapp.sh
Any advice would be great on how to fix this.
Thanks,
Sam
Probably better to change that to an init script and use the service module to start it. For example, see this role for installing bitbucket...
Otherwise, you're subject to HUP and other issues from running processes under an ephemeral session.

Perforce installation in network folder for Windows

I would like to install the Windows version of Perforce in a network location so that users can call p4 via:
\\somewhere\p4.exe -p server:1666 -c some_client_name sync
where "somewhere" is consistently mapped on all Windows machines. I tried to do this by installing locally, then copying p4.exe to \\somewhere.
On the computer where I installed locally, \\somewhere\p4.exe works just fine. But when I switch to another machine and try to run
\\somewhere\p4.exe -p server:1666 info
I get the following error:
Perforce client error
Connect to server failed; check $P4PORT.
TCP connect to server:1666 failed.
A non-recoverable error occurred during a database lookup.
What does this error mean? I couldn't find any information in the documentation; I suspect I might need another file besides p4.exe. Indeed, when I install Perforce locally on the other machine, using the local p4.exe works, but \\somewhere\p4.exe still does not.
Any pointers?
Thanks!
You shouldn't need any other files besides P4.exe.
The TCP connection error is probably because that other machine isn't able to translate "server" into an IP address.
Try using some of the Windows command line tools to diagnose this, as in:
nslookup server
or
ping server
Also, try changing your test to run:
\\somewhere\p4.exe -p NNN.NNN.NNN.NNN:1666 info
where the "NNN.NNN.NNN.NNN" is the IP address of your server machine.

Resources