Intermittent "Context Canceled" error on Terraform Plan - aws-lambda

I'm running a terraform plan through AWS Codebuild. terraform init works consistently but terraform plan intermittently throws this error:
Error: rpc error: code = Canceled desc = context canceled
terraform version: 0.12.31
aws provider version: ~> 3.55.0
The intention here was to change Python lambda runtimes from 3.6 to 3.9. In this regard, I upgraded terraform from 0.12.21 to 0.12.31 and aws provider version from ~>2 to ~> 3.55.0 (aws provider version 3.55.0 is the first version that supports python 3.9 runtime)
I've seen that others have had similar issues and what seemed to have worked for them is trying a different version of the provider. I wanted to keep these versions of terraform and aws provider because if I upgrade it further here, it would have consequences elsewhere in our pipeline. Does anyone have any ideas on how I can move forward?

I've finally solved it. I increased the Compute resources allocated to Codebuild by following this page: https://docs.aws.amazon.com/codebuild/latest/userguide/change-project-cli.html
All my builds are running through successfully now. So it looks like it was just the resource constraint all along.

Related

Error syncing pod on starting Beam - Dataflow pipeline from docker

We are constantly getting an error while starting our Beam Golang SDK pipeline (driver program) from a docker image which works when started from local / VM instance. We are using Dataflow runner for our pipeline and Kubernetes to deploy.
LOCAL SETUP:
We have GOOGLE_APPLICATION_CREDENTIALS variable set with service account for our GCP cluster. When running the job from local, job gets submitted to dataflow and completes successfully.
DOCKER SETUP:
Build image used is FROM golang:1.14-alpine. When we pack the same program with Dockerfile and try to run, it fails with error
User program exited: fork/exec /bin/worker: no such file or directory
On checking Stackdriver logs for more details, we see this:
Error syncing pod 00014c7112b5049966a4242e323b7850 ("dataflow-go-job-1-1611314272307727-
01220317-27at-harness-jv3l_default(00014c7112b5049966a4242e323b7850)"),
skipping: failed to "StartContainer" for "sdk" with CrashLoopBackOff:
"back-off 2m40s restarting failed container=sdk pod=dataflow-go-job-1-
1611314272307727-01220317-27at-harness-jv3l_default(00014c7112b5049966a4242e323b7850)"
Found reference to this error in Dataflow common errors doc, but it is too generic to figure out whats failing. After multiple retries, we were able to eliminate any permission / access related issues from pods. Not sure what else could be the problem here.
After multiple attempts, we decided to start the job manually from a new Debian 10 based VM instance and it worked. This brought to our notice that we are using alpine based golang image in Docker which may not have all the required dependencies installed to start the job.
On golang docker hub, we found a golang:1.14-buster where buster is codename for Debian 10. Using that for docker build helped us solve the issue. Self answering here to help anyone else facing the same issues.

terraform wont download plugin without sudo

I am trying to run the terraform code and running the terraform init but I am running into issues.
As you can see, when I run with sudo, it has no issues, but without it, it has. I am using mac os Mojave terraform 0.12. I checked the folder permissions and it is just fine.
once I run sudo terraform init, the other commands don't need the sudo command.
Initializing the backend...
Initializing provider plugins...
- Checking for available provider plugins...
Registry service unreachable.
This may indicate a network issue, or an issue with the requested Terraform Registry.
Registry service unreachable.
This may indicate a network issue, or an issue with the requested Terraform Registry.
Error: registry service is unreachable, check https://status.hashicorp.com/ for status updates
Error: registry service is unreachable, check https://status.hashicorp.com/ for status updates
C02Z1BCSLVCG:blue-deployment shakyas$ sudo terraform init
Password:
Initializing the backend...
Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "aws" (hashicorp/aws) 2.42.0...
- Downloading plugin for provider "template" (hashicorp/template) 2.1.2...
The following providers do not have any version constraints in configuration,
so the latest version was installed.
To prevent automatic upgrades to new major versions that may contain breaking
changes, it is recommended to add version = "..." constraints to the
corresponding provider blocks in configuration, with the constraint strings
suggested below.
* provider.aws: version = "~> 2.42"
* provider.template: version = "~> 2.1"
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
I had the same issue, and I resolved it with removing a lot of certificates in my macOS Keychain.
It sound weird and I still don't understand why it works, but it works for others people too : https://discuss.hashicorp.com/t/error-when-running-terraform-init/3135/6

Terraform behind the password protected proxy command

I have set the proxy in command as following
set HTTP_PROXY=http://user:passowrd#host.com:8080
set HTTPS_PROXY=https://user:passowrd#host.com:8080
set HTTP_USER=myuser
set HTTP_PASSWORD=mypwd
and future more I have set environment variable as HTTP_PROXY, HTTPS_PROXY, HTTP_USER, HTTP_PASSWORD
Somehow still getting following error
>terraform init
Initializing the backend...
Initializing provider plugins...
- Checking for available provider plugins...
Registry service unreachable.
This may indicate a network issue, or an issue with the requested Terraform Registry.
Error: registry service is unreachable, check https://status.hashicorp.com/ for status updates
please note that https://status.hashicorp.com/ having access behind the proxy.
but I am not sure terraform init actually which URL/service API is getting access
Working for me with proxy:
C:\Users\xxxx\Desktop\VMWare_Scripts\Terraform>set HTTP_PROXY=http://xxxx:8080
C:\Users\xxxx\Desktop\VMWare_Scripts\Terraform>terraform init
Initializing the backend...
Initializing provider plugins...
Finding latest version of hashicorp/vsphere...
Installing hashicorp/vsphere v1.24.1...
Installed hashicorp/vsphere v1.24.1 (signed by HashiCorp)
The following providers do not have any version constraints in configuration,
so the latest version was installed.
To prevent automatic upgrades to new major versions that may contain breaking
changes, we recommend adding version constraints in a required_providers block
in your configuration, with the constraint strings suggested below.
hashicorp/vsphere: version = "~> 1.24.1"
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

Windows Docker API issue for using GitLab-runner

While setting up a Windows CI pipeline from GitLab, I was going through the numerous issues related to the Windows gitlab-runner docker executor that is using an old API (1.18) which Docker no longer accepts.
The issue results in the following error messages when the Gitlab/CI tries to connect to the runner:
Running with gitlab-runner 11.2.0 (35e8515d) on Windows VS2017 x64 0825d1d7
Using Docker executor with image buildtools2017 ...
ERROR: Preparation failed: Error response from daemon: **client version 1.18 is too old.** Minimum supported API version is 1.24, please upgrade your client to a newer version (executor_docker.go:1148:0s)
The 'buildtools2017' docker image that is referred to is the Microsoft "official".
The image seems to be working and valid for the current (experimental) Docker version I'm using (18.06.1-ce-win74) and for the stable version as well.
The issue was described throughout the GitLab wiki. Andrew Leech (?) went so far as to fork and modify the runner so that it would connect properly, and kindly provided his scripts and comments in a blogpost. This seems to give some results:
C:\gitlab-runner>gitlab-runner.exe -v
Version: 10.8.0~beta.551.g67a6ccc7
Git revision: 67a6ccc7
Git branch: windows-container-executor
GO version: go1.9.4
Built: 2018-07-30T08:57:44+00:00
OS/Arch: windows/amd64
The GitLab wiki states that they're waiting until a more stable solution can be released. Currently it's been over one year of broken windows docker runners..
Andrew's blogpost and a link to his gitlab-runner.exe describes actually a different workaround using the PowerShell runner that then starts a Docker instance. All the token info is exposed, I'm not sure how to set it up, and it also seems to rely on an external image with older build tools.
It seems the docker runner now connects, but if I undestand correctly, the Gitlab-runner docker runner does not seem to agree on the 'build directory' that is used. The first Gitlab/CI scriptline in my repo is just an echo command, so the error is not about the ci script content, but I'm not sure what it IS about. If anyone with docker fu knows what is going on this would really help me.
Using Docker executor with image buildtools2017 ...
ERROR: Preparation failed: build directory needs to be absolute and non-root path
Cheers,

AWS SDK S3 example giving network error

I am trying to learn to use the AWS SDK for Ruby, and I started by running the sample programs on S3. But the S3 example gives me this error:
/var/lib/gems/1.9.1/gems/aws-sdk-1.8.0/lib/aws/core/client.rb:318:in 'return_or_raise':AWS::Core::Client::NetworkError (AWS::Core::Client::NetworkError)
from /var/lib/gems/1.9.1/gems/aws-sdk-1.8.0/lib/aws/core/client.rb:419:in 'client_request'
from (eval):3:in 'put_object'
from /var/lib/gems/1.9.1/gems/aws-sdk-1.8.0/lib/aws/s3/s3_object.rb:1655:in 'write_with_put_object'
from /var/lib/gems/1.9.1/gems/aws-sdk-1.8.0/lib/aws/s3/s3_object.rb:600:in 'write'
from s3/upload_file.rb:31:in 'main'
I have already checked using s3cmd and it is working fine.
I am using Ruby 1.9.1 and AWS SDK version is 1.8.0.

Resources