Update AWS stack with CF template

Update AWS stack with CF template - ruby

Hope everyone is keeping safe!
I have a Ruby on Rails application hosted on AWS Beanstalk. I am using CloudFormation template, to update any stack for e.g., Ruby version, Linux Platform upgrade etc.
I was trying to upgrade, Linux box to 2.11.7 and Ruby to 2.6.6 and then ElasticSearch to 7.4
I was doing these changes in CloudFormation YML template and then I ran aws cloudformation update-stack command to apply these changes.
While the changes took time, I accidentally clicked on Rebuild Environment from Web AWS Console as a result, all the previously configured settings like SQS, Load balancer etc., were replaced by new settings.
Now, whenever I am trying to execute the update-stack command, it fails with below errors:
2020-06-09 15:25:44 UTC+0530
WARN
Environment health has transitioned from Info to Degraded. Command failed on all instances.
Incorrect application version found on all instances. Expected version "code-pipeline-xxxxxxxxxx" (deployment 2377). Application update failed 40 seconds ago and took 79 seconds.
2020-06-09 15:25:03 UTC+0530
INFO
The environment was reverted to the previous configuration setting.
2020-06-09 15:24:44 UTC+0530
INFO
Environment health has transitioned from Ok to Info. Application update in progress on 1 instance. 0 out of 1 instance completed (running for 39 seconds).
2020-06-09 15:24:30 UTC+0530
ERROR
During an aborted deployment, some instances may have deployed the new application version.
To ensure all instances are running the same version, re-deploy the appropriate application version.
2020-06-09 15:24:30 UTC+0530
ERROR
Failed to deploy application.
2020-06-09 15:24:30 UTC+0530
ERROR
Unsuccessful command execution on instance id(s) 'i-xxxxxxxxxx'. Aborting the operation.
2020-06-09 15:24:30 UTC+0530
INFO
Command execution completed on all instances. Summary: [Successful: 0, Failed: 1].
2020-06-09 15:24:30 UTC+0530
ERROR
[Instance: i-xxxxxxxxxx] Command failed on instance. Return code: 18 Output: (TRUNCATED)...g: the running version of Bundler (1.16.0) is older than the version that created the lockfile (1.17.3). We suggest you upgrade to the latest version of Bundler by running `gem install bundler`. Your Ruby version is 2.6.6, but your Gemfile specified 2.6.5. Hook /opt/elasticbeanstalk/hooks/appdeploy/pre/10_bundle_install.sh failed. For more detail, check /var/log/eb-activity.log using console or EB CLI.
2020-06-09 15:24:19 UTC+0530
INFO
Deploying new version to instance(s).
2020-06-09 15:23:45 UTC+0530
INFO
Updating environment developWeb's configuration settings.
2020-06-09 15:23:36 UTC+0530
INFO
Environment update is starting.
I can confirm that I have Ruby-2.6.6 set. I am not sure from where it is picking up the old version of Ruby?
Is there any way I can fix this? OR forcefully apply template changes?
Any help on this would be highly appreciated.
[UPDATE]: When I try to connect to ElasticSearch from Rails console, I get:
Faraday::ConnectionFailed: Failed to open TCP connection to old-elasticsearch-host-name.es.amazonaws.com:80 (Hostname not known: old-elasticsearch-host-name.es.amazonaws.com)
from /opt/rubies/ruby-2.6.6/lib/ruby/2.6.0/net/http.rb:949:in `rescue in block in connect'
Caused by SocketError: Failed to open TCP connection to old-elasticsearch-host-name.es.amazonaws.com:80 (Hostname not known: old-elasticsearch-host-name.es.amazonaws.com)
from /opt/rubies/ruby-2.6.6/lib/ruby/2.6.0/net/http.rb:949:in `rescue in block in connect'
Caused by Resolv::ResolvError: no address for old-elasticsearch-host-name.es.amazonaws.com
from /opt/rubies/ruby-2.6.6/lib/ruby/2.6.0/resolv.rb:94:in `getaddress'
The new URL of elasticsearch instance is different but it is still picking up the old URL from ELASTICSEARCH_HOST ENV variable.
Information from my CF template:
I can now provide info as per request. please tag me to see what I have in CF template

This was a config issue.
Whenever I was running aws update-stack command, it was going to s3 and pulling the zip (of the source code) code and in Gemfile of that zip code ruby version was set to 2.6.5.
So, I'd uploaded the fresh copy of the source code and then executed the update-stack command and it worked

Related

Error syncing pod on starting Beam - Dataflow pipeline from docker

We are constantly getting an error while starting our Beam Golang SDK pipeline (driver program) from a docker image which works when started from local / VM instance. We are using Dataflow runner for our pipeline and Kubernetes to deploy.
LOCAL SETUP:
We have GOOGLE_APPLICATION_CREDENTIALS variable set with service account for our GCP cluster. When running the job from local, job gets submitted to dataflow and completes successfully.
DOCKER SETUP:
Build image used is FROM golang:1.14-alpine. When we pack the same program with Dockerfile and try to run, it fails with error
User program exited: fork/exec /bin/worker: no such file or directory
On checking Stackdriver logs for more details, we see this:
Error syncing pod 00014c7112b5049966a4242e323b7850 ("dataflow-go-job-1-1611314272307727-
01220317-27at-harness-jv3l_default(00014c7112b5049966a4242e323b7850)"),
skipping: failed to "StartContainer" for "sdk" with CrashLoopBackOff:
"back-off 2m40s restarting failed container=sdk pod=dataflow-go-job-1-
1611314272307727-01220317-27at-harness-jv3l_default(00014c7112b5049966a4242e323b7850)"
Found reference to this error in Dataflow common errors doc, but it is too generic to figure out whats failing. After multiple retries, we were able to eliminate any permission / access related issues from pods. Not sure what else could be the problem here.

After multiple attempts, we decided to start the job manually from a new Debian 10 based VM instance and it worked. This brought to our notice that we are using alpine based golang image in Docker which may not have all the required dependencies installed to start the job.
On golang docker hub, we found a golang:1.14-buster where buster is codename for Debian 10. Using that for docker build helped us solve the issue. Self answering here to help anyone else facing the same issues.

Laravel AWS Elastic Beanslack deployment error- Out of memory error

I deployed my Application in AWS Elastic Beanslack. Intially I deployed my application directly in Aws console. After configuring all the things. I zipped my code and upload it in a console. At that time its working perfect.
But now I tried to deploy with cli, its shows error. I put eb deploy command
Creating application version archive "app-xxxxxxxxx".
Uploading: [##################################################] 100% Done...
2020-03-14 18:51:49 INFO Environment update is starting.
2020-03-14 18:51:55 INFO Deploying new version to instance(s).
2020-03-14 18:52:22 ERROR [Instance: i-xxxxxxxxx] Command failed on instance. Return code: 255 Output: (TRUNCATED)...ar/src/Composer/DependencyResolver/GenericRule.php on line 36
Fatal error: Out of memory (allocated 809508864) (tried to allocate 8192 bytes) in phar:///opt/elasticbeanstalk/support/composer.phar/src/Composer/DependencyResolver/GenericRule.php on line 36.
Hook /opt/elasticbeanstalk/hooks/appdeploy/pre/10_composer_install.sh failed. For more detail, check /var/log/eb-activity.log using console or EB CLI.
2020-03-14 18:52:22 INFO Command execution completed on all instances. Summary: [Successful: 0, Failed: 1].
2020-03-14 18:52:22 ERROR Unsuccessful command execution on instance id(s) 'i-xxxxxxxxx'. Aborting the operation.
2020-03-14 18:52:23 ERROR Failed to deploy application.
From Internet I tried these all these things
1) I extended memory in php.ini but still its not working
2) I created ebextensions folder and do some configurations but thats also not working
My guess was Initially I deployed manually, so at that time I zipped with vendor folder also. Now when I tried with cli it wont take vendor folder. Instead of its by using composer install
So I think I'm facing this issues due to these things
Please let us know, If any other thing I want to do

This is a configuration inside de ElasticBeanstalk Environment.
To increase de memory limit:
Access the Elastic Beanstalk section
Open to your environment
Go to Configuration , then Software
Find the Memory limit field. The default value is 1024M.
Update the value to what you want
Apply the change
Redeploy your application

When deploy with ElasticBeanstalk (ELB) we should check Environment. With memory_limit, you can config this in Software tab (Configuration)
Follow:
Access the ELB
Open to your environment
Go to Configuration , then Software tab
Find the Memory limit field.
Change the value for what you want
Redeploy

CodeDeploy Timeout Error and Corrupted EC2

I have an issue where my most recent CodeDeploy failed due to a timeout and am now not able to connect or load my EC2 instance. It didn't rollback to the previous working version and since I can't connect I can't take a deeper look at the CodeDeploy logs. When looking at the log tail there isn't anything glaring about the error log and I don't think it is related to the timeout error, but I'm not sure what in my shell script could be creating the timeout. Any ideas of what might be wrong in my setup? Should I do a check for a node_modules folder before running sudo npm install?
Error:
Error Code: ScriptTimedOut
Script Name:scripts/npm-install.sh
Message: Script at specified location: scripts/npm-install.sh failed to complete in 300 seconds
Log Tail:
LifecycleEvent - AfterInstall
Script - scripts/npm-install.sh
[stderr]npm WARN deprecated lodash-node#3.10.2: This package is discontinued. Use lodash#^4.0.0.
[stderr]npm WARN deprecated sendgrid#4.10.0: Please see v6.X+ at https://www.npmjs.com/org/sendgrid
[stderr]npm WARN deprecated nodemailer#2.7.2: All versions below 4.0.1 of Nodemailer are deprecated. See https://nodemailer.com/status/
[stderr]npm WARN deprecated mailcomposer#4.0.1: This project is unmaintained
[stderr]npm WARN deprecated socks#1.1.9: If using 2.x branch, please upgrade to at least 2.1.6 to avoid a serious bug with socket data flow and an import issue introduced in 2.1.0
[stderr]npm WARN deprecated buildmail#4.0.1: This project is unmaintained
[stderr]npm WARN deprecated sendgrid#1.9.2: Please see v6.X+ at https://www.npmjs.com/org/sendgrid
[stderr]npm WARN deprecated mailparser#0.6.2: This project is unmaintained
[stderr]npm WARN deprecated mimelib#0.3.1: This project is unmaintained
[stderr]
Here is my appspec.yml:
version: 0.0
os: linux
files:
- source: /
destination: /var/www/app/
hooks:
AfterInstall:
- location: scripts/npm-install.sh
runas: ec2-user
timeout: 300
ApplicationStart:
- location: scripts/npm-start.sh
runas: ec2-user
timeout: 60
npm-install.sh:
#!/bin/bash
source /home/ec2-user/.bash_profile
cd /var/www/app
sudo npm install
npm-start.sh:
#!/bin/bash
source /home/ec2-user/.bash_profile
cd /var/www/app
npm start

As you probably know, the 300 second timeout is configured in your appspec.yml file. I'm assuming that npm install should finish in less than 300 seconds.
With the information provided, it's impossible to know what went wrong. Per your concern about the node_modules directory not existing, it won't matter because npm install will create that directory if it doesn't already exist. If possible, I would strongly suggest upgrading the deprecated packages. It's possible NPM is not happy about that.
As for your instance, there is nothing in the scripts that you've provided that should kill your instance. If you can't connect to your EC2 instance, you'll probably want to recycle it and get a new one. If you have something actually running on the EC2 instance you can't connect to, you could create a new CodeDeploy deployment group with the new instance, deploy the code and see if the instance remains healthy. This could be an intermittent oddity, and it would be more valuable to troubleshoot if the issue becomes repeatable.

Is the latest version of docker from Amazon on ec2 broken?

As of last night all our new docker deployments started failing because the latest version of docker (docker-1.3.2-1.0.amzn1.x86_64) in the amazon repo fails to start up.
Steps to reproduce are:
## Launch instance with default amazon AMI
yum install docker-1.3.2-1.0.amzn1.x86_64
service docker restart
### Get the following error in /var/log/docker
2014/11/26 05:14:16 docker daemon: 1.3.2 c78088f/1.3.2; execdriver: native; graphdriver:
[8f6d7cfb] +job serveapi(unix:///var/run/docker.sock)
[info] Listening for HTTP on unix (/var/run/docker.sock)
docker: relocation error: docker: symbol dm_task_get_info_with_deferred_remove,
version Base not defined in file libdevmapper.so.1.02 with link time reference
If I downgrade back to docker-1.3.1-1.0.amzn1.x86_64 everything seems to be fine.
Is the AWS package actually broken, or is it just our setup?
Is there a work around other than downgrading?

Yes, it is broken for me too.
Downgrading has been the solution yet.

The same error was by me on a centos VM provisioned at my workplace - a yum update resolved it.
I suspect a build was broken but went out, and has been fixed subsequently.

Play Evolutions Resolve NullPointerException

I am using Play 1.2.4 and deploying to Heroku. When I deployed most recently, I had a mistake in my latest db evolution (it was trying to add a column that was already there). It failed and needed to be resolved so I just ran the heroku run "play evolutions:resolve" command.
I have tried also running heroku restart and then the above command but that didn't work either.
The error I get when I run the heroku run "play evolutions:resolve" command is
Picked up JAVA_TOOL_OPTIONS: -Djava.net.preferIPv4Stack=true -Djava.rmi.server.useCodebaseOnly=true
Exception in thread "main" java.lang.NullPointerException
at play.db.Evolutions.main(Evolutions.java:54)
How can I fix the production environment on heroku?

It turns out I needed to add the --%prod flag.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio