Apache Whirr on EC2 with custom AMI - amazon-ec2

I am trying to launch a cluster of custom AMI images. AMI image is just Ubunutu 12.04 server image from Amazon free tier selection with Java installed (I actually want to create AMI with numpy and scipy). In fact, I created that image by launching the Ubuntu 12.04 instance with whirr and noop as a role. Then I installed Java, and in AWS online Console selected Create Image (EBS AMI). I am using same whirr recipe script I used to launch original ubuntu server with only image-id changed.
Whirr launches the image, it shows up in the console. Then it tries to run InitScript for noop and nothing happens. After 10min it throws exception caused by script running for too long. whirr.log containts record
error acquiring SFTPClient() (out of retries - max 7): Invalid packet: indicated length 1349281121 too large
I saw this error mentioned in one of the tutorials, suggested solution was to add line
whirr.bootstrap-user=ec2-user
to let JCloud know the username. I know this is the correct username and was used by default anyway. After adding the line, whirr.log shows authentification error, problem with public key.
Finally, when I use 'ubuntu' as user, the error is
Dying because - java.net.SocketTimeoutException: Read timed out
Here's file I use to launch the cluster
whirr.cluster-name=pineapple
whirr.instance-templates=1 noop
whirr.provider=aws-ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.private-key-file=${sys:user.home}/.ssh/id_rsa
whirr.public-key-file=${sys:user.home}/.ssh/id_rsa.pub
whirr.env.repo=cdh4
whirr.hardware-id=t1.micro
whirr.image-id=us-east-1/ami-224cda4b
whirr.image-location=us-east-1b

The exception log will help us to solve your problem.
Also, setting the following may solve your issue.
whirr.cluster-user=<Clu>

Related

"There are no OCR keys; creating a new key encrypted with given password" Crashes when running Chainlink node

I am setting up a chainlink node in AWS ec2 + AWS RDS (PostgreSQL) and have followed every step in the documentation (https://docs.chain.link/docs/running-a-chainlink-node/).
Everything runs smoothly until the OCR keys creation step. Once it gets here, it shows "There are no OCR keys; creating a new key encrypted with given password". This is supposed to happen but the docker container exits right after (see image below).
Output after OCR keys creation
I have tried the following:
Checking whether there is a problem with the specific table these keys are stored in the PostgreSQL database: public.encrypted_ocr_key_bundles, which gets populated if this step succeeds. Nothing here so far.
Using a different version of the Chainlink docker image (see Chainlink Docker hub). I am currently using version 0.10.0. No success either, even if using latest ones.
Using AWS Cloudformation to "let AWS + Chainlink" take care of this, but even so I have encountered similar problems, so no success.
I have thought about populating the OCR table manually with a query, but I am far from having proper OCR key generation knowledge/script in hand so I do not like this option.
Does anybody know what else to try/where the problem could be?
Thanks a lot in advance!
UPDATE: It was a simple memory problem. The AWS micro instance (1GB RAM) was running out of memory when OCR keys were generated. I only got a log of the error after switching to an updated version of the CL docker image. In conclusion: migrate to a bigger instance. Should've thought of that but learning never stops!

How can I allow a private insecure registry to be used inside a minikube node?

I know there are about a thousand answers to various permutations of this question but none of the fifteen or so that I've tried have worked.
I'm running on Mac OS Sierra and using Minikube 0.17.1 as well as kubectl 1.5.3.
We run our own private Docker registry that is insecure as it is not open to the internet. (This is not my choice or in my control so it's not open for discussion). This is my first foray into Kubernetes and actually container orchestration altogether. I also have a very intermediate level of knowledge about Docker in general so I'm drowning in terminology/platform soup here.
When I execute
kubectl run perf-ui --image=X.X.X.X/performance/perf-ui:master
I see
image pull failed for X.X.X.X/performance/perf-ui:master, this may be because there are no credentials on this request. details: (Error response from daemon: Get https://X.X.X.X/v1/_ping: dial tcp X.X.X.X:443: getsockopt: connection refused)
We have an Ubuntu box that accesses the same registry (not using Kubernetes, just Docker) that works just fine. This is likely due to
DOCKER_OPTS="--insecure-registry X.X.X.X"
being in /etc/default/docker.
I made a similar change using the UI of Docker for Mac. I don't know where this change persisted in a config file. After this change a docker pull worked on my laptop!!! Again, this is just using Docker not Kubernetes. The interesting part is I got the same "Connection refused error" (as it tries to access via HTTPS) on my Mac as I get in the Minikube VM and after the change the pull worked. I feel like I'm on to something there.
After sshing into minikube (the VM created my minikube start) using
minikube ssh
I added the following content to /var/lib/boot2docker/profile
export EXTRA_ARGS="$EXTRA_ARGS --insecure-registry 10.129.100.3
export DOCKER_OPTS="$DOCKER_OPTS --insecure-registry 10.129.100.3
As you can infer, nothing has worked. I know I've tried other things but they too have failed.
I know this isn't the most comprehensive explanation but I've been digging into this for the past 4 hours.
The bottom line is docker pulls work from our Ubuntu box with the config file setup correctly and from my Mac with the setting configured properly.
How can I enable the same setting in my "Linux 2.6" VM that was created by Minikube?
If someone knows the answer I would be forever grateful.
Thank you in advance!
Thank you to Janos for your alternative solution. I'm confident that is the right choice for some use cases.
It turns out that what I needed was a good night sleep and the following command to start Minikube in the first place:
minikube start --insecure-registry="X.X.X.X"
#intelfx says that adding a port won't be necessary. I'm inclined to believe them but if your registry is on a non-standard port just keep it in mind in case things still aren't working.
In the end it was, in fact, a matter of telling Docker to use an insecure registry but it was not clear how to tell this to Docker when I was not controlling it directly.
I know that seems simple but after you've tried a hundred things you're almost hallucinating so you're not in a great state to make rational decisions. I'm sorry for the dumb post but I'm willing to bet this will help at least one person one day, which makes it worth it.
Thanks SO!
The flag --insecure-registry doesn't work on the existing cluster on MacOS. You need to do minikube delete, it's not enough just to stop the cluster with kubectl stop.
I spent plenty of time to figure this out and then I found this comment at https://github.com/kubernetes/minikube/issues/604:
the --insecure-registry flag is ignored if the
machine already existed (even if it is stopped). You must first
minikube delete if you want new flags to be respected.
You can use kube-registry-proxy from (needs some configuration):
https://github.com/kubernetes/kubernetes/blob/master/cluster/saltbase/salt/kube-registry-proxy/kube-registry-proxy.yaml
Then you can refer to localhost:5050 as your registry. The trick is that localhost is allowed as an insecure registry by default.

Rubber stalling while executing `bundle:install'

A rubber deployment as per quick start instruction using the latest 3.1.0 version reaches the stage of fetching and installing the gems (the last one loaded is pg), for an m1.small instance. I see no mention of therubyracer in the scroll of gems...
The process successfully completes deploy:setup, rubber:collectd:bootstrap, deploy:setup, deploy:update_code, but upon deploy:finalize_update the callback being triggered is bundle:install
Invariably, the process stalls at this point. The /etc/hosts/ file does refer to the proper configurations (52.25.135.252 production.foo.com ec2-52-25-135-252.us-west-2.compute.amazonaws.com ip-172-[...]).
One oddity is that trying to ssh into the the instance
ssh -i aws-eb production.foo.com
or via the ec-2 user
ssh -i aws-eb ec2-user#ec2-52-25-135-252.us-west-2.compute.amazonaws.com
the access is
Permission denied (publickey).
for a key that I was using with elastic beanstalk until a few days ago and had inserted into the confg/rubber/rubber.yml file.
I will attempt with a new key pair, but how can a key be now deemed public and unusable?
update
setting up a new keypair does not alter any behaviour. Process stuck at same point, cannot ssh into the instance. The production.foo.com does properly return, what is configured to this point, the nginx on ubuntu welcome page
As far as I can tell, having iterated about 10 times over this, memory of the instance is the issue.
The smallest instance that has not choked at this point is image_type: m3.medium. AMIs per instance type can be found here
The automatic suggestion of an m1.small in the vulcanization of the application is optimistic in my opinion.

AWS EC2: Instance from my own Windows AMI is not reachable

I am windows user and wanted to use a spot instance using my own EBS windows AMI. For this I have followed these steps:
I had my own on-demand instance with specific settings
Using AWS console I used option "Create Image EBS" to create EBS based windows AMI. IT worked and AMI created successfully
Then using this new AMI I launched a spot medium instance that was created well and now running with status checks passed.
After waiting an hour or more I am trying to connect it using windows 7 RDC client but is not reachable with client tool's standard error that either computer is not reachable or not powered on.
I have tried to achieve this goal and created/ deleted many volums, instances, snapshots everything but still unsuccessful. Doesn't anybody else have any solution to this problem?
Thanks
Basically what's happening is that the existing administrator password (and other user authentication information) for Windows is only valid in the original instance, and can't be used on the new "hardware" that you're launching the AMI on (even though it's all virtualized).
This is why RDP connections will fail to newly launched instances, as will any attempts to retrieve the administrator password. Unfortunately you have no choice but to shut down the new instances you've been trying to connect to because you won't be able to do anything with them.
For various reasons the Windows administrator password cannot be preserved when running a snapshot of the operating system on different hardware (even virtualized hardware) - this is a big part of the reason why technologies like Active Directory exist, so that user authentication information is portable between machines and networks.
It sounds like you did all the steps necessary to get this working except for one - you never took any steps to cause a new password to be assigned to your newly-launched instances based on the original AMI you created.
To fix this, BEFORE turning your instance into a custom AMI that can be used to launch new instances, you need to (in the original instance) run the Ec2ConfigService Settings tool (found in the start menu when remoted into the original instance using RDP), and enable the option to generate a new password on next reboot. Save this setting change.
Then when you do create an AMI from the original instance, and use that AMI to launch new instances, they will each boot up to your custom Windows image but will choose their own random administrator password.
At this point you can go to your ec2 dashboard and retrieve the newly-generated password for the new instance based on the old AMI, and you'll also be able to download the RDP file used to connect to it.
One additional note is that Amazon warns that it can take upwards of 30 minutes for the retrieval of an administrator password after launching a new instance, however in my previous experience I've never had to wait more than a few minutes to be able to get it.
Your problem is most likely that the security group you used to launch the AMI does not have RDP (TCP port #3389) enabled.
When you launch the windows AMI for the first time, AWS will populate the quicklaunch with this port enabled. However, when you launch the subsequent AMI, you will have to ensure that this port is open for your security group.

can not login to custom ami

I am trying to initiate an instance that is found here...
https://aws.amazon.com/amis/aws-tools
The instance is launched but when I try to login, I get the following message:
ssh -i oct9.pem root#ec2-50-16-125-42.compute-1.amazonaws.com
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
If I launch a new instance using the built-in wizard, It works as expected with the same .pem key.
This AMI was working as expected till recently. I have used it before for a few instances. I would like to use this because it has several utilities pre-installed.
When you produce a new image from a running instance, you end up getting locked out of the running instance. I'm not sure why, but you can then re-launch a new instance from the image you just created.
It's unclear whether or not this is the issue you're running into, though.

Resources