Vagrant up and its non-deterministic issues - vagrant

Ok, I've started to use vagrant because a client uses it. Getting all the pieces together and make them work was truly painful... and now I'm facing a different kind of issue.
Once I run vagrant up there could be different results:
VM doesn't boot properly
Timed out while waiting for the machine to boot. This means that
Vagrant was unable to communicate with the guest machine within
the configured ("config.vm.boot_timeout" value) time period.
If you look above, you should be able to see the error(s) that
Vagrant had when attempting to connect to the machine. These errors
are usually good hints as to what may be wrong.
If you're using a custom box, make sure that networking is properly
working and you're able to connect to the machine. It is a common
problem that networking isn't setup properly in these boxes.
Verify that authentication configurations are also setup properly,
as well.
If the box appears to be booting properly, you may want to increase
the timeout ("config.vm.boot_timeout") value.
MySQL dies
default: ERROR 2006 (HY000) at line 1118: MySQL server has gone away
or
default: ERROR 1034 (HY000) at line 3720: Index for table 'ci_sessions' is corrupt; try to repair it
Works properly
I have to run back to back vagrant up and vagrant destroy commands, until my machine decides to boot and build properly. It's weird, I have no explanation for this, running the environment is such a gamble and takes me a lot of time until it works.
If someone knows anything about Vagrant or this strange issue, it would be very appreciated.
PD:
My Vagrantfile has been provisioned with
config.vm.provider "virtualbox" do |v|
v.memory = 4096
v.cpus = 3
end
Databases that are being loaded are huge (one of 5Gb) and I tried to include a sed command to improve max allowed package without success.
sudo sed -i "s#[client-server]#[client-server]\n\n[mysqld]\nmax_allowed_packet=256M#" /etc/my.cnf
I got this answer:
sed: -e expression #1, char 33: unterminated `s' command
It runs ok when I execute it at vagrant ssh but I'm not sure where to include it in my Vagrantfile.

I would speculate that a lot of stuff is going on in the VM during startup that either eats up resources, or just takes too long. You already point out that you give the VM only 4GB of RAM, but the database you try to load is 5GB, so the VM is likely thrashing the swap partition. That MySQL dies also suggests that you are not giving the VM the resources required.
Even if vagrant can't establish an SSH connection and times out, can you ssh into the box after a while (check "vagrant ssh-config")?
Increase RAM, and investigate (perhaps you can ask whoever packaged the box) what kind of services this VM launches as part of the boot process. Monitor whatever facilities the guest OS of the VM provides, and you should see what's going on in the VirtualBox GUI as well. Press whatever keys the guest OS uses for that purpose so that you don't just see some opaque boot screen.
As for your sed command, you can use a shell provisioner for that: https://www.vagrantup.com/docs/provisioning/shell; make sure you escape things properly when passing the command line via the inline attribute.

Related

How can I allow a private insecure registry to be used inside a minikube node?

I know there are about a thousand answers to various permutations of this question but none of the fifteen or so that I've tried have worked.
I'm running on Mac OS Sierra and using Minikube 0.17.1 as well as kubectl 1.5.3.
We run our own private Docker registry that is insecure as it is not open to the internet. (This is not my choice or in my control so it's not open for discussion). This is my first foray into Kubernetes and actually container orchestration altogether. I also have a very intermediate level of knowledge about Docker in general so I'm drowning in terminology/platform soup here.
When I execute
kubectl run perf-ui --image=X.X.X.X/performance/perf-ui:master
I see
image pull failed for X.X.X.X/performance/perf-ui:master, this may be because there are no credentials on this request. details: (Error response from daemon: Get https://X.X.X.X/v1/_ping: dial tcp X.X.X.X:443: getsockopt: connection refused)
We have an Ubuntu box that accesses the same registry (not using Kubernetes, just Docker) that works just fine. This is likely due to
DOCKER_OPTS="--insecure-registry X.X.X.X"
being in /etc/default/docker.
I made a similar change using the UI of Docker for Mac. I don't know where this change persisted in a config file. After this change a docker pull worked on my laptop!!! Again, this is just using Docker not Kubernetes. The interesting part is I got the same "Connection refused error" (as it tries to access via HTTPS) on my Mac as I get in the Minikube VM and after the change the pull worked. I feel like I'm on to something there.
After sshing into minikube (the VM created my minikube start) using
minikube ssh
I added the following content to /var/lib/boot2docker/profile
export EXTRA_ARGS="$EXTRA_ARGS --insecure-registry 10.129.100.3
export DOCKER_OPTS="$DOCKER_OPTS --insecure-registry 10.129.100.3
As you can infer, nothing has worked. I know I've tried other things but they too have failed.
I know this isn't the most comprehensive explanation but I've been digging into this for the past 4 hours.
The bottom line is docker pulls work from our Ubuntu box with the config file setup correctly and from my Mac with the setting configured properly.
How can I enable the same setting in my "Linux 2.6" VM that was created by Minikube?
If someone knows the answer I would be forever grateful.
Thank you in advance!
Thank you to Janos for your alternative solution. I'm confident that is the right choice for some use cases.
It turns out that what I needed was a good night sleep and the following command to start Minikube in the first place:
minikube start --insecure-registry="X.X.X.X"
#intelfx says that adding a port won't be necessary. I'm inclined to believe them but if your registry is on a non-standard port just keep it in mind in case things still aren't working.
In the end it was, in fact, a matter of telling Docker to use an insecure registry but it was not clear how to tell this to Docker when I was not controlling it directly.
I know that seems simple but after you've tried a hundred things you're almost hallucinating so you're not in a great state to make rational decisions. I'm sorry for the dumb post but I'm willing to bet this will help at least one person one day, which makes it worth it.
Thanks SO!
The flag --insecure-registry doesn't work on the existing cluster on MacOS. You need to do minikube delete, it's not enough just to stop the cluster with kubectl stop.
I spent plenty of time to figure this out and then I found this comment at https://github.com/kubernetes/minikube/issues/604:
the --insecure-registry flag is ignored if the
machine already existed (even if it is stopped). You must first
minikube delete if you want new flags to be respected.
You can use kube-registry-proxy from (needs some configuration):
https://github.com/kubernetes/kubernetes/blob/master/cluster/saltbase/salt/kube-registry-proxy/kube-registry-proxy.yaml
Then you can refer to localhost:5050 as your registry. The trick is that localhost is allowed as an insecure registry by default.

how does vagrant get information about the VM

I have been using vagrant for some time now and have came across "bugs" that I would like to understand more in depth.
the situation I encountered was that when provisioning a base box that I have created the /etc/exports file was loaded with the wrong IP, The file was given the IP the base box had while I was creating it, And once I packaged the base box, every time I did "vagrant up" and provisioned it there was an error because vagrant used the IP the base box had and the new box I was spinning up had a different (random) IP that i wanted vagrant to use.
So my question is, where does vagrant or virtualbox hold that data about the base box? why does it sometimes "stick" like that and sometimes doesn't? And when I'll understand where that data is stored, Is it holding any more data I can manipulate?

Where does Cloudera Manager get the --hostname value for Impala commands?

I am working through the process for activating Kerberos on the Cloudera quick-start VM. The vm begins life with hostname = "quickstart.cloudera" but I had to change it to get it into our local DNS consistently. After changing the name I was able to get everything except impala to come up. The manager is passing it --hostname=quickstart.cloudera even though everything else in the whole system knows the new name. I don't strictly have to have impala running for the tests I need to run but it's driving me nuts. Any clues?
I'm looking at a relatively fresh install of CM 5.3 with Impala 2.1 and I don't see the hostname being passed to the catalog server via cmdline flags.
Unless you're explicitly setting the hostname parameter in CM via a safety-valve configuration (I assume you're not doing this), then Impala gets the hostname to use by calling the gethostname() stdlib function (see the gethostname() man page). The log output snipped you showed is somewhat deceiving because when the flag isn't set, Impala actually sets that value manually and it shows up as if it were passed by the user.
Anyway, you should check that the hostname is properly changed on the box,
which may depend on your OS. A few things to try: check the hostname command returns the name you expect and that you've restarted the OS.

Load data onto an ec2 instance with no associated key-pair (generated by NotebookCloud)

I'm trying to run iPython notebook in an Amazon ec2 instance (I'm using the free tier, if that makes any difference), using NotebookCloud (https://notebookcloud.appspot.com/) to handle the iPython notebook interface. However, the code I want to run in the notebook needs access to a variety of datafiles and supplemental python files. When NotebookCloud generates a new ec2 instance, it doesn't assign a key-pair to it, and I can't find a way to make it do so. As far as I can tell from other questions, there's no way to SSH into an instance if it doesn't have an associated key-pair. Is there still some sneaky way to get data onto the instance though?
Okay, I figured it out. I put the data on an EBS volume and attached it to the instance. Since iPython let's you send commands directly to the operating system by prefacing them with "!", it was then possible to mount the volume on the instance as specified here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html.
Earlier I also tried just enabling x-forwarding and installing a browser on the instance and running ipython notebook in that browser. This, however, proved to be painfully slow, although that might have just been because I was using a micro instance, which I have since decided is not remotely large enough for my task.
I recommend using ssh and port forwarding instead of putting the IPython Notebook on the internet or x-forwarding (e.g. ssh -i prvate_key user#ip_address -L 8889:localhost:8888)
Point your browser to http://localhost:8889 for your remote IPython Notebook

Installed Zone Alarm on Amazon EC2 Windows Instance and cannot access now. How do I fix this?

I messed up this.
Installed ZoneMinder and now I cannot connect to my VPS via Remote Desktop, it must probably have blocked connections. Didnt know it will start blocking right away and let me configure it before.
How can I solve this?
Note: My answer is under the assumption this is a Windows instance due to the use of 'Remote Desktop', even though ZoneMinder is primarily Linux-based.
Short answer is you probably can't and will likely be forced to terminate the instance.
But at the very least you can take a snapshot of the hard drive (EBS volume) attached to the machine, so you don't lose any data or configuration settings.
Without network connectivity your server can't be accessed at all, and unless you've installed other services on the machine that are still accessible (e.g. ssh, telnet) that could be used to reverse the firewall settings, you can't make any changes.
I would attempt the following in this order (although they're longshots):
Restart your instance using the AWS Console (maybe the firewall won't be enabled by default on reboot and you'll be able to connect).
If this doesn't work (which it shouldn't), you're going to need to stop your crippled instance, detach the volume, spin up another ec2 instance running Windows, and attach the old volume to the new instance.
Here's the procedure with screenshots of the exact steps, except your specific steps to disable the new firewall will be different.
After this is done, you need to find instructions on manually uninstalling your new firewall -
Take a snapshot of the EBS volume attached to it to preserve your data (essentially the C:), this appears on the EC2 console page under the 'volumes' menu item. This way you don't lose any data at least.
Start another Windows EC2 instance, and attach the EBS volume from the old one to this one. RDP into the new instance and attempt to manually uninstall the firewall.
At a minimum at this point you should be able to recover your files and service settings very easily into the new instance, which is the approach I would expect you to have more success with.

Resources