Vagrant box broken after sleep/shutdown - vagrant

This has happened several times now, the scenario is as follows: I create/provision a vagrant box with puppet. I work on it for some time, a couple of days, sometimes a week. At the end of my day I either close the lid on my MacBook (putting it to sleep), or I shut it down. At a certain point, vagrant up gives an error:
[default] Mounting shared folders...
[default] -- v-root: /vagrant
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!
mkdir -p /vagrant
Provisioning, reloading, halt/up all don't work at this point. I have to destroy and build the box again, which costs some time and becomes very annoying.
I found this post which describes this problem, and states a ntp service should fix it. So I've added that to my puppet config, but the problem still occurs.
I also found a similar issue on Github, which is fixed, but I'm running a different OS than described there, so it's not the same issue. I did post my problem there, without response so far.
The debug log is saved as a gist: https://gist.github.com/pkruithof/5116426
Does anyone know what this problem might be, and how I can fix it?
UPDATE
I think this is fixed somewhere along the road in Vagrant, because I haven't had this issue in about 6 months now. Therefore I'm closing this question.

sudo vim /etc/NetworkManager/NetworkManager.conf
Add the following lines:
[keyfile]
unmanaged-devices=interface-name:vboxnet0
Do vargrant reload

Related

Vagrant Windows 10 'hangs" on vagrant up

I've been having a problem with Vagrant (1.8.1, using VirtualBox 5.0.20) on Windows 10.
When I follow the getting started tutorial https://www.vagrantup.com/docs/getting-started/ after I have typed vagrant up, my console is stuck on:
==> default: Waiting for machine to boot. This may take a few minutes...
default: SSH address: 127.0.0.1:2200
default: SSH username: vagrant
default: SSH auth method: private key
It does not continue, i can see the VM boot inside of VirtualBox, and i can use the VirtualBox GUI to log in with the default credentials, so the VM itself is working.
According to https://www.vagrantup.com/docs/virtualbox/common-issues.html
I should run VirtualBox as admin and do vagrant up from a cmd.exe with admin rights, but when i do that i get the message:
There was an error while executing `VBoxManage`, a CLI used by Vagrant
for controlling VirtualBox. The command and stderr is shown below.
Command: ["modifyvm", "1b9d4f9b-04d8-48bf-8d16-d3aed99d341b", "--natpf1", "delete", "ssh"]
Stderr: VBoxManage.exe: error: Code E_FAIL (0x80004005) - Unspecified error (extended info not available)
VBoxManage.exe: error: Context: "LockMachine(a->session, LockType_Write)" at line 493 of file VBoxManageModifyVM.cpp
This seems different from the 100's of posts all around the net like these:
https://github.com/Varying-Vagrant-Vagrants/VVV/issues/375
since I am not getting antying after the output listed above, it just sits there and after alike 10 minutes it comes up with the message:
Timed out while waiting for the machine to boot. This means that
Vagrant was unable to communicate with the guest machine within
the configured ("config.vm.boot_timeout" value) time period.
If you look above, you should be able to see the error(s) that
Vagrant had when attempting to connect to the machine. These errors
are usually good hints as to what may be wrong.
If you're using a custom box, make sure that networking is properly
working and you're able to connect to the machine. It is a common
problem that networking isn't setup properly in these boxes.
Verify that authentication configurations are also setup properly,
as well.
If the box appears to be booting properly, you may want to increase
the timeout ("config.vm.boot_timeout") value.
I've also read Vagrant stuck in "Waiting for VM to Boot" but it did not help me.
Is there anything else I am missing here?
In my case, vagrant up was hanging on 'Syncing VM folder' , on Windows 7 with Vagrant 1.9.3 and VBox 5.1.18 . It turned out that it requires Powershell >= 3.0.
I downloaded it from https://www.google.ca/search?q=powershell+3.0+download&ie=utf-8&oe=utf-8&client=firefox-b&gfe_rd=cr&ei=x0fdWLfsBubQXu2OorAD, and worked fine afterwards.
try to turn off the VM from VirtualBox or from command line
C:\Progra~1\Oracle\VirtualBox\VBoxManage.exe controlvm default poweroff
then restart the VM from vagrant.
In case you get an error when powering off the VM, force the shutdown
C:\Progra~1\Oracle\VirtualBox\VBoxManage.exe startvm default --type emergencystop
Then vagrant up will should work nicely
I actually already found my problem. It was a .dll from some addware scanner that was preventing the virtualbox VM from starting. I lost the link to the forum topic which helped me solve this unfortunately.
What i did was open the logs from the VM in VirtualBox and had a read trough. At some point, a line indicating an error appeared with a .dll name which was the culprit. I deleted the offending .dll files from my pc and it was fixed.
If i find the link again to the topic explaining exactly what dll it was i will post it here. Im not at the machine that i fixed the problem on now so i can't access my search history.
Hope it will work for you as it worked for me
I'm still investigating why, but as a solution it works.
our case - when typed in cmd (inside vagrand image directory) "vagrant up"
it open virtual box vm and stuck on "default: SSH auth method:
private key" as mentioned in question
so fix by this steps:
open manually virtual box (besides what already opened by vagrant up)
run the vm that had added to the list (by vagrant up)
open CMD
type "Vagrant ssh"
and it will work
hope it helped,
best regards

vagrant up stuck on mount nfs

When I attempt to initiate 'vagrant up' the script executes as normal until it gets to the last line, where NFS shared drives are mounted.
I have tried deleting the exports file in /etc/ followed by a nfsd restart and vagrant destroy / vagrant up but to no avail.
After some considerable amount of time the console outputs the following [certain details redacted]:
*==> default: Mounting NFS shared folders...*
*The following SSH command responded with a non-zero exit status. Vagrant assumes that this means the command failed!*
*mount -o 'nolock,vers=3,udp,noatime' XXX.XXX.XX.X:'/Users/dhatton/Google Drive/moodle-doodle/site' /var/www/site*
*Stdout from the command:*
*Stderr from the command:*
*mount.nfs: Connection timed out*
UPDATE
The above problem was encountered when using a VPN into the office network. Upon logging in on-site without the VPN, everything works again.
For macOS Monterey 12.1 with virtualBox 6.1.30 and vagrant Vagrant 2.2.19/18:
create vbox folder in /etc
create a file inside /etc/vbox named networks.conf
add the following inside networks.conf
* 0.0.0.0/0 ::/0
Note: if you get the ip address range error, add your IP here too.
I had similar issue. I searched a lot, and tried following solutions:
Check /etc/exports and /etc/hosts files, if there are invalid entries in file, remove them.
Check your firewall is not blocking access
Restart NFS system
install vagrant plugin install vagrant-vbguest plugin
do vagrant reload --provision
Reboot your pc
Reinstall vagrant
For me reinstalling vagrant worked.
I've ran across this before and the problem turned out to be related to my companies VPN. If I tried running vagrant up connected to the VPN it would hang on mounting NFS, but if I disconnected from VPN and tried again it worked. Once running I could connect to VP Probably goes back to it needing a stable internet connection.
Assuming you are trying to mount from guest to host (host being OSX?) trying mounting to a different path. You might be encountering issues with the space in Google Drive?
Vagrant downloads binaries from its cloud while configuring a VM, so a stable internet connection is needed. In fact, an internet connection is necessary for using most of the Hashicorp products.

how to unlock a vagrant machine while it is being provisioned

Our vagrant box takes ~1h to provision thus when vagrant up is run for the first time, at the very end of provisioning process I would like to package the box to an image in a local folder so it can be used as a base box next time it needs to be rebuilt. I'm using vagrant-triggers plugin to place the code right at the end of :up process.
Relevant (shortened) Vagrantfile:
pre_built_box_file_name = 'image.vagrant'
pre_built_box_path = 'file://' + File.join(Dir.pwd, pre_built_box_file_name)
pre_built_box_exists = File.file?(pre_built_box_path)
Vagrant.configure(2) do |config|
config.vm.box = 'ubuntu/trusty64'
config.vm.box_url = pre_built_box_path if pre_built_box_exists
config.trigger.after :up do
if not pre_built_box_exists
system("echo 'Building gett vagrant image for re-use...'; vagrant halt; vagrant package --output #{pre_built_box_file_name}; vagrant up;")
end
end
end
The problem is that vagrant locks the machine while the current (vagrant up) process is running:
An action 'halt' was attempted on the machine 'gett',
but another process is already executing an action on the machine.
Vagrant locks each machine for access by only one process at a time.
Please wait until the other Vagrant process finishes modifying this
machine, then try again.
I understand the dangers of two processes provisioning or modifying the machine at one given time, but this is a special case where I'm certain the provisioning has completed.
How can I manually "unlock" vagrant machine during provisioning so I can run vagrant halt; vagrant package; vagrant up; from within config.trigger.after :up?
Or is there at least a way to start vagrant up without locking the machine?
vagrant
This issue has been fixed in GH #3664 (2015). If this still happening, probably it's related to plugins (such as AWS). So try without plugins.
vagrant-aws
If you're using AWS, then follow this bug/feature report: #428 - Unable to ssh into instance during provisioning, which is currently pending.
However there is a pull request which fixes the issue:
Allow status and ssh to run without a lock #457
So apply the fix manually, or waits until it's fixed in the next release.
In case you've got this error related to machines which aren't valid, then try running the vagrant global-status --prune command.
Definitely a bit more of a hack than a solution, but I'd rather a hack than nothing.
I ran into this issue and nothing that was suggested here was working for me. Even though this is 6 years old, it's what came up on a google (along with precious little else), I thought I'd share what solved it for me in case anyone else lands here.
My Setup
I'm using vagrant with ansible-local provisioner on a local virtualbox VM, which provisions remote AWS EC2 instances. (i.e. the ansible-local runs on the virtualbox instance, vagrant provisions the virtualbox instance, ansible handles the cloud). This setup is largely because my host OS is Windows and it's a little easier to take Microsoft out of the equation on this one.
My Mistake
Ran an ansible shell task with a command that doesn't terminate without user input (and did not run it with the & to run in the background).
My Frustration
Even in the linux subsystem, trying a ps aux | grep ruby or ps aux | grep vagrant was unhelpful because the PID would change every time. Probably a reason for this, likely has something to do with how the subsystem works, but I don't know what that reason is.
My Solution
Just kill the AWS EC2 instances manually. In the console, in the CLI, pick your flavor. Your terminal where you were running vagrant provision or vagrant up should then finally complete and spit out the summary output, even if you ctrl + C'd out of the command.
Hoping this helps someone!

VM has become 'inaccessible' - Vagrant no longer working

For some reason this morning when I run 'vagrant up' I get the following error (this has worked absolutely fine for over a year)
Your VM has become "inaccessible". Unfortunately, this is a critical error with VirtualBox that Vagrant can not cleanly recover from. Please open VirtualBox and clear out your inaccessible virtual machines or find a way to fix them.
I could try removing my existing .vagrant folder and doing a vagrant up but that will take forever on our very slow internet speeds - can anyone suggest how to fix this quickly?
this works for me:
In my "C:\Users{user}\VirtualBox VMs{vm-id}" folder are two files
{vm-id}.vbox-prev
{vm-id}.vbox-tmp
Renaming from "{vm-id}.vbox-tmp" to "{vm-id}.vbox" solved my problem and i can call "vagrant up"
You can simply delete the .vagrant folder from your project folder and run vagrant up again.
This worked for me
After some digging through the debug output, I discovered that even though the actual VM is intact (I can load and run it from the VirtualBox GUI app), somewhere in its guts, VirtualBox flagged this VM as "". Vagrant, rightly believing what it's told, spits out the error message.
After looking at VBoxManage's help, I found that one its commands, list vms, unsurprisingly lists all of the VMs registered with VirtualBox:
$ /cygdrive/c/Program\ Files/Oracle/VirtualBox/VBoxManage.exe list vms
"precise64" {3613de48-6295-4a91-81fd-36e936beda4b}
"<inaccessible>" {2568227e-e73d-4056-978e-9ae8596493d9}
"<inaccessible>" {0fb42965-61cb-4388-89c4-de572d4ea7fc}
"<inaccessible>" {c65b1456-5771-4617-a6fb-869dffebeddd}
"<inaccessible>" {9709d3d5-ce4d-42b9-ad5e-07726823fd02}
One of those VMs flagged as inaccessible is my lost VM! Time to fix VBoxManage's wagon, by unregistering the VM as inaccessible, then re-registering it with the correct name:
Open the configuration file for your lost VM. Mine was saved to C:\cygwin\home\Philip\VirtualBox VMs\rails-vm-v2\rails-vm-v2.vbox
Find and copy the value of the uuid attribute of the Machine node. Mine was 9709d3d5-ce4d-42b9-ad5e-07726823fd02.
In a Windows command prompt (or Cygwin terminal), unregister the VM with the unregistervm command, using the [uuid] value from step 2:
$ C:\Program Files\Oracle\VirtualBox\VBoxManage.exe unregistervm [uuid]
Now register the VM using the registervm command, with the path to the VM configuration file:
$ C:\Program Files\Oracle\VirtualBox\VBoxManage.exe registervm C:\cygwin\home\Philip\Virtual VMs\rails-vm-v2\rails-vm-v2.vbox
Now you should be able to start the VM as expected.
Source :
http://www.psteiner.com/2013/04/vagrant-how-to-fix-vm-inaccessible-error.html
Nothing here worked for me.
I deleted (or renamed see first comment) all files from
C:\Users[YourNameHere].VirtualBox
Run vagrant again:
vagrant up
Now it's up.
VirtualBox Manager will likely give you a bit more useful information, for example in my case it reported that the .vbox file did not exist.
After taking a look the problem was indeed that the file didn't exist - something had renamed it to x.vbox-tmp (shutting the PC down with the VM still running maybe?)
I copied the x.vbox-prev file to x.vbox and tried booting the VM again and everything worked fine.
Find the one which is inaccessible with one of the following commands:
$ vagrant global-status
or:
$ VBoxManage list vms
Then note the GUID, and remove it from VirtualBox.xml file (OS X: ~/Library/VirtualBox/VirtualBox.xml, Windows: %HOME%/.VirtualBox).
Alternatively remove .vagrant folder from the folder where is your VM and start from scratch (vagrant up).
See also: Cannot Delete "Inaccessible" virtual machines from Virtualbox GUI at VirtualBox
By chance if someone deletes your vm from VirtualBox VMs folder manually, also in this case your vm would become inaccessible. However, you will not be able to get your machine back but vagrant will still show your vm in the list. To remove it completely from the vm list, go to
\.vagrant.d\data\machine-index
and open index file. Delete the reference of inaccessible machine. Next time when you run below command, it will not show your inaccessible machine.
vagrant global-status --prune
My problem was the same, but the fix was quite different... my VMs are stored on a network drive, accessible by NFS share. The remote drive had failed to come up after a reboot, so the VMs weren't accessible.
Took me a while to realise the reason, and meantime hunted all over SO without a solution.
Then I realised, facepalmed, mounted the paths, and it all worked.
So in a nutshell, it was a path issue.
I felt I should include it here in case it helps someone in the same boat.
By using command line, you can remove all inaccessible boxes by using an one-liner:
VBoxManage list vms |grep inaccessible |cut -d "{" -f2 |cut -d "}" -f1 |xargs -L1 VBoxManage unregistervm
See https://phz.fi/?p=8422
I had to rename [vm-id].vbox-tmp (on VirtualBox VMs) to [vm-id].vbox. After that, without delete the .vagrant folder, I could run vagrant up and it worked very well.
On Linux the following will unregister the machines:
VBoxManage list vms
VBoxManage unregistervm <inaccessible machine UID>
After that you may want to restart VB services:
sudo /sbin/vboxconfig
Deleting .vagrant folder may help but you had to rebuild the machines.
I also had this problem.
when change directory of virtualBox after restart macOS virtualbox get inaccessible to all vms.
my solution worked.
just move virtual box to default directory.
remove all inaccesible vms from gui, then register vms from default path next run it.
or
vagrant up

Flag/Mark Vagrant Machine As Provisioned

I don't want to run provision on my vagrant (VirtualBox) machine. I want to vagrant up and the machine marked as "provisioned", even though actually the machine is not provisioned yet. I just want to mark it as "provisioned".
Is this possible? Perhaps, is there some file i can edit in .vagrant?
It seems vagrant looks for the existence of a file :
.vagrant/machines/[machine-name]/[provider]/action_provision
However, it seems that there is more logic to that in
[vagrant-install-path]/lib/vagrant/action/builtin/provision.rb
You can start investigating from there to what exactly vagrant needs to consider a machine as provisioned.
I personally didn't have time to look more into it since I fixed my issue with a workaround :).
Started the machine with vagrant up, so that chef_solo provisioner started running, and then hitting CTRL+C twice (so that chef says "exiting without cleanup") helped making the VM be marked as provisioned, so that it could be started without the --no-provision flag.
Hoping this comes of any help.

Resources