Docker and file sharing on OS X - macos

Ok. I am playing around with different tools to prepare dev environment. Docker is nice option. I created the whole dev environment in docker and can build a project in it.
The source code for this project lives outside of docker container (on the host). This way you can use your IDE to edit it and use docker just to build it.
However, there is one problem
a) Docker on OS X uses VM (VirtualBox VM)
b) File sharing is reasonably slow (way slower than file IO on host)
c) The project has something like a gazzilion files (which exaggerate problems #a an #b).
If I move source code in the docker, I will have the same problem in IDE (it will have to access shared files and it will be slow).
I heard about some workaround to make it fast. However, I can't seem to find any information on this subject.
Update 1
I used Docker file sharing feature (meaning I run)
docker run -P -i -v <VMDIR>:<DOCKERDIR> -t <imageName> /bin/bash
However, sharing between VM and Docker isn't a problem. It's fast.
The bottle neck is sharing between host and VM.

The workaround I use is not to use boot2docker but instead have a vagrant VM provisioned with docker. No such big penalty for mounting folders host->vagrant->docker.
On the downside, I have to pre-map folders to vagrant (basically my whole work directory) and pre-expose a range of ports from the vagrant box to the host to have access to the docker services directly from there.
On the plus side, when I want to clean unused docker garbage (images, volumes, etc.) I simply destroy the vagrant vm and re-create it again :)
Elaboration
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.vm.box = "trusty-docker"
config.vm.box_url = "https://oss-binaries.phusionpassenger.com/vagrant/boxes/latest/ubuntu-14.04-amd64-vbox.box"
config.vm.provision "docker"
#by default we'll claim ports 9080-9090 on the host system
for i in 9080..9090
config.vm.network :forwarded_port, guest: i, host: i
end
#NB: this folder mapping will not have the boot2docker issue of slow sync
config.vm.synced_folder "~/work", "/home/vagrant/work"
end
Having that:
host$ vagrant up && vagrant ssh
vagrant$ docker run -it --rm -v $(pwd)/work:/work ubuntu:12.04 find /work

This is unfortunately a typical problem Windows and OS X users are currently struggling with that cannot be solved trivially, especially in the case of Windows users. The main culprit is VirtualBox's vboxfs which is used for file sharing which, despite being incredibly useful, results in poor filesystem I/O.
There are numerous situations by which developing the project sources inside the guest VM are brought to a crawl, the main two being scores of 3rd party sources introduce by package managers and Git repositories with a sizable history.
The obvious approach is to move as much of the project-related files outside of vboxfs somewhere else into the guest. For instance, symlinking the package manager directory into the project's vboxfs tree, with something like:
mkdir /var/cache/node_modules && ln -s /var/cache/node_modules /myproject/node_modules
This alone improved the startup time from ~28 seconds down to ~4 seconds for a Node.js application with a few dozen dependencies running on my SSD.
Unfortunately, this is not applicable to managing Git repositories, short of splatting/truncating your history and committing to data loss, unless the Git repository itself is provisioned within the guest, which forces you to have two repositories: one to clone the environment for inflating the guest and another containing the actual sources, where consolidating the two worlds becomes an absolute pain.
The best way to approach the situation is to either:
drop vboxfs in favor of a shared transport mechanism that results in better I/O in the guest, such as the Network File System. Unfortunately, for Windows users, the only way to get NFS service support is to run the enterprise edition of Windows (which I believe will still be true for Windows 10).
revert to mounting raw disk partitions into the guest, noting the related risks of giving your hypervisor raw disk access
If your developer audience is wholly compromised of Linux and OS X users, option 1 might be viable. Create a Vagrant machine and configure NFS shares between your host and guest and profit. If you do have Windows users, then, short of buying them an enterprise license, it would be best to simply ask them to repartition their disks and work inside a guest VM.
I personally use a Windows host and have a 64 GB partition on my SSD that I mount directly into my Arch Linux guest and operate from there. I also switched to GPT and UEFI and have an option to boot directly into Arch Linux in case I want to circumvent the overhead of the virtualized hardware, giving me the best of both worlds with little compromise.

Two steps can improve your performance quite well:
Switch to NFS. You can use this script to do this.
Switch your VBox NIC driver to FAST III. You can do this on your default machine by running:
VBoxManage modifyvm default --nictype1 Am79C973
VBoxManage modifyvm default --nictype2 Am79C973

I run a simple watch script that kicks off an rsync inside my container(s) from the shared source-code volume to a container-only volume whenever anything changes. My entry-point only reads from the container-only volume so that avoids the performance issues, and rsync works really well, especially if you set it up correctly to avoid things like .git folders and libraries that don't change frequently.

Couple of pieces of info which I found
I started to use Vagrant to work with VirtualBox and install Docker in it. It gives more flexibility
Default sharing in Vagrant VirtualBox provisioning is very slow
NFS sharing is much faster. However, it could be reasonably slow it (especially if your build process will create files which needs to be written back to this share).
Vagrant 1.5+ have a rsync option (to use rsync to copy files from host to VM). It's faster because it doesn't have to write back any changes.
This rsync option has autosync (to continiously sync it).
This rsync option consumes a lot of CPU and people came up with a gem to overcome it (https://github.com/smerrill/vagrant-gatling-rsync)
So, Vagrant + VirtualBox + Rsync shared folder + auto rsync + vagrant gatling looks like a good option for my case (still researching it).
I tried vagrant gatling. However, it results in non deterministic behavior. I never know whether new files were copied into VM or not. It wouldn't be a problem if it would take 1 second. However, it make take 20 seconds which is too much (a user can switch a window and start build when new files weren't synced yet).
Now, I am thinking about some way to copy over ONLY files which changed. I am still in research phase.The idea would be to use FSEvent to listen for file changes and send over only changed files. It looks like there are some tools around which do that.
BTW. Gatling internally using FSEvent. The only problem that it triggers full rsync (which goes and start comparing date/times and sizes for 40k files)

Related

Can I create a volume on my windows hyperv docker installation

I need some straight answers about this as the current docker info and general Web info mixes hyperv and vmware info up.
I have installed docker on my windows 10 pro machine. I do not have vmware/virtual box installed I don't need it I have hyperv? I can use docker on a Linux Ubuntu box fairly well and I (think!) I understand volumes. Sorry for the background...
I am developing a node app and I simply want to have a volume within my Linux container mapped to a local directory on my windows machine, this should be simple but everytime I run my (lets say alpine linux) container '-v /c/Users:/somedata' the directory within the Linux /somedata directory is empty?
I just don't get this functionality on Windows? If you have a decent link I would be very grateful as I have been going over the docker info for two days and I feel I am nowhere!
If volumes are not supported between Windows and Linux because of the OS differences would the answer be to use copy within a Docker file? And simply copy my dev files into the container being created?
MANY MANY THANKS IN ADVANCE!
I have been able to get a link to take place, but I don't really know what the rules are, yet.
(I am using Docker for windows 1.12.0-beta21 (build: 5971) )
You have to share the drive(s) in your docker settings (this may require logging in)
The one or two times I've gotten it to work, I used
-v //d/vms/mysql:/var/lib/mysql
(where I have a folder D:\vms\mysql)
(note the "//d" to indicate the drive letter)
I am trying to reproduce this with a different setup, though, and I am not having any luck. Hopefully the next release will make this even easier for us!

Is Vagrant Provision suppose to wipe out all your data

I just ran vagrant provision in a futile attempt at getting my customized synced_folders directive to work and now my whole guest box is wiped out.
Is this normal? I don't see any references to Vagrant docs about this behavior.
As per the doc:
Provisioners in Vagrant allow you to automatically install software, alter configurations, and more on the machine as part of the vagrant up process.
The only thing I have in my config provision shell commands are installation commands. Nothing about wiping anything out.
I do have app.vm.provision for puppet that sets fqdn, user name and box name (along with the normal module_path, manifests_path and manifests_file). Maybe this caused things to be reset?
The Answer
Is Vagrant Provision suppose to wipe out all your data?
No. Vagrant should never harm your "data" (i.e., websites, code, etc.).
...now my whole guest box is wiped out. Is this normal?
Yes. Your Vagrant environment (in other words, the guest operating system created in a virtual environment by Vagrant) is volatile, and you should be able to destroy and recreate it at will without having any impact on your working files (because those should be kept in your local, or host, file system).
Explanation
On Vagrant's website, the very first thing they tell you is this:
Create and configure lightweight, reproducible, and portable development environments.
Your development environment allows you to work. You work on your data, in your development work environment. When you are done with your "development work environment," you should be able to delete it freely without affecting your data in the least.
Further, you should be able to send a collaborating developer your Vagrantfile so that they can create the exact same development environment you used to create your data (i.e., write your program, build your website, and so forth). Then, when you provide them with your code, they can use it in an environment identical to the one that your code was created in without having to reconfigure their own setup.
For more details about how your data files (code, working files, etc.) are kept safely in your computer while making them accessible to your guest system created by Vagrant, see my answer to this question.
So what appears to have happened was that when I set up a synced folder, it wiped out everything because there was nothing on my host machine in that synced folder. Unless there is a way to recover the lost data, there should be an unmistakable WARNING in their docs that this can happen.
I setup the synced_folder to be on my whole home directory. When I created a new machine, I cloned the one project I had saved and decided to just sync my individual projects instead of my whole user directory this time. When I reloaded, the project directory was empty since it was empty on my host machine.
So I guess, make sure the directories on your host machine are already setup with the data before configuring your Vagrantfile with synced_folder information.

Using Vagrant with npm-linked dependencies

I'm evaluating a change in development process toward Vagrant, but I frequently develop interdependent, not-yet-released Node modules that are wired together npm link.
Since Vagrant doesn't have all the source files shared on the guest machine, the symlinks npm link creates are no longer sufficient as a means of developing these modules in sync with one another. For one, there doesn't seem to be any way to get npm link to create hard links. For two, sharing the symlink destinations across the board a la the following won't scale:
config.vm.synced_folder "/usr/local/share/npm/lib/node_modules", "/usr/lib/node_modules"
Now, the question. Is any of the above incorrect (e.g. npm support for hard links exists, and I missed it)? What processes have people used to develop interrelated, private Node modules with testing accomplished via Vagrant?
EDIT: Ultimately, I'm hoping for a solution that will work on both Mac & Windows. Also, for the record, I don't intend to intimate how hard linking a Node module would work; I'm just trying to leverage Vagrant to improve this not-uncommon workflow.
Idea: instead of using the VM sync feature, use a sharing service in the VM to make the files accessible from the host OS.
For example, if your VM runs Linux and the host OS is Windows, you could start up samba and configure it to share the relevant directories. Then have the host OS map the samba share.
If the host OS is Mac, you could use something like macfuse to mount a directory over SSH to the VM.
Good luck!

Is it possible to have a makefile step in a host OS execute a shell command inside a guest OS running in a VM?

I'm currently developing apps for the inPulse watch (if you're a geek, check out www.GetInPulse.com) and am compiling for the watch while on a Mac. But deploying the app to the device takes several minutes. They do however offer a simulator, but that only runs under Linux so I installed Ubuntu in a VM, which works great.
What I'm hoping is to stay completely on the Mac side, except be able to execute a build step or shell script that can 'call into' the VM and launch a shell script there which kicks up the simulator. That way I can just add 'sim' as a step in my makefile back on the 'mac' side.
Currently, I'm mousing back and forth too damn much and I have terminals open all over the place in both the host and the guest OSes. Just trying to clean that up and cross-machine scripting seems like it would work in theory. Just don't know if the boundaries of cross-machines are even a valid thing.
The host OS doesn't know what a “shell” is inside the guest. A shell is an OS-dependent concept, and while the host OS technically knows everything that's going on in the guest, its only contact is by observing the guest memory and the instructions it runs, altogether the wrong level of abstraction here.
The most natural way to run shell commands from one OS to another is to use a remote shell facility over a network link; in practice, that means SSH. You need a network link between the two machines, and once you have that, it doesn't matter that one is a VM running inside the other. There probably is a network link already between the two machines; in case there isn't, make sure you activate a bridged network or a host-only network or whatever your VM technology offers.
Install an SSH client on the host (there's probably one already) and an SSH server on the guest (openssh-server Install openssh-server http://bit.ly/software-small). Then set up public-key authentication between the two machines so you don't need to type a password all the time.
You'll get shell access on the guest. If you need to manipulate GUI applications, you'll need to work a little more than that. ssh DISPLAY variable may help, or perhaps How can I run Firefox on Linux headlessly (i.e. without requiring libgtk-x11-2.0.so.0)?.

Creating a virtual machine image as a continuous integration artifact?

I'm currently working on a server-side product which is a bit complex to deploy on a new server, which makes it an ideal candidate for testing out in a VM. We are already using Hudson as our CI system, and I would really like to be able to deploy a virtual machine image with the latest and greatest software as a build artifact.
So, how does one go about doing this exactly? What VM software is recommended for this purpose? How much scripting needs to be done to accomplish this? Are there any issues in particular when using Windows 2003 Server as the OS here?
Sorry to deny anyone an accepted answer here, but based on further research (thanks to your answers!), I've found a better solution and wanted to summarize what I've found.
First, both VirtualBox and VMWare Server are great products, and since both are free, each is worth evaluating. We've decided to go with VMWare Server, since it is a more established product and we can get support for it should we need. This is especially important since we are also considering distributing our software to clients as a VM instead of a special server installation, assuming that the overhead from the VMWare Player is not too high. Also, there is a VMWare scripting interface called VIX which one can use to directly install files to the VM without needing to install SSH or SFTP, which is a big advantage.
So our solution is basically as follows... first we create a "vanilla" VM image with OS, nothing else, and check it into the repository. Then, we write a script which acts as our installer, putting the artifacts created by Hudson on the VM. This script should have interfaces to copy files directly, over SFTP, and through VIX. This will allow us to continue distributing software directly on the target machine, or through a VM of our choice. This resulting image is then compressed and distributed as an artifact of the CI server.
Regardless of the VM software (I can recommend VirtualBox, too) I think you are looking at the following scenario:
Build is done
CI launches virtual machine (or it is always running)
CI uses scp/sftp to upload build into VM over the network
CI uses the ssh (if available on target OS running in VM) or other remote command execution facility to trigger installation in the VM environment
VMWare Server is free and a very stable product. It also gives you the ability to create snapshots of the VM slice and rollback to previous version of your virtual machine when needed. It will run fine on Win 2003.
In terms of provisioning new VM slices for your builds, you can simply copy and past the folder that contains the VMWare files, change the SID and IP of the new VM and you have a new machine. Takes 15 minutes depending on the size of your VM slice. No scripting required.
If you use VirtualBox, you'll want to look into running it headless, since it'll be on your server. Normally, VirtualBox runs as a desktop app, but it's possible to start VMs from the commandline and access the virtual machine over RDP.
VBoxManage startvm "Windows 2003 Server" -type vrdp
We are using Jenkins + Vagrant + Chef for this scenario.
So you can do the following process:
Version control your VM environment using vagrant provisioning scripts (Chef or Puppet)
Build your system using Jenkins/Hudson
Run your Vagrant script to fetch the last stable release from CI output
Save the VM state to reuse in future.
Reference:
vagrantup.com
I'd recommend VirtualBox. It is free and has a well-defined programming interface, although I haven't personally used it in automated build situations.
Choosing VMWare is currently NOT a bad choice.
However,
Just like VMWare gives support for VMWare server, SUN gives support for VirtualBOX.
You can also accomplish this task using VMWare Studio, which is also free.
The basic workflow is this:
1. Create an XML file that describes your virtual machine
2. Use studio to create the shell.
3. Use VMWare server to provision the virtual machine.

Resources