Can be a docker layer "bypassed" on build? - image

let's supposse I have a Dockerfile like this:
FROM debian:stretch
RUN apt update
RUN apt install -y wget
RUN wget https://stackoverflow.com/
# I know the wget is useless. Is just an example :)
CMD ["echo", "hello-world"]
I want to put over the wget statement, a new RUN statement. After this change, when I rebuild, It will re-run all the commands from my modification to down, so the wget will be executed again. The problem is that the wget command takes so much time to finish because on my real file, the file is a very big file.
The question is, can be docker "tweaked" somewhere in order to avoid on building the execution again of the wget layer? If I already built it, can that layer be used again even changing a statement over it?
Thank you.

AFAIK this is not possible, as docker only reuses the layers up until your change and starts to build again from there on out.
This is because the new layers get tested on the previously built layers (so your RUN wget layer is tested and built on the layers from FROM to RUN apt install -y wget). So if you'd enter another RUN instruction above the RUN wget instruction, you'd get a changed environment for your RUN wget instruction, so it needs to be executed again.
I don't think there's a way to fidget with it manually so it would reuse the layer built on a "different" environment and neither would I recommend it.

Using docker-compose, or the -v flag when running docker run you can mount a volume that will persist between runs. Change your wget to a script that conditionally runs in absence of the file.
That won’t cache the later but will make that step faster.
You may need to modify the folder where you store that file depending on the rest of your script and how your environment is set up.
I’m using compose for volume mounting here: https://github.com/jaydorsey/ghgvcR/blob/master/docker-compose.yml
Look at the bin/download-files.sh file in that repo for a bash example.

Related

Unable to run cygwin in Windows Docker Container

I've been working with Docker for Windows, attempting to create a Windows Container that can run cygwin as the shell within the container itself. I haven't had any luck getting this going yet. Here's the Dockerfile that I've been messing with.
# escape=`
FROM microsoft/windowsservercore
SHELL ["powershell", "-command"]
RUN Invoke-WebRequest https://chocolatey.org/install.ps1 -UseBasicParsing | Invoke-Expression
RUN choco install cygwin -y
RUN refreshenv
RUN [Environment]::SetEnvironmentVariable('Path', $env:Path + ';C:\tools\cygwin\bin', [EnvironmentVariableTarget]::Machine)
I've tried setting the ENTRYPOINT and CMD to try and get into cygwin, but neither seems to do anything. I've also attached to the container with docker run -it and fired off the cygwin command to get into the shell, but it doesn't appear to do anything. I don't get an error, it just returns to the command prompt as if nothing happened.
Is it possible to run another shell in the Windows Container, or am I just doing something incorrectly?
Thanks!
You don't "attach" to a container with docker run: you start a container with it.
In your case, as seen here, docker run -it is the right approach.
You can try as an entry point using c:\cygwin\bin\bash, as seen in this issue.
As commented in issue 32330:
Don't get me wrong, cygwin should work in Docker Windows containers.
But, it's also a little paradoxical that containers were painstakingly wrought into Windows, modeled on containers on Linux, only for people to then want to run Linux-utils in these newly minted Docker Windows containers...
That same issue is still unresolved, with new case seen in May and June 2018:
We have an environment that compiles with Visual Studio but still we want to use git and some very useful commands taken from linux.
Also we use of-the-shelve utilities (e.g. git-repo) that uses linux commands (e.g. curl, grep,...)
Some builds require Cygwin like ICU (a cross-platform Unicode based globalization library), and worst: our builds require building it from source.
You can see an example of a crash in MSYS2-packages issue 1239:
Step 5/5 : RUN "C:\\msys64\\usr\\bin\\ls.exe"
---> Running in 5d7867a1f8da
The command 'cmd /S /C "C:\\msys64\\usr\\bin\\ls.exe"' returned a non-zero code: 3221225794
This can get more information on the crash:
PS C:\msys64\usr\bin>
Get-EventLog -Index 28,29,30 -LogName "Application" | Format-List -Property *
The workaround was:
PS > xcopy /S C:\Git C:\Git_Copy
PS > C:\Git_Copy\usr\bin\sh.exe --version > v.txt
PS > type v.txt
As mentioned in that thread, the output gets lost somewhere in the container, thus sending it to a text file.
After playing with it for a long time, my findings were the following:
If your Cygwin utilities are crashing your container, you need to use process isolation. See https://learn.microsoft.com/en-us/virtualization/windowscontainers/deploy-containers/version-compatibility for the requirements (essentially you need to use Windows Server 2016 and a build-matching Docker Image). I spent some time trying to understand the reason why hyper-v isolation doesn't work and so far I didn't come to any conclusion;
If your Cygwin utilities apparently do nothing - but they don't crash the container - you need to remove the -t flag (the -i flag is still ok) or alternatively play with stdout redirection. Apparently there seems to be an issue with MSYS2 when it deals with some pseudo-ttys. You can verify that programs still run if you redirect stdout to a file (e.g. whoami won't output anything when you run it without any stdout redirection, but whoami > out.txt will output the expected result to a file). It might be possible to fix this by replacing the pseudo-tty but I didn't try it. I suspect that the problem is an invalid handle somewhere inside the MSYS2 libs - as other console apps can print things to the terminal - but I didn't verify this.
Hope it helps to all of you having the same problem.
I was able to get a preinstalled (copied from the host) copy of Cygwin to work in a nanoserver-based container with these two steps:
Using Żubrówka's recommendation for no -t in the docker run cmd-line (when running docker interactively)
Copying the host's (Windows Server 2016) kernel32.dll to the container's c:\windows\system32
I found serveral versions of kernel32.dll on my system, and used the one from c:\windows\system32 with md5 hash d8948a7af764f7153b3e396ad44992ff
This also made a large variety of other executables work. Note that without a tty, using the container is even more cumbersome, and the bash shell doesn't render the prompt. However, scripts (via Jenkins, in my case) that rely on cygwin components work fine.
If that doesn't help, try this guide, it helped me a lot. If your windows application (other than cygwin) is legitimately missing DLLs, the instructions in this guide can help. It never occurred to me that SysInternals' procmon.exe can be run on the host and still report events from the container!

Getting chef-client 11.14.6 for later MacOSX versions

I have inherited a cheffed OSX machine running chef-client 11.14.6. I am trying to lay my hands on the installer for 11.14.6, but it seems that Chef have pulled it from the downloads site ( https://downloads.chef.io/chef-client/mac/ ).
Does anyone know anything about this, or know where I can get "archived" version?
Much appreciated.
I don't see any copies in any repos so it's probably lost to the mists of time by now. You should be able to build a new one using this commit from omnibus-chef https://github.com/chef/omnibus-chef/tree/6d5001c588edacc98f6045e22c70195200111660
Yes. From my research, and the research of others. It seems as if it has been removed.
However, we (I can't take the credit - it was one of my colleagues :) ) managed to get it working. We had another machine with the correct version on it, so we grabbed it from there and zipped it up (using root as the base, and grabbing /opt/chef).
Once tarball (e.g. opt.chef-11.14.6.tar.gz) is transferred to new machine, these were the steps used:
install chef-client v11.10.4 using:
(echo "version=11.10.4"; curl -L https://www.opscode.com/chef/install.sh) | sudo bash
verify your chef-client version is currently reported as 11.10.4 with "chef-client -v"
extract the tarball as root into the root filesystem using:
cd / && tar xvfz /tmp/opt.chef-11.14.6.tar.gz
verify your chef-client version is now reported as 11.16.4 with "chef-client -v"
run your knife bootstrap command like normal, but don't include the --bootstrap-version parameter, it'll detect chef-client is already installed and use the one you have installed manually.
I did not try rebuilding it.

What are the reasons not to use many RUN commands in a Dockerfile?

I've seen several Dockerfile, and i have the feeling that people try to avoid using RUN commands. But why?
So is there anything (but repetitive text in this example) to preferr
RUN gem install \
jekyll \
github-pages
over
RUN gem install jekyll
RUN gem install github-pages
Each execution of a RUN command creates a temporary container from the last resulting image, executes your commands, and saves the result as a new layer. Minimizing RUN commands both reduces the amount of overhead from these intermediate containers, but can also dramatically shrink the size of the resulting image.
If, for example, you do 2 run commands, one that downloads 1 gig of data, and a second that deletes that gig of data, your resulting image will exceed one gig even though it's not visible in the running container.
Therefore, when doing large downloads of cached files to do an install or build of an app and you cleanup that build environment when finished, it's a good practice to do that as a single step so the deleted files never make it into any part of the image.
One last reason is for the cache. If you need to pull a new version of an app from a package repository, you also want to update your info on that remote repository (e.g. apt-get update) before doing an install to pull the latest version. If you separate the apt-get update from the apt-get install, the update command may be cached from an old build and the install will attempt to pull old or non-existent files.

Dockerfile, how to create images ubuntu 14.04

Yesterday I've been asked about how to make a docker images with dockerfile
This time I want to add a question
If I want to make the OS ubuntu 14:04 on images docker, which it is installed, postgresql-9.3.10, install Java JDK 6, copy the file (significant location), and create a user on the images.
Whether I can combine of several dockerfile as needed for images? (dockerfile of postgresql, java, copyfile, and create a user so one dockerfile)
Example. I made one dockerfile "ubuntu"
which contains the command
top line
# Create dockerfile
# get OS ubuntu to images
FROM ubuntu: 14:04
# !!further adding a command on the following link, below the line per-dockerfile(intends command in dockerfile on the link)
# command on dockerfile postgresql-9.3
https://github.com/docker-library/postgres/blob/ed23320582f4ec5b0e5e35c99d98966dacbc6ed8/9.3/Dockerfile
# command on dockerfile java
https://github.com/docker-library/java/blob/master/openjdk-6-jdk/Dockerfile
# create a user on images ubuntu
RUN adduser myuser
# copy file/directory on images ubuntu
COPY /home/myuser/test /home/userimagedockerubuntu/test
# ?
CMD ["ubuntu:14.04"]
Please help me
No, you cannot combine multiple Dockerfile.
The best practice is to:
start from an imabe already included what you need, like this postgresql image already based on ubuntu.
That means that if your Dockerfile starts with:
FROM orchardup/postgresql
You would be building an image which already contains ubuntu and postgresql.
COPY or RUN what you need in your dockerfile, like for openjdk6:
RUN \
apt-get update && \
apt-get install -y openjdk-6-jdk && \
rm -rf /var/lib/apt/lists/*
ENV JAVA_HOME /usr/lib/jvm/java-6-openjdk-amd64
Finally, your default command should run the service you want:
# Set the default command to run when starting the container
CMD ["/usr/lib/postgresql/9.3/bin/postgres", "-D", "/var/lib/postgresql/9.3/main", "-c", "config_file=/etc/postgresql/9.3/main/postgresql.conf"]
But since the Dockerfile of orchardup/postgresql already contains a CMD, you don't even have to specify one: you will inherit from the CMD defined in your base image.
I think nesting multiple Dockerfiles is not possible due to the layer system. You may however outsource tasks into shell scripts and run those in your Dockerfile.
In your Dockerfile please fix the base image:
FROM ubuntu:14.04
Further your CMD is invalid. You may want to execute a bash with CMD ["bash"] that you can work with.
I would suggest you to start with the doc on Dockerfile as you clearly missed this and it contains all the answers to your questions, and even questions you don't even think to ask yet.

TensorFlow docker dev workflow on mac

There is an official guide on how to install it that doesn't say much about actually developing in it.
From what I understand, there is a quite big challenge in developing with Docker in general. Not to mention there could be deeper technical complications about going with it for TensorFlow, maybe mostly thanks to GPUs. So there is a lot of stuff to after pulling the docker image...
Does anyone have a step by step guide on how to get development going here?
You could mount a local directory to the docker container so that you can still use your preferred editor in osx. Here's a command to start the container with a mounted directory and run a command:
docker run --name tensorflow --rm -v /Users/me/Code/web/tensorflow_dev:/tensorflow_dev b.gcr.io/tensorflow/tensorflow /bin/sh -c 'cd /tensorflow_dev && python mnist.py'
-v will mount the local directory and the -c will run the specified command. So your flow might look like:
Edit python script in your favorite editor
Run the above command to excute your script
However, I actually use pycharm so that I can place breakpoints and run the python script interactively within the editor.
Hope this helps.

Resources