Dockerfile: Bash Script executing flawlessly with ENTRYPOINT, but not with RUN - bash

Project
I am building a docker-compose file for a simple devops stack. One of the tools is Helix Core by Perforce. I am trying to build an Ubuntu Dockerfile, that will install Helix Core and then run it. I have already written a bash script install.sh that when put like this
FROM ubuntu:20.04
COPY ./install.sh /install.sh
ENTRYPOINT["/bin/bash", "/install.sh"]
will work flawlessly.
Breaking Change
The problem is that I need the script to run as a setup step and not every time the container is started. So I tried the following:
FROM ubuntu:20.04
COPY ./install.sh /install.sh
RUN chmod +x /install.sh
SHELL ["/bin/bash", "-c"]
RUN /install.sh
ENTRYPOINT [ "p4d" ]
Problem
Now firstly I do not get any descriptive output in the console. The only thing I get is the default building output.
...
=> CACHED [2/4] COPY ./install.sh /install.sh 0.0s
=> CACHED [3/4] RUN chmod +x /install.sh 0.0s
=> CACHED [4/4] RUN /install.sh 0.0s
=> exporting to image
...
Secondly the script does not seem to execute or fails immediately. (It should take much longer than it does.) Here is the script, it does work in the first Dockerfile, just not in the second.
#!/bin/bash
service_name="${service_name:="master"}"
p4root="${p4root:="/opt/perforce/servers/$service_name"}"
unicode_mode="${unicode_mode:=0}"
case_sensitive="${case_sensitive:=0}"
p4port="${p4port:="1666"}"
super_user_login="${super_user_login:="super"}"
if [ -z "$super_user_password" ]
then
echo "Install aborted!"
echo "Please set 'super_user_password' via environment variable!"
exit
fi
echo "Installing Helix Core..."
echo "Updating Ubuntu..."
apt-get update -y
echo "Installing utilities..."
apt-get install ca-certificates wget gpg curl -y
echo "Downloading public key..."
curl https://package.perforce.com/perforce.pubkey > perforce.pubkey
echo "Adding public key..."
gpg init
gpg -n --import --import-options import-show perforce.pubkey
rm perforce.pubkey
echo "Adding perforce packaging key to keyring..."
wget -qO - https://package.perforce.com/perforce.pubkey | apt-key add -
echo "Adding perforce repository to APT configuration..."
echo "deb http://package.perforce.com/apt/ubuntu focal release" > /etc/apt/sources.list.d/perforce.list
echo "Updating Ubuntu..."
apt-get update -y
echo "Installing..."
apt-get install helix-p4d -y
echo "Install complete! Writing config file..."
/opt/perforce/sbin/configure-helix-p4d.sh $service_name -n -p $p4port -r $p4root -u $super_user_login -P $super_user_password
p4 admin stop
Extra Information
The Dockerfile is being built with docker-compose, but I have already tried out docker build with no success.
My Thoughts
It is my understanding that the only relevant difference between RUN and ENTRYPOINT is, that RUN only executes one time in the lifecycle of a container (in the build phase), while ENTRYPOINT defines the executable that will be started together with the container every time it is started. So I assume the environment that script is called in is the same.
Any ideas to why this behavior occurs and how to fix it are appreciated.

So thanks to #DavidMaze I took another look at something I overlooked before.
You can get output from the build command using --progress (see: Why is docker build not showing any output from commands?)
Problem 1 solved!
From there I found:
#6 [4/4] RUN /install.sh
#6 sha256:26be9fa6818fd9e3a0fb64e95a83b816de790902b1f54da1c123ff19888873ff
#6 0.344 Install aborted!
#6 0.344 Please set 'super_user_password' via environment variable!
#6 DONE 0.4s
Problem 2 identified!
Now... the actual problem is that:
Environment variables and arguments are two different things.
You want environment variables for the execution environment and arguments for the build environment. This webpage explained it to me.
My mistake was trying to use environment variables for the build environment.
Modified dockerfile:
FROM ubuntu:20.04
COPY ./install.sh /install.sh
SHELL ["/bin/bash", "-c"]
RUN chmod +x /install.sh
RUN ["/install.sh", "$super_user_password"]
ENTRYPOINT [ "p4d" ]
and I had to add this line to my script:
super_user_password="$1"
And finally... it works!

Related

How to properly run entrypoint bash script on docker?

I would like to build a docker image for dumping large SQL Server tables into S3 using the bcp tool by combining this docker and this script. Ideally I could pass table, database, user, password and s3 path as arguments for the docker run command.
The script looks like
#!/bin/bash
TABLE_NAME=$1
DATABASE=$2
USER=$3
PASSWORD=$4
S3_PATH=$5
# read sqlserver...
# write to s3...
# .....
And the Dockerfile is:
# SQL Server Command Line Tools
FROM ubuntu:16.04
LABEL maintainer="SQL Server Engineering Team"
# apt-get and system utilities
RUN apt-get update && apt-get install -y \
curl apt-transport-https debconf-utils \
&& rm -rf /var/lib/apt/lists/*# SQL Server Command Line Tools
FROM ubuntu:16.04
LABEL maintainer="SQL Server Engineering Team"
# apt-get and system utilities
RUN apt-get update && apt-get install -y \
curl apt-transport-https debconf-utils \
&& rm -rf /var/lib/apt/lists/*
# adding custom MS repository
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
RUN curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
# install SQL Server drivers and tools
RUN apt-get update && ACCEPT_EULA=Y apt-get install -y msodbcsql mssql-tools awscli
RUN echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
RUN /bin/bash -c "source ~/.bashrc"
ADD ./sql2sss.sh /opt/mssql-tools/bin/sql2sss.sh
RUN chmod +x /opt/mssql-tools/bin/sql2sss.sh
RUN apt-get -y install locales
RUN locale-gen en_US.UTF-8
RUN update-locale LANG=en_US.UTF-8
ENTRYPOINT ["/opt/mssql-tools/bin/sql2sss.sh", "DB.dbo.TABLE", "SQLSERVERDB", "USER", "PASSWORD", "S3PATH"]
If I replae the entrypoint for CMD /bin/bash and run the image with -it, I can manually run the sql2sss.sh and it works properly, reading and writing to s3. However if I try to use the entrypoint as shown yelds bcp: command not found.
I also noticed if I use CMD /bin/sh in iterative mode it will produce the same error. Am I missing some configuration in order for the entrypoint to run the script properly?
Have you tried
ENV PATH="/opt/mssql-tools/bin:${PATH}"
Instead of exporting the bashrc?
As David Maze pointed out docker doesn't read dot files
Basically add your env definitions in the ENV primitive

Docker gives `no such file or directory` on `docker run <image_id>`

So I've just created my very first docker image (woohoo) and was able to run it on the original host system where it was created (Ubuntu 20.04 Desktop PC). The image was executed using docker run -it <image_id>. The expected command (defined in CMD which is just a bash script) was run, and the expected output was seen. I assumed this meant I successfully created my very first docker image and so I pushed this to Docker Hub.
Docker Hub
GitHub repo with original docker-compose.yml and Dockerfile
Here's the Dockerfile:
FROM ubuntu:20.04
# Required for Debian interaction
# (https://stackoverflow.com/questions/62299928/r-installation-in-docker-gets-stuck-in-geographic-area)
ENV DEBIAN_FRONTEND noninteractive
WORKDIR /home/benchmarking-programming-languages
# Install pre-requisites
# Versions at time of writing:
# gcc -- version (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
# make -- GNU Make 4.2.1
# curl -- 7.68.0
RUN apt update && apt install make build-essential curl wget tar -y
# Install `column`
RUN wget https://mirrors.edge.kernel.org/pub/linux/utils/util-linux/v2.35/util-linux-2.35-rc1.tar.gz
RUN tar xfz util-linux-2.35-rc1.tar.gz
WORKDIR /home/benchmarking-programming-languages/util-linux-2.35-rc1
RUN ./configure
RUN make column
RUN cp .libs/column /bin/
WORKDIR /home/benchmarking-programming-languages
RUN rm -rf util-linux-2.35-rc1*
RUN apt install python3 pip -y
RUN ln -s /usr/bin/python3 /usr/bin/python
RUN apt install default-jdk-headless -y
RUN apt install rustc -y
# Install GoLang
RUN wget https://go.dev/dl/go1.17.8.linux-amd64.tar.gz
RUN rm -rf /usr/local/go && tar -C /usr/local -xzf go1.17.8.linux-amd64.tar.gz
ENV PATH="/usr/local/go/bin:${PATH}"
# Install Haxe and Haxelib
RUN wget https://github.com/HaxeFoundation/haxe/releases/download/4.2.5/haxe-4.2.5-linux64.tar.gz
RUN tar xfz haxe-4.2.5-linux64.tar.gz
RUN ln -s /home/benchmarking-programming-languages/haxe_20220306074705_e5eec31/haxe /usr/bin/haxe
RUN ln -s /home/benchmarking-programming-languages/haxe_20220306074705_e5eec31/haxelib /usr/bin/haxelib
# # Install Neko (Haxe VM)
# RUN add-apt-repository ppa:haxe/snapshots -y
# RUN apt update
# RUN apt install neko -y
RUN if ! test -d /home/benchmarking-programming-languages; then mkdir /home/benchmarking-programming-languages && echo "Created directory /home/benchmarking-programming-languages."; fi
COPY . /home/benchmarking-programming-languages
RUN pip install -r /home/benchmarking-programming-languages/requirements_dev.txt
CMD [ "/home/benchmarking-programming-languages/benchmark.sh -v" ]
However, upon pulling the same image on my Windows 10 machine (same machine as above just dual booted) and a Windows 11 laptop using both the Docker Desktop application and the command line (docker pull mariosyian/benchmarking-programming-languages followed by docker run -it <image_id>). Both which give me the following error
Error invoking remote method 'docker-run-container': Error: (HTTP code 400) unexpected - failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "/home/benchmarking-programming-languages/benchmark.sh -v": stat /home/benchmarking-programming-languages/benchmark.sh -v: no such file or directory: unknown
Despite this, running the image as a container with a shell (docker run -it <image_id> sh), I am successfully able to, not only see the file, but execute it with no errors! Can someone suggest a reason for why the error happens in the first place, and how to fix it?
In your Dockerfile you have specified the CMD as
CMD [ "/home/benchmarking-programming-languages/benchmark.sh -v" ]
This uses the JSON syntax of the CMD instruction, i.e. is an array of strings where the first string is the executable and each following string is a parameter to that executable.
Since you only have a single string specified docker tries to invoke the executable /home/benchmarking-programming-languages/benchmark.sh -v - i.e. a file named "benchmark.sh -v", containing a space in its name and ending with -v. But what you actually intended to do was to invoke the benchmark.sh script with the -v parameter.
You can do this by correctly specifying the parameter(s) as separate strings:
CMD ["/home/benchmarking-programming-languages/benchmark.sh", "-v"]
or by using the shell syntax:
CMD /home/benchmarking-programming-languages/benchmark.sh -v

I have made a dockerfile and I was going to run it on AWS ECS but I cant as it requires -t

Here is my docker run and the docker file is there a reason why it requires -t and isnt working on ECS thanks for any help. I dont
understand what -t does so if someone could also help with that thanks.
This is just a basic docker that connects to my rds and uses wordpress. I dont have any plugins and shapely is the theme i'm using .
command docker run -t --name wordpress -d -p 80:80 dockcore/wordpress
FROM ubuntu
#pt-get clean all
RUN apt-get -y update
RUN DEBIAN_FRONTEND=noninteractive apt-get -y install unzip wget mysql-client mysql-server apache2 libapache2-mod-php7.0 pwgen python-setuptools vim-tiny php7.0-mysql php7.0-lda
RUN rm -fr /var/cashe/*files neeeded
ADD wordpress.conf /etc/apache2/sites-enabled/000-default.conf
# Wordpress install
RUN wget -P /var/www/html/ https://wordpress.org/latest.zip
RUN unzip /var/www/html/latest.zip -d /var/www/html/
RUN rm -fr /var/www/html/latest.zip
# Copy he wp config file
RUN cp /var/www/html/wordpress/wp-config-sample.php /var/www/html/wordpress/wp-config.php
# Expose web port
EXPOSE 80
# wp config for database
RUN sed -ie 's/database_name_here/wordpress/g' /var/www/html/wordpress/wp-config.php
RUN sed -ie 's/username_here/root/g' /var/www/html/wordpress/wp-config.php
RUN sed -ie 's/password_here/password/g' /var/www/html/wordpress/wp-config.php
RUN sed -ie 's/localhost/wordpressrds.xxxxxxxxxxxxxx.ap-southeast-2.rds.amazonaws.com:3306/g' /var/www/html/wordpress/wp-config.php
RUN rm -fr /var/www/html/wordpress/wp-content/themes/*
RUN rm -fr /var/www/html/wordpress/wp-content/plugins/*
ADD /shapely /var/www/html/wordpress/wp-content/themes/
# Start apache on boot
RUN echo "service apache2 start" >> ~/.bashrc
I see a couple problems. First of all your container should never require -t in order to run unless it is a temporary container that you plan to interact with using a shell. Background containers shouldn't require an interactive TTY interface, they just run in the background autonomously.
Second in your docker file I see a lot of RUN statements which are basically the build time commands for setting up the initial state of the container, but you don't have any CMD statement.
You need a CMD which is the process to actually kick off and start in the container when you try to run the container. RUN statements only execute once during the initial docker build, and then the results of those run statements are saved into the container image. When you run a docker container it has the initial state that was setup by the RUN statements, and then the CMD statement kicks off a running process in the container.
So it looks like that last RUN in your Dockerfile should be a CMD since the Apache server is the long running process that you want to run with the container state that you previously setup using all those RUN statements.
Another thing you should do is chain many of those consecutive RUN statements into one. Docker creates a separate layer for each RUN command, where each layer is kind of like a Git commit of the state of the container. So it is very wasteful to have so many RUN statements because it makes way too many container layers. You can do something like this ot chain RUN statements together instead to make a smaller, more efficient container:
RUN apt-get -y update && \
DEBIAN_FRONTEND=noninteractive apt-get -y install unzip wget mysql-client mysql-server apache2 libapache2-mod-php7.0 pwgen python-setuptools vim-tiny php7.0-mysql php7.0-lda && \
rm -fr /var/cashe/*files neeeded
I recommend reading through this guide from Docker that covers best practices for writing a Dockerfile: https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#cmd

Docker refusing to run bash

I have the following docker setup:
python27.Dockerfile
FROM python:2.7
COPY ./entrypoint.sh /entrypoint.sh
RUN mkdir /src
RUN apt-get update && apt-get install -y bash libmysqlclient-dev python-pip build-essential && pip install virtualenv
ENTRYPOINT ["/entrypoint.sh"]
EXPOSE 8000
WORKDIR /src
CMD source /src/env/bin/activate && python /src/manage.py runserver
entrypoint.sh
#!/bin/bash
# some code here...
# some code here...
# some code here...
exec "$#"
Whenever I try to run my docker container I get python27 | /bin/sh: 1: source: not found.
I understand that the error comes from the fact that the command is run with sh instead of bash, but I can't understand why is that happening, given the fact that I have the correct shebang at the top of my entrypoint.
Any ideas why is that happening and how can I fix it?
The problem is that for CMD you're using the shell form that uses /bin/sh, and the /src/env/bin/activate likely contains a "source" command, which isn't available on POSIX /bin/sh (the equivalent builtin would be just .).
You must use the exec form for CMD using brackets:
CMD ["/bin/bash", "-c", "source /src/env/bin/activate && python /src/manage.py runserver"]
More details in:
https://docs.docker.com/engine/reference/builder/#run
https://docs.docker.com/engine/reference/builder/#cmd
https://docs.docker.com/engine/reference/builder/#entrypoint

How can I inspect the file system of a failed `docker build`?

I'm trying to build a new Docker image for our development process, using cpanm to install a bunch of Perl modules as a base image for various projects.
While developing the Dockerfile, cpanm returns a failure code because some of the modules did not install cleanly.
I'm fairly sure I need to get apt to install some more things.
Where can I find the /.cpanm/work directory quoted in the output, in order to inspect the logs? In the general case, how can I inspect the file system of a failed docker build command?
After running a find I discovered
/var/lib/docker/aufs/diff/3afa404e[...]/.cpanm
Is this reliable, or am I better off building a "bare" container and running stuff manually until I have all the things I need?
Everytime docker successfully executes a RUN command from a Dockerfile, a new layer in the image filesystem is committed. Conveniently you can use those layers ids as images to start a new container.
Take the following Dockerfile:
FROM busybox
RUN echo 'foo' > /tmp/foo.txt
RUN echo 'bar' >> /tmp/foo.txt
and build it:
$ docker build -t so-26220957 .
Sending build context to Docker daemon 47.62 kB
Step 1/3 : FROM busybox
---> 00f017a8c2a6
Step 2/3 : RUN echo 'foo' > /tmp/foo.txt
---> Running in 4dbd01ebf27f
---> 044e1532c690
Removing intermediate container 4dbd01ebf27f
Step 3/3 : RUN echo 'bar' >> /tmp/foo.txt
---> Running in 74d81cb9d2b1
---> 5bd8172529c1
Removing intermediate container 74d81cb9d2b1
Successfully built 5bd8172529c1
You can now start a new container from 00f017a8c2a6, 044e1532c690 and 5bd8172529c1:
$ docker run --rm 00f017a8c2a6 cat /tmp/foo.txt
cat: /tmp/foo.txt: No such file or directory
$ docker run --rm 044e1532c690 cat /tmp/foo.txt
foo
$ docker run --rm 5bd8172529c1 cat /tmp/foo.txt
foo
bar
of course you might want to start a shell to explore the filesystem and try out commands:
$ docker run --rm -it 044e1532c690 sh
/ # ls -l /tmp
total 4
-rw-r--r-- 1 root root 4 Mar 9 19:09 foo.txt
/ # cat /tmp/foo.txt
foo
When one of the Dockerfile command fails, what you need to do is to look for the id of the preceding layer and run a shell in a container created from that id:
docker run --rm -it <id_last_working_layer> bash -il
Once in the container:
try the command that failed, and reproduce the issue
then fix the command and test it
finally update your Dockerfile with the fixed command
If you really need to experiment in the actual layer that failed instead of working from the last working layer, see Drew's answer.
The top answer works in the case that you want to examine the state immediately prior to the failed command.
However, the question asks how to examine the state of the failed container itself. In my situation, the failed command is a build that takes several hours, so rewinding prior to the failed command and running it again takes a long time and is not very helpful.
The solution here is to find the container that failed:
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
6934ada98de6 42e0228751b3 "/bin/sh -c './utils/" 24 minutes ago Exited (1) About a minute ago sleepy_bell
Commit it to an image:
$ docker commit 6934ada98de6
sha256:7015687976a478e0e94b60fa496d319cdf4ec847bcd612aecf869a72336e6b83
And then run the image [if necessary, running bash]:
$ docker run -it 7015687976a4 [bash -il]
Now you are actually looking at the state of the build at the time that it failed, instead of at the time before running the command that caused the failure.
Update for newer docker versions 20.10 onwards
Linux or macOS
DOCKER_BUILDKIT=0 docker build ...
Windows
# Command line
set DOCKER_BUILDKIT=0 docker build ...
# PowerShell
$env:DOCKER_BUILDKIT=0
Use
DOCKER_BUILDKIT=0 docker build ...
to get the intermediate container hashes as known from older versions.
On newer versions, Buildkit is activated per default. It is recommended to only use it for debugging purposes. Build Kit can make your build faster.
For reference:
Buildkit doesn't support intermediate container hashes: https://github.com/moby/buildkit/issues/1053
Thanks to #David Callanan and #MegaCookie for their inputs.
Docker caches the entire filesystem state after each successful RUN line.
Knowing that:
to examine the latest state before your failing RUN command, comment it out in the Dockerfile (as well as any and all subsequent RUN commands), then run docker build and docker run again.
to examine the state after the failing RUN command, simply add || true to it to force it to succeed; then proceed like above (keep any and all subsequent RUN commands commented out, run docker build and docker run)
Tada, no need to mess with Docker internals or layer IDs, and as a bonus Docker automatically minimizes the amount of work that needs to be re-done.
Currently with the latest docker-desktop, there isn't a way to opt out
of the new Buildkit, which doesn't support debugging yet (follow the
latest updates on this on this GitHub Thread:
https://github.com/moby/buildkit/issues/1472).
Find out at which line in your Dockerfile it is failing.
Add to the top of your Dockerfile: FROM xxx as debug
Add an additional target: FROM xxx as next just one line before the failing command (as you don't want to build that part). Example:
FROM xxx as debug
RUN echo "working command"
FROM xxx as next
RUN echoo "failing command"
Run docker build -f Dockerfile --target debug --tag debug .
Then you can debug the container with: docker run -it debug /bin/sh
You can quit the shell by pressing CTRL P + CTRL Q
If you want to use docker compose build instead of docker build it's possible by adding target: debug in your docker-compose.yml under build.
Then start the container by docker compose run xxxYourServiceNamexxx and use either:
The second top answer to find out how to run a shell inside the container.
Or add ENTRYPOINT /bin/sh before the FROM xxx as next line in your Dockerfile.
Debugging build step failures is indeed very annoying.
The best solution I have found is to make sure that each step that does real work succeeds, and adding a check after those that fails. That way you get a committed layer that contains the outputs of the failed step that you can inspect.
A Dockerfile, with an example after the # Run DB2 silent installer line:
#
# DB2 10.5 Client Dockerfile (Part 1)
#
# Requires
# - DB2 10.5 Client for 64bit Linux ibm_data_server_runtime_client_linuxx64_v10.5.tar.gz
# - Response file for DB2 10.5 Client for 64bit Linux db2rtcl_nr.rsp
#
#
# Using Ubuntu 14.04 base image as the starting point.
FROM ubuntu:14.04
MAINTAINER David Carew <carew#us.ibm.com>
# DB2 prereqs (also installing sharutils package as we use the utility uuencode to generate password - all others are required for the DB2 Client)
RUN dpkg --add-architecture i386 && apt-get update && apt-get install -y sharutils binutils libstdc++6:i386 libpam0g:i386 && ln -s /lib/i386-linux-gnu/libpam.so.0 /lib/libpam.so.0
RUN apt-get install -y libxml2
# Create user db2clnt
# Generate strong random password and allow sudo to root w/o password
#
RUN \
adduser --quiet --disabled-password -shell /bin/bash -home /home/db2clnt --gecos "DB2 Client" db2clnt && \
echo db2clnt:`dd if=/dev/urandom bs=16 count=1 2>/dev/null | uuencode -| head -n 2 | grep -v begin | cut -b 2-10` | chgpasswd && \
adduser db2clnt sudo && \
echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
# Install DB2
RUN mkdir /install
# Copy DB2 tarball - ADD command will expand it automatically
ADD v10.5fp9_linuxx64_rtcl.tar.gz /install/
# Copy response file
COPY db2rtcl_nr.rsp /install/
# Run DB2 silent installer
RUN mkdir /logs
RUN (/install/rtcl/db2setup -t /logs/trace -l /logs/log -u /install/db2rtcl_nr.rsp && touch /install/done) || /bin/true
RUN test -f /install/done || (echo ERROR-------; echo install failed, see files in container /logs directory of the last container layer; echo run docker run '<last image id>' /bin/cat /logs/trace; echo ----------)
RUN test -f /install/done
# Clean up unwanted files
RUN rm -fr /install/rtcl
# Login as db2clnt user
CMD su - db2clnt
In my case, I have to have:
DOCKER_BUILDKIT=1 docker build ...
and as mentioned by Jannis Schönleber in his answer, there is currently no debug available in this case (i.e. no intermediate images/containers get created).
What I've found I could do is use the following option:
... --progress=plain ...
and then add various RUN ... or additional lines on existing RUN ... to debug specific commands. This gives you what to me feels like full access (at least if your build is relatively fast).
For example, you could check a variable like so:
RUN echo "Variable NAME = [$NAME]"
If you're wondering whether a file is installed properly, you do:
RUN find /
etc.
In my situation, I had to debug a docker build of a Go application with a private repository and it was quite difficult to do that debugging. I've other details on that here.
If you are using docker-compose to build docker images try to add DOCKER_BUILDKIT=0 before the command to see the last successful layer id
DOCKER_BUILDKIT=0 docker-compose ...
This will temporarily disable DOCKER_BUILDKIT for the command only.
Having the last layer id you can connect to it using the command from the top answer
docker run --rm -it LAST_LAYER_ID sh
my solution would be to see what step failed in the docker file, RUN bundle install in my case,
and change it to
RUN bundle install || cat <path to the file containing the error>
This has the double effect of printing out the reason for the failure, AND this intermediate step is not figured as a failed one by docker build. so it's not deleted, and can be inspected via:
docker run --rm -it <id_last_working_layer> bash -il
in there you can even re run your failed command and test it live.
What I would do is comment out the Dockerfile below and including the offending line. Then you can run the container and run the docker commands by hand, and look at the logs in the usual way. E.g. if the Dockerfile is
RUN foo
RUN bar
RUN baz
and it's dying at bar I would do
RUN foo
# RUN bar
# RUN baz
Then
$ docker build -t foo .
$ docker run -it foo bash
container# bar
...grep logs...
Still using BuildKit, as in Alexis Wilke's answer, you can use ktock/buildg.
See "Interactive debugger for Dockerfile" from Kohei Tokunaga
buildg is a tool to interactively debug Dockerfile based on BuildKit.
Source-level inspection
Breakpoints and step execution
Interactive shell on a step with your own debugigng tools
Based on BuildKit (needs unmerged patches)
Supports rootless
Example:
$ buildg.sh debug --image=ubuntu:22.04 /tmp/ctx
WARN[2022-05-09T01:40:21Z] using host network as the default
#1 [internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.1s
#2 [internal] load build definition from Dockerfile
#2 transferring dockerfile: 195B done
#2 DONE 0.1s
#3 [internal] load metadata for docker.io/library/busybox:latest
#3 DONE 3.0s
#4 [build1 1/2] FROM docker.io/library/busybox#sha256:d2b53584f580310186df7a2055ce3ff83cc0df6caacf1e3489bff8cf5d0af5d8
#4 resolve docker.io/library/busybox#sha256:d2b53584f580310186df7a2055ce3ff83cc0df6caacf1e3489bff8cf5d0af5d8 0.0s done
#4 sha256:50e8d59317eb665383b2ef4d9434aeaa394dcd6f54b96bb7810fdde583e9c2d1 772.81kB / 772.81kB 0.2s done
Filename: "Dockerfile"
2| RUN echo hello > /hello
3|
4| FROM busybox AS build2
=> 5| RUN echo hi > /hi
6|
7| FROM scratch
8| COPY --from=build1 /hello /
>>> break 2
>>> breakpoints
[0]: line 2
>>> continue
#4 extracting sha256:50e8d59317eb665383b2ef4d9434aeaa394dcd6f54b96bb7810fdde583e9c2d1 0.0s done
#4 DONE 0.3s
...

Resources