HyperLedger fabric chaincode not updated - go

I am trying to follow along this sample Hyperledger Fabric code: https://github.com/hyperledger/education/tree/master/LFS171x/fabric-material
Initially I replaced chaincode/tuna-app/tuna-chaincode.go with my go file chaincode/tuna-app/test.go. test.go had changes just in what we initialize in the ledger through its initLedger function call. It worked fine, with no changes required in tuna-app/.startFabric.sh.
Now when I again try to change the ledger through its initLedger function call, its not happening. Even if I comment the function itself, it still shows the old content of the ledger.
How do I update my chaincode with visible changes?
startFabric.sh contains the following code:
set -e
# don't rewrite paths for Windows Git Bash users
export MSYS_NO_PATHCONV=1
starttime=$(date +%s)
if [ ! -d ~/.hfc-key-store/ ]; then
mkdir ~/.hfc-key-store/
fi
# launch network; create channel and join peer to channel
cd ../basic-network
./start.sh
# Now launch the CLI container in order to install, instantiate chaincode
# and prime the ledger with our 10 tuna catches
docker-compose -f ./docker-compose.yml up -d cli
docker exec -e "CORE_PEER_LOCALMSPID=Org1MSP" -e "CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/users/Admin#org1.example.com/msp" cli peer chaincode install -n tuna-app -v 1.0 -p github.com/test-app
docker exec -e "CORE_PEER_LOCALMSPID=Org1MSP" -e "CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/users/Admin#org1.example.com/msp" cli peer chaincode instantiate -o orderer.example.com:7050 -C mychannel -n tuna-app -v 1.0 -c '{"Args":[""]}' -P "OR ('Org1MSP.member','Org2MSP.member')"
sleep 10
docker exec -e "CORE_PEER_LOCALMSPID=Org1MSP" -e "CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/users/Admin#org1.example.com/msp" cli peer chaincode invoke -o orderer.example.com:7050 -C mychannel -n tuna-app -c '{"function":"initLedger","Args":[""]}'
printf "\nTotal execution time : $(($(date +%s) - starttime)) secs ...\n\n"
printf "\nStart with the registerAdmin.js, then registerUser.js, then server.js\n\n"
I tried by adding the following line after peer chaincode instantiate :
docker exec -e "CORE_PEER_LOCALMSPID=Org1MSP" -e "CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/users/Admin#org1.example.com/msp" cli peer chaincode upgrade -o orderer.example.com:7050 -C mychannel -n tuna-app -c '{"function":"initLedger","Args":[""]}'
But it gives the following error:
Error: Chaincode version is not provided for upgrade
When I change upgrade statement to:
docker exec -e "CORE_PEER_LOCALMSPID=Org1MSP" -e "CORE_PEER_MSPCONFIGPATH=/opt/gopath/src/github.com/hyperledger/fabric/peer/crypto/peerOrganizations/org1.example.com/users/Admin#org1.example.com/msp" cli peer chaincode upgrade -o orderer.example.com:7050 -C mychannel -n tuna-app -v 1.0 -c '{"function":"initLedger","Args":[""]}'
Error changes to:
Error: Error endorsing chaincode: rpc error: code = Unknown desc = chaincode error (status: 500, message: version already exists for chaincode with name 'tuna-app')

To make the changes in the chaincode made reflect, following steps were taken:
1. Stop all the containers
docker stop $(docker ps -aq)
Delete all the containers
docker rm -f $(docker ps -aq)
Find the following image when you run docker images
One of the output will be this among the other hyperledger binary images.
REPOSITORY TAG, IMAGE ID, CREATED, SIZE: dev-peer0.org1.example.com-tuna-app-1.0-b58eb592ed6ced10f52cc063bda0c303a4272089a3f9a99000d921f94b9bae9b, latest, 0919d7c15f0a, 3 minutes ago, 172MB
Delete it using the following command:
docker rmi 0919d7c15f0a
Run the fabric again using ./startFabric.sh, npm install, node registerAdmin.js, node registerUser.js and node server.js. It should work

I would guess you alrwady have version 1.0 installed, thats why its complaining that it already exists. Try it with 1.1 or 2.0 by using -v 2.0 instead of -v 1.0.

It's pretty tricky once you miss the sequence.
As per my knowledge, deployment of a chaincode in HLF is a two step process.
Step 1. Transfer the source code into the peer ( each chaincode gets a chaincode id which approximately is a function of their name, path and version). This gets signed by your keys and transferred to the all the peers you have chosen as target. ( This step by the way, is called installation. )
Step 2. The source code is compiled, with all the vendor libs ( I'm talking about the GoLang chaincode version here, hoping that it would be same for the other ones too.). A docker image is built and a container gets formed with that binary. ( This is the part which is known as instantiation - which also becomes an Upgrade if it's already done earlier. )
In this process, the step 1 would want the chaincode to be unique. If you have installed once, then if you want to send it again, then make sure you have changed the version number to say abc-1.0 to abc-2.0. This will save you at Installation step.
Once your installation is successful, then it's the matter of when to call for an upgrade and when not. If you have run this container earlier, then right step is to upgrade.
Or the other way around is to do what you did. Clean up and start fresh - which works ok for development, but not for production - as your data goes "poof" with that clean up.

when developing chaincode on hyperledger fabric.
1: we have to remove chaincode docker image for testing each change.
For Example: name of install chaincode is mycc
#remove container
docker rm -f $(docker ps -aq)
#remove images
docker rmi mycc-0-container id or name
2: you can install chaincode just by changing its name like mycc is currently running then you have to change mycc1 and now you need to use mycc1 and perform your transactions.
For Example:
#Already install chaincode has named mycc
#following command will install same chaincode(updated) with chaincodeName
#mycc1
docker exec cli peer chaincode install -n mycc1 -v 0 -p github.com/sacc
Note: now you need to instantiate , invoke and query chaincode with name mycc1.

Related

Docker: How to ADD a service via ENV variables?

I have built a Docker Cron Environment to run Cronjobs based on alseambusher/crontab-ui using alpine:3.15.3 & it works great.
For it to work I have had to install a number of things via the Dockerfile, editing it & adding python so it could run a python script, perl for another service, openssl so I could use a Self-signed certificate, etc.
As it stands the Container is a lot bigger, which is fine, but if I am to share the container others won't necessarily want or need the services I have added & will likely need other that I haven't.
I would like to be able to add a command in the ENV of a Docker Compose to add services at startup without having to do a full build each time. I'm sure it would be simpler to add build:>args: & have it rebuild the container each startup, but my goal is to have it add to an image only the services that each user needs & declares in the Docker-Compose with no need to have the files for the build on the system.
I know this will mean a longer startup depending on the services, I'm okay with that.
I know it's normal to run cron on the host & have it call into containers, but cron on Windows WSL has to be manually started every time the WSL starts & is easy to forget about & can't really be automated aside from on startup, & I'd like to do this entirely inside Docker.
How can I add an ENV like SERVICE_INSTALL to have it run in BASH (which is already added in the Dockerfile & present at /bin/bash) at container startup?
Ideally I'd like to be able to add multiple SERVICE_INSTALL lines if at all possible.
Example:
SERVICE_INSTALL1='apk add --update --no-cache python3 && ln -sf python3 /usr/bin/python'
SERVICE_INSTALL2='python3 -m ensurepip'
SERVICE_INSTALL3='apk add --no-cache perl perl-html-parser perl-http-cookies perl-lwp-useragent-determined perl-json perl-json-xs'
Or, if nothing else:
SERVICE_INSTALL=apk add --update --no-cache python3 && ln -sf python3 /usr/bin/python && perl perl-html-parser perl-http-cookies perl-lwp-useragent-determined perl-json perl-json-xs && && wget && curl && nodejs && npm
but then that leaves the problem of installing things through pip or npm.
I have tried adding a command: to the Docker-Compose but every variation I have tried does not work. I'm also concerned with this method as from my understanding a command: replaces the startup script in the container, not adds to it, so that is not ideal, regardless, it doesn't seem like an install command: is possible anyway
I have tried: (Each as a single command: not together)
command:
- BASH apk --update add openssl
- /bin/bash apk --update add openssl
- BASH RUN apk --update add openssl
- /bin/bash RUN apk --update add openssl
- sh apk --update add openssl
- /bin/sh apk --update add openssl
- apk --update add openssl
Each ends with a message along the lines of Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "/bin/bash run apk --update add openssl": stat /bin/bash run apk --update add openssl: no such file or directory: unknown
UPDATE: I discovered a few things trying to get this to work
for command: to work there needs to not be any - before it
anything, even on multiple lines, is considered a single command essentially as though they were all on the same line & have to be separated with an &&
it will repeat the command or show the error of it failing to execute the command & not continue to next until it is completed.
for example the command mkdir -p /test leaves no logs, but the container never actually starts. While portainer says it's running trying to bash into it gives a is restarting, wait until the container is running message
mkdir "-p /test" repeats this message
mkdir: unrecognized option:
BusyBox v1.34.1 (2022-02-02 18:21:20 UTC) multi-call binary.
Usage: mkdir [-m MODE] [-p] DIRECTORY...
Create DIRECTORY
-m MODE Mode
-p No error if exists; make parent directories as needed
3 times 3-4 seconds apart, them 7 seconds, then 8 seconds, then 15 seconds, 27 seconds, 53 seconds, then hits a minute & continues to grow a few seconds each try.
It also returns the same wait for the container to be running message when trying to bash in
mkdir -p "/test" seems to be the correct formatting, it appears to work but leaves no logs & when attempting to bash in it connects, shows the terminal, then exits, attempting to reconnect shows the same container is restarting message, likely because the container stopped once the command was finished & is set to restart: always. commenting out the restart command the container exits.
mkdir -p "/test" followed by a new line with supervisord -c /etc/supervisord.conf (the default start command) has mkdir reporting mkdir: unrecognized option: c
adding "supervisord -c /etc/supervisord.conf" leaves no logs & a restarting container.
reversing the order, with supervisord -c /etc/supervisord.conf 1st has supervisord reporting the error Error: positional arguments are not supported: ['mkdir', '-p', '/test'] For help, use /usr/bin/supervisord -h
bash -c "supervisord -c /etc/supervisord.conf with a new line & && mkdir -p /test with a new line & && mkdir -p /test2" runs with a working container, but no directories created
reversing the order seems to work & creates the directories, with a running container
command:
bash -c "mkdir -p /test
&& mkdir -p /test2
&& supervisord -c /etc/supervisord.conf"
Which indicates that it will run them in order, but only proceeds to the next after the one finishes.
a test confirmed that the same can be done with other dependencies so long as the initial startup is last. I'd rather have the container start 1st, then install the dependencies while it is running as they are not required for the container itself to run, but rather are added for use in the cronjobs that will be running on a schedule, so if the container starts & the dependencies cannot be used for the 1st 2, 3, even 5 or 10 minutes that might only affect their 1st attempt if it happens to be in that time.
This is alright, I now understand better how the command: option works, but it still requires users to know & properly include the default start command. The command: options are also a lot more particular & easy to get wrong, while ENV variables are something every docker user knows, has experience with, & is simpler to implement

Docker MinIO entrypoint

I have this project which was initially set up on Mac, I'm on Windows, it's a Docker project which runs Node, Kafka and a few other containers, one of them being MinIO. Everything works as intended except MinIO, I get the following error:
createbuckets_1 | /bin/sh: nc: command not found
Docker-compose code:
createbuckets:
image: minio/mc
networks:
- localnet
depends_on:
- minio
entrypoint: >
/bin/sh -c "
while ! nc -zv minio 9000; do echo 'Wait minio to startup...' && sleep 0.1; done; sleep 5;
/usr/bin/mc config host add myminio http://minio:9000 X X;
/usr/bin/mc rm -r --force myminio/cronify/details;
/usr/bin/mc mb myminio/cronify/details;
/usr/bin/mc policy set download myminio/cronify/details;
exit 0;"
Where X is, credentials are supposed to be.
I have been trying to find a fix for weeks.
I have also tried to change the entrypoint from /bin/sh -c to /bin/bash -c or #!/bin/bash -c or #!/bin/sh -c, I get the same error except ".../bin/bash: nc: command not found".
Dockerfile contains:
FROM confluentinc/cp-kafka-connect
I am not entirely sure what you are asking here, but if you are asking about the error message itself, it is telling you that nc is not installed (because it won't be in a container). I am also not clear on which container minio is running in. Assuming the container is being pulled from minio/minio, then it will have curl installed, and you can just use the health check endpoint instead of trying to use nc - https://docs.min.io/minio/baremetal/monitoring/healthcheck-probe.html#minio-healthcheck-api. If it is not a minio container, you would just need to make sure it has curl installed (or nc if for some reason you were set on using that).

Hyperledger Composer Error Identity has not been registered once issued after restart

I am using hyperledger composer 0.16.0 and I want to persist data to database so that data can be used even after restart. so I am using loopback-connector-mongodb
Context
I have been following this tutorial and I am able to complete it.
I have setup fabric by issuing below steps
cd ${HOME}/fabric-tools/
./stopFabric.sh
./teardownFabric.sh
./downloadFabric.sh
./startFabric.sh
cd ${HOME}/tmt/Profile/
composer card create -p connection.json -u PeerAdmin -c Admin#org1.example.com-cert.pem -k 114aab0e76bf0c78308f89efc4b8c9423e31568da0c340ca187a9b17aa9a4457_sk -r PeerAdmin -r ChannelAdmin
composer card import -f PeerAdmin#fabric-network.card
composer runtime install -c PeerAdmin#fabric-network -n dam-network
cd ../dam-network/
# added model.cto file below
composer archive create -t dir -n .
composer network start -c PeerAdmin#fabric-network -a dam-network#0.0.1.bna -A admin -S adminpw
composer card import -f admin#dam-network.card
composer network ping -c admin#dam-network
chmod -R 777 ${HOME}/.composer
## onetime setup using npm install -g loopback-connector-mongodb
docker run -d --name mongo --network composer_default -p 27017:27017 mongo
cd ${HOME}/tmt/docker
docker build -t myorg/my-composer-rest-server .
#Which is attached below
source envvars.txt
docker run \
-d \
-e COMPOSER_CARD=${COMPOSER_CARD} \
-e COMPOSER_NAMESPACES=${COMPOSER_NAMESPACES} \
-e COMPOSER_AUTHENTICATION=${COMPOSER_AUTHENTICATION} \
-e COMPOSER_MULTIUSER=${COMPOSER_MULTIUSER} \
-e COMPOSER_PROVIDERS="${COMPOSER_PROVIDERS}" \
-e COMPOSER_DATASOURCES="${COMPOSER_DATASOURCES}" \
-v ~/.composer:/home/composer/.composer \
--name rest \
--network composer_default \
-p 3000:3000 \
myorg/my-composer-rest-server
I issue a new identity, to an existing participant and I create a business card for this identity with the following command
composer participant add -c admin#dam-network -d ' {"$class": "com.asset.tmt.User","userId": "tmtadmin","email": "tmtadmin#gmail.com","firstName": "TMT","lastName": "Admin","userGroup": "peerAdmin"} '
composer identity issue -u tmtadmin -a com.asset.tmt.User#tmtadmin -c admin#dam-network
composer card import -f tmtadmin#dam-network.card
Then, I import that business card via POST /wallet/import and I am able to call different REST API operations. After that, I stop the composer-rest-server and after a few minutes I start the composer-rest-server again with the commands 
cd ${HOME}/fabric-tools/
./startFabric.sh
docker start mongo rest
Issuing above command is not working so I am killing rest and then running it again by issuing below commands. Correct me if I am wrong
docker stop rest
docker rm rest
docker run \
-d \
-e COMPOSER_CARD=${COMPOSER_CARD} \
-e COMPOSER_NAMESPACES=${COMPOSER_NAMESPACES} \
-e COMPOSER_AUTHENTICATION=${COMPOSER_AUTHENTICATION} \
-e COMPOSER_MULTIUSER=${COMPOSER_MULTIUSER} \
-e COMPOSER_PROVIDERS="${COMPOSER_PROVIDERS}" \
-e COMPOSER_DATASOURCES="${COMPOSER_DATASOURCES}" \
-v ~/.composer:/home/composer/.composer \
--name rest \
--network composer_default \
-p 3000:3000 \
myorg/my-composer-rest-server
Then, I authenticate to the REST API using the configured authentication mechanism (in my case passport-github strategy) and if I try to call one operation for REST API it throws a A business network card has not been specified error message, then I import the previous business card via POST /wallet/import getting a no content which is supposed to be correct.
Finally, when I try to call another REST API operation I get the following error:
{
"error": {
"statusCode": 500,
"name": "Error",
"message": "Error trying login and get user Context. Error: Error trying to enroll user or load channel configuration. Error: Enrollment failed with errors [[{\"code\":400,\"message\":\"Authorization failure\"}]]",
"stack": "Error: Error trying login and get user Context. Error: Error trying to enroll user or load channel configuration. Error: Enrollment failed with errors [[{\"code\":400,\"message\":\"Authorization failure\"}]]\n at client.getUserContext.then.then.catch (/home/composer/.npm-global/lib/node_modules/composer-rest-server/node_modules/composer-connector-hlfv1/lib/hlfconnection.js:305:34)\n at <anonymous>\n at process._tickDomainCallback (internal/process/next_tick.js:228:7)"
}
}
Expected Behavior
This should work even after restart
Actual Behavior
This is the main issue, I don't know why my identity is not being recognized by the REST API if I used it previously to call some operations.
Your Environment
* Version used: 0.16.0
* Environment name and version (e.g. Chrome 39, node.js 5.4): chrome latest and node.js 8.9.1
* Operating System and version (desktop or mobile): Ubuntu desktop
My envvars.txt
COMPOSER_CARD=admin#dam-network
COMPOSER_NAMESPACES=never
COMPOSER_AUTHENTICATION=true
COMPOSER_MULTIUSER=true
COMPOSER_PROVIDERS='{
"github": {
"provider": "github",
"module": "passport-github",
"clientID": "xxxxxxxxxxxxx",
"clientSecret": "xxxxxxxxxxxxxxxxxxxxx",
"authPath": "/auth/github",
"callbackURL": "/auth/github/callback",
"successRedirect": "/",
"failureRedirect": "/"
}
}'
COMPOSER_DATASOURCES='{
"db": {
"name": "db",
"connector": "mongodb",
"host": "10.142.0.10"
}
}'
model.cto
/**
* Model Definitions
*/
namespace com.asset.tmt
participant User identified by userId {
o String userId
o String email
o String firstName
o String lastName
o String userGroup
}
asset Asset identified by assetId {
o String assetId
o String name
o String creationDate
o String expiryDate
}
transaction ChangeAssetValue {
o String expiryDate
o String assetId
o String userId
}
update:
After following what #R Thatcher told, When I issue command docker-compose start , it is starting fabric network but not the business network which is deployed earlier.
tmt#blockchain:~/tmt/dam-network$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8a6833bd7d3a myorg/my-composer-rest-server "pm2-docker compos..." 17 hours ago Exited (0) 10 hours ago rest
9bffab63a048 mongo "docker-entrypoint..." 17 hours ago Exited (0) 10 hours ago mongo
5bafb4dd5662 dev-peer0.org1.example.com-dam-network-0.16.0-4a77c4c8eabde9e440464f91b1655a48c6c5e0dac908e36a7b437034152bf141 "chaincode -peer.a..." 17 hours ago Exited (0) 4 minutes ago dev-peer0.org1.example.com-dam-network-0.16.0
4bfc67f13811 hyperledger/fabric-peer:x86_64-1.0.4 "peer node start -..." 17 hours ago Up 6 minutes 0.0.0.0:7051->7051/tcp, 0.0.0.0:7053->7053/tcp peer0.org1.example.com
762a42bc0eb7 hyperledger/fabric-orderer:x86_64-1.0.4 "orderer" 17 hours ago Up 6 minutes 0.0.0.0:7050->7050/tcp orderer.example.com
49c925a8cc43 hyperledger/fabric-couchdb:x86_64-1.0.4 "tini -- /docker-e..." 17 hours ago Up 6 minutes 4369/tcp, 9100/tcp, 0.0.0.0:5984->5984/tcp couchdb
cee51891308f hyperledger/fabric-ca:x86_64-1.0.4 "sh -c 'fabric-ca-..." 17 hours ago Up 6 minutes 0.0.0.0:7054->7054/tcp ca.org1.example.com
What is the correct way to bring it up?
1)When I try to start network by issuing below command
tmt#blockchain:~/tmt/dam-network$ composer network start -c PeerAdmin#fabric-network -a dam-network#0.0.1.bna -A admin -S adminpw
Starting business network from archive: dam-network#0.0.1.bna
Business network definition:
Identifier: dam-network#0.0.1
Description: Blockchain dam integration
Processing these Network Admins:
userName: admin
✖ Starting business network definition. This may take a minute...
Error: Error trying to instantiate composer runtime. Error: No valid responses from any peers.
Response from attempted peer comms was an error: Error: chaincode error (status: 500, message: chaincode exists dam-network)
Command failed
2) When I try to start docker container manually by issuing docker start container I still see it is not up.
The startFabric.sh does more than just start the Fabric - it actually removes your Containers and recreates new Containers from the Docker Images. The impact of this is that you lose all your data and your Business Network from the Fabric.
If you want to stop and start your Fabric after you have created it you need to change to the directory where the docker-compose.yml file is (in my case /home/rob/fabric-tools/fabric-scripts/hlfv1/composer)
Run docker-compose stop to stop the Fabric Containers and docker-compose start to restart where you left off. It is necessary to be in the correct folder before using the docker-compose command.

How can I run a docker container and commit the changes once a script completes?

I want to set up a cron job to run a set of commands inside a docker container and then commit the changes to the docker image. I'm able to run the container as a daemon and get the container ID using this command:
CONTAINER_ID=$(sudo docker run -d my-image /bin/sh -c "sleep 10")
but I'm having trouble with the second part--committing the changes to the image once the sleep 10 command completes. Is there a way for me to tell when the docker container is about to be killed and run another command before it is?
EDIT: As an alternative, is there a way to trigger ctrl-p-q via a shell script in the container to leave the container running but return to the host?
There are following ways to persist container data:
Docker volumes
Docker commit
a) create container from ubuntu image and run a bash terminal.
$ docker run -i -t ubuntu:14.04 /bin/bash
b) Inside the terminal install curl
# apt-get update
# apt-get install curl
c) Exit the container terminal
# exit
d) Take a note of your container id by executing following command :
$ docker ps -a
e) save container as new image
$ docker commit <container_id> new_image_name:tag_name(optional)
f) verify that you can see your new image with curl installed.
$ docker images
$ docker run -it new_image_name:tag_name bash
# which curl
/usr/bin/curl
Run it in the foreground, not as daemon. When it ends the script that launched it takes control and commits/push it
I didn't find any of these answers satisfying, as my goal was to 1) launch a container, 2) run a setup script, and 3) capture/store the state after setup, so I can instantly run various scripts against that state later. And all in a local, automated, continuous integration environment (e.g. scripted and non-interactive).
Here's what I came up with (and I run this in Travis-CI install section) for setting up my test environment:
#!/bin/bash
# Run a docker with the env boot script
docker run ubuntu:14.04 /path/to/env_setup_script.sh
# Get the container ID of the last run docker (above)
export CONTAINER_ID=`docker ps -lq`
# Commit the container state (returns an image_id with sha256: prefix cut off)
# and write the IMAGE_ID to disk at ~/.docker_image_id
(docker commit $CONTAINER_ID | cut -c8-) > ~/.docker_image_id
Note that my base image was ubuntu:14.04 but yours could be any image you want.
With that setup, now I can run any number of scripts (e.g. unit tests) against this snapshot (for Travis, these are in my script section). e.g.:
docker run `cat ~/.docker_image_id` /path/to/unit_test_1.sh
docker run `cat ~/.docker_image_id` /path/to/unit_test_2.sh
Try this if you want an auto commit for all which are running. Put this in a cron or something, if this helps
#!/bin/bash
for i in `docker ps|tail -n +2|awk '{print $1}'`; do docker commit -m "commit new change" $i; done

How can I inspect the file system of a failed `docker build`?

I'm trying to build a new Docker image for our development process, using cpanm to install a bunch of Perl modules as a base image for various projects.
While developing the Dockerfile, cpanm returns a failure code because some of the modules did not install cleanly.
I'm fairly sure I need to get apt to install some more things.
Where can I find the /.cpanm/work directory quoted in the output, in order to inspect the logs? In the general case, how can I inspect the file system of a failed docker build command?
After running a find I discovered
/var/lib/docker/aufs/diff/3afa404e[...]/.cpanm
Is this reliable, or am I better off building a "bare" container and running stuff manually until I have all the things I need?
Everytime docker successfully executes a RUN command from a Dockerfile, a new layer in the image filesystem is committed. Conveniently you can use those layers ids as images to start a new container.
Take the following Dockerfile:
FROM busybox
RUN echo 'foo' > /tmp/foo.txt
RUN echo 'bar' >> /tmp/foo.txt
and build it:
$ docker build -t so-26220957 .
Sending build context to Docker daemon 47.62 kB
Step 1/3 : FROM busybox
---> 00f017a8c2a6
Step 2/3 : RUN echo 'foo' > /tmp/foo.txt
---> Running in 4dbd01ebf27f
---> 044e1532c690
Removing intermediate container 4dbd01ebf27f
Step 3/3 : RUN echo 'bar' >> /tmp/foo.txt
---> Running in 74d81cb9d2b1
---> 5bd8172529c1
Removing intermediate container 74d81cb9d2b1
Successfully built 5bd8172529c1
You can now start a new container from 00f017a8c2a6, 044e1532c690 and 5bd8172529c1:
$ docker run --rm 00f017a8c2a6 cat /tmp/foo.txt
cat: /tmp/foo.txt: No such file or directory
$ docker run --rm 044e1532c690 cat /tmp/foo.txt
foo
$ docker run --rm 5bd8172529c1 cat /tmp/foo.txt
foo
bar
of course you might want to start a shell to explore the filesystem and try out commands:
$ docker run --rm -it 044e1532c690 sh
/ # ls -l /tmp
total 4
-rw-r--r-- 1 root root 4 Mar 9 19:09 foo.txt
/ # cat /tmp/foo.txt
foo
When one of the Dockerfile command fails, what you need to do is to look for the id of the preceding layer and run a shell in a container created from that id:
docker run --rm -it <id_last_working_layer> bash -il
Once in the container:
try the command that failed, and reproduce the issue
then fix the command and test it
finally update your Dockerfile with the fixed command
If you really need to experiment in the actual layer that failed instead of working from the last working layer, see Drew's answer.
The top answer works in the case that you want to examine the state immediately prior to the failed command.
However, the question asks how to examine the state of the failed container itself. In my situation, the failed command is a build that takes several hours, so rewinding prior to the failed command and running it again takes a long time and is not very helpful.
The solution here is to find the container that failed:
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
6934ada98de6 42e0228751b3 "/bin/sh -c './utils/" 24 minutes ago Exited (1) About a minute ago sleepy_bell
Commit it to an image:
$ docker commit 6934ada98de6
sha256:7015687976a478e0e94b60fa496d319cdf4ec847bcd612aecf869a72336e6b83
And then run the image [if necessary, running bash]:
$ docker run -it 7015687976a4 [bash -il]
Now you are actually looking at the state of the build at the time that it failed, instead of at the time before running the command that caused the failure.
Update for newer docker versions 20.10 onwards
Linux or macOS
DOCKER_BUILDKIT=0 docker build ...
Windows
# Command line
set DOCKER_BUILDKIT=0 docker build ...
# PowerShell
$env:DOCKER_BUILDKIT=0
Use
DOCKER_BUILDKIT=0 docker build ...
to get the intermediate container hashes as known from older versions.
On newer versions, Buildkit is activated per default. It is recommended to only use it for debugging purposes. Build Kit can make your build faster.
For reference:
Buildkit doesn't support intermediate container hashes: https://github.com/moby/buildkit/issues/1053
Thanks to #David Callanan and #MegaCookie for their inputs.
Docker caches the entire filesystem state after each successful RUN line.
Knowing that:
to examine the latest state before your failing RUN command, comment it out in the Dockerfile (as well as any and all subsequent RUN commands), then run docker build and docker run again.
to examine the state after the failing RUN command, simply add || true to it to force it to succeed; then proceed like above (keep any and all subsequent RUN commands commented out, run docker build and docker run)
Tada, no need to mess with Docker internals or layer IDs, and as a bonus Docker automatically minimizes the amount of work that needs to be re-done.
Currently with the latest docker-desktop, there isn't a way to opt out
of the new Buildkit, which doesn't support debugging yet (follow the
latest updates on this on this GitHub Thread:
https://github.com/moby/buildkit/issues/1472).
Find out at which line in your Dockerfile it is failing.
Add to the top of your Dockerfile: FROM xxx as debug
Add an additional target: FROM xxx as next just one line before the failing command (as you don't want to build that part). Example:
FROM xxx as debug
RUN echo "working command"
FROM xxx as next
RUN echoo "failing command"
Run docker build -f Dockerfile --target debug --tag debug .
Then you can debug the container with: docker run -it debug /bin/sh
You can quit the shell by pressing CTRL P + CTRL Q
If you want to use docker compose build instead of docker build it's possible by adding target: debug in your docker-compose.yml under build.
Then start the container by docker compose run xxxYourServiceNamexxx and use either:
The second top answer to find out how to run a shell inside the container.
Or add ENTRYPOINT /bin/sh before the FROM xxx as next line in your Dockerfile.
Debugging build step failures is indeed very annoying.
The best solution I have found is to make sure that each step that does real work succeeds, and adding a check after those that fails. That way you get a committed layer that contains the outputs of the failed step that you can inspect.
A Dockerfile, with an example after the # Run DB2 silent installer line:
#
# DB2 10.5 Client Dockerfile (Part 1)
#
# Requires
# - DB2 10.5 Client for 64bit Linux ibm_data_server_runtime_client_linuxx64_v10.5.tar.gz
# - Response file for DB2 10.5 Client for 64bit Linux db2rtcl_nr.rsp
#
#
# Using Ubuntu 14.04 base image as the starting point.
FROM ubuntu:14.04
MAINTAINER David Carew <carew#us.ibm.com>
# DB2 prereqs (also installing sharutils package as we use the utility uuencode to generate password - all others are required for the DB2 Client)
RUN dpkg --add-architecture i386 && apt-get update && apt-get install -y sharutils binutils libstdc++6:i386 libpam0g:i386 && ln -s /lib/i386-linux-gnu/libpam.so.0 /lib/libpam.so.0
RUN apt-get install -y libxml2
# Create user db2clnt
# Generate strong random password and allow sudo to root w/o password
#
RUN \
adduser --quiet --disabled-password -shell /bin/bash -home /home/db2clnt --gecos "DB2 Client" db2clnt && \
echo db2clnt:`dd if=/dev/urandom bs=16 count=1 2>/dev/null | uuencode -| head -n 2 | grep -v begin | cut -b 2-10` | chgpasswd && \
adduser db2clnt sudo && \
echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
# Install DB2
RUN mkdir /install
# Copy DB2 tarball - ADD command will expand it automatically
ADD v10.5fp9_linuxx64_rtcl.tar.gz /install/
# Copy response file
COPY db2rtcl_nr.rsp /install/
# Run DB2 silent installer
RUN mkdir /logs
RUN (/install/rtcl/db2setup -t /logs/trace -l /logs/log -u /install/db2rtcl_nr.rsp && touch /install/done) || /bin/true
RUN test -f /install/done || (echo ERROR-------; echo install failed, see files in container /logs directory of the last container layer; echo run docker run '<last image id>' /bin/cat /logs/trace; echo ----------)
RUN test -f /install/done
# Clean up unwanted files
RUN rm -fr /install/rtcl
# Login as db2clnt user
CMD su - db2clnt
In my case, I have to have:
DOCKER_BUILDKIT=1 docker build ...
and as mentioned by Jannis Schönleber in his answer, there is currently no debug available in this case (i.e. no intermediate images/containers get created).
What I've found I could do is use the following option:
... --progress=plain ...
and then add various RUN ... or additional lines on existing RUN ... to debug specific commands. This gives you what to me feels like full access (at least if your build is relatively fast).
For example, you could check a variable like so:
RUN echo "Variable NAME = [$NAME]"
If you're wondering whether a file is installed properly, you do:
RUN find /
etc.
In my situation, I had to debug a docker build of a Go application with a private repository and it was quite difficult to do that debugging. I've other details on that here.
If you are using docker-compose to build docker images try to add DOCKER_BUILDKIT=0 before the command to see the last successful layer id
DOCKER_BUILDKIT=0 docker-compose ...
This will temporarily disable DOCKER_BUILDKIT for the command only.
Having the last layer id you can connect to it using the command from the top answer
docker run --rm -it LAST_LAYER_ID sh
my solution would be to see what step failed in the docker file, RUN bundle install in my case,
and change it to
RUN bundle install || cat <path to the file containing the error>
This has the double effect of printing out the reason for the failure, AND this intermediate step is not figured as a failed one by docker build. so it's not deleted, and can be inspected via:
docker run --rm -it <id_last_working_layer> bash -il
in there you can even re run your failed command and test it live.
What I would do is comment out the Dockerfile below and including the offending line. Then you can run the container and run the docker commands by hand, and look at the logs in the usual way. E.g. if the Dockerfile is
RUN foo
RUN bar
RUN baz
and it's dying at bar I would do
RUN foo
# RUN bar
# RUN baz
Then
$ docker build -t foo .
$ docker run -it foo bash
container# bar
...grep logs...
Still using BuildKit, as in Alexis Wilke's answer, you can use ktock/buildg.
See "Interactive debugger for Dockerfile" from Kohei Tokunaga
buildg is a tool to interactively debug Dockerfile based on BuildKit.
Source-level inspection
Breakpoints and step execution
Interactive shell on a step with your own debugigng tools
Based on BuildKit (needs unmerged patches)
Supports rootless
Example:
$ buildg.sh debug --image=ubuntu:22.04 /tmp/ctx
WARN[2022-05-09T01:40:21Z] using host network as the default
#1 [internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.1s
#2 [internal] load build definition from Dockerfile
#2 transferring dockerfile: 195B done
#2 DONE 0.1s
#3 [internal] load metadata for docker.io/library/busybox:latest
#3 DONE 3.0s
#4 [build1 1/2] FROM docker.io/library/busybox#sha256:d2b53584f580310186df7a2055ce3ff83cc0df6caacf1e3489bff8cf5d0af5d8
#4 resolve docker.io/library/busybox#sha256:d2b53584f580310186df7a2055ce3ff83cc0df6caacf1e3489bff8cf5d0af5d8 0.0s done
#4 sha256:50e8d59317eb665383b2ef4d9434aeaa394dcd6f54b96bb7810fdde583e9c2d1 772.81kB / 772.81kB 0.2s done
Filename: "Dockerfile"
2| RUN echo hello > /hello
3|
4| FROM busybox AS build2
=> 5| RUN echo hi > /hi
6|
7| FROM scratch
8| COPY --from=build1 /hello /
>>> break 2
>>> breakpoints
[0]: line 2
>>> continue
#4 extracting sha256:50e8d59317eb665383b2ef4d9434aeaa394dcd6f54b96bb7810fdde583e9c2d1 0.0s done
#4 DONE 0.3s
...

Resources