I execute some tests in GitHub Actions using Testcontainers.
Testcontainers pulls the images which are used in my tests. Unfortunately the images are pulled again at every build.
How can I cache the images in GitHub Actions?
There's no official support from GitHub Actions (yet) to support caching pulled Docker images (see this and this issue).
What you can do is to pull the Docker images, save them as a .tar archive and store them in a folder for the GitHub Actions cache action to pick it up.
A sample workflow can look like the following:
build-java-project:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout#v2
- run: mkdir -p ~/image-cache
- id: image-cache
uses: actions/cache#v1
with:
path: ~/image-cache
key: image-cache-${{ runner.os }}
- if: steps.image-cache.outputs.cache-hit != 'true'
run: |
docker pull postgres:13
docker save -o ~/image-cache/postgres.tar alpine
- if: steps.image-cache.outputs.cache-hit == 'true'
run: docker load -i ~/image-cache/postgres.tar
- name: 'Run tests'
run: ./mvnw verify
While this is a little bit noisy, you'd need to adjust the pipeline every time you're depending on a new Docker image for your tests. Also be aware of how to do cache invalidation as I guess if you plan to use a :latest tag, the solution above won't recognize changes to the image.
The current GitHub Actions cache size is 10 GB which should be enough for a mid-size project relying on 5-10 Docker images for testing.
There's also the Docker GitHub Cache API but I'm not sure how well this integrates with Testcontainers.
Related
Hello I am using kedro (a pipeline tool) and want to use github actions to trigger a kedro command (kedro run) whenever I make a push to my github repo.
Since I have all the data in my local repo, I thought it would make sense to run the kedro command on my local machine.
So my question is, is there a way to trigger a local action using github actions? Using self-hosted runners perhaps?
You can run the kedro pipeline directly in gh provided runner using the steps below. I added a script I used previously in here to run kedro lint with every push.
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout#v2
- name: Set up Python 3.7.9
uses: actions/setup-python#v2
with:
python-version: 3.7.9
- uses: actions/cache#v2
with:
path: ${{ env.pythonLocation }}
key: ${{ env.pythonLocation }}-${{ hashFiles('src/requirements.txt') }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r src/requirements.txt
- name: Run Kedro Pipeline
run: |
kedro run
That said, I'm also wondering why you would need to run it on every push. Given compute provided by github actions is likely resource constrained running there might not be the best place. You would also need to keep your data within your repo for this approach to work.
Hi #magical_unicorn I think #avan-sh's answer is correct - what I would also add is that we encourage you to not commit data to VCS / Git. There are some technical limitations such as filesize, but more importantly it's not great security practice when working with teams.
Whilst not necessary on all projects - it might be good practice to explore using some sort of cloud storage for your data, decoupling it from the version control system.
I'm working on a Spring Boot application that should be packaged into a OCI container using Cloud Native Build Packs / Paketo.io. I build it with GitHub Actions, where my workflow build.yml looks like this:
name: build
on: [push]
jobs:
build-with-paketo-push-2-dockerhub:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout#v2
- name: Login to DockerHub Container Registry
run: echo $DOCKER_HUB_TOKEN | docker login -u jonashackt --password-stdin
env:
DOCKER_HUB_TOKEN: ${{ secrets.DOCKER_HUB_TOKEN }}
- name: Install pack CLI via apt. See https://buildpacks.io/docs/tools/pack/#pack-cli
run: |
sudo add-apt-repository ppa:cncf-buildpacks/pack-cli
sudo apt-get update
sudo apt-get install pack-cli
- name: Build app with pack CLI
run: pack build spring-boot-buildpack --path . --builder paketobuildpacks/builder:base
- name: Tag & publish to Docker Hub
run: |
docker tag spring-boot-buildpack jonashackt/spring-boot-buildpack:latest
docker push jonashackt/spring-boot-buildpack:latest
Now the step Build app with pack CLI takes relatively long, since it always downloads the Paketo builder Docker image and then does a full fresh build. This means downloading the JDK and every single Maven dependency. Is there a way to cache a Paketo build on GitHub Actions?
Caching the Docker images on GitHub Actions might be an option - which doesn't seem to be that easy. Another option would be to leverage the Docker official docker/build-push-action Action, which is able to cache the buildx-cache. But I didn't get the combination of pack CLI and buildx-caching to work (see this build for example).
Finally I stumbled upon the general Cloud Native Buildpacks approach on how to cache in the docs:
Cache Images are a way to preserve build optimizing layers across
different host machines. These images can improve performance when
using pack in ephemeral environments such as CI/CD pipelines.
I found this concept quite nice, since it uses a second cache image, which gets published on a container registry of your choice. And this image is simply used for all Paketo pack CLI builds on every machine you append the --cache-image parameter - be it your local desktop or any CI/CD platform (like GitHub Actions).
In order to use the --cache-image parameter, we also have to use the --publish flag (since the cache image needs to get published to your container registry!). This also means we need to log into the container registry before we're able to run our pack CLI command. Using Docker Hub this is something like:
echo $DOCKER_HUB_TOKEN | docker login -u YourUserNameHere --password-stdin
Also the Paketo builder image must be a trusted one. As the docs state:
By default, any builder suggested by pack builder suggest is
considered trusted.
Since I use a suggested builder, I don't have to do anything here. If you want to use another builder that isn't trusted by default, you need to run a pack config trusted-builders add your/builder-to-trust:bionic command before the final pack CLI command.
Here's the pack CLI command, which is cache enabled in case you want to build a Spring Boot app and using Docker Hub as the container registry:
pack build index.docker.io/yourApplicationImageNameHere:latest \
--builder paketobuildpacks/builder:base \
--path . \
--cache-image index.docker.io/yourCacheImageNameHere:latest \
--publish
Finally the GitHub Action workflow to build and publish the example Spring Boot app https://github.com/jonashackt/spring-boot-buildpack looks like this:
name: build
on: [push]
jobs:
build-with-paketo-push-2-dockerhub:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout#v2
- name: Login to DockerHub Container Registry
run: echo $DOCKER_HUB_TOKEN | docker login -u jonashackt --password-stdin
env:
DOCKER_HUB_TOKEN: ${{ secrets.DOCKER_HUB_TOKEN }}
- name: Install pack CLI via the official buildpack Action https://github.com/buildpacks/github-actions#setup-pack-cli-action
uses: buildpacks/github-actions/setup-pack#v4.1.0
- name: Build app with pack CLI using Buildpack Cache image (see https://buildpacks.io/docs/app-developer-guide/using-cache-image/) & publish to Docker Hub
run: |
pack build index.docker.io/jonashackt/spring-boot-buildpack:latest \
--builder paketobuildpacks/builder:base \
--path . \
--cache-image index.docker.io/jonashackt/spring-boot-buildpack-paketo-cache-image:latest \
--publish
Note that with using the pack CLI's --publish flag, we also don't need an extra step Tag & publish to Docker Hub anymore. Since this is done by pack CLI for us already.
The situation is this:
I'm running Cypress tests in a Gitlab CI (launched by vue-cli). To speed up the execution, I built a Docker image that contains the necessary dependencies.
How can I cache node_modules from the prebuilt image to use it in the test job ?
Currently I'm using an awful (but working) solution:
testsE2e:
image: path/to/prebuiltImg
stage: tests
script:
- ln -s /node_modules/ /builds/path/to/prebuiltImg/node_modules
- yarn test:e2e
- yarn test:e2e:report
But I think there must be a cleaner way using the Gitlab CI cache.
I've been testing:
cacheE2eDeps:
image: path/to/prebuiltImg
stage: dependencies
cache:
key: e2eDeps
paths:
- node_modules/
script:
- find / -name node_modules # check that node_modules files are there
- echo "Caching e2e test dependencies"
testsE2e:
image: path/to/prebuiltImg
stage: tests
cache:
key: e2eDeps
script:
- yarn test:e2e
- yarn test:e2e:report
But the job cacheE2eDeps displays a "WARNING: node_modules/: no matching files" error.
How can I do this successfully? The Gitlab documentation doesn't really talk about caching from a prebuilt image...
The Dockerfile used to build the image :
FROM cypress/browsers:node13.8.0-chrome81-ff75
COPY . .
RUN yarn install
There is not documentation for caching data from prebuilt images, because it’s simply not done. The dependencies are already available in the image so why cache them in the first place? It would only lead to an unnecessary data duplication.
Also, you seem to operate under the impression that cache should be used to share data between jobs, but it’s primary use case is sharing data between different runs of the same job. Sharing data between jobs should be done using artifacts.
In your case you can use cache instead of prebuilt image, like so:
variables:
CYPRESS_CACHE_FOLDER: "$CI_PROJECT_DIR/cache/Cypress"
testsE2e:
image: cypress/browsers:node13.8.0-chrome81-ff75
stage: tests
cache:
key: "e2eDeps"
paths:
- node_modules/
- cache/Cypress/
script:
- yarn install
- yarn test:e2e
- yarn test:e2e:report
The first time the above job is run, it’ll install dependencies from scratch, but the next time it’ll fetch them from the runner cache. The caveat is that unless all runners that run this job share cache, each time you run it on a new runner it’ll install the dependencies from scratch.
Here’s the documentation about using yarn with GitLab CI.
Edit:
To elaborate on using cache vs artifacts - artifacts are meant for both storing job output (eg. to manually download it later) and for passing results of one job to another one from a subsequent stage, while cache is meant to speed up job execution by preserving files that the job needs to download from the internet. See GitLab documentation for details.
Contents of node_modules directory obviously fit into the second category.
I'm trying to set up a workflow in CircleCI for my React project.
What I want to achieve is to get a job to build the stuff and another one to deploy the master branch to Firebase hosting.
This is what I have so far after several configurations:
witmy: &witmy
docker:
- image: circleci/node:7.10
version: 2
jobs:
build:
<<: *witmy
steps:
- checkout
- restore_cache:
keys:
- v1-dependencies-{{ checksum "package.json" }}
- v1-dependencies-
- run: yarn install
- save_cache:
paths:
- node_modules
key: v1-dependencies-{{ checksum "package.json" }}
- run:
name: Build app in production mode
command: |
yarn build
- persist_to_workspace:
root: .
deploy:
<<: *witmy
steps:
- attach_workspace:
at: .
- run:
name: Deploy Master to Firebase
command: ./node_modules/.bin/firebase deploy --token=MY_TOKEN
workflows:
version: 2
build-and-deploy:
jobs:
- build
- deploy:
requires:
- build
filters:
branches:
only: master
The build job always success, but with the deploy I have this error:
#!/bin/bash -eo pipefail
./node_modules/.bin/firebase deploy --token=MYTOKEN
/bin/bash: ./node_modules/.bin/firebase: No such file or directory
Exited with code 1
So, what I understand is that the deploy job is not running in the same place the build was, right?
I'm not sure how to fix that. I've read some examples they provide and tried several things, but it doesn't work. I've also read the documentation but I think it's not very clear how to configure everything... maybe I'm too dumb.
I hope you guys can help me out on this one.
Cheers!!
EDITED TO ADD MY CURRENT CONFIG USING WORKSPACES
I've added Workspaces... but still I'm not able to get it working, after a loooot of tries I'm getting this error:
Persisting to Workspace
The specified paths did not match any files in /home/circleci/project
And also it's a real pain to commit and push to CircleCI every single change to the config file when I want to test it... :/
Thanks!
disclaimer: I'm a CircleCI Developer Advocate
Each job is its own running Docker container (or VM). So the problem here is that nothing in node_modules exists in your deploy job. There's 2 ways to solve this:
Install Firebase and anything else you might need, on the fly, just like you do in the build job.
Utilize CircleCI Workspaces to carry over your node_modules directory from the build job to the deploy job.
In my opinion, option 2 is likely your best bet because it's more efficient.
I wish to use GitLab Container Registry to temporary store my newly built Docker image; in order to have Docker function (i.e. docker login, docker build, docker push), I applied docker-in-docker executor; then from GitLab Piplelines error messages, I realize I need to place a Dockerfile at project root:-
$ docker build --pull -t $CONTAINER_TEST_IMAGE .
unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /builds/xxxxx.com/laravel/Dockerfile: no such file or directory
My Dockerfile includes centos:7, php, nodejs, composer and sass installations. I observe after each commit, GitLab runner will go through the Dockerfile once and install all of them from beginning, which makes the whole build stage very slow and funny - how come I just want to amend 1 word in my project but I need to install so many things again for deployment?
From my imagination, it will be nice if I can pre-build a Docker image from a Dockerfile that contains the installations mentioned above plus Docker (so that docker login, docker build and docker push can work) and stored in the GitLab-runner server, and after each commit, this image can be reused to build the new image to be pushed to GitLab Container Registry.
However, I faced 2 problems:-
1) Even I include Docker installation in the pre-build a Docker image, I cannot systemctl docker start, due to some D-bus problem
Failed to get D-Bus connection: Operation not permitted
moreover some articles also mentioned a docker in container shall not run background services;
2) when I use dind, it will require a Dockerfile at project root; with the pre-build a Docker image, actually I have nothing to do with this Dockerfile at project root; hence is dind a wrong option?
Acutally, what is the proper way to push a Laravel project image to GitLab Container Registry? (where to place those npm install and composer install commands?)
image: docker:latest
services:
- docker:dind
stages:
- build
- test
- deploy
variables:
CONTAINER_TEST_IMAGE: xxxx
CONTAINER_RELEASE_IMAGE: yyyy
before_script:
- docker login -u xxx -p yyy registry.gitlab.com
build:
stage: build
script:
- npm install here?
- docker build --pull -t $CONTAINER_TEST_IMAGE .
- docker push $CONTAINER_TEST_IMAGE
There are many questions in your post. I would like to target them as follows:
You can pre-build a docker image and then use it in your gitlab-ci.yaml file. This can be used to add your specific dependencies.
image: my custom image
services:
- docker:dind
Important to add the service to the configuration.
You problem about trying to run the docker service inside the gitlab-ci.yml. You actually don't need to do that. Gitlab exposes the docker engine to the executor (either via unix:///var/run/docker.sock or tcp://localhost:2375/). Note, that if the runners are executed in a kubernetes environment, you need to specify the DOCKER_HOST as follows:
variable:
DOCKER_HOST: tcp://localhost:2375/
You question about where to place npm install is more a fundamental question about how docker images are build. In short, npm install should be place in the Dockerfile. For a long description, please this for example.
Some references:
https://docs.gitlab.com/ee/ci/docker/using_docker_build.html#use-docker-in-docker-executor
https://nodejs.org/en/docs/guides/nodejs-docker-webapp/