Is it possible to rebuild only updated files in Gitlab CI?

Is it possible to rebuild only updated files in Gitlab CI? - makefile

I'm using this script for my Gitlab CI build stage (only relevant part is shown):
cache:
key: "$CI_BUILD_REF"
paths:
- bin/
- build/
build:
image: <my_build_image>
stage: build
script:
- "make PLATFORM='x86_64-linux-gnu' BUILD='release' JOBS=8 all"
only:
- master
- tags
- merge-requests
artifacts:
untracked: true
paths:
- bin/x86_64-linux-gnu/release
I thought what if I'll add bin and build dirs into the cache, make won't rebuild the whole project every time (just like it behaves locally), but it seems what CI runner overwrites my src dir every time, so timestamps on the files is being updated too and make think each file is updated. I thought about including src dir into the cache, but it's included in the repo and I'm not sure this is correct. So, which is the best way to rebuild gitlab ci project using previously built binaries?

I see you are using $CI_BUILD_REF as a cache key; although this variable is deprecated, it seems to work and provides the commit's SHA1.
Is that really what you intended, to create separate caches per commit (not even per branch)?
So for any new commit there wouldn't be a cache anyways?
I'd probably even use a static cache key in order to maximize caching (while using minimal cache storage), or maybe per branch.
Maybe also the Git checkouts and/or branch switches touch the source files too often.
I have implemented a similar strategy in one of my projects, but there I have a distinct "cached" folder to where I /rsync/ the files from the checkout.
The shared runners of Gitlab.com do seem to leave the file modification time intact when using a cache, and even on the main checkout.
I've put up a sample project with a CI job that demonstrates the fac tat https://gitlab.com/hannibal218bc/test-build-cache-xtimes/-/jobs/1022404894 :
the job stats the directory's contents
creates a cached directory if it not yet exists
copies the README.md file
"compiles" the file to a README.built file.
As you can see in the output, the modification timestamp of the README.built is the runtime from the previous job:
$ cd cached
$ stat README.* || true
File: README.built
Size: 146 Blocks: 16 IO Block: 4096 regular file
Device: 809h/2057d Inode: 2101510 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2021-02-10 23:06:13.000000000 +0000
Modify: 2021-02-10 23:02:39.000000000 +0000 <<< timestamp from previous job
Change: 2021-02-10 23:06:13.000000000 +0000

Related

How to trigger Gitlab CI only if CURRENT commit contains changes to some file?

I a have a Gitlab stage that I only want to run if there are changes to some file. I have tried using only and if: changes... however those not work as I want.
I want that this stage is only triggered if the current commit (no matter if it is a merge request or pushed to main) contains changes to some file
My stage:
cache frontend:
stage: cache
cache: &frontend_cache
key: frontend-packages
paths:
- frontend/node_modules/
- frontend/.yarn/
script:
- cd frontend
- yarn install --immutable
only:
changes:
- frontend/yarn.lock

Is there a way to have GitLab Cache be consumed without being written to?

I have a gitlab job that downloads a bunch of dependencies and stuffs them in a cache (if necessary), then I have a bunch of jobs that use that cache. I notice at the end of the consuming jobs, they spend a bunch of time creating a new cache, even though they made no changes to it.
Is it possible to have them act only as consumers? Read-only?
cache:
paths:
- assets/
configure:
stage: .pre
script:
- conda env update --prefix ./assets/env/base -f ./environment.yml;
- source activate ./assets/env/base
- bash ./download.sh
parse1:
stage: build
script:
- source activate ./assets/env/base;
- ./build.sh -b test -s 2
artifacts:
paths:
- build
parse2:
stage: build
script:
- source activate ./assets/env/base;
- ./build.sh -b test -s 2
artifacts:
paths:
- build

In the very detailed .gitlab-ci.yml documentation is a reference to a cache setting called policy. GitLab caches have the concept of push (aka write) and pull (aka read). By default it is set to pull-push (read at the beginning and write at the end).
If you know the job does not alter the cached files, you can skip the upload step by setting policy: pull in the job specification. Typically, this would be twinned with an ordinary cache job at an earlier stage to ensure the cache is updated from time to time:
.gitlab-ci.yml > cache:policy
Which pretty much describes this situation: the job configure updates the cache, and the parse jobs do not alter the cache.
In the consuming jobs, add:
cache:
paths:
- assets/
policy: pull
For clarity, it probably wouldn't hurt to make that explicit in the global setting:
cache:
paths:
- assets/
policy: pull-push

TLDR. Overwrite cache with no path element.
You probably have to add a key element to your global cache configuration too. I actually have never used without a key element.
See the cache documentation here

Skipping cache generation, cache already exists for key

Using CircleCI - version: 2.1 - for continuous deployment where caching installed dependencies. Based on save_cache documentation:
Generates and stores a cache of a file or directory of files such as dependencies or source code in our object storage. Later jobs can restore this cache.
Current scenario:
See the simplified caching step below in .circleci/config.yml file:
steps:
- node/with-cache:
steps:
- checkout
- run: npm install
- save_cache:
key: dependencies
paths: node_modules
The problem is coming once adding new package to the project thus package.json file is changing. In the same time CircleCI shows the message for Saving Cache step:
Skipping cache generation, cache already exists for key: dependenciesFound one created at 2020-05-23 19:29:29 +0000 UTC
Then once restoring the cache obviously does not find the newly added package in the build step:
./src/index.tsxCannot find module: 'package-name'. Make sure this package is installed.
Questions:
Is there any way to check package.json changes in the pipeline? Ideally I would install the dependencies only in those cases, so the cache can be purged and updated.
Maybe I did not see something in the documentation. Any help is appreciated, thank you!

The problem is the cache key you used is "dependencies", a plain string. This key never changes, so you will always use the same exact cache.
You need to use a cache key that changes, preferably based on package.lock. Please read the section of cache keys in the CircleCI Docs for more information: https://circleci.com/docs/2.0/caching/#using-keys-and-templates

Got "ZIP does not support timestamps before 1980" while deploying a Go Cloud Function on GCP via Triggers

Problem:
I am trying to deploy a function with this step in a second level compilation (second-level-compilation.yaml)
- name: 'gcr.io/cloud-builders/gcloud'
args: ['beta', 'functions',
'deploy', '${_FUNCTION_NAME}',
'--source', 'path/to/function',
'--runtime', 'go111',
'--region', '${_GCP_CLOUD_FUNCTION_REGION}',
'--entry-point', '${_ENTRYPOINT}',
'--env-vars-file', '${_FUNCTION_PATH}/.env.${_DEPLOY_ENV}.yaml',
'--trigger-topic', '${_TRIGGER_TOPIC_NAME}',
'--timeout', '${_FUNCTION_TIMEOUT}',
'--service-account', '${_SERVICE_ACCOUNT}']
I get this error from Cloud Build using the Console.
Step #1: Step #11: ERROR: (gcloud.beta.functions.deploy) Error creating a ZIP archive with the source code for directory path/to/function: ZIP does not support timestamps before 1980
Here is the global flow:
The following step is in a first-level compilation (first-level-compilation.yaml). Which is triggered by Cloud build using a Github repository (via Application GitHub Cloud Build) :
- name: 'gcr.io/cloud-builders/gcloud'
entrypoint: 'bash'
args: ['-c', 'launch-second-level-compilation.sh ${_MY_VAR}']
The script "launch-second-level-compilation.sh" does specific operations based on ${_MY_VAR} and then launches a second-level compilation passing a lot of substitutions variables with "gcloud builds submit --config=second-level-compilation.yaml --substitutions=_FUNCTION_NAME=val,_GCP_CLOUD_FUNCTION_REGION=val,....."
Then, the "second-level-compilation.yaml" described at the beginning of this question is executed, using the substitutions values generated and passed through the launch-second-level-compilation.sh script.
The main idea here is to have a generic first-level-compilation.yaml in charge of calling a second-level compilation with specific dynamically generated substitutions.
Attempts / Investigations
As described in this issue Cloud Container Builder, ZIP does not support timestamps before 1980, I tried to "ls" the files in the /workspace directory. But none of the files at the /workspace root have strange DATE.
I changed the path/to/function from a relative path to /workspace/path/to/function, but no success, without surprise as the directory ends to be the same.

Please make sure you don't have folders without files. For example:
|--dir
|--subdir1
| |--file1
|--subdir2
|--file2
In this example dir doesn't directly contain any file, only subdirectories. During local deployment gcp sdk puts dir into tarball without copying last modified field.
So it is set to 1st Jan 1970 that causes problems with ZIP.
As possible workaround just make sure every directory contains at least one file.

GitLab CI cache with multiple paths seems to skip a path

I configuring a gitlab CI where I have 2 jobs in the install stage pulling in dependencies into cached locations. Then a job in a different stage tries to access these locations but only one seems to exist.
I've built the CI according to the python example provided by Gitlab, which can be [found here].1
My .gitlab-ci.yml file looks like this.
---
cache:
paths:
- foo-1
- foo-2
stages:
- install
- test
install_foo-1_dependencies:
stage: install
script:
- pull foo-1 dependencies
install_foo-2_dependencies:
stage: install
script:
- pull foo-2 dependencies
tags:
- ansible-f5-runner
test_dependencies:
stage: test
script:
- ls foo-1
- ls foo-2
The output of install_foo-1_dependencies and install_foo-2_dependencies clearly shows the cache being created. However when you look at the output of test_dependencies it seems only foo-1 cache is being created.
install_foo-1_dependencies output:
Fetching changes...
Removing foo-1/
Checking cache for default-5...
Successfully extracted cache
Creating cache default-5...
....
foo-1: found 1000 matching files
Created cache
install_foo-2_dependencies output:
Fetching changes...
Removing installed-roles/
Checking cache for default-5...
Successfully extracted cache
Creating cache default-5...
....
foo-2: found 1000 matching files
Created cache
Output for test_dependencies
Fetching changes...
Removing foo-1/
Checking cache for default-5...
....
Successfully extracted cache
$ ls foo-1
files
$ ls foo-2
ls: cannot access foo-2: No such file or directory

You need to ensure the same runner is used for each stage of this pipeline. From the docs:
Tip: Using the same Runner for your pipeline, is the most simple and efficient way to cache files in one stage or pipeline, and pass this cache to subsequent stages or pipelines in a guaranteed manner.
It's not apparent from your .gitlab-ci.yml file that you're ensuring the same runner picks up each stage. Again from these docs, to ensure that one runner is used, you should use one or a mix of the following:
Tag your Runners and use the tag on jobs that share their cache.
Use sticky Runners that will be only available to a particular project.
Use a key that fits your workflow (e.g., different caches on each branch).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio