Between 1 and 2 minutes of my AWS CodeBuilds are spent downloading dependencies from Maven Central.
Short of building a pre-provisioned Docker container, is there any way to cache these between builds?
CodeBuild now provides a cache feature you can use to pre-load your dependencies.
Unsigned's answer is good but is a tad outdated. As of February 2019, CodeBuild allows both caching in an S3 bucket and allows the user to cache locally. You can now specify cache at 3 different layers of a build:
Docker Layer Caching
Git Layer Cahing (cache the last build and then only build from git diff)
Custom caching - specified within the cache: portion of your buildspec.yml file. Personally, I cache my node_modules/ here and then cache at the Git Layer.
Source: https://aws.amazon.com/blogs/devops/improve-build-performance-and-save-time-using-local-caching-in-aws-codebuild/
Related
I am utilizing Yarn 3 with the Plug'n'Play (PNP) feature for zero-installation of the dependencies in my development environment, which works well. However, when I run a production build, the .yarn/cache folder also includes dev-dependencies, resulting in an increase in the overall size of the Docker image. Is there a way to only include the necessary dependencies in the .yarn/cache folder for the production build?
OWASP dependency check it's a great way of automating vulnerability discovery in our projects, though when running it as part of a CI pipeline per project it adds 3-4 minutes just to download the NVD database.
How can we cache this DB when running it with maven / gradle on a CI pipeline?
After a bit of research I found the way!
Basically, the files containing the NVM db are called: nvdcve-1.1-[YYYY].json.gz i.e. nvdcve-1.1-2022.json.gz which are later added to a Lucene index.
When running Dependency-Check with the Gradle plugin the files are created on:
$GRADLE_USER_HOME/.gradle/dependency-check-data/7.0/nvdcache/
When running it with Maven they are created on:
$MAVEN_HOME/.m2/repository/org/owasp/dependency-check-data/7.0/nvdcache/
So to cache this the DB on Gitlab CI you just have to add the following to your .gitlab-ci.yaml (Gradle):
before_script:
- export GRADLE_USER_HOME=`pwd`/.gradle
cache:
key: "$CI_PROJECT_NAME"
paths:
- .gradle/dependency-check-data
The first CI job run will create the cache and the consecutive (from same or different pipelines) will fetch it!
I have .terraform/modules folder generated by terraform itself.
It's where terraform keeps modules by default and I'm fine with that.
when running terraform init command and if .terraform folder is gone it will try to pull modules again I would like to avoid that step by saying to use pre-populated modules folder from different location - it's like building shared cache folder for terraform for our CI/CD pipelines, pull only if new version of a modules specified otherwise use cache.
NOTE:
We don't run anything on Jenkins locally, every `Stage` in Jenkins uses Ephemeral Docker
container agents to run all the `Steps` and to keep Jenkins clean,
otherwise I would use local workspace cache for all that.
is there a way to do that?
Thank you
I have configured and working following setup
gitlab-ci, which uses docker-machine runner and uploads cache to S3
maven build with configured caching
caching correctly loads and uploads on each job
But the problem is, that every time I run mvn install, something in the local maven repository changes (I assume it updates pom metadata) and gitlab runner keeps uploading new versions of the cache, on every single build.
It is still faster and more reliable to use this "busted" cache, than to download the deps from internet every time, but the upload can take a long time and I would like to shave off this extra time.
How can I modify my build to force maven, to generate cacheable local repository?
Simplified version of my .gitlab-ci.yml:
variables:
# we have a custom java+maven image, that uses this ENV variable,
# to auto-configure path where to put the local maven repository
MAVEN_LOCAL_REPOSITORY: $CI_PROJECT_DIR/.cache/maven
job-build:
stage: build
image: internal-gitlab/java/maven:3.6-jdk8-alpine
script:
- mvn -B clean package
cache:
key: backend-dependencies
paths:
- .cache/
You have a constant as a cache key. Maybe a more fine grained cache would help.
See the link here
In short - prepare your own maven image with required dependencies and use it instead of internal-gitlab/java/maven:3.6-jdk8-alpine.
Some details:
First of all, you need to create a maven docker image where all (or most of) required for your project dependencies are presented. Publish it to your registry (gitlab has one) and use it instead of internal-gitlab/java/maven:3.6-jdk8-alpine.
To create such an image I usually create an additional job in CI triggered manually. You need to trigger it at initial stage and when project dependencies are heavily modified.
Working sample can be found here:
https://gitlab.com/alexej.vlasov/syncer/blob/master/.gitlab-ci.yml
- this project is using the prepared image and also it has a job to prepare this image.
https://gitlab.com/alexej.vlasov/maven/blob/master/Dockerfile
- dockerfile to run maven and download dependencies once.
The pros:
don't need to download dependencies each time - they are inside a
docker image (and docker layers are cached on the runners)
don't need to upload artifacts when job is finished
Since travis-ci.org doesn't support bitbucket.org I need another CI service which supports it and allows managing the build commands in a VCS file (like .travis.yml in travis).
My quite annoying research result so far is:
semaphoreci.com: projects which are forks aren't listed even after refreshing the project list
app.shippable.com: signing up with both github.com and bitbucket.org doesn't work
codeship.com: doesn't support to run commands as ''root'' user((https://codeship.com/documentation/faq/root-level-access/))
www.snap-ci.com: no support for bitbucket.org((http://www.slant.co/topics/186/~hosted-continuous-integration-services))
I don't get why people would not want to share the CI service build commands in the VCS - chances of good collaboration without such a feature seems small to me. Even if one adds a script file in the VCS it still needs to be set up in the CI service which appears to be an unnecessary step.
A few months ago Bitbucket launched Pipelines. Quoting from the link:
Continuous delivery is now seamlessly integrated into your Bitbucket Cloud repositories.
You may use it on free plans, but next year they will reduce the build minutes for free plans from 500 minutes to 50 minutes as told in this link.
Also, CircleCI is supporting Bitbucket. It has free plan with 1500 build minutes. It can be triggered by commit or tag in BB. https://circleci.com/
The company that owns BitBucket also has a product called Bamboo for CI. Though most should work with any git that provides a webhook.
According to this blog, it is possible to use Travis-CI for Bitbucket:
Clone github repository:
git clone https://github.com/{github_user}/{github_repository}
cd {github_repository}
Add submodule bitbucket repository:
git submodule add https://bitbucket.org/{bitbucket_user}/{bitbucket_repository}
Add .travis.yml to root dir:
git:
submodules:
false
before_install:
- echo -e "machine bitbucket.org\n login $BITBUCKET_USER_NAME\n password $BITBUCKET_USER_PASSWORD" >~/.netrc
- git submodule update --init --recursive
$BITBUCKET_USER_NAME is bitbucket username
$BITBUCKET_USER_PASSWORD is app password
Open https://travis-ci.org/{github_user}/{github_repository}
A Semaphore CI user can add a fork of a project to his Semaphore account following these steps on the documentation page. Also, Semaphore is building a fork pull request and those builds are visible.
There also is (now) an option to use GitLab as CI/CD server for repository hosted on Bitbucket.
See the documentation here: on GitLab site