Pass data between Gitlab pipelines - continuous-integration

I need to pass the folders generated in one pipeline to the next pipeline in Gitlab CI. What are the possible ways?
Is it possible through just Artifacts?
Can we only achieve it through cache?
If by Cache, is there any expiry that we can set in cache?
My actual question was (but no answers so far) :
Carry artifacts of Gitlab pages between pipelines/jobs

There is a simple distinction:
Cache is used between multiple runs of the same job in different pipelines and also on the same runner (unless you have configured a shared cache storage)
Artifacts are used to pass files between different jobs within a single pipeline
Jobs may specify an artifacts:expire_in keyword to control the lifespan of their artifacts (see https://docs.gitlab.com/ee/ci/yaml/README.html#artifactsexpire_in ).

Related

How to disable GitLab build caching for a job

In GitLab there seem to be some sort of build cache.
For example, I have a job which builds and tags a docker image. The job succeeds and build log looks normal, but the image isn't actually created in the runner. Same thing with files: writing to file, which doesn't exist after job finishes. I suspect the build uses some kind of cache, as it executes so fast in these scenarios.
This behavior seems to manifest most often with detached pipelines, tag pipelines and when pipelines points to same commit via different refs in general.
How do I disable job caching and force the side effects to happen?
According to GitLab's documentation on caching you can try:
job_name:
cache: []
...

How to prevent GitLab CI/CD from deleting the whole build

I'm currently having a frustrating issue.
I have a setup of GitLab CI on a VPS server, which is working completely fine, I have my pipelines running without a problem.
The issue comes after having to redo a pipeline. Each time GitLab deletes the whole folder, where the build is and builds it again to deploy it. My problem is that I have a "uploads" folder, that stores all user content, that was uploaded, and each time I redo a pipeline everything gets deleted from this folder and I obviously need this content, because it's the purpose of the app.
I have tried GitLab CI cache - no luck. I have also tried making a new folder, that isn't in the repository, it deletes it too.
Running my first job looks like so:
Job
As you can see there are a lot of lines, that says "Removing ..."
In order to persist a folder with local files while integrating CI pipelines, the best approach is to use Docker data persistency, as you'll be able to delete everything from the last build while keeping local files inside your application between your builds, while maintains the ability to start from stretch every time you start a new pipeline.
Bind-mount volumes
Volumes managed by Docker
GitLab's CI/CD Documentation provides a short briefing on how to persist storage between jobs when using Docker to build your applications.
I'd also like to point out that if you're using Gitlab Runner through SSH, they explicitly state they do not support caching between builds when using this functionality. Even when using the standard Shell executor, they highly discourage saving data to the Builds folder. so it can be argued that the best practice approach is to use a bind-mount volume to your host and isolate the application from the user uploaded data.

Converting DSL jobs into a pipeline in Jenkins

I'm trying to migrate old fashioned Jenkins DSL jobs (in Groovy) to the new descriptive pipeline form.
Since I'm very new to the pipeline and I could not find any answer to my noob problem, I'll firstly describe my scenario here:
Supposing I have 3 DSL jobs, one to build and save the artifact generated in a repository like Artifactory, another to tag the master branch and the last one is used to deploy to prod. All jobs use the same Git repository.
The building job is usually run many times during development. It can be triggered manually or as a response to events in the Git repo, e.g. merge requests and pushes.
For simplicity, let's assume the tagging job only needs to tag the master branch in the repo. This will only be run once in a while, manually, when we are pretty sure the master branch will go to prod.
Artifact gets deployed using a third job, also manually.
So here are my questions:
As I understand we can only have one file per branch in the repo, so how can I configure such a setup using a pipeline defined in only one Jenkinsfile?
How can I manually trigger the tagging job only (meaning compile/test/generate the artifact without uploading and then if everything tests ok, tag the version)?
In this situation, will it be easier for me if I just implement the building job in the pipeline and keep the others as DSL scripts?
Many thanks for any suggestions!

How to cache job results only for running pipeline?

I wrote a pipline to build my Java application with Maven. I have feature branches and master branch in my Git repository, so I have to separate Maven goal package and deploy. Therefore I created two jobs in my pipeline. Last job needs job results from first job.
I know that I have to cache the job results, but I don't want to
expose the job results to GitLab UI
expose it to the next run of the pipeline
I tried following solutions without success.
Using cache
I followed How to deploy Maven projects to Artifactory with GitLab CI/CD:
Caching the .m2/repository folder (where all the Maven files are stored), and the target folder (where our application will be created), is useful for speeding up the process by running all Maven phases in a sequential order, therefore, executing mvn test will automatically run mvn compile if necessary.
but this solution shares job results between piplines, see Cache dependencies in GitLab CI/CD:
If caching is enabled, it’s shared between pipelines and jobs at the project level by default, starting from GitLab 9.0. Caches are not shared across projects.
and also it should not be used for caching in the same pipeline, see Cache vs artifacts:
Don’t use caching for passing artifacts between stages, as it is designed to store runtime dependencies needed to compile the project:
cache: For storing project dependencies
Caches are used to speed up runs of a given job in subsequent pipelines, by storing downloaded dependencies so that they don’t have to be fetched from the internet again (like npm packages, Go vendor packages, etc.) While the cache could be configured to pass intermediate build results between stages, this should be done with artifacts instead.
artifacts: Use for stage results that will be passed between stages.
Artifacts are files generated by a job which are stored and uploaded, and can then be fetched and used by jobs in later stages of the same pipeline. This data will not be available in different pipelines, but is available to be downloaded from the UI.
Using artifacts
This solution is exposing the job results to the GitLab UI, see artifacts:
The artifacts will be sent to GitLab after the job finishes and will be available for download in the GitLab UI.
and there is no way to expire the cache after finishing the pipeline, see artifacts:expire_in:
The value of expire_in is an elapsed time in seconds, unless a unit is provided.
Is there any way to cache job results only for the running pipline?
There is no way to send build artifacts between jobs in GitLab that only keeps them as long as the pipeline is running. This is how GitLab has designed their CI solution.
The recommended way to send build artifacts between jobs in GitLab is to use artifacts. This feature always upload the files to the GitLab instance, that they call the coordinator in this case. These files are available through the GitLab UI, as you write. For most cases this is a complete waste of space, but in rare cases it is very useful as you can download the artifacts and check why your pipeline broke.
The artifacts are available for download by project members that are at least Reporters, but can be viewed by everybody if public pipelines is enabled. You can read more about permissions here.
To not fill up your hard disk or quotas, you should use an expire_in. You could set it to just a few hours if you really don't want to waste space. I would not recommend this though, as if a job that depend on these artifacts fails and you retry it, if the artifacts have expired, you will have to restart the whole pipeline. I usually put this to one week for intermediate build artifacts as that often fits my needs.
If you want to use caches for keeping build artifacts, maybe because your build artifacts are huge and you need to optimize it, it should be possible to use CI_PIPELINE_ID as the key of the cache (I haven't tested this):
cache:
key: ${CI_PIPELINE_ID}
The files in the cache should be stored where your runner is installed. If you make sure that all jobs that need these build artifacts are executed by runners that have access to this cache, it should work.
You could also try some of the other predefined environment variables as key our your cache.

How can I write to a Shared Resource of Custom Values in TeamCity?

TeamCity has a feature which is called "Shared Resources". This allows you to configure a set of custom values (i.e., URLs) which can be read from by the locking builds in order to properly share data and avoid resource conflicts. However, in my situation, the custom values (i.e., environment names) are not known yet. There is a separate build which will be able to create the environments. How can I have this build "write" into the custom values of the TeamCity Shared Resources so that the dependent builds can retrieve the most up-to-date values?
Currently parameters are not supported in TeamCity. Here is the related request https://youtrack.jetbrains.com/issue/TW-39174 you can vote for.

Resources