I would like to verify that an rpm is available from Nexus 3 after it is uploaded.
When an rpm is uploaded to Nexus 3, the following events happen (looking at the logs):
Scheduling rebuild of yum metadata to start in 60 seconds
Rebuilding yum metadata for repository rpm
...
Finished rebuilding yum metadata for repository rpm
This takes a while. In my CI pipeline I would like to check periodically until the artifact is available to be installed.
The pipeline builds the rpm, it uploads it to Nexus 3 and then checks every 10 seconds whether the rpm is available. In order to check the availability of the rpm I'm performing the following command:
yum clean all && yum --disablerepo="*" --enablerepo="the-repo-I-care-about" list --showduplicates | grep <name_of_artifcat> | grep <expected_version>
The /etc/yum.conf contains:
cachedir=/var/cache/yum/$basearch/$releasever
keepcache=0
debuglevel=2
logfile=/var/log/yum.log
exactarch=1
obsoletes=1
gpgcheck=1
plugins=1
installonly_limit=5
distroverpkg=centos-release
http_caching=none
The /etc/yum.repos.d/repo-i-care-about.repo contains:
[repo-i-care-about]
name=Repo I care about
enabled=1
gpgcheck=0
baseurl=https://somewhere.com
metadata_expire=5
mirrorlist_expire=5
http_caching=none
The problem I'm experiencing is that the list response seems to return stale information.
The metadata rebuild takes about 70 seconds (60 seconds initial can be configured, I will tweak it eventually), and I'm checking every 10 seconds: the response from the yum repo looks cached somewhere (sometime), and when it happens if I try to perform the same search on another box with the same repo settings I get the expected artefact version.
The fact that on another machine I get the expected result on the first attempt given the specific list command and the fact that the machine where I check every 10 seconds seems to never receive the expected result (even after several minutes since the artefact is available on a different box) makes me think that the response gets cached.
I would like to avoid waiting 90 seconds or so before making the first list request (to make sure that the very first time I perform the list command the artefact is most likely ready and I don't cache the result), especially because the initial delay of the scheduling of the metadata might change (from 60 seconds we might change to a lower value).
The flakyness of this check got better since I've added the http_caching=none to the yum.conf and to the repo definition. But it still didn't make the problem go away reliably.
Is there any other settings around caching that I'm supposed to configure in order to expect more reliable results from the list command? At this point I really don't care about how long the list command would take, as long as it does not contain stale information.
Looks like deleting the /var/cache/yum/* folders is making the check more reliable. Still, it feels like I'm missing some settings to achieve what I need in a neater way.
Related
I have a Gitlab CI/CD pipeline with multiple jobs running in parallel, each job executes mvn test package.
Because there's a lot of dependencies, I'm making use of Gitlabs caching feature to store the .m2 folder:
cache:
key: "$CI_PROJECT_NAME"
paths:
- .m2/repository
I'm using the CI_PROJECT_NAME as I want the cache to be available to all jobs in all branches.
It mostly works, in many jobs I see the build succeed, then a message that the cache was either
created or it's already up to date:
Creating cache my-project-name...
.m2/repository: found 10142 matching files and directories
Archive is up to date!
Created cache
But in some jobs, Maven suddenly fails:
355804 [ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:4.5.2:compile (default) on project spark: wrap: scala.reflect.internal.FatalError: Error accessing /builds/Kxs9HrJp/4/analytics/my-project-name/.m2/repository/org/apache/spark/spark-catalyst_2.12/3.1.1/spark-catalyst_2.12-3.1.1.jar: zip END header not found -> [Help 1]
It seems that the cache was somehow corrupted. If I execute the same job again, it now consistently fails. If I clear the runner cache through the UI, the same job runs successfully again until it fails for another file at some point.
I have a feeling that the concurrent runs are the problem, but I don't know why.
Each job downloads the current state of the cache at the beginning.
Even if it's not up to date, maven will simply download the missing libraries.
If two or more jobs try to update / upload the cache "at the same time", it's OK for the last one to win and overwrite the others' cache.
Any idea what's happening here?
I think maybe it's related to concurrent workers ( Read over write at the same time maybe) if you have received the error just one time, I could suppose that it maybe error connection from the runner to the cache location, but what I'm seeing its more often, the problem is maybe concurrency.
Try to change your key to be more specific by branch/Commit hash and try again...
cache:
key: $CI_COMMIT_REF_SLUG
paths:
- .m2/repository
OR using a distributed location like S3 with versioning enabled...
Hello I just upgraded my website to Jekyll 4.0.0 and it's starting to take a long time to compile. Up to 10 minutes sometimes. But when I use the incremental build locally it's able to product the compiled version in a few seconds. So I tried to cache all the Jekyll related caches I could find. I'm using CircleCI this is my config.yml
- save_cache:
key: site-cache-260320
paths:
- _site
- .jekyll-cache
- .jekyll-metadata
- .sass-cache
This restores the cache folders to the repo when the CircleCI job starts. But it doesn't seem like they get reused in the compilation process. The job always takes almost 10 minutes to compile.
Am I missing a cache folder? Is there a Jekyll option I need to use? If I could get my website build/deploys down to a few seconds that would be life changing. Thanks!
The CircleCI documentation on caching also mention
CircleCI restores caches in the order of keys listed in the restore_cache step. Each cache key is namespaced to the project, and retrieval is prefix-matched. The cache will be restored from the first matching key
steps:
- restore_cache:
keys:
So make sure to configure your restore_cache step, to go along with your save_cache step.
For instance, be aware of the cache size.
I recently encountered a GitLab pipeline issue where my node_modules weren't being updated with newer versions of a library (particularly my own internal fork of a project, which uses the git+url syntax). I suspect, as the git+url doesn't have a version number in it, its tricky to hash the package file and detect there is a change...
My workaround was to try and put a $date entry in the cache entry of my .gitlab-ci.yml file, so that the cache is lost every 24 hours. However there is no CI variable listed which contains a date, and it doesn't seem that you can access OS variables everywhere in the yaml file. Is there a neat trick I can use?
I tried:
cache:
key: "$(date +%F)" # or see: https://gitlab.msu.edu/help/ci/variables/README.md
paths:
- node_modules
before_script:
- echo Gitlab job started $(date)
This doesn't seem to work - I think it just outputs the key string verbatum, although notice that the script echo command does.
Anyone have any neat ideas? For now, I am just putting a manual string, and will add a digit when I want to cause the cache to be blown (although it is a bit error prone)
At this time there is no way to set the cache expiration time for CI jobs. If the cache is using too much disk space and you're using the Docker executor, you can explore a tool such as https://gitlab.com/gitlab-org/gitlab-runner-docker-cleanup which will keep X amount of disk space free on the runner at any given time by expiring older cache.
I'm amazed at how good Docker's caching of layers works but I'm also wondering how it determines whether it may use a cached layer or not.
Let's take these build steps for example:
Step 4 : RUN npm install -g node-gyp
---> Using cache
---> 3fc59f47f6aa
Step 5 : WORKDIR /src
---> Using cache
---> 5c6956ba5856
Step 6 : COPY package.json .
---> d82099966d6a
Removing intermediate container eb7ecb8d3ec7
Step 7 : RUN npm install
---> Running in b960cf0fdd0a
For example how does it know it can use the cached layer for npm install -g node-gyp but creates a fresh layer for npm install ?
The build cache process is explained fairly thoroughly in the Best practices for writing Dockerfiles: Leverage build cache section.
Starting with a parent image that is already in the cache, the next instruction is compared against all child images derived from that base image to see if one of them was built using the exact same instruction. If not, the cache is invalidated.
In most cases, simply comparing the instruction in the Dockerfile with one of the child images is sufficient. However, certain instructions require more examination and explanation.
For the ADD and COPY instructions, the contents of the file(s) in the image are examined and a checksum is calculated for each file. The last-modified and last-accessed times of the file(s) are not considered in these checksums. During the cache lookup, the checksum is compared against the checksum in the existing images. If anything has changed in the file(s), such as the contents and metadata, then the cache is invalidated.
Aside from the ADD and COPY commands, cache checking does not look at the files in the container to determine a cache match. For example, when processing a RUN apt-get -y update command the files updated in the container are not examined to determine if a cache hit exists. In that case just the command string itself is used to find a match.
Once the cache is invalidated, all subsequent Dockerfile commands generate new images and the cache is not used.
You will run into situations where OS packages, NPM packages or a Git repo are updated to newer versions (say a ~2.3 semver in package.json) but as your Dockerfile or package.json hasn't updated, docker will continue using the cache.
It's possible to programatically generate a Dockerfile that busts the cache by modifying lines on certain smarter checks (e.g retrieve the latest git branch shasum from a repo to use in the clone instruction). You can also periodically run the build with --no-cache=true to enforce updates.
It's because your package.json file has been modified, see Removing intermediate container.
That's also usually the reason why package-manager (vendor/3rd-party) info files are COPY'ed first during docker build. After that you run the package-manager installation, and then you add the rest of your application, i.e. src.
If you've no changes to your libs, these steps are served from the build cache.
The selfupdate command seems to be incredibly slow in MacPorts. It seems like it is taking ages in this step
---> Updating the ports tree
Synchronizing local ports tree from http://distfiles.macports.org/ports.tar.gz
I believe it is taking a long time just to download the ports.tar.gz file (order of 9-22 kbps). I've downloaded it myself (using axel downloader 100-300 kbps). How can I associate this along with selfupdate so that I can operate it off-line; at least for the ports.tar.gz file? Is this even possible?
Nailed it! Got the solution for the problem. All I had to do is to include the path in the /opt/local/etc/macports/sources.conf file in place of the old one just like this
#rsync://rsync.macports.org/release/tarballs/ports.tar [default]
#https://www.macports.org/files/ports.tar.gz [default]
file:////Users/Ebe/Downloads/Axel/ports.tar.gz
I commented out the remaining entry just in case. (Ref. the second line in the conf file was from this post)
I then executed sudo port selfupdate and the process was complete without any delay.