How does Travis CI cache Gradle dependencies? - caching

In Travis documentation about caching dependencies, it mentions:
The cache’s purpose is to make installing language-specific dependencies easy and fast, so everything related to tools like Bundler, pip, Composer, npm, Gradle, Maven, is what should go into the cache.
Large files that are quick to install but slow to download do not benefit from caching, as they take as long to download from the cache as from the original source:
I am using Gradle in my Java project.
It seems what Gradle caches is those .jar files, which should fall in the category "quick to install".
So my question is, why Travis recommends caching Gradle dependencies if .jar files are quick to install but slow to download?
Where does the benefits (in terms of shorter build time) come from?

It's a good question. I'm not sure about the benefits of cache usage because I never measured the download time of S3, but it's probably faster.
At the end of the linked page they explain:
If you store archives larger than a few hundred megabytes in the
cache, it’s unlikely that you’ll see a significant speed improvement.
It seems that they consider faster to cache a lot of small files than downloading them independently.
Gradle files fit in this category are quick to install and FAST to download.
They don't recommend to use cache for quick to install files and SLOW to download like the system images of 1GB of Android.
In my opinion, they say this because you are hurting their S3 quotas (I have no idea about the terms of this service) for a negligible benefit for you in this case.

Related

Is there a way to download all the maven dependencies separately and then import them locally

I am suffering the slow download of maven.
I use the default configuration for Spring Boot starter project.
Group: com.example
Artifact: Demo
STS takes too much time to sync the content. is there a way to speed this up? for example, download the whole thing separately that the project needs, and then import them locally.
If you look carefully at the screenshot you'll notice that it says that Maven download operation is blocked by user operation.
The comments saying the build is slower under IDE (particularly Eclipse) are wrong in the sense that they are sort of a mental shortcuts. They are based on observation (it may indeed take longer time to achieve the end result) but that does not mean the build/download itself is slower. The thing is that Eclipse performs way more operations than just the build alone and sometimes those end up waiting on one another (as your screenshot clearly indicates).
With that in mind, if you run your build on the command line it may complete a lot faster as most likely it will not compete for resources with other tasks. But keep in mind that will keep Eclipse out of sync with what is actually on the file system. Eventually Eclipse will figure that out and try to sync. Sometimes it may not and you would have to do it manually. In both cases, depending on the size and amount of the projects and the number and complexity of changes made, it may take significant time to sync.
To summarize, it's not "slow download of maven" what you experience but multiple tasks competing for resources and waiting on each other. There is no point to pre-download all dependencies as this is not a recurring operation. Maven ONLY downloads missing dependencies. Once they are in local repo it will not try to download them again (unless you force it to).

NPM caching similar to a local Maven cache

Gradle's dependency management system stores downloaded artifacts in a local Maven cache. When a build requests that same dependency again the dependency is simply retrieved from the cache, avoiding any network transfer of the artifact.
I'm trying to replicate this behavior with NPM for building JavaScript projects. I was expecting NPM to support a global node_modules cache, but installing a package "globally" in NPM has a different meaning => the package is added to PATH so that it can be used as a CLI tool.
Reading the documenation for npm install, the standard behavior is to install packages into a local node_modules directory. But this would mean many duplicated packages on the system wasting valuable disk space. It also poses a problem for doing clean production builds, since ideally the node_modules should be blown away each time.
Does NPM support something like the Gradle's Maven caching? Documentation on NPM cache doesn't make it any clearer how this is to be used. What's more, it's not obvious if a caching strategy with NPM is safe across multiple parallel builds.
This seems like such a basic requirement for busy CI environments that it must have been solved before. I found the npm-cache tool which seems to offer this support, but it would be much better if caching was supported natively in npm itself.
Thanks!
IMHO it is a pity that the makers did not learn from things like maven that have already been there. If you are doing microservices and have many apps on your machine and you might also have multiple branches or a local jenkins you will have each dependency N*M times on the disk what is an extraordinary waste of disc-space and performance. So you have to be aware that Java or .NET/C# are mature ecosystems while the JavaScript ecosystem is still in the childhood with lots of flaws and edges. But JavaScript is evolving fast so lets hope for the best. Feel free to discuss your pain with the npm makers (https://github.com/npm/npm/issues/).
However, a partial cure comes if you go away from npm and switch to yarn: http://yarnpkg.com/
NPM Cache already comes bundled with NPM out of the box(listed under cli commands). And its main utility is to avoid the network transfer of the same package over and over.
Regarding the duplicate packages issue, as of npm v3 there has been an effort in terms of finding ways to deduplicate dependencies. But it still does not work exactly like Gradle since it is still possible to end up with duplicates of the same package in your node_modules folder.
Per NPM documentation:
Your node_modules directory structure and therefore your dependency tree are dependant on install order
Although a fresh npm install from the same package json always produces the same dependency tree:
The npm install command, when used exclusively to install packages from apackage.json, will always produce the same tree. This is because install order from a package.json is always alphabetical. Same install order means that you will get the same tree.
So at least there is a way to get consistent dependency trees, albeit there's no guarantee it will be the most efficient one. At least those differences do not interfere correct functioning of NPM.
Hope that helps.

What is the best way to save composer dependencies across multiple builds

I am currently using atlassian bamboo build server (cloud based, using aws) and have an initial task that simply does a composer install.
this single task can take quite a bit of time which can be a pain when developers have committed multiple times giving the build server 4 builds all downloading dependencies (these are not parallel).
I wish to speed this process up but canot figure out a way in which to save the dependancies to a common location for use across multiple builds which still allowing the application to run as intended (laravel)
Answer
Remove composer.lock from your .gitignore
Explanation
When you run compose install for the first time, composer has to check all of your dependencies (and their dependencies etc.) or compatibility. Running through the whole dependency tree is quite expensive, which is why it takes so long.
After figuring out all of your dependencies, composer then writes the exact versions it uses into the composer.lock file so that subsequent composer install commands will not have to spend that much time running through the whole graph.
If you commit your composer.lock file it'll come along to your bamboo server. The composer install command will be waaaayy faster.
Committing composer.lock is a best practice regardless. To quote the docs:
Commit your application's composer.lock (along with composer.json) into version control.
This is important because the install command checks if a lock file is present, and if it is, it downloads the versions specified there (regardless of what composer.json says).
This means that anyone who sets up the project will download the exact same version of the dependencies. Your CI server, production machines, other developers in your team, everything and everyone runs on the same dependencies, which mitigates the potential for bugs affecting only some parts of the deployments.

Speed up Adobe CQ5 Maven build

I need solution to speed up maven build process.
The project is based on Adobe cq5 otherwise AEM and i have more than 10 modules in my project where the build happens in linear way.
Currently Build process takes more than 10 min to compile.
is any other specific tool available or other way to speed up the process?
Thanks
I have one suggestion,If you have 10 modules than make separate profile for each module,and build only that part in which you are changing and modifying,no need to deploy 10 modules each time unnecessarily.It will not speed up maven project build but can help you to save your time.It is a workaround but will be helpful .
Try mvn -T 4 clean install #Builds with 4 threads
Its a multi-threaded mode to run Maven and is faster. Apache documentation here.
To add to the other answers:
1) Give more memory resources to the server where the AEM instance is installed, content creation involves a lot of disk access so using SSD is a must.
2) Having a clean AEM instance helps to speed up the process. As you may know the AEM repository grows because of revisions so each time you deploy the repository size grows and it becomes slower. if it's a production environment use maintenance tools like revision clean up and compaction.
revision clean up
how to maintain repository
As per my knowledge there are no such mechanism to speed up.
better to make a build of particular module, it will deploy faster rather waiting for all 10 module to be happened.
Thanks
You can try using the suggested answers to build modules in parallel. It should speed up the build in theory.
But really there is no magic answer. You have to find the bottleneck in your build, it could be the number of dependencies, it could be a specific slow plugin, it could be hardware related, and it could be something else.
Try this: https://github.com/lptr/maven-time-tracker
It may help you find the bottleneck.
I would like to answer this question, knowing it was posted a really long time back.
Currently I am using AEM 6.3 and to recompile and deploy CORE module changes there is a simple maven trick -
This command tells us to -
run 1 thread/core
compile just the core module of the list of projects
and of course, zip package and send it to running AEM instance.
In my observation, this reduced the turnaround time by a huge margin.

What backend does Jenkins (Hudson) use for archiving build artifacts?

I've read about the disadvantages (especially this one) of using SVN to store build artificats (large binary files). Hudson was suggested as an alternative.
How does Hudson handle these files?
Edit: My project is not Java-based.
Hudson can create/keep an archive of build artifacts, and provides a nice browser view for inspecting them.
You need to enable Archive the Artifacts in the job definition.
Hudson is basically using flat file storage. You can find those files within Hudson in the jobs/builds/ folders. I'm not sure I'd say, "Use Hudson as an alternative to checking in file to source control" but using something as an alternative is a decent idea if it provides:
authoritative place to store
versioned binaries access control
checksums for tamper resistance
release meta-data (environment information; approval level)
retention periods
I'm not sure how well Hudson scores on those marks, but I think it does at least some of that. SVN is non-terrible as a solution there as well, but really struggles with retention periods (old builds tend to eat disk space like crazy) and isn't terribly well optimized for large binaries - most SCM systems are optimized for smallish text files.
I stole the list above from this presentation: http://www.anthillpro.com/html/resources/webinars/Role_of_Binary_repositories_in_Software_Configuration_Management.html (registration required)
We use Jenkins for our builds, but we also store the artifacts from the builds. Like Eric said above, Hudson/Jenkins store artficats using flat file storage. It is organized based on the build.
Some things I have noticed from use (in reponse to Eric's questions about an alternative to souce control for binaries):
Each build stores it's own artifact, so you do have a versioning of sorts.
You can use the fingerprinting option when archiving. This will allow you to differentiate between versions and also check for corruption.
Retention periods are completely up to you. We keep artifacts forever.
FYI, our projects are not Java either (they are C/C++) and our artifacts are tar.gz/zip files and documents.
It may or may not be the best way to store binaries, but it is definitely decent as long as you have regular backups (weekly in our case) and your disk is fault tolerant.

Resources