This question already has answers here:
How to cache package manager downloads for docker builds?
(4 answers)
Closed 2 years ago.
I'm using docker-compose to orchestrate containers for multiple separate projects. Each of these projects has their own set of containers and do not relate to other projects.
For example:
/my-projects/project-1/docker-compose.yml
/my-projects/project-2/docker-compose.yml
/my-projects/project-3/docker-compose.yml
These projects are, however, similar in that they are all PHP projects and use webpack for front-end assets, thus share the same package managers: composer and yarn.
I was wondering, in the interest of performance, if it would be possible to mount a shared volume outside the directory root of all the projects for package manager caches?
For example:
/my-projects/caches/composer
/my-projects/caches/npm
/my-projects/project-1/docker-compose.yml
/my-projects/project-2/docker-compose.yml
/my-projects/project-3/docker-compose.yml
Where /my-projects/caches/composer and /my-projects/caches/npm get mounted inside the relevant containers within each project. In case it's not clear, only one project would be spun up at a time.
At the moment, if two projects share the same deps then each downloads and caches it individually. A more performant (in terms of build times) would be to mount a common volume and point the package manager's caches there so that when "Project A" downloads an update to a dip, "Project B" can load it from cache.
You can simply mount the same directories as volume binds on each of the containers that require it. You can use absolute paths. Even one of the examples in the docs is using an absolute path as bind mount.
However, volumes are not available during image build (docker-compose build, which is where commands like composer install, npm install or yarn install should be run in any case.
Although if you are running these commands at container runtime, nothing stops you from mounting these cache dirs into each container.
Related
I'm working on a large project using yarn workspaces. I know that yarn workspaces essentially does two things
It automates the symlinking process we had to do manually years ago when we want to share private packages
It hoists all similar packages at the top in node_modules in order to be more efficient.
However, I have noticed that my packages still contain code in their own node_modules and I'm not sure why. When I make a sample monorepo app and say I install lodash in one, it goes straight to the root node_modules.
Why and when does yarn decide to install a package inside a package's node_modules ?
I found the answer on yarn's Discords. yarn will always hoist unless it would conflict with another version.
I have a golang web app, and I need to deploy it. I am trying to figure out the best practice on how to run a golang app on production. The way I am doing it now, very simple,
just upload the built binary to production, without actually having
the source code on prod at all.
However I found an issue, my source code actually reads a config/<local/prod>.yml config file from the source. If I just upload the binary without source code, the app cant run because it is missing config. So I wonder what is the best practice here.
I thought about a couple solutions:
Upload source code, and the binary or build from the source
Only upload the binary and the config file.
Move yml config to Env Variables, but I think with this solution, the code will be less structured, because if you have lots of configs, env variables will be hard to manage.
Thanks in advance.
Good practice for deployment is to have reproducible build process that runs in a clean room (e.g. Docker image) and produces artifacts (binaries, configs, assets) to deploy, ideally also runs some tests that prove nothing was broken from the last time.
It is a good idea to package service - binary and all its needed files (configs, auxiliary files such as systemd, nginx or logrotate configs, etc.) into some sort of package - be it package native to your target environment Linux distribution (DPKG, RPM), virtual machine image, docker image, etc. That way you (or someone else tasked with deployment) won't forget any files, etc. Once you have package you can easily verify and deploy using native tools for that packaging format (apt, yum, docker...) to production environment.
For configuration and other files I recommend to make software to read it from well known locations or at least have option to pass paths in command line arguments. If you deploy to Linux I recommend following LFHS (tldr; configuration to /etc/yourapp/, binaries to /usr/bin/)
It is not recommended to build the software from source in production environment as build requires tools that are normally unnecessary there (e.g. go, git, dependencies, etc.). Installing and running these requires more maintenance and might cause security and performance risks. Generally you want to keep your production environment minimal as required to run the application.
I think the most common deployment strategy for an app is trying to comply with the 12-factor-app methodology.
So, in this case, if your YAML file is the configuration file, then it would be better if you put the configuration on the Environment Variables(ENV vars). So that when you deploy your app on the container, it is easier to config your running instance from the ENV vars rather than copying a config file to the container.
However, while writing system software, it is better to comply with the file system hierarchy structure defined by the OS you are using. If you are using a Unix-like system you could read the hierarchy structure by typing man hier on the terminal. Usually, I install the compiled binary on the /usr/local/bin directory and put the configuration inside the /usr/local/etc.
For the deployment on the production, I created a simple Makefile that will do the building and installation process. If the deployment environment is a bare metal server or a VM, I commonly use Ansible to do the deployment using ansible-playbook. The playbook will fetch the source code from the code repository, then build, compile, and install the software by running the make command.
If the app will be deployed on containers, I suggest that you create an image and use multi-stage builds so the source code and other tools that needed while building the binary would not be in the production environment and the image size would be smaller. But, as I mentioned before, it is a common practice to read the app configuration from the ENV vars instead of a config file. If the app has a lot of things to be configured, the file could be copied to the image while building the image.
While we wait for the proposal: cmd/go: support embedding static assets (files) in binaries to be implemented (see the current proposal), you can use one of the embedding static asset files tools listed in that proposal.
The idea is to include your static file in your executable.
That way, you can distribute your program without being dependent on your sources.
I have a Laravel application with a Dockerfile and a docker-compose.yml which i use for local development. It does some volume sharing currently so that code updates are reflected immediately. The docker-compose also spins up containers for MySQL, Redis, etc.
However, in preparation for deploying my container to production (ECS), I wonder how best to structure my Dockerfile.
Essentially, on production, there are several other steps I would need to do that would not be done locally in the Dockerfile:
install dependencies
modify permissions
download a production env file
My first solution was to have a build script, which essentially takes the codebase, copies it to an empty sub-folder, runs the above three commands in that folder, and then runs docker build. This way, the Dockerfile doesn't need to change between dev and production and i can include the extra steps before the build process.
However, the drawback is that the first 3 commands don't get included as part of the docker image layering. So even if my dependencies haven't changed in the last 100 builds, it'll still download them all from scratch each time, which is fairly time consuming.
Another option would be to have multiple docker files, but that doesn't seem very dry.
Is there a preferred or standardized approach for handling this sort of situation?
I am currently using atlassian bamboo build server (cloud based, using aws) and have an initial task that simply does a composer install.
this single task can take quite a bit of time which can be a pain when developers have committed multiple times giving the build server 4 builds all downloading dependencies (these are not parallel).
I wish to speed this process up but canot figure out a way in which to save the dependancies to a common location for use across multiple builds which still allowing the application to run as intended (laravel)
Answer
Remove composer.lock from your .gitignore
Explanation
When you run compose install for the first time, composer has to check all of your dependencies (and their dependencies etc.) or compatibility. Running through the whole dependency tree is quite expensive, which is why it takes so long.
After figuring out all of your dependencies, composer then writes the exact versions it uses into the composer.lock file so that subsequent composer install commands will not have to spend that much time running through the whole graph.
If you commit your composer.lock file it'll come along to your bamboo server. The composer install command will be waaaayy faster.
Committing composer.lock is a best practice regardless. To quote the docs:
Commit your application's composer.lock (along with composer.json) into version control.
This is important because the install command checks if a lock file is present, and if it is, it downloads the versions specified there (regardless of what composer.json says).
This means that anyone who sets up the project will download the exact same version of the dependencies. Your CI server, production machines, other developers in your team, everything and everyone runs on the same dependencies, which mitigates the potential for bugs affecting only some parts of the deployments.
What is the best way to use Docker in a Continuous Delivery pipeline?
Should the build artefact be a Docker Image rather than a Jar/War? If so, how would that work - I'm struggling to work out how to seamlessly use Docker in development (on laptop) and then have the CI server use the same base image to build the artefact with.
Well, there are of course multiple best practices and many approaches on how to do this. One approach that I have found successful is the following:
Separate the deployable code (jars/wars etc) from the docker containers in separate VCS-repos (we used two different Git-repos in my latest project). This means that the docker images that you use to deploy your code on is built in a separate step. A docker-build so to say. Here you can e.g. build the docker images for your database, application server, redis cache or similar. When a `Dockerfile´ or similar has changed in your VCS then Jenkins or whatever you use can trigger the build of the docker images. These images should be tagged and pushed to some registry (may it be Docker Hub or some local registry).
The deployable code (jars/wars etc) should be built as usual using Jenkins and commit-hooks. In one of my projects we actually ran Jenkins in a Docker container as described here.
All docker containers that uses dynamic data (such as the storage for a database, the war-files for a Tomcat/Jetty or configuration files that is part of the code base) should mount these files as data volumes or as data volume containers.
The test servers or whatever steps that are part of your pipeline should be set up according to a spec that is known by your build server. We used a descriptor that connected your newly built tag from the code base to the tag on the docker containers. Jenkins pipeline plugin could then run a script that first moved the build artifacts to the host server, pulled the correct docker images from the local registry and then started all processes using data volume containers (we used Fig for managing the docker lifecycle.
With this approach, we were also able to run our local processes (databases etc) as Docker containers. These containers were of course based on the same images as the ones in production and could also be developed on the dev machines. The only real difference between local dev environment and the production environment was the operating system. The dev machines typically ran Mac OS X/Boot2Docker and prod ran on Linux.
Yes, you should shift from jar/war files to images as your build artefacts. The process #wassgren describes should work well, however I found the following to fit our use case better, especially for development:
1- Make a base image. It looks like you're a java shop so as an example, lets pretend your base image is FROM ubuntu:14.04 and installs the jdk and some of the more common libs. Let's call it myjava.
2- During development, use fig to bring up your container(s) locally and mount your dev code to the right spot. Fig uses the myjava base image and doesn't care about the Dockerfile.
3- When you build the project for deployment it uses the Dockerfile, and somewhere it does a COPY of the code/build artefacts to the right place. That image then gets tagged appropriately and deployed.
simple steps to follow.
1)Install jenkins in a container
2)Install framework tool in the same container.(I used SBT).
3)Create a project in jenkins with necessary plugins to integrate data from gitlab and build all jar's to a compressed format (say build.tgz).
4)This build.tgz can be moved anywhere and be triggered but it should satisfy all its environment dependencies (for eg say it required mysql).
5)Now we create a base environment image(in our case mysql installed).
6)With every build triggered, we should trigger a dockerfile on the server which will combine build.tgz and environment image.
7)Now we have our build.tgz along with its environment satisfied.This image should be pushed into registry.This is our final image.It is portable and can be deployed anywhere.
8)This final image can be kept on a container with necessary mountppoints to get outputs and a script(script.sh) can be triggered by mentioning the entrypoint command in dockerfile.
9)This script.sh will be inside the image(base image) and will be configured to do things according to our purpose.
10)Before making this container up you need to stop the previously running container.
Important notes:
An image is created everytime you build and is stored in registry.
Thus this can be reused.This image can be pushed into main production server and can be triggered.
This helps to maintain a clean environment everytime because we are using the base image.
You can also create a stable CD pipeline with Bamboo and Docker. Bamboo Docker integrations come in both a build agent form and as a bundled task within the application. You might find this article helpful: http://blogs.atlassian.com/2015/06/docker-containers-bamboo-winning-continuous-delivery/
You can use the task to build a Docker image that you can use in another environment or deploy your application to a container.
Good luck!