Gitlab.com CI for Private Projects - continuous-integration

Summary: We have a concern based on one piece of documentation regarding Gitlab.com CI for private projects
Note: This is in reference to Gitlab.com (and not a self-hosted gitlab)
Concern: We came across this link, https://docs.gitlab.com/ee/ci/runners/#be-careful-with-sensitive-information
My Interpretation: its not advisable to build private projects in Default Gitlab CI Runners
Is the interpretation valid? and to what extent a concern?
What do you think will be the best practice for this?
Question:
Is it fine to use Gitlab.com Shared Runners for CI in Private Projects?
Our Solution: IF and only IF we need an alternative (POC for this was successfully implemented)
We created a EC2 Instance (a private box)
Installed Gitlab Runner to the box
Connected EC2 Instance to Gitlab
Disabled Shared Runner from Project setting
On CI run, it successfully sends the request to our EC2 instance
https://gitlab.com/gitlab-org/gitlab/-/issues/215677

Short:
My interpretation was wrong. Gitlab.com is fully safe. Quoted Doc is NOT for this use-case.
Read Response from Gitlab.com: https://gitlab.com/gitlab-org/gitlab-runner/-/issues/25468#note_333854812
Quote of response:
The Shared Runners on GitLab.com are isolated VM's that are provisioned for each CI job and removed after job execution. This is documented here.
The documentation that you reference is actually referring to the situation where as a user you are now setting up and managing your own Runners. This is actually what you have done in the Our Solution section. So the security concern is that on your EC2 instance, if the Runner is configured to use the Shell executor for example, then any user in your organization that can execute CI jobs on the Runner on that EC2 instance is now able to execute a script which has full access to the filesystem on the EC2 instance.
So this is why on GitLab.com we always create new isolated VM's for each job.

Related

How should I deploy the result of the CI/CD pipeline on my production server

I am having this GitLab CI/CD which builds then tests and pushes my projects container to GitLab container register successfully. But now I am wondering how I can do the deployment stage automated too. currently, I am doing it manually and after each successful pipeline, I SSH to my server and run several commands to pull the latest images from the GitLab.com container registry and then run them. But I would like to make this step automated as well. Yet, I don't know how?
Actually I have seen some examples of opening an ssh session from CI/CD pipeline but it doesn't feel secure enough. So I was wondering is there a better way for this or I have to actually do this.
Not that I am using gitlab.com so the GitLab server is not installed on my machine and I can't share assets between them directly
There are many ways to achieve this, depending on your setup, other requirements, scale etc.
I'll just give you two options.
I. Kubernetes
create cluster (ie control plane) somewhere
add your cluster to GitLab (now GitLab can even create cluster for you in AWS and GCP, check this page)
attach your target machine as a worker node to the cluster
create Kubernetes YAML files \ Helm chart for your application and deploy via usual ways, e.g. kubectl apply -f ... or helm install ..., or rely on Auto DevOps to do this step for you
This is quite complex but sort of "right" way of doing things.
II. Private GitLab runner
go to Settings > CI/CD > Runners of your GitLab project or group
obtain the registration token
install your own GitLab runner right on the target machine, and register it on the GitLab server using registration token, see example
give runner some specific tag
use that tag in your .gitlab-ci.yml file, see documentation
then deployment process is just a local process of docker pull... and docker run ... for your image
This is a lot simpler, but is a "wrong" way, as you are mixing CI\CD infrastructure with target environment.

What are the best gitlab-ci.yml CI/CD practices and runners configs?

This is a bit theoretical, but I'll try to explain my setup as much as I can:
1 server (instance) with a self-hosted gitlab
1 server (instance) for development
1 server (instance) for production
Let's say in my gitlab I have a ReactJs project and I configured my gitlab-ci.yml as following:
job deploy_dev Upon pushing to dev branch, the updates will be copied with rsync to /var/www/html/${CI_PROJECT_NAME} (As a deployment to dev server)
The runner that picks up the job deploy_dev is a shared runner installed on that same dev server that I deploy to and it picks up jobs with the tag reactjs
The question is:
If I want to deploy to production what is the best practice should I follow?
I managed to come up with a couple of options that I thought of but I don't know which one is the best practice (if any). Here is what I came up with:
Modify gitlab-ci.yml adding a job deploy_prod with the same tag reactjs but the script should rsync with the production server's /var/www/html/${CI_PROJECT_NAME} using SSH?
Set up another runner on production server and let it pick up the jobs with tags reactjs-prod and modify gitlab-ci.yml to have deploy_prod with the tag reactjs-prod?
You have a better way other than the 2 mentioned above?
Last question (related):
Where is the best place to install my runners? Is what I'm doing (Having my runners on my dev server) actually ok?
Please if you can explain to me the best way (that you would choose) with the reasons as in pros or cons I would be very grateful.
The best practice is to separate your CI/CD infrastructure from the infrastructure where you host your apps.
This is done to minimize the number of variables which can lead to problems with either your applications or your runners.
Consider the following scenarios when you have a runner on the same machine where you host your application: (The below scenarios can happen even if the runner and app are running in separate Docker containers. The underlying machine is still a single point of failure.)
The runner executes a CPU/RAM heavy job and takes up most of the resources on the machine. Your application starts experiencing performance problems.
The Gitlab runner crashes and puts the host machine in an inoperable state. (docker panic or whatever).
Your production app stops functioning.
Your app brakes the host machine (Doesn't matter how. It can happen), your CI/CD stops working and you can not deploy a fix to production.
Consider having a separate runner machine (or machines. Gitlab runner can scale horizontally), that is used to run your deployment jobs to both dev and production servers.
I agree with #cecunami's answer.
As an example, in our Org we have a dedicated VM only for the runner, which is explicitly monitored by our teams.
Since first creating the machine, the CPU, RAM and storage demand has grown massively, thus why the infrastructure is to be separated.

Deploy code from gitlab on ec2 WITHOUT.gitlab-ci.yml file

I am using gitlab as repository and want to push my code on ec2 whenever any commit is done on gitlab. The gitlab CD/CI documentation states that I have to add a file .gitlab-ci.yml at the root directory of my repo. This is actually a problem for me because, I want project repo to have only code and not any configuration related info like build and deploy etc. Also when anybody clones the repo, they would have access to location where my code is pushed/deployed on ec2. Is there any work around for this problem ?
You'll need to use a gitlab-ci.yml filke to deploy your application. The file provides instructions and a pipeline "infrastructure" which, if properly configured, will build, test and automatically deploy your code.
If you are worried about leaking credentials, you should use the built-in instance variables to mask your important bits, like a "$SERVERNAME" or "$DB_PASSWORD" for instance.
Lastly, you can use the power of gitignore, in order to not publish all of your credentials or sensitive bits to your projects' servers or instances.

How to reuse puppet SSL certificates

I would like have a setup where my ec2 instances are getting terminated sometimes and new nodes comes up with the same host name .My puppetserver supposed to have the old certificates with them and instantly push the configs via the required modules.
Is this a viable solution?In this case do I need to keep the ssl certs of clients and push them to the ec2instnces via user-data ? what would be the best alternatives?
I can't really think of too many arguments against certificate re-use, only that puppet cert clean "$certname" on the CA master is so simple that I can't really think of a reason to re-use certificate names.
That said, I'd recommend building some kind of pipeline that includes certificate cleaning upon destruction of your ec2 instances. This is fairly easy to do in a terraform workflow with the AWS Terraform Provider and local-exec provisioner. For each aws_instance you create, you'd also create a local-exec resource, and specify the local-execonly executes at destruction time: when = "destroy".
If you're re-using hostnames for convenience, maybe it would be wise to instead rely on dns to point to the new hosts instead of relying on the hostnames themselves and stop worrying about puppet cert clean.

Deployment with Ansible from Gitlab CI, dealing with passwords

I'm trying to achieve an "password-free" deployment workflow using Gitlab CI and Ansible.
Some steps do require a password (I'm already using SSH Keys whenever I can) so I've stored those password inside an Ansible Vault. Next, I would just need to provide the Vault password when running the playbook.
But how could I integrate this nicely with Gitlab CI?
May I register a gitlab-ci job (or jobs are suitable for builds only?), which just runs the playbook providing the vault password somehow?! Can this be achieved without a password laying around in plain text?!
Also, I would be really happy if someone can point me some material that shows how we can deploy builds using Ansible. As you can notice, I've definitively found nothing about that.
You can set an environment variable in the GitLab CI which would hold the Ansible Vault password. In my example i called it $ANSIBLE_VAULT_PASSWORD
Here is the example for .gitlab-ci.yml:
deploy:
only:
- master
script:
- echo $ANSIBLE_VAULT_PASSWORD > .vault_password.txt
- ansible-playbook -i ansible/staging.yml --vault-password-file .vault_password.txt
Hope this trick helps you out.
I'm not super familiar with gitlab ci, or ansible vault for that matter, but one strategy that I prefer for this kind of situation is to create a single, isolated, secure, and durable place where your password or other secrets exist. A private s3 bucket, for example. Then, give your build box or build cluster explicit access to that secure place. Of course, you'll then want to make sure your build box or cluster are also locked down, such as within a vpc that isn't publicly accessible and can only be accessed via vpn or other very secure means.
The idea is to give the machines that need your password explicit knowledge of where to get it AND seperately the permission & access they need to get it. The former does not have to be a secret (so it can exist in source control) but the latter is virtually impossible to attain without compromising the entire system, at which point you're already boned anyway.
So, more specifically, the machine that runs ansible could be inside the secure cluster. It knows how to access the password. It has permission to do so. So, it can simply get the pw, store as a variable, and use it to access secure resources as it runs. You'll want to be careful not to leak the password in the process (like piping ansible logs with the pw to somewhere outside the cluster, or even anywhere perhaps). If you want to kick off the ansible script from outside the cluster, then you would vpn in to run the ansible playbook on the remote machine.

Resources