Store Step Function and Lamda in same code commit repo - aws-lambda

I am creating a step function that orchestrates a lamda. The lamda has 4 simple endpoints.
I would like to save the lamda and step function in the same git repo. Regarding this:
Is this good practice or should they be in separate repos on code commit?
If this is good practice, what is the best way to manage the deploy pipeline - for two related projects in the same repo?

It is best practice to have actual code and necessary infrastructure code in same repo.
Creating infrastructure as code:
There are many options, couple of widely used ones are:
Cloudformation
Aws CDK
In general writing CDK code is easier than cloudformation and writing step functions in CDK is a million times easier.
So, my recommendation is to write CDK code to create Lambda functions and step functions.
Build & Deploy:
We can use AWS CodeBuild to build artifacts and AWS CodePipeline to orchestrate.

After research, another option I found was to use serverless. I think its definitions are simpler and it's also easier to test offline. Credit to the accepted answer, as it's also a useful resource.

Related

CDK: Deploy microservice endpoints (lambdas) individually

I am writing IAAC with CDK for a microservice that'll be using APIGateway's RestAPI.
I have one stack with all the lambdas and the restApi in one place, I can deploy everything just fine.
Now the problem is when two people are working on different endpoints I would want them to be able to deploy just the endpoint(lambda) they are working on. Currently, when they deploy, CDK deploys all the endpoint from their repo overwriting changes someone might have deployed from their branch.
I would happily share some code but I am not really sure what to share. I think I can use some help with how to structure the code in stacks to achieve what I need.
You have one api gateway shared across two different endpoints from two different repos.
There are couple of ways that I can think of:
Option 1: we need 4 stacks.
Gateway Stack: Api Gateway and Root endpoints.
Endpoint1 stack: Lambda and necessary routes.
Endpoint2 stack: Lambda and necessary routes.
Gateway Deploy stack: Deploy the stage.
Each time a lambda function is changed, deploy its own stack and the deploy stack.
Option 2: we just need 1 stack but deploy lambdas separately.
Single CDK project which deploys everything. Only thing to keep in mind is artifacts for the lambda functions should be taken from S3 bucket location.
Within individual pipelines of each lambda, copy artifacts to same S3 location referenced by lambda in cdk and trigger an update to lambda with a aws cli update-function-configuration as simple of update description with a timestamp or an env variable, just to trigger a new deployment.
This way either we can seamlessly deploy CDK pipeline or individual lambda pipeline
You have 2 options to solve this problem without much work.
First one is to use code to identify who is deploying the stack. If developer 1 is deploying the stack then set some environment variable or parameter to stack. Based on that value, CDK code should compile only 1 of the endpoint repos.
Second option is to not build the repos as part of (CDK) deployment. Use Continuous Delivery (or anything else) which builds the repo code separately and CDK only deploys them.
Based on the context of your project any one strategy should work fine for you. Or share more context if it's not covered until now.
Thanks for your input guys. I have taken the following approach which works for me:
const testLambda = new TestLambda(app, "dev-test-lambda", {
...backendProps.dev,
dynamoDbTable: docStoreDev.store,
});
const restApiDev = new RestApiStack(app, "dev-RestApi", {
...backendProps.dev,
hostedZone: hostedZones.test,
testFunction: testLambda.endpointFunction,
});
Now if a developer just wants to deploy their lambda, they will just deploy the stack for the lambda which won't deploy anything else. Since the restApiStack requires lambda as a dependency, deploying that will also deploy all the other lambdas all at the same time.
One idea as well is for the developer to deploy the pipeline with their code branch name, so they can have a fully fledge environment without worrying about overriding the other developer's lambdas.
Once they're done with the feature, they just merge their code in the main branch and destroy their own pipeline.
It's a common approach :-)

Rundeck to run AWS Lambda function

Does anyone have any information on how to run AWS lambda scripts from rundeck? I was looking into doing this to have a central place that certain uses can log into run deck and run the scripts that are relevant to them, as not everyone has aws access.
I found this: https://www.slideshare.net/tetutaro/lambda-and-rundeck-58884982
But I was hoping there might be something more official somewhere and in English :)
A good way to integrate with Lambda is to use AWS CLI on the Rundeck server and call functions using script step or command step on your workflow. Take a look at this.
Also, and similar to this answer, another good way to interact with Lamda is to access it using API (you have two options: using HTTP Workflow Step plugin or via script step on your workflow).
Finally, maybe is a good opportunity to develop some custom plugin focused on AWS Lambda.

AWS Codebuild: Monorepo and multiple builds?

I have an AWS CodePipeline which uses CodeBuild as the build step and deploys Lambda functions. This pipeline is triggered upon any commit on the development branch which houses multiple Lambda functions. Right now, since all these Lambdas use the same pipeline, they have the same build job as well.
The problem is, what happens in case one of my Lambdas has a different requirement in the build step (say installing a library). Is there any way to trigger a different build job for a specific Lambda? I am guessing this delves into the age-old issue of Codepipeline unable to deal with monorepo, but any suggestions are welcome.
You could integrate change detection for your lambda functions. The only thing you need is that you need to check out the source separately in the job so you got the .git folder (see: https://forums.aws.amazon.com/thread.jspa?threadID=251732).
Afterwards you can easily check with git which lambda function was actually changed and run your pre-build commands based on the result.

Deploying Single Lambda Function From CI/CD pipeline

I am dealing an infrastructure and trying to figure it out how to deploy just single lambda from CI/CD pipeline.
Let's say in a repo you have 20 lambdas, and you made change for one single lambda, instead of deploying all of them i just want to deploy the changed one so cut out the deployment time.
I've got an idea like checking difference from git and figure it out which ones are changed, and do deployment only that part of functionality, but it surely doesn't seem right way to do it. Believing there is more proper way to do it.
I am using terraform for now (moving to serverless framework) i know that terraform and serverless framework holds a state on s3 bucket. However on my case when i run it through pipelines, eventhogh there is a terraform state and there is no change on the state, it still deploys the whole thing as far as realised (i might be wrong). I just want to get clear my mind to see how people does this with their pipline.
Since you seem to be asking about both Terraform and Serverless Framework here, I'm assuming you're looking for a general answer rather than specifically how this would be solved with a particular tool.
One way to solve this problem is to decouple your build process from your deploy process by adding a version selection mechanism in between. This just means that somewhere in your system you have a value that can be written by your build process and read by your deploy process which indicates what is the "current" artifact for each of your Lambda functions.
When your build process completes successfully, it can write the information about the artifact it built into the appropriate location, and then trigger your deployment process. Your deployment process will then read the artifact information and use it to decide what to deploy.
If you have made no changes to the current artifact metadata for a particular function then the deploy process can see that and not do anything. If a particular artifact is flawed in some way and you only notice once it's deployed, you can potentially set the artifact metadata back to the previous one and re-run the deployment process to roll back. If you choose a data store that retains historical versions, you'll also have a log of changes to the current artifact which might be useful to understand circumstances that lead to an incident.
Without getting into specifics it's hard to say more about this. For Terraform in particular, the artifact metadata store ought to be something that Terraform can read using a data source. To show a real example I'm going to just arbitrarily choose AWS SSM Parameter Store as a location for that artifact metadata store:
data "aws_ssm_parameter" "foo" {
name = "FooFunctionArtifact"
}
locals {
# For this example, we'll assume that the stored parameter is a JSON
# string shaped like this:
# {
# "s3_bucket": "awesomecorp-app-artifacts"
# "s3_key": "/awesomeapp/v1.2.0/function.zip"
# }
foo_artifact = jsondecode(data.aws_ssm_parameter.foo)
}
resource "aws_lambda_function" "foo" {
function_name = "foo"
s3_bucket = local.foo_artifact.s3_bucket
s3_key = local.foo_artifact.s3_key
# etc, etc
}
The technical details of this will vary a lot depending on your technology choices. If you don't use Terraform then you'll either use a feature similar to data sources in your other tool or you'd write some wrapper glue code that can itself retrieve the necessary information and pass it into the tool as an argument.
The main thing, regardless of technology choices, is that there is an explicit record somewhere of what is the latest artifact for each function, which is updated by your build step and read by your deploy step. This pattern can apply to other artifact types too, such as AMIs for EC2, docker images, etc.
Seems you have added label of terraform, serverless-framework (I called it sls), and aws-lambda. So all of them work for you.
terraform - Terraform itself will care of the differences which lambda need be updated. But it is not lambda friendly if you need install related packages.
serverless framework (sls) - it is good to use to manage lambda functions, but as side effect, it has to be managed with api gateway together. I am not sure if sls team has fix this issue or not. Need some confirmations.
SLS will take care of installing related packages.
The bad part is, sls can't diff the resources to be deployed and to be planned.
cloudformation - that's AWS owned Infrastructure as Code (IaC) tool to manage aws resources, you should be fine to use it to manage the lambda resource. you will get same issues as Terraform that you have to install the related packages before deploy the stack.
Bad part is, cfn (cloudformation) doesn't have diff feature as well, furtherly, it doesn't have proper tools to manage its aws cli commands, you have to use others, such as shell scriping, Ansible or even Terraform to manage coudformation templates updates.
aws cdk - The newest way is using aws-cdk, it does have the diff feature cdk diff which is mostly suitable for your current job, but it is very new project, a lot of features are still waiting to be developed.
You can take these and think with your team's skill sets. Always choice the tools, which you and your team are most confident.

What’s the best way to deploy multiple lambda functions from a single github repo onto AWS?

I have a single repository that hosts my lambda functions on github. I would like to be able to deploy the new versions whenever new logic is pushed to master.
I did a lot of reasearch and found a few different approaches, but nothing really clear. Would like to know what others feel would be the best way to go about this, and maybe some detail (if possible) into how that pipeline is setup.
Thanks
Welcome to StackOverflow. You can improve your question by reading this page.
You can setup a CI/CD pipeline using CircleCI with its GitHub integration (which is an online Service, so you don't need to maintain anything, like a Jenkins server, for example)
Upon every commit to your repository, a CircleCI build will be triggered. Once the build process is over, you can declare sls deploy, sam deploy, use Terraform or even create a script to upload the .zip file from your GitHub repo to an S3 Bucket and then, within your script, invoke the create-function command. There's an example how to deploy Serverless applications using CircleCI along with the Serverless Framework here
Other options include TravisCI, AWS Code Deploy or even maintain your own CI/CD Server. The same logic applies to all of these tools though: commit -> build -> deploy (using one of the tools you've chosen).
EDIT: After #Matt's answer, it clicked that the OP never mentioned the Serverless Framework (I, somehow, thought he was already using it, so I pointed the OP to tutorials using the Serverless Framework already). I then decided to update my answer with a few other options for serverless deployment
I know that this isn't exactly what you asked for but I use Serverless Framework (https://serverless.com) for deployment and I love it. I don't do my deployments when I push to my repo. Instead I push to my repo after I've deployed. I like this flow because a deployment can fail due to so many things and pushing to GitHub is much less likely to fail. I this way, I prevent pushing code that failed to deploy to my master branch.
I don't know if you're familiar with the framework but it is super simple. The website describes the simple steps to creating and deploy a function like this.
1 # Step 1. Install serverless globally
2 $ npm install serverless -g
3
4 # Step 2. Create a serverless function
5 $ serverless create --template hello-world
6
7 # Step 3. deploy to cloud provider
8 $ serverless deploy
9
10 # Your function is deployed!
11 $ http://xyz.amazonaws.com/hello-world
There are also a number of plugins you can use to integrate easily with custom domains on APIGateway, prune older versions of lambda functions that might be filling up your limits, etc...
Overall, I've found it to be the easiest way to manage and deploy my lambdas. Hope it helps!
Given that you're using AWS Lambda, you may want to consider CodePipeline to automate your release process. [SAM(https://docs.aws.amazon.com/lambda/latest/dg/serverless_app.html) may also be interesting.
I too had the same problem. I wanted to manage 12 lambdas with 1 git repository. I solved it by introducing travis-ci. travis-ci saved the time and really useful in many ways. We can check the logs whenever we want and you can share the logs to anyone by sharing the URL. The sample documentation of all steps can be found here. You can go through it. 👍

Resources