How to send Lambda logs to StackDriver instead of CloudWatch? - aws-lambda

I am considering sending my logs into StackDriver instead of CloudWatch. But from the docs, it seem to only describe how to do it with EC2. What about lambda? I prefer to send logs directly to StackDriver instead of StackDriver reading from CloudWatch to remove the CloudWatch costs entirely.

Stackdriver supports the metric types from Amazon Lambda listed in this article
To use these metrics in charting or alerting, your Google Cloud Platform project or AWS account must be associated with a Workspace.
After you have a Workspace, you can add more GCP projects and AWS accounts to it using the Adding monitored projects instructions.
If you plan to monitor more than just your host project, then the best practice is to use a new, empty GCP project to host the Workspace and then to add the projects and AWS accounts you want to monitor to your Workspace. This lets you choose a useful name for your host project and Workspace, and gives you a little more flexibility in moving monitored projects between Workspaces. The following diagram shows Workspace W monitoring GCP projects A and B and AWS account D:
Monitoring creates this AWS connector project when you add an AWS account to a Workspace. The connector project has a name beginning with AWS Link, and it has the same parent organization as the Workspace. To get the name and details about your AWS connector projects, go to the Inspecting Workspace section.
In the GCP Console, AWS connector projects appear as regular GCP projects. Don't use connector projects for any other purpose, and don't delete them while your Workspace is still connected to your AWS account.

Related

Deploying jaeger on AWS ECS with Elasticsearch

How should I go about deploying Jaeger on AWS ECS with Elasticsearch as backend? Is it a good idea to use the Jaeger all in one image or should I use separate images?
While I didn’t find any official jaeger reference to this, I think the jaeger all in one image is not intended for use in production. It makes one container a single point of failure, making it better to use separate containers for each jaeger component(if one is down from some reason - others can continue to operate).
I have recently written a blog post about hosting jaeger on AWS with AWS Elasticsearch (OpenSearch) service. While it is done with all-in-one, it is still useful to get the general idea of how to go about this.
Just to generally outline the process (described in detail in the post):
Create AWS Elasticsearch cluster
Create an ECS Cluster (running on ec2)
Create an ECS Task Definition, configured with a jaeger all-in-one image with the elasticsearch url from the step 1
Create an ECS Service that runs the created task definition
Make sure security groups on your EC2 allow access to jaeger ports as described here
Send spans to your jaeger endpoint via OpenTelemetry SDK
View your spans via the hosted jaeger UI (your-ec2-url:16686)
The all in one is a useful tool in development to test your work locally.
For deployment it is very limiting. Ideally to handle a potentially large volume of traffic you will want to scale parts of your infrastructure.
I would recommend deploying multiple jaeger-collectors, configured to write to the ES cluster. Then you can configure jaeger-agents running as a sidecar to each app or service broadcasting telemetry info. These agents can be configured to forward to one of a list of collectors adding some extra resilience.

can databricks cluster be shared across workspace?

My ultimate goal is to differentiate/manage the cost on databricks (azure) based on different teams/project.
And I was thinking whether I could utilize workspace to achieve this.
I read below , it sounds like workspace can access a cluster, but does not say whether multiple workspace can access the same cluster or not.
A Databricks workspace is an environment for accessing all of your Databricks assets. The workspace organizes objects (notebooks, libraries, and experiments) into folders, and provides access to data and computational resources such as clusters and jobs.
In other words, can I creat a cluster and somehow ensure can be only accessed by certain project or team or workspace?
To manage whom can access a particular cluster, you can make use of cluster access control. With cluster access control, you can determine what users can do on the cluster. E.g. attach to the cluster, the ability to restart it or to fully manage it. You can do this on a user level but also on a user group level. Note that you have to be on Azure Databricks Premium Plan to make use of cluster access control.
You also mentioned that your ultimate goal is to differentiate/manage costs on Azure Databricks. For this you can make use of tags. You can tag workspaces, clusters and pools which are then propagated to cost analysis reports in the Azure portal (see here).

Setup Terraform stackdriver alerts based on GCP bucket

I am trying to setup stack driver alerting policies thru terraform, and based on cloud storage bucket conditions.
So whenever there is a file in the GCP bucket, It should trigger a mail notification to our mails (Not using SendGrind).
For now, I got this mail notification working thru GCP console via stack-driver. But I am trying to incorporate it using terraform.
Any guidance is really appreciated. Thank you
Figured out thru terraform import google monitoring policies. Incorporated all thru terraform and managed to hook up notifications sent thru stackdriver over bucket changes as well.

Aws Lambda Deployment via CodePipeline

I would like to deploy my Lambda methods by using Aws Codepipeline. However, when i follow Aws Codepipeline creation wizard, i couldn't understand which one should i choose at beta stage. Because, not only Aws Codedeploy, but also Elastic Beanstalk are concerning only EC2 instances. There is lack of tutorial about telling step by step to create pipeline for our lambda, apigateway deployments. How can i skip beta stage without choosing one of them?, or which one should i choose for my serverless architecture's deployments?.
There are no direct integrations for Lambda/API Gateway -> CodePipeline at the moment. You could certainly do something with Jenkins like #arjabbar suggested. Thanks for the feedback, we'll take this on our backlog.
CloudFormation is available in CodePipeline now. This allows you to target cloudformation templates as Actions in the CodePipeline.
Here's an overview (the implementation was moved to a private repository after I changed positions):
https://aws.amazon.com/blogs/compute/continuous-deployment-for-serverless-applications/
In this pipeline we deploy a staging lambda, test its functionality, then deploy the production lambda.

Connect hadoop cluster to mutiple Google Cloud Storage backets in multiple Google Projects

It is possible, to connect my Hadoop cluster to multiple Google Cloud Projects at once ?
I can easly use any Google Storage bucket in single Google Project via Google Cloud Storage Connector as explained in this thread Migrating 50TB data from local Hadoop cluster to Google Cloud Storage. But i can't find any documentation or example how to connect to two or more Google Cloud Project from single map-reduce job. Do You have any suggestion/trick ?
Thanks a lot.
Indeed, it is possible to connect your cluster to buckets from multiple different projects at once. Ultimately, if you're using the instructions for using a service-account keyfile, the GCS requests are performed on behalf of that service-account, which can be treated more-or-less like any other user. You can either add the service account email your-service-account-email#developer.gserviceaccount.com to all the different cloud projects owning buckets you want to process, using the permissions section of cloud.google.com/console and simply adding that email address like any other member, or you can set GCS-level access to add that service-account like any other user.

Resources