FUNCTION_REGION env variable in Nodejs is differenrent than GCP set automatically for logs - google-cloud-logging

I programmatically write the logs from the function using such code:
import {Logging} from '#google-cloud/logging';
const logging = new Logging();
const log = logging.log('log-name');
const metadata = {
type: 'cloud_function',
labels: {
function_name: process.env.FUNCTION_NAME,
project: process.env.GCLOUD_PROJECT,
region: process.env.FUNCTION_REGION
},
};
log.write(
log.entry(metadata, "some message")
);
Later in Logs Explorer I get the log message where labels.region is us1 whereas standard logs that GCP adds, e.g. "Function execution started", contains us-central1 value.
Should not they be the same? Maybe I missed something or if it was done intentionally what is the reason behind it?

process.env.FUNCTION_REGION is supported only in Node 8 runtime. In newer runtimes it was deprecated. More info in documentation.
If your function requires one of the environment variables from an older runtime, you can set the variable when deploying your function.

Related

CDK: How do I configure `userData` with values known only at deploy time?

I am using CDK v2, with Typescript.
I want my bastion machine to log stuff to Cloudwatch.
The specific LogGroup I want it to write to is also created via CDK (so that I can customise the retention).
How can I customise the userData script with knowledge about other AWS resources, which are also created by CDK - so I can't know their names?
My CDK stuff is being deployed via a CDK pipeline.
Here is my CDK script:
export class StoBastion extends cdk.Stack {
constructor(scope: Construct, id: string, props: cdk.StackProps){
super(scope, id, props);
// actual name: DemoStage-StoBastion-StoBastionLogGroup5EEB3DE8-AdkaWy0ELoeF
const logGroup = new LogGroup(this, "StoBastionLogGroup", {
retention: RetentionDays.TWO_WEEKS,
});
let initScriptPath = 'lib/script/bastion-linux2-asg-provision.sh';
const userDataText = readFileSync(initScriptPath, 'utf8');
const autoScalingGroup = new AutoScalingGroup(this, 'StoAsg', {
...
userData: UserData.custom(userDataText),
})
}
}
And the shell script I want to use as the userData for the instance:
#!/bin/sh
### cloudwatch ###
# This goes as early as possible in the script, so we can see what's going
# on from Cloudwatch ASAP.
echo " >>bastion>> installing cloudwatch package $(date)"
yum install -y awslogs
echo " >>bastion>> configuring cloudwatch - ${TF_APP_LOG_GROUP} $(date)"
## overwrite awscli.conf ##
cat > /etc/awslogs/awscli.conf <<EOL
[plugins]
cwlogs = cwlogs
[default]
region = ${TF_APP_REGION}
EOL
## end of overwrite awscli.conf ##
## overwrite awslogs.conf ##
cat > /etc/awslogs/awslogs.conf <<EOL
[general]
state_file = /var/lib/awslogs/agent-state
[cloudinit]
datetime_format = %b %d %H:%M:%S
file = /var/log/cloud-init-output.log
buffer_duration = 5000
log_group_name = ${TF_APP_LOG_GROUP}
log_stream_name = linux2-cloud-init-output-{instance_id}
initial_position = start_of_file
EOL
## of overwrite awslogs.conf ##
echo " >>bastion>> start awslogs daemon $(date)"
systemctl start awslogsd
echo " >>bastion>> make sure awslogs starts up on boot"
systemctl enable awslogsd.service
### end cloudwatch ###
I want to somehow replace the variable references in the userData script like ${TF_APP_LOG_GROUP} with values populated at CDK deploy time so they have the correct values.
I'm doing cloudwatch stuff at the moment, but there will be other stuff I need to do like this, so this question isn't about cloudwatch - it's about "how can I configure my userData with values known only at CDK deploy time"?
You can do this with conventional string formatting tools, as if the log group were a regular string:
const userDataText = readFileSync(
initScriptPath,
'utf8'
).replaceAll(
'${TF_APP_LOG_GROUP}',
logGroup.logGroupName
);
What this will do behind the scenes is replace all occurences of ${TF_APP_LOG_GROUP} in the text string with a token (a special string that looks something like ${TOKEN[LogGroup.Name.1234]}), and CloudFormation will in turn replace it with the actual value during deployment.
For reference: https://docs.aws.amazon.com/cdk/v2/guide/tokens.html
This was all incorrect. Leaving it for posterity purposes, but not valid as it was the wrong direction
Wherever you are synthing your template - be that in a SimpleSynth
Pipelines step or in a CodeBuild if using a standard
CodePipeline/CodeBuild, you can include context variables in the
synth.
The command cdk deploy can be followed with any context variables
you want: cdk deploy -c VariableName=value - if your bash script is
returning to the shell the answers, you can store them as shell
variables and use them in the cdk deploy.
you can then reference these variables within the actual stacks with
const bucket_name = this.node.tryGetContext('VariableName');
See this documentation or this SO post for more information.

Receiving error in AWS Secrets manager awscli for: Version "AWSCURRENT" not found when deployed via Terraform

Overview
Create a aws_secretsmanager_secret
Create a aws_secretsmanager_secret_version
Store a uniquely generated string as that above version
Use local-exec provisioner to store the actual secured string using bash
Reference that string using the secretsmanager resource in for example, an RDS instance deployment.
Objective
Keep all plain text strings out of remote-state residing in a S3 bucket
Use AWS Secrets Manager to store these strings
Set once, retrieve by calling the resource in Terraform
Problem
Error: Secrets Manager Secret
"arn:aws:secretsmanager:us-east-1:82374283744:secret:Example-rds-secret-fff42b69-30c1-df50-8e5c-f512464a4a11-pJvC5U"
Version "AWSCURRENT" not found
when running terraform apply
Question
Why isn't it moving the AWSCURRENT version automatically? Am I missing something? Is my bash command wrong? The value does not write to the secret_version, but it does reference it correctly.
Look in main.tf code, which actually performs the command:
provisioner "local-exec" {
command = "bash -c 'RDSSECRET=$(openssl rand -base64 16); aws secretsmanager put-secret-value --secret-id ${data.aws_secretsmanager_secret.secretsmanager-name.arn} --secret-string $RDSSECRET --version-stages AWSCURRENT --region ${var.aws_region} --profile ${var.aws-profile}'"
}
Code
main.tf
data "aws_secretsmanager_secret_version" "rds-secret" {
secret_id = aws_secretsmanager_secret.rds-secret.id
}
data "aws_secretsmanager_secret" "secretsmanager-name" {
arn = aws_secretsmanager_secret.rds-secret.arn
}
resource "random_password" "db_password" {
length = 56
special = true
min_special = 5
override_special = "!#$%^&*()-_=+[]{}<>:?"
keepers = {
pass_version = 1
}
}
resource "random_uuid" "secret-uuid" { }
resource "aws_secretsmanager_secret" "rds-secret" {
name = "DAL-${var.environment}-rds-secret-${random_uuid.secret-uuid.result}"
}
resource "aws_secretsmanager_secret_version" "rds-secret-version" {
secret_id = aws_secretsmanager_secret.rds-secret.id
secret_string = random_password.db_password.result
provisioner "local-exec" {
command = "bash -c 'RDSSECRET=$(openssl rand -base64 16); aws secretsmanager put-secret-value --secret-id ${data.aws_secretsmanager_secret.secretsmanager-name.arn} --secret-string $RDSSECRET --region ${var.aws_region} --profile ${var.aws-profile}'"
}
}
variables.tf
variable "aws-profile" {
description = "Local AWS Profile Name "
type = "string"
}
variable "aws_region" {
description = "aws region"
default="us-east-1"
type = "string"
}
variable "environment" {}
terraform.tfvars
aws_region="us-east-1"
aws-profile="Example-Environment"
environment="dev"
The error likely isn't occuring in your provisioner execution per se, because if you remove the provisioner block the error still occurs on apply--but confusingly only the first time after a destroy.
Removing the data "aws_secretsmanager_secret_version" "rds-secret" block as well "resolves" the error completely.
I'm guessing there is some sort of config delay issue here...but adding a 20 second delay provisioner to the aws_secretsmanager_secret.rds-secret resource block didn't help.
And the value from the data block can be successfully output on subsequent apply runs, so maybe it's not just timing.
Even if you resolve the above more basic issue, it's likely your provisioner will still be confusing things by modifying a resource that Terraform is trying to manage in the same run. I'm not sure there's a way to get around that except perhaps by splitting into two separate operations.
Update:
It turns out that on the first run the data sources are read before the aws_secretsmanager_secret_version resource is created. Just adding depends_on = [aws_secretsmanager_secret_version.rds-secret-version] to the data "aws_secretsmanager_secret_version" block resolves this fully and makes the interpolation for your provisioner work as well. I haven't tested the actual provisioner.
Also you may need to consider this (which I take to not always apply to 0.13):
NOTE: In Terraform 0.12 and earlier, due to the data resource behavior of deferring the read until the apply phase when depending on values that are not yet known, using depends_on with data resources will force the read to always be deferred to the apply phase, and therefore a configuration that uses depends_on with a data resource can never converge. Due to this behavior, we do not recommend using depends_on with data resources.

terraform: lambda invocation during destroy

I have lambda invocation in our terraform-built environment:
data "aws_lambda_invocation" "this" {
count = var.invocation == "true" ? 1 : 0
function_name = aws_lambda_function.this.function_name
input = <<JSON
{
"Name": "Invocation"
}
JSON
}
The problem: the function is invoked not only during creation ("apply") but deletion ("destroy") too. How to invoke it during creation only? I thought about checking environment variables in the lambda (perhaps TF adds name of the process here or something like that) but I hope there's a better way.
Worth checking if you can use the -var 'lambda_xxx=execute' option while running the terraform command to check if the lambda code needs to be executed or not terraform docs
Using that variable lambda_xxx passed in via the command line while executing the command, you can check in the terraform code whether you want to run the lambda code or not.
Below code creates a waf only if the count is 1
resource "aws_waf_rule" "wafrule" {
depends_on = ["aws_waf_ipset.ipset"]
name = "${var.environment}-WAFRule"
metric_name = "${replace(var.environment, "-", "")}WAFRule"
count = "${var.is_waf_enabled == "true" ? 1 : 0}"
predicates {
data_id = "${aws_waf_ipset.ipset.id}"
negated = false
type = "IPMatch"
}
}
Variable declared in variables.tf file
variable "is_waf_enabled" {
type = "string"
default = "false"
description = "String value to indicate if WAF/API KEY is turned on or off (true/any_value)"
}
When you run the command any value other than true is considered false as we are just checking for string true.
Similarly you can do this for your lambda.
There are better alternative solutions for this problem now, which weren't available at the time the question was asked.
Lambda Invocation Resource in AWS provider: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lambda_invocation
Lambda Based Resource in LambdaBased provider: https://registry.terraform.io/providers/thetradedesk/lambdabased/latest/docs/resources/lambdabased_resource
With the disclaimer that I'm the developer of the latter one: If the underlying problem is managing a resource through lambda functions, the lambda based resource has some good features tailored specifically to accomplish that with the obvious drawback of adding another provider dependency.

Cloudwatch to Elasticsearch parse/tokenize log event before push to ES

Appreciate your help in advance.
In my scenario - Cloudwatch multiline logs needs to be shipped to elasticsearch service.
ECS--awslog->Cloudwatch---using lambda--> ES Domain
(Basic flow though very open to change how data is shipped from CW to ES )
I was able to solve multi-line issue using multi_line_start_pattern BUT
The main issue I am experiencing now - is my logs have ODL format (following format)
[yyyy-mm-ddThh:mm:ss.SSS-Z][ProductName-Version][Log Level]
[Message ID][LoggerName][Key Value Pairs][[
Message]]
AND I will like to parse and tokenize log events before storing in ES (vs the complete log line ).
For example:
[2018-05-31T11:08:49.148-0400] [glassfish 4.1] [INFO] [] [] [tid: _ThreadID=43 _ThreadName=Thread-8] [timeMillis: 1527692929148] [levelValue: 800] [[
[] INFO : (DummyApplicationFunctionJPADAO) EntityManagerFactory located under resource lookup name [null], resource name=AuthorizationPU]]
Needs to be parsed and tokenize using format
timestamp 2018-05-31T11:08:49.148-0400
ProductName-Version glassfish 4.1
LogLevel INFO
MessageID
LoggerName
KeyValuePairs tid: _ThreadID=43 _ThreadName=Thread-8
Message [] INFO : (DummyApplicationFunctionJPADAO)
EntityManagerFactorylocated under resource lookup name
[null], resource name=AuthorizationPU
In above Key Value pairs repeat and are variable - for simplicity I can store all as one long string.
As far as what I gathered about Cloudwatch - It seems Subscription Filter Pattern reg ex support is very limited really not sure how to fit the above pattern. For lambda function that pushes the data to ES have not seen AWS doc or examples that support lambda as means to parse and push for ES.
Will appreciate if someone can please guide what/where will be best option to parse CW logs before it gets into ES => Subscription Filter -Pattern vs in lambda function or any other way.
Thank you .
From what I can see your best bet is what you're suggesting, a CloudWatch log triggered lambda that reformats the logged data into your ES prefered format and then posts it into ES.
You'll need to subscribe this lambda to your CloudWatch logs. You can do this on the lambda console, or the cloudwatch console (https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Subscriptions.html).
The lambda's event payload will be: { "awslogs": { "data": "encoded-logs" } }. Where encoded-logs is a Base64 encoding of a gzipped JSON.
For example, the sample event (https://docs.aws.amazon.com/lambda/latest/dg/eventsources.html#eventsources-cloudwatch-logs) can be decoded in node, for example, using:
const zlib = require('zlib');
const data = event.awslogs.data;
const gzipped = Buffer.from(data, 'base64');
const json = zlib.gunzipSync(gzipped);
const logs = JSON.parse(json);
console.log(logs);
/*
{ messageType: 'DATA_MESSAGE',
owner: '123456789123',
logGroup: 'testLogGroup',
logStream: 'testLogStream',
subscriptionFilters: [ 'testFilter' ],
logEvents:
[ { id: 'eventId1',
timestamp: 1440442987000,
message: '[ERROR] First test message' },
{ id: 'eventId2',
timestamp: 1440442987001,
message: '[ERROR] Second test message' } ] }
*/
From what you've outlined, you'll want to extract the logEvents array, and parse this into an array of strings. I'm happy to give some help on this too if you need it (but I'll need to know what language you're writing your lambda in- there are libraries for tokenizing ODL- so hopefully it's not too hard).
At this point you can then POST these new records directly into your AWS ES Domain. Somewhat crypitcally the S3-to-ES guide gives a good outline of how to do this in python: https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-aws-integrations.html#es-aws-integrations-s3-lambda-es
You can find a full example for a lambda that does all this (by someone else) here: https://github.com/blueimp/aws-lambda/tree/master/cloudwatch-logs-to-elastic-cloud

elasticsearch can't resolve environment variables in elasticsearch.yml

I have the following two settings in my elasticsearch.yml file. They are the only ones that pull from environment variables.
cloud.aws.access_key: ${AWS_ACCESS_KEY_ID}
cloud.aws.secret_key: ${AWS_SECRET_KEY}
When I restart elasticsearch to load these from the environment, I get an error that it can't resolve them. I've tested it and it will not resolve either, so this error applies to both (it just fails on the bottom one first)
- IllegalArgumentException[Could not resolve placeholder 'AWS_SECRET_KEY']
java.lang.IllegalArgumentException: Could not resolve placeholder 'AWS_SECRET_KEY'
at org.elasticsearch.common.property.PropertyPlaceholder.parseStringValue(PropertyPlaceholder.java:124)
at org.elasticsearch.common.property.PropertyPlaceholder.replacePlaceholders(PropertyPlaceholder.java:81)
at org.elasticsearch.common.settings.ImmutableSettings$Builder.replacePropertyPlaceholders(ImmutableSettings.java:1060)
at org.elasticsearch.node.internal.InternalSettingsPreparer.prepareSettings(InternalSettingsPreparer.java:101)
at org.elasticsearch.bootstrap.Bootstrap.initialSettings(Bootstrap.java:106)
at org.elasticsearch.bootstrap.Bootstrap.main(Bootstrap.java:177)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:32)
I did some investigating through elasticsearch's code repository on github and discovered this bit of code that pulls from the environment variables.
ImmutableSettings.java#resolvePlaceholder from elasticsearch#github
Namely. the lines inside that function that should be pulling from the environment variables are these one:
Code from resolvePlaceholder that pulls out environment variables
However, after resolvePlaceholder is run from inside function PropertyPlaceholder#parseStringValue, the System.getenv call must be returning null as that is the only way for that error to be thrown.
I wrote a simple test program that is essentially a copy of ImmutableSettings.java#resolvePlaceholder to test that System.getenv was pulling out the environment variables correctly on my system. This in fact returns the values I expect.
public class Cool {
public static void main(String[] args) {
System.out.println(resolvePlaceholder(args[0]));
}
public static String resolvePlaceholder(String placeholderName) {
if (placeholderName.startsWith("env.")) {
// explicit env var prefix
System.out.println("1: placeholderName.startsWith(\"env.\")");
return System.getenv(placeholderName.substring("env.".length()));
}
String value = System.getProperty(placeholderName);
if (value != null) {
System.out.println("2: System.getProperty");
return value;
}
value = System.getenv(placeholderName);
if (value != null) {
System.out.println("3: System.getenv");
return value;
}
return "Map should've had it";
}
}
When run, this is the output, showing we are getting the set environment variables (keys hidden for obvious reasons):
[ec2-user#ip-172-31-34-195 ~]$ java Cool AWS_SECRET_KEY
3: System.getenv
XXXXXXXXXXXXXXXXXX
[ec2-user#ip-172-31-34-195 ~]$ java Cool AWS_ACCESS_KEY_ID
3: System.getenv
XXXXXXXXXXXXXXXXXX
What is it about elasticsearch that isn't able to parse my environment variables from elasticsearch.yml? I've done quite a bit of digging at this point but I'm sure there is a simple solution around the corner. Any help would be very much appreciated.
I figured out the issue.
As I am running elasticsearch as a linux service, rather than a shell application, it has access to no environment variables except for a very select few.
I added the following line to the end of /etc/sysconfig/elasticsearch to load the environment variables I wanted available to the program:
. /path/to/environment/variables

Resources