I am creating an AWS cluster and I have a bootstrap action to change spark-defaults.conf.
Server is keep getting terminated saying
can't read /etc/spark/conf/spark-defaults.conf: No such file or
directory
Though if I skip this and check on server the files does exist. So I assume the order of things are not correct. I am using Spark 1.6.1 by provided EMR 4.5 so it should be installed by default.
Any clues?
Thanks!
You should not change Spark configurations in a bootstrap action. Instead you should specify any changes you have to spark-defaults in a special json file you need to add when launching the cluster. If you use the cli to launch, the command should look something like this:
aws --profile MY_PROFILE emr create-cluster \
--release-label emr-4.6.0 \
--applications Name=Spark Name=Ganglia Name=Zeppelin-Sandbox \
--name "Name of my cluster" \
--configurations file:///path/to/my/emr-configuration.json \
...
--bootstrap-actions ....
--step ...
In the emr-configuration.json file you then set your changes to spark-defaults. An example could be:
[
{
"Classification": "capacity-scheduler",
"Properties": {
"yarn.scheduler.capacity.resource-calculator": "org.apache.hadoop.yarn.util.resource.DominantResourceCalculator"
}
},
{
"Classification": "spark",
"Properties": {
"maximizeResourceAllocation": "true"
}
},
{
"Classification": "spark-defaults",
"Properties": {
"spark.dynamicAllocation.enabled": "true",
"spark.executor.cores":"7"
}
}
]
The best way to achieve this goal is to use the Steps definition at a CloudFormation template for example... as Steps will run particularly at your Master node which holds the spark-default.conf file.
Related
I'm triying to implement solution for reverse-proxy service using Traefik v1 (1.7) and ECS one-off tasks as backends, as described in this SO question. Routing should by dynamic - requests to /user/1234/* path should go to the ECS task, running with the appropriate docker labels:
docker_labels = {
traefik.frontend.rule = "Path:/user/1234"
traefik.backend = "trax1"
traefik.enable = "true"
}
So far this setup works fine, but I need create one ECS task definition per one running task, because the docker labels are the property of ECS TaskDefinition, not the ECS task itself. Is it possible to create only one TaskDefinition and pass Traefik rules in ECS task tags, within task key/value properties?
This will require some modification in Traefik source code, are the any other available options or ways this should be implemented, that I've missed, like API Gateway or Lambda#Edge? I have no experience with those technologies, real-world examples are more then welcome.
Solved by using Traefik REST API provider. External component, which runs the one-off tasks, can discover task internal IP and update Traefik configuration on-fly by pair traefik.frontend.rule = "Path:/user/1234" and task internal IP:port values in backends section
It should GET the Traefik configuration first from /api/providers/rest endpoint, remove or add corresponding part (if task was stopped or started), and update Traefik configuration by PUT method to the same endpoint.
{
"backends": {
"backend-serv1": {
"servers": {
"server-service-serv-test1-serv-test-4ca02d28c79b": {
"url": "http://172.16.0.5:32793"
}
}
},
"backend-serv2": {
"servers": {
"server-service-serv-test2-serv-test-279c0ba1959b": {
"url": "http://172.16.0.5:32792"
}
}
}
},
"frontends": {
"frontend-serv1": {
"entryPoints": [
"http"
],
"backend": "backend-serv1",
"routes": {
"route-frontend-serv1": {
"rule": "Path:/user/1234"
}
}
},
"frontend-serv2": {
"entryPoints": [
"http"
],
"backend": "backend-serv2",
"routes": {
"route-frontend-serv2": {
"rule": "Path:/user/5678"
}
}
}
}
}
Am picking ECS optimised instance(ami-05958d7635caa4d04) in data plane of ECS in ca-central-1 region.
AWS Systems Manager Agent (SSM Agent) is Amazon software that can be installed and configured on an Amazon EC2 instance, an on-premises server, or a virtual machine (VM). SSM Agent makes it possible for Systems Manager to update, manage, and configure these resources.
In my scenario, Launching a ECS task in ECS optimised instance(ami-05958d7635caa4d04), causes resource:memory error. More on this error, here. Monitoring ECS->cluster->service->events will not work for me, because cloudformation roll back the cluster.
My existing ECS optimised instance is launched as shown below:
"EC2Instance":{
"Type": "AWS::EC2::Instance",
"Properties":{
"ImageId": "ami-05958d7635caa4d04",
"InstanceType": "t2.micro",
"SubnetId": { "Ref": "SubnetId"},
"KeyName": { "Ref": "KeyName"},
"SecurityGroupIds": [ { "Ref": "EC2InstanceSecurityGroup"} ],
"IamInstanceProfile": { "Ref" : "EC2InstanceProfile"},
"UserData":{
"Fn::Base64": { "Fn::Join": ["", [
"#!/bin/bash\n",
"echo ECS_CLUSTER=", { "Ref": "EcsCluster" }, " >> /etc/ecs/ecs.config\n",
"groupadd -g 1000 jenkins\n",
"useradd -u 1000 -g jenkins jenkins\n",
"mkdir -p /ecs/jenkins_home\n",
"chown -R jenkins:jenkins /ecs/jenkins_home\n"
] ] }
},
"Tags": [ { "Key": "Name", "Value": { "Fn::Join": ["", [ { "Ref": "AWS::StackName"}, "-instance" ] ]} }]
}
}
1) Does aws ssm agent installation required on ECS instance(ami-05958d7635caa4d04) to retrieve such cloudwatch events(resource:memory) with aws.ssm cloudwatch event rule filter? or Does aws.ec2 cloudwatch event rule filter suffice?
2) If yes, Do I need to explicitly install SSM agent on ECS instance(ami-05958d7635caa4d04)? through CloudFormation...
You don't need to install SSM agent to monitor something such as memory usage of your instance (whether container instance or not). This is domain of CloudWatch, not SSM.
All you need to install is unified cloud watch agent and configure it accordingly. This is where SSM can help but it is not necessary and you can install it manually (or via script if you want).
If you decide to use SSM then you will need to explicitly install it. It comes preinstalled on some OSes but not on Amazon ECS-Optimized AMI - more about this.
I have 4 shell scripts which I embedded in java code and converted into jar. I also have a lambda AWS function which brings up the EMR cluster. In lambda function, I should run the generated jar (java -jar /home/hadoop/aws.jar) using steps. I have bootstrap actions where I am setting few environmental variables when the cluster is bought up. So ideally, after the cluster is up the cluster should run the java -jar command which was specified in steps values in json events.
But the problem is the emr is terminating failing in the step jar command. Is there any other way to run the java -jar command from lambda using steps.
"Steps":[
{
"Name": "Setup hadoop debugging",
"ActionOnFailure": "TERMINATE_CLUSTER",
"HadoopJarStep": {
"Jar": "command-runner.jar",
"Args": [
"state-pusher-script"
]
}
},
{
"Name": "Execute Step JAR",
"ActionOnFailure": "TERMINATE_CLUSTER",
"HadoopJarStep": {
"Jar": "command-runner.jar",
"Args":[
"java -jar /home/hadoop/lib/aws-add-step-emr-0.0.1-SNAPSHOT-shaded.jar"
]
}
}
],
"BootstrapActions":[
{
"Name": "Custom action",
"ScriptBootstrapAction": {
"Path": "s3://aws-east-1/bootstrap/init.sh"
}
}]
I am using cloudformation to create my lambda function with the code in a S3Bucket with versioning enabled.
"MYLAMBDA": {
"Type": "AWS::Lambda::Function",
"Properties": {
"FunctionName": {
"Fn::Sub": "My-Lambda-${StageName}"
},
"Code": {
"S3Bucket": {
"Fn::Sub": "${S3BucketName}"
},
"S3Key": {
"Fn::Sub": "${artifact}.zip"
},
"S3ObjectVersion": "1e8Oasedk6sDZu6y01tioj8X._tAl3N"
},
"Handler": "streams.lambda_handler",
"Runtime": "python3.6",
"Timeout": "300",
"MemorySize": "512",
"Role": {
"Fn::GetAtt": [
"LambdaExecutionRole",
"Arn"
]
}
}
}
The lambda function gets created successfully. When i copy a new artifact zip file to the s3bucket, a new version of the file gets created with the new version "S3ObjectVersion" string. But the lambda function code is still using the older version.
The documentation of aws cloudformation clearly says the following
To update a Lambda function whose source code is in an Amazon S3
bucket, you must trigger an update by updating the S3Bucket, S3Key, or
S3ObjectVersion property. Updating the source code alone doesn't
update the function.
Is there an additional trigger event, i need to create to get the code updated?
In case anyone is running into this similar issue, I have figured out a way in my case. I use Terraform + Jenkins to create my lambda functions through s3 bucket. In the beginning, I can create the functions but it won't update once it created. I verified my zip files in s3 is updated. It took me some time to figure out that I need do one of following two changes.
solution 1: Giving a new object key when load the new zip file. In my terraform I add the git commit id as part of the s3 key.
resource "aws_s3_bucket_object" "lambda-abc-package" {
bucket = "${aws_s3_bucket.abc-bucket.id}"
key = "${var.lambda_ecs_task_runner_bucket_key}_${var.git_commit_id}.zip"
source = "../${var.lambda_ecs_task_runner_bucket_key}.zip"
}
solution 2: add source_code_hash in lambda part.
resource "aws_lambda_function" "abc-ecs-task-runner" {
s3_bucket = "${var.bucket_name}"
s3_key = "${aws_s3_bucket_object.lambda-ecstaskrunner-package.key}"
function_name = "abc-ecs-task-runner"
role = "${aws_iam_role.AbcEcsTaskRunnerRole.arn}"
handler = "index.handler"
memory_size = "128"
runtime = "nodejs6.10"
timeout = "300"
source_code_hash = "${base64sha256(file("../${var.lambda_ecs_task_runner_bucket_key}.zip"))}"
So do either one should work. Also when checking lambda code, refresh the URL from the browser won't work. Need go back Functions and open that function again.
Hope this helps.
I also faced the same issue , my code was in Archive.zip in S3 bucket , when I uploaded a new Archive.zip , lambda was not responding according to new code .
Solution was to again paste the link of S3 location of Archive.zip in lambda's function code section and Save it again.
How I figured out lambda was not taking new code?
Go to your lambda function --> Actions --> Export Function --> Download Deployment Package and check if the code is actually the code that you've recently uploaded to S3 .
You have to update the S3ObjectVersion value to the new version ID in your CloudFormation template itself.
Then you have to update your Cloudformation stack with the new template.
You can do this either on the Cloudformation console or via the AWS CLI.
From AWS CLI you can do an update-function-code call like this post mentions : https://nono.ma/update-aws-lambda-function-code
I am deploying a package that contains a deploy.ps1 file. As you already know Octopus is running this script on deploying by default, I want to prevent it happening and run a custom script instead.
If you have a requirement like this, then it's better to move the powershell that starts the services to a separate build step and then tag the tentacles you want that script to run on.
In your deployment step for the service, set the start mode to "Manual"
Then have a step that starts the service, and scope that script to the environments / servers that you want to auto start
The code for the step template I use here is
{
"Id": "ActionTemplates-1",
"Name": "Enable and start service",
"Description": null,
"ActionType": "Octopus.Script",
"Version": 8,
"Properties": {
"Octopus.Action.Package.NuGetFeedId": "feeds-builtin",
"Octopus.Action.Script.Syntax": "PowerShell",
"Octopus.Action.Script.ScriptSource": "Inline",
"Octopus.Action.RunOnServer": "false",
"Octopus.Action.Script.ScriptBody": "$serviceName = $OctopusParameters[\"ServiceName\"]\n\nwrite-host \"the service is: \" $serviceName\n\n& \"sc.exe\" config $serviceName start= delayed-auto\n& \"sc.exe\" start $serviceName\n\n"
},
"Parameters": [
{
"Name": "ServiceName",
"Label": "Service Name",
"HelpText": null,
"DefaultValue": null,
"DisplaySettings": {
"Octopus.ControlType": "SingleLineText"
}
}
],
"$Meta": {
"ExportedAt": "2016-10-10T10:21:21.980Z",
"OctopusVersion": "3.3.2",
"Type": "ActionTemplate"
}
}
You may want to modify the step template as it will set the service to "Automatic - Delayed" and then start the service.
Are you able to move the script to a sub folder?
These scripts must be located in the root of your package
http://docs.octopusdeploy.com/display/OD/Custom+scripts
Alternatively - don't include your deploy.ps1 script in the deployment package if it should never be deployed.