Shell scripts scheduler - shell

Basically, I need to run a set of custom shell scripts on ec2 instances to provision some software. Is there any workflow manager like oozie or airflow with api access to schedule the same. I am asking for alternatives like oozie and airflow, as those are that of hadoop environment schedulers and my environment is not. I can ensure that there can be ssh access from the source machine that will run the workflow manager and the ec2 instance where want to install the software. Is there any such open source workflow schedulers?

I would recommend using Cadence Workflow for your use case. There are multiple provisioning solutions built on top of it. For example Banzai Cloud Pipeline Platform
See the presentation that goes over Cadence programming model.

Related

How to run Ansible play-book command from remote server

I need to install and configure all new system start with auto-scaling in aws as per the requirements , like if it is a app server install nodejs with respective git code for deployment using with Ansible.
How Ansible identify a new system came up and need to do this all configuration.
Here is a guide from ansible docs how to handle autoscaling with Ansible: https://docs.ansible.com/ansible/latest/scenario_guides/guide_aws.html#autoscaling-with-ansible-pull
The problem on this approch is, that you need the whole provisining prozess on startup. This takes much time and is error prone.
A common solution is to build a custom AMI with all infrastructure needed for your service and only deploy your current code to this maschine.
A good tool to build custom AMIs is Packer. A Guide for AWS is available here. https://www.packer.io/docs/builders/amazon.html

How to schedule jar file as event on aws EC2

I wrote talend etl job and packaged it into an executable Jar file.
i want schedule this jar file to run every week once.
I am using Amazon AWS. Can it achieve what I wanted? If yes, what are the steps I should proceed with? If not what are other alternatives?
If you are using Linux, I recommend you to use the traditional cron service served by unix system.
step
Edit /etc/crontab
Write schedule and shell script to kick Talend job
Please be careful timezone setting.
alternatives
Wikipedia - List of job scheduler software
I would personally recommend putting your jar in a docker container and putting it into an EC2 instance on AWS. Cron job will work, however, but what happens if you are not signed in when your task is scheduled to execute? If you put the jar in the cloud it will always execute.

Configuring AWS cluster using automation script

We are looking for the possibility of an automation script which we can give how many master and data nodes we need and it would configure a cluster. Probably giving the credentials in a properties file.
Currently our approach is to login to the console and configure the Hadoop cluster. It would be great if there could be an automated way around it.
I've seen this done very nicely using Foreman, Chef, and Ambari Blueprints. Foreman was used to provision the VMs, Chef scripts were used to install Ambari, configure the Ambari blueprint, and to create the cluster using the Blueprint.

Running a script on an AWS server

I have a script that I need to run once a day that requires a lot of memory. I would like to run it on a dedicated amazon box.
Is there some automated way to build a box, download all required software (like ruby) and then run my script. After the script is ran, I would like to shutdown the box.
The two options I can think of are:
I am thinking about hacking EMR to do this. (My script is a mapper against an empty directory)
Chef - This seemed like too much for one simple script.
You can accomplish setting up a new EC2 instance on startup using the official Ubuntu AMIs, the official Amazon Linux AMIs, and any other AMI that supports the concept of a user-data script.
Create a script (bash, Perl, Python,
whatever) that starts with #!
Pass this script as the user-data when running the EC2 instance.
The script will automatically be run as root on the first boot.
Here's the article where I introduced the concept of a user-data script:
Automate EC2 Instance Setup with user-data Scripts
http://alestic.com/2009/06/ec2-user-data-scripts
Your user-data script can install the required software, configure it, install your work script, and set up a cron job that runs the work script once a day.
ENHANCEMENT:
If the installation script don't take a long time to run (e.g., under an hour or few) then you don't even have to run a single dedicated instance 24 hours a day. You can instead use an approach that lets AWS start an instance for you on a regular schedule.
Here's an article I wrote that provides details on this approach with sample commands:
Running EC2 Instances on a Recurring Schedule with Auto Scaling
http://alestic.com/2011/11/ec2-schedule-instance
The general approach is to use Auto Scaling to start an instance with your user-data script on a regular schedule. Your job will terminate the instance when it has completed. They key is to suspend Auto Scaling's normal desire to re-start instances that terminate so that you don't pay for a running instance until the next time your job starts.

How to sync my EC2 instance when autoscaling

When autoscaling my EC2 instances for application, what is the best way to keep every instances in sync?
For example, there are custom settings and application files like below...
Apache httpd.conf
php.ini
PHP source for my application
To get my autoscaling working, all of these must be configured same in each EC2 instances, and I want to know the best practice to sync these elements.
You could use a private AMI which contains scripts that install software or checkout the code from SVN, etc.. The second possibility to use a deployment framework like chef or puppet.
The way this works with Amazon EC2 is that you can pass user-data to each instance -- generally a script of some sort to run commands, e.g. for bootstrapping. As far as I can see CreateLaunchConfiguration allows you to define that as well.
If running this yourself is too much of an obstacle, I'd recommend a service like:
scalarium
rightscale
scalr (also opensource)
They all offer some form of scaling.
HTH

Resources