Recommendation for an EC2 Instance Stack - amazon-ec2

Our development servers are on Amazons EC2.
We would ideally like the following:
PHP 5.3.x
Oracle Drivers
mHash
mCrypt
Apache
Does anyone have a recommendation on a good place to get a stack that would meet most of those needs with a minimum of additional configuration?

For updated base AMI's you can check Ubuntu canonical AMI's found at Eric's site: www.alestic.com.
The Amazon AMI was also release if you prefer CentOS distro. http://bit.ly/a5fcz3
For security reasons, I suggest you build your own LAMP stack.
Of course there are many existing LAMP AMI's you can find.

We are bootstrapping our instances using chef and a service called Scalarium. Depending on what we setup and configure, it takes up to 8 minutes for the instance to become operational.
Feel free to check out my chef recipes, specifically the one for php-fpm:
http://github.com/till/easybib-cookbooks/tree/master/php-fpm/
I'm also working on a custom php5 debian package to speed up the PHP installation.

Related

Common APIs to launch EC2 and Openstack instances

At work we use Amazon linux Ec2 instances for production purposes. Also, for our internal dev setup we use openstack Cent OS instances.
I want to make a common CLI or expose REST APIs to start and stop instances on both these cloudstacks. (I already have machine images). I understand I can use any of the common SDKs (I plan to use GO) and build this.
Recently, I came across this. I am just wondering if such a thing is already available. Or does the above repo mean something else? There have been also some other articles which mention EC2 support for openstack. I am not sure if it means the same as I what I want to achieve.
There already is some compatibility with ec2 command line clients, for Nova, what you have linked to expands on that to include some network functions (VPC etc.), and openstack heat is compatible with some aws cloudformation templates.
have you looked at euca2ools? - this client was developed by Eucalyptus cloud and is compatible with AWS and nova EC2

In a vagrant/ansible set up, who is responsible for starting servers (nodejs, rails)

Our infrastructure is getting pretty complex with many moving pieces so I'm setting up Vagrant with Ansible to spin up development environments.
My question is who (Vagrant or Ansible or another tool) should be responsible for starting various such as
rails s (for starting rails server)
nginx
nodejs (for seperate API)
I think the answer you're looking for is Ansible (or another tool).
Vagrant has capabilities to run scripts and start services. Once you add a configuration management tool, it should do exactly that. That's part of its job: starting and managing services.
You want the same application configuration regardless of the machine you're spinning up (ESXi, Amazon EC2, Vagrant, whatever), and the best way to do that is outside of Vagrant.

Continuous deployment & AWS autoscaling using Ansible (+Docker ?)

My organization's website is a Django app running on front end webservers + a few background processing servers in AWS.
We're currently using Ansible for both :
system configuration (from a bare OS image)
frequent manually-triggered code deployments.
The same Ansible playbook is able to provision either a local Vagrant dev VM, or a production EC2 instance from scratch.
We now want to implement autoscaling in EC2, and that requires some changes towards a "treat servers as cattle, not pets" philosophy.
The first prerequisite was to move from a statically managed Ansible inventory to a dynamic, EC2 API-based one, done.
The next big question is how to deploy in this new world where throwaway instances come up & down in the middle of the night. The options I can think of are :
Bake a new fully-deployed AMI for each deploy, create a new AS Launch config and update the AS group with that. Sounds very, very cumbersome, but also very reliable because of the clean slate approach, and will ensure that any system changes the code requires will be here. Also, no additional steps needed on instance bootup, so up & running more quickly.
Use a base AMI that doesn't change very often, automatically get the latest app code from git upon bootup, start webserver. Once it's up just do manual deploys as needed, like before. But what if the new code depends on a change in the system config (new package, permissions, etc) ? Looks like you have to start taking care of dependencies between code versions and system/AMI versions, whereas the "just do a full ansible run" approach was more integrated and more reliable. Is it more than just a potential headache in practice ?
Use Docker ? I have a strong hunch it can be useful, but I'm not sure yet how it would fit our picture. We're a relatively self-contained Django front-end app with just RabbitMQ + memcache as services, which we're never going to run on the same host anyway. So what benefits are there in building a Docker image using Ansible that contains system packages + latest code, rather than having Ansible just do it directly on an EC2 instance ?
How do you do it ? Any insights / best practices ?
Thanks !
This question is very opinion based. But just to give you my take, I would just go with prebaking the AMIs with Ansible and then use CloudFormation to deploy your stacks with Autoscaling, Monitoring and your pre-baked AMIs. The advantage of this is that if you have most of the application stack pre-baked into the AMI autoscaling UP will happen faster.
Docker is another approach but in my opinion it adds an extra layer in your application that you may not need if you are already using EC2. Docker can be really useful if you say want to containerize in a single server. Maybe you have some extra capacity in a server and Docker will allow you to run that extra application on the same server without interfering with existing ones.
Having said that some people find Docker useful not in the sort of way to optimize the resources in a single server but rather in a sort of way that it allows you to pre-bake your applications in containers. So when you do deploy a new version or new code all you have to do is copy/replicate these docker containers across your servers, then stop the old container versions and start the new container versions.
My two cents.
A hybrid solution may give you the desired result. Store the head docker image in S3, prebake the AMI with a simple fetch and run script on start (or pass it into a stock AMI with user-data). Version control by moving the head image to your latest stable version, you could probably also implement test stacks of new versions by making the fetch script smart enough to identify which docker version to fetch based on instance tags which are configurable at instance launch.
You can also use AWS CodeDeploy with AutoScaling and your build server. We use CodeDeploy plugin for Jenkins.
This setup allows you to:
perform your build in Jenkins
upload to S3 bucket
deploy to all the EC2s one by one which are part of the assigned AWS Auto-Scaling group.
All that with a push of a button!
Here is the AWS tutorial: Deploy an Application to an Auto Scaling Group Using AWS CodeDeploy

Is there an Amazon community AMI for Hadoop/HBase?

I would like to test out Hadoop & HBase in Amazon EC2, but I am not sure how complicate it is. Is there a stable community AMI that has Hadoop & HBase installed? I am thinking of something like bioconductor AMI
Thank you.
I highly recommend using Amazon's Elastic MapReduce service, especially if you already have an AWS/EC2 account. The reasons are:
EMR comes with a working Hadoop/HBase cluster "out of the box" - you don't need to tune anything to get Hadoop/HBase working. It Just Works(TM).
Amazon EC2's networking is quite different from what you are likely used to. It has, AFAIK, a 1-to-1 NAT where the node sees its own private IP address, but it connects to the outside world on a public IP. When you are manually building a cluster, this causes problems - even using software like Apache Whirr or BigTop specifically for EC2.
An AMI alone is not likely to help you get a Hadoop or HBase cluster up and running - if you want to run a Hadoop/HBase cluster, you will likely have to spend time tweaking the networking settings etc.
To my knowledge there isn't, but you should be able to easily deploy on EC2 using Apache Whirr which is a very good alternative.
Here is a good tutorial to do this with Whirr, as the tutorial says you should be able to do this in minutes !
The key is creating a recipe like this:
whirr.cluster-name=hbase
whirr.instance-templates=1 zk+nn+jt+hbase-master,5 dn+tt+hbase-regionserver
whirr.provider=ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.hardware-id=c1.xlarge
whirr.image-id=us-east-1/ami-da0cf8b3
whirr.location-id=us-east-1
You will then be able to launch your cluster with:
bin/whirr launch-cluster --config hbase-ec2.properties

Usage of rightscale init script in it's EC2 Centos 5.4 AMI

I was searching for EC2 EBS storage Centos 5.4 AMI in the community AMI and eventually I found Rightscale AMI (I think they called it RightImage).
Now I have created instance using that AMI, but I found out there is some Rightscale stuff inside which is worrying me about the safety on using it. I found out there are the following files in that AMI:
/etc/init.d/rightimage
/etc/init.d/rightlink
/etc/init.d/rightscale
/home/ec2
/home/s3sync
(may be more other files I haven't found out yet)
I know I can look into the script and folder and see what they do, but since a lot of user here recommended using Rightscale Centos AMI in EC2, I hope may be there is already some gurus here know what those mentioned script and folder doing and could advice me
i)whether is it safe to delete them. (I'm more concern on whether my data in the server will be safe by using this AMI)
ii)any installed apps in RightScale AMI that should be deleted
And if you think there is other free EC2 Centos AMI that is secure and solid, do suggest as well, thanks !
In order for RightScale to properly manage instances in ec2 they use a ruby based daemon called RightLink as a communication device between their core platform and each instance that is launched. The init scripts that you saw are required for the instance to self configure itself to the point where it can be managed by RightScale properly.
/etc/init.d/rightimage is the first script that is run. Essentially it just determines the OS, arch version, and installs the correct RightLink package from the S3 bucket. Afterwards it kicks off the /opt/rightscale/bin/post_install.sh script which uses the OS init control tools to register the startup scripts to be invoked on future boots of the OS; this ensures that RightLink will always be started.
/etc/init.d/rightscale is the next script that is run. It initializes RightScale-specific (but not RightLink-specific) system state. It is responsible for caching launch settings (aka userdata) and metadata in /var/spool and installing any available patches to the RightLink agent.
/etc/init.d/rightlink is the final script that is run. It configures and enrolls the RightLink agent idempotently. If configuration and enrollment succeed, rightlink starts the sandboxed monit which starts the persistent agent process. If you're not launching the AMI using the RightScale platform this will never properly enroll because they aren't expecting it to, as such RightScale will have no communication with the instance at all.
Removing all three of these from the image shouldn't in any way harm the overall stability of the image, but from a security standpoint they shouldn't cause any problems if they are present.
If you have any further specific questions about it I'd suggest hopping on their forums at https://forums.rightscale.com/
You could also try #rightscale on freenode.

Resources