Sphinx search setup on load balanced app servers - amazon-ec2

I would like to use sphinxsearch on our site which is hosted on an auto-scaled load-ballanced server farm with 1LB, 2DB, 2APP,& 1 memcached servers.
With using sphinx to search a site with over million posts (forum site), is any of these ideas a recommended way to setup sphinxsearch.
a: Setup a extra server (or put it on the memcache instance) and have results from the app servers pull from that server.
b: setup sphinxsearch on the app servers and find a way to replicated the index
c: what ever other idea you can think of?

A) Try putting it on separate server first, if it will not take a lot of resources you can move it to memcache server.
B) Replicating indexes would actually be rsyncing them? If yes you would need to restart search daemon after every sync so I would not suggest it
I would go with A

Related

Laravel Application & Load balancer

I have hosted my application on AWS cloud and a load balancer is running on top of two instances which is being served by Nginx on top of Php7.0-fpm. Let's say that my application downloads a file and stores it locally, so that the contents can be served to the customers. With an auto scaling group, configured for two instances;
1) If my session begins with instance-1 where my file gets downloaded, and suddenly switches over to instance-2, will I be getting the same content?
Or
2) If a session is created on a single instance, will the same instance be used until I log out of my application?
Any help is much appreciated!!
For a website with more than 1 instance, which is load balanced, it is highly recommended that you store cache and sessions in 1 place and not multiple across them. For this, you can install memcached on all servers and configure them to point them to 1 server to store it all.
SESSION_DRIVER=memcached
CACHE_DRIVER= memcached
MEMCACHED_HOST=127.0.0.1 #on your memcache server, point to localhost
MEMCACHED_HOST=10.10.1.10 #on other instances, point to memcache server
MEMCACHED_PORT=11211
For file and image uploads, use S3 from AWS or dedicated storage server with FTP to store so that all the servers can access it directly and the same way. Easiest and most efficient :)
If you store them locally, your servers won't be synced with the same content, and your users will end up with 404s.

Development and production on the same vps server?

I'm currently using a vps plan at vpsdime.com as my development server. I move a lot and use different computers so didn't want to develop locally.
Soon, I'll be able to launch my webapp (approx 5-10 users to start with). Should I simply install my production app on my same vps server, or would you advise to get another server? Why?
You can safely use the same server. Just make sure everything as separated per environment:
Different Redis database
Different MySQL database
Different Elasticsearch server
Different location to store session data
Different caching location
Different queues (Redis/Beanstalk, ...)
Different AWS bucket
Different ... you get the gist.
It should be straightforward to setup different vhosts with Apache or Nginx.

Running multiple elasticsearch instances

I need to setup 2 Elasticsearch instances:
one for kibana logs (my separate application will throw logs at it)
one for search for my production application
My plan is to create a separate folders with elasticsearch in them. They dont talk to each other which means they are separate databases and if one goes down, the other still runs. Is this good solution or should I use only one elasticsearch folder with muliple elasticsearch.yaml configuration files? What is the best practice for multiple elasticsearch instances?
The best practice is to NOT run two Elasticsearch instances on the SAME server.
Your production search will probably need a lot of ram to work fast and stay responsive. You don't want your logging system interfere with that.

How do I run my application code (PHP) across my various Amazon EC2 instances?

I've been trying to get to grips with Amazons AWS services for a client. As is evidenced by the very n00bish question(s) I'm about to ask I'm having a little trouble wrapping my head round some very basic things:
a) I've played around with a few instances and managed to get LAMP working just fine, the problem I'm having is that the code I place in /var/www doesn't seem to be shared across those machines. What do I have to do to achieve this? I was thinking of a shared EBS volume and changing Apaches document root?
b) Furthermore what is the best way to upload code and assets to an EBS/S3 volume? Should I setup an instance to handle FTP to the aforementioned shared volume?
c) Finally I have a basic plan for the setup that I wanted to run by someone that actually knows what they are talking about:
DNS pointing to Load Balancer (AWS Elastic Beanstalk)
Load Balancer managing multiple AWS EC2 instances.
EC2 instances sharing code from a single EBS store.
An RDS instance to handle database queries.
Cloud Front to serve assets directly to the user.
Thanks,
Rich.
Edit: My Solution for anyone that comes across this on google.
Please note that my setup is not finished yet and the bash scripts I'm providing in this explanation are probably not very good as even though I'm very comfortable with the command line I have no experience of scripting in bash. However, it should at least show you how my setup works in theory.
All AMIs are Ubuntu Maverick i386 from Alestic.
I have two AMI Snapshots:
Master
Users
git - Very limited access runs git-shell so can't be accessed via SSH but hosts a git repository which can be pushed to or pulled from.
ubuntu - Default SSH account, used to administer server and deploy code.
Services
Simple git repository hosting via ssh.
Apache and PHP, databases are hosted on Amazon RDS
Slave
Services
Apache and PHP, databases are hosted on Amazon RDS
Right now (this will change) this is how deploy code to my servers:
Merge changes to master branch on local machine.
Stop all slave instances.
Use Git to push the master branch to the master server.
Login to ubuntu user via SSH on master server and run script which does the following:
Exports (git-archive) code from local repository to folder.
Compresses folder and uploads backup of code to S3 with timestamp attached to the file name.
Replaces code in /var/www/ with folder and gives appropriate permissions.
Removes exported folder from home directory but leaves compressed file intact with containing the latest code.
5 Start all slave instances. On startup they run a script:
Apache does not start until it's triggered.
Use scp (Secure copy) to copy latest compressed code from master to /tmp/www
Extract code and replace /var/www/ and give appropriate permissions.
Start Apache.
I would provide code examples but they are very incomplete and I need more time. I also want to get all my assets (css/js/img) being automatically being pushed to s3 so they can be distibutes to clients via CloudFront.
EBS is like a harddrive you can attach to one instance, basically a 1:1 mapping. S3 is the only shared storage stuff in AWS, otherwise you will need to setup an NFS server or similar.
What you can do is put all your php files on s3 and then sync them down to a new instance when you start it.
I would recommend bundling a custom AMI with everything you need installed (apache, php, etc) and setup a cron job to sync php files from s3 to your document root. Your workflow would be, upload files to s3, let server cron sync files.
The rest of your setup seems pretty standard.

EC2 database server failover strategy

I am planning to deploy my web app to EC2. I have several webserver instances. I have 1 primary database instance. I have 1 failover database instance. I need a strategy to redirect the webservers to the failover database instance IP when the primary database instance fails.
I was hoping I could use an Elastic IP in my connection strings. But, the webservers are not able to access/ping the Elastic IP. I have several brute force ideas to solve the problem. However, I am trying to find the most elegant solution possible.
I am using all .Net and SQL Server. My connection strings are encrypted.
Does anybody have a strategy for failing over a database instance in EC2 using some form of automation or DNS configuration?
Please let me know.
http://alestic.com/2009/06/ec2-elastic-ip-internal
tells you how to use the Elastic IP public DNS.
Haven't used EC2 but surely you need to either:
(a) put your front-end into some custom maintenance mode, that you define, while you switch the IP over; and have the front-end perform required steps to manage potential data integrity and data loss issues related to the previous server going down and the new server coming up when it enters and leaves your custom maintenance mode
OR, for a zero down-time system:
(b) design the system at the object/relational and transaction levels from the ground up to support zero-down-time fail-over. It's not something you can bolt on quicjkly to just any application.
(c) use some database support for automatic failover. I am unaware whether SQL Server support for failover suitable for your application exists or is appropriate here. I suggest adding a "sql-server" tag to the question to start a search for the right audience.
If Elastic IPs don't work (which sounds odd to say the least - shouldn't you talk to EC2 about that), you mayhave to be able to instruct your front-end which new database IP to use at the same time as telling it to go from maintenance mode to normal mode.
If you're willing to shell out a bit of extra money, take a look at Rightscale's tools; they've built custom server images and supporting tools that handle database failover (among many other things). This link explains how to do it with MySQL, so will hopefully show you some principles even though it doesn't use SQL Server.
I always thought there was this possibility in the connnection string
This is taken (but not yet tested) from How to add Failover Partner to a connection string in VB.NET :
If you connect with ADO.NET or the SQL
Native Client to a database that is
being mirrored, your application can
take advantage of the drivers ability
to automatically redirect connections
when a database mirroring failover
occurs. You must specify the initial
principal server and database in the
connection string and the failover
partner server.
Data Source=myServerAddress;Failover Partner=myMirrorServerAddress;
Initial Catalog=myDataBase;Integrated Security=True;
There is ofcourse many other ways to
write the connection string using
database mirroring, this is just one
example pointing out the failover
functionality. You can combine this
with the other connection strings
options available.
To broaden gareth's answer, cloud management softwares usually solve this type of problems. RightScale is one of them, but you can try enStratus or Scalr (disclaimer: I work at Scalr). These tools provide failover solutions like:
Backups: you can schedule automated snapshots of the EBS volume containing the data
Fault-tolerant database: in the event of failure, a slave is promoted master and mounted storage will be switched if the failed master and new master are in the same AZ, or a snapshot taken of the volume
If you want to build your own solution, you could replicate the process detailed below that we use at Scalr:
Is there a slave in the same AZ? If so, promote it, switch EBS
volumes (which are limited to a single AZ), switch any ElasticIP you
might have, reconfigure replication of the remaining slaves.
If not, is there a slave fully replicated in another AZ? If so, promote it,
then do the above.
If there are no slave in same AZ, and no slave fully
replicated in another AZ, then create a snapshot from master's
volume, and use this snapshot to create a new volume in an AZ where a
slave is running. Then do the above.

Resources