I am using Cloud Foundry and I deployed my Spring boot application on Cloud. Whenever there is some updates/upgrade happens on Cloud foundry, my application got restart and some request got failed to reach to application as restart of application takes more time to get up.
Is there any way in CF that some instances of application will be running while upgrade/restart of application to process requests.
Also I want to know, if CF provides services from different locations/regions, so consider my application will be deployed on 2 CF containers available on different region. Wherever there is some updates/upgrade available, proceed upgrade on one region for Cf so other CF service from another region will be available and some application instances will be running to serve requests and vice versa.
-Thank you.
What you're describing is the intended behavior of CF.
If you have two or more instance of your application, they should never both go down at the same time. i.e. one will be taken down, then after it's restarted successfully, then the other will be taken down and restarted.
If your operator has configured multiple availability zones for the foundation that you've targeted, then application instances will be distributed across those AZs to help facilitate HA and best possible availability.
If you're not seeing this behavior then you should take a look at the following as these items can affect uptime of your apps:
Do you have more than one application instance? If you only have one application instance, then you can expect to see some small windows of downtime when updates are applied to the foundation and under other scenarios. This happens because a times Diego will need to evict applications running on a Diego Cell. It makes an attempt to start your app on another Cell before stopping the current instance, but there are no guarantees provided around this. Thus you can end up with some downtime, if for example your app is slow to start or your app does not have a good health check configured (like it passes the health check before the app is really up).
Did your operator set up multiple AZs? As a developer, you cannot really tell. This is abstracted away, so you would need to ask your platform operations team and confirm if there are more than one and if so how many. For best possible uptime, have at least as many app instances as you have AZs.
The other thing often overlooked, does your application depend on any services? If so, it is also possible that you will see downtime when services are being updated. That all depends on the services you are using and if there will be associated downtime for management and upgrades of those services. You may be able to tell if this is the case by looking more closely at your application logs when it fails to see if there are connection failures or errors like that. You might also be able to tell by looking at the plan defined in the CF Marketplace. Often the description will say if there are stipulations regarding the plan, like it is or isn't clustered or HA.
UPDATE
One other thing which can cause downtime:
If your operator has the "max in flight" value too high for the number of Diego Cells this can also cause downtime. Essentially, "max in flight" dictates how many Diego Cells will be taken out of service during an upgrade. If this value is too high, you can run into a situation where there is not enough capacity in the remaining Cells to host all of your applications. This ends up resulting in downtime for app instances as they cannot be rescheduled on another Cell in a timely manner. As a developer, I don't think this is something you can troubleshoot, you would need to work with your platform operators to investigate further.
That is probably a theme here. If you are an app developer, you should be talking to your platform operations team to debug this.
Hope that helps!
For a pure Rack app running on a Heroku hobby dyno with a Heroku Postgres hobby dev add-on, how do you know how many workers & threads to configure Puma to have?
Based on this article it seems like you'd be safe running 2-4 processes within a single Heroku web dyno depending on your memory usage. For threads, I'd stick to the default (5) depending on the needs of your app.
I'd recommend tuning your app to use a particular config then keeping an eye on the Heroku logs for a few days to see if you get too many R14 errors. At that point, you know you've exhausted the dyno and should scale it back.
It depends greatly on how memory-hungry your application is. Given that it's a pure Rack app and most of the literature out there are for Rails apps - I'd imagine that your optimal values are higher.
The Librato add-on is really helpful here in letting you see your memory usage in near real-time, so you can quickly tweak and monitor how close you are to the 512MB limit. There's a free tier there, and it doesn't need any additional instrumentation either (I'm not affiliated with them in any way, but we do use their service!)
How would I go about setting up a backup for heroku downtimes set up on a vps like linode? (using nginx/unicorn)
Essentially very simply, but also with a whole world of hurt.
Simply create an instance of your application of said VPS.
Then you need to ensure that you're able to flip your DNS from Heroku to said VPS without waiting for a TTL to expire, or someway of letting the world know your application has moved.
Then figure out a reliable way of ensuring that the code on both environments is exactly the same, and works on both different server setups
Then figure out how you can keep the data up to date in both environments so that when you do need to flip, the data will be the same in both environments.
Then you need to figure out a way to remind yourself to keep this secondary VPS up to date from a server management point of view. Software updates, security patches etc etc.
Then you need to figure out a way that you can notified when Heroku is down 24/7
Then you need to hope that when Heroku is down that Linode isn't
... or just accept that any host will go down, and it can cost a hell of a lot of money to ensure that your site doesn't. To be honest, it's probably better for you to look at some sort of hosting setup that allows redundancy and failover across several locations (which won't be cheap)
There are third party services which provide the ability to keep your site (parts of) up if your server goes down - At least it appears to the user that your site is up but it's not working properly behind the scenes. CloudFlare is one such service. It sits in front of your site/application and performs magic (quite simply). It works with static/dynamic sites - and if your server goes offline then they are able to serve static parts of your site. See http://support.cloudflare.com/kb/what-do-the-various-cloudflare-settings-do/what-does-enabling-cloudflare-offline-browsing-do
I was curious if anyone has experimented with auto scaling web or db tier in EC2 or other cloud computing infrastructure? It seems theoretically possible, but I am curious what the practical limitations are/maybe.
Thanks!
We are also starting to look at auto-scaling.
The first candidate approach is to use Amazon's ELB (Elastic Load Balancer) and Cloud Front. However, our traffic is a web service. Caller's frequently send the 100-Continue http message, and ELB cannot understand that message. There's no word yet from Amazon on when that might be fixed. Also, there are a number of complaints in the Amazon forums about ELB not handling heavy load.
LigHTTPD 1.5 looks like a promising partial solution, in that it can detect when an instance is not functioning and transparently take it out of the rotation, and can be dynamically reconfigured without restarting the load balancer.
There are a number of commercial solutions as well. We will probably have a look at Right Scale.
This is more of a question than an answer, but I'm about to start experimenting with autoscaling myself (most likely using the Amazon CloudFront facilities) and am thinking that instance startup time will be a factor. I've noticed that a new EC2 instance can take from 5 to 20 minutes to start up, so it's not as if you can instantly add more capacity when your load increases; it seems like you would need one or more idle instances to be running and ready to pick up increased load.
Late addition:
Consider SimpleDB as well... this would eliminate the DB scaling side.
For autoscaling, we rolled our own scripts to monitor, launch, and provision servers and yes, the whole process takes about 7 minutes. We do a little predictive analysis to guess when new servers will be needed and then just break them down if they aren't. Total cost: ~10 cents.
Also, Scalr looks promising as a commercial solution (haven't used it).
Chad
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm looking for a cost effective tool for managing an web app on Ec2. Rightscale seems to the big dog and charges for it. Scalr looks like a more cost effective solution but it's hard to find out any real customer experiences..
The key aspects I'm looking for is a load balancer (http and https) and a way to automatically bring online additional web servers capacity as load increases as well as terminate the instances when load falls off.
From what I can tell, lots of people are rolling their own stuff here. We're trying to release an app and don't really want to have to fight too many heavy sys admin battles. Given the importance of performance etc I'd be grateful to hear advise and experiences from the field on this.
I am a Scalr user, a Scalr.net subscriber, and have become a Scalr enthusiast. I cannot possibly afford Rightscale.
Scalr can do what you ask.
Scalr has three images (each with 32/64 bit versions), plus a base (generic) image:
1) A load balancer image, running nginx. A highly available setup requires two of these. Scalr will manage your nameservice, and round robin between them. If one goes down, Scalr will remove it from DNS and bring up another instance. It is possible to run other load balancers, but nginx is the default.
2) Several application server images are available, running Apache/Tomcat/Rails. You setup your application here, be it PHP/Perl/Python/Java/Ruby/whatever. nginx routes requests between these instances grouped by unique user (based on IP + browser). Scalr monitors these for upness too, and replaces broken instances.
3) A MySQL database image, with automatic master/slave replication. Just deploy your schema, and Scalr handles replication and replaces defunct servers. It will also backup your data periodically. Scalr's DNS provides master and slave hostnames, so you can have your app read from the slaves and write to the master.
All of these instance types will auto-scale based on load. You start with the base image closest to what you're doing, and then you customize them for your application. For instance, we deploy our Perl/Catalyst app on the apache server instances but we serve static content from the nginx front-end servers. We had to modify our application slightly to use read/write database handles.
All in all, it took about three weeks of working through bugs in Scalr to get our application to a reliable state where I am confident that it IS highly available with Scalr. Their support was phenomenal, so the bugs didn't bother me too much, and the system is really coming along. It is approaching serious reliability.
As a side note, the best feature of Scalr is the 'Synchronize to All' feature, which auto-bundles your AMI and re-deploys it on a new instance - all without a service interruption. This saves you the time of going through the lengthy EC2 image/AMI creation process, which can otherwise make very simple admin tasks take 20 minutes. You can use this whether you are scaling your server farm or not - it would be very handy even on a single instance.
I pay Scalr.net $50 a month to host the service for me because I think it saves me time and money. The bottom line so far is this: at my last gig, we had a systems guy working on our highly available Linux DB + app server setup for a year... and he failed to achieve the kind of reliability that I achieved in three weeks. The savings by using Scalr as compared to rolling my own are extreme.
All that being said, if I could afford Rightscale, I would be using Rightscale. But the up-front fee and $500 a month make that impossible. There has been talk of waving the up-front fee in exchange for waving the consulting that it includes, but the monthly service fee isn't going anywhere.
I should mention that at the moment, sclar.net's website is down, so if I wanted to manage any of my server farms (don't have them up atm), I simply couldn't right now. It is not clear whether scaling is working for scalr.net subscribers right now, or not. Which is to say... this is perhaps not a mature solution yet. This doesn't happen often, before tonight the only downtime I've experienced were in periods of a few minutes at a time. But yeah... its down RIGHT NOW, so I must mention it :)
I would suggest a thorough reading of the support group at http://groups.google.com/group/scalr-discuss before making your decision. If you pick Scalr, be prepared to test your setup and work through any issues you have on the google group.
I will comment on your question, since giving a concrete answer is a little ambitious.
First, I see that you have haproxy on your tags. That is definitely the best load balancing software proven in EC2. There is documentation and experiences in the AWS forums on the use of haproxy.
I am unable to give you an opinion on scalr, but Rightscale is going the right direction. One of RightScale most interesting features in their roadmap is that they are a mgmt cloud system for any cloud not just EC2 of Amazon. That makes them very promising when trying to request load balancing and upscaling in need.
Also you can signup for a developer free account on rightscale and you can test some of their AMI and free scripts, they are pretty impressive.
Well, this might sound like I am working there or something, but I am a just a cloud user, no connection with them. If that crosses your mind.
I hope this helps, at least adds to the discussion.
Geo
Been on Scalr for about two months now and have slowly transitioned several production applications to the platform with good results. I strongly recommend them for quick turn around/support and value. I would like to see them improve availability of their platform.
All in all, a good fit for the original poster based on the simple use case presented.
Every service has a bad day. AWS services see down time. However, there are still users running their apps on AWS.
I have a few farms on Scalr.net and compared to Rightscale. I don't have to pay an arm and a leg.
Overall, service is very reliable. And now with the scripting engine i can setup my own scripts to govern my instances.
With Regards
Hareem Haque
Both services (rightscale and scalr) are great. The offer is not the same and the price is not the same too. But they are both what I was looking for. Regaring our budget scalr fits my needs. I found the support through a google group very strange at the beginning, but it is very fast and efficient.
Their solution is also open source (not bad) and they also have a V2 in their roadmap with support to other providers.
Wait and see, but til now, I'm very happy with it
Deciding on the right choice may not be as cut and dry as everyone expects. I have met with and heard talks from Scalr about their platform and have also listened to RightScale discuss their platform. If you have a simple SOA (App Server - Database Server - File Server), then either choice will be right for your company.
Ultimately, if you have created some custom middleware and you rely on known sockets or specific points for handshakes, you will need to consider load-balancing and auto-scaling what you can and fall back to your own solutions for what can't be managed with either of these services.
Some people say that automatic scaling won't solve the problem
I am looking into Scalr right now and although it all looks good, I decided to continue with my own scripting for the purpose of cloud management / scaling. I have 8 servers right now and am paying only the AWS fees. I use chef (self-hosted), nagios, and a lot of other tools. My databases are mysql and mongodb, load balancer is haproxy, app layer is rails. Until I need 100s of servers, I think I will just keep scriptin' ;-)