I'm trying to figure out how to scale a Windows Azure app, where there are some web roles and some worker roles.
The objective is to have some instances in a US datacenter and some others in an Europe datacenter, for different users in America an Europe to have the better response time. My problem is to replicate all my storages (for users in Europe who travel to America and viceversa) and even for troubles in one datacenter.
Until now, I understand that it's possible using Traffic Manager to let Azure know which datacenter is closer to the user.
I know I can replicate data between databases with SQL Data Sync.
The blob storages can also be replicated using Copy Blob API .
I understand the queues cannot be automatically replicated but I don't have much problem with that.
My problem is I cannot find a way to replicate table storages.
As a matter of fact I really don't know if this is the best strategy for my problem...
Thank you.
DX - you are right on with Traffic Manager and Data Sync. Those are the best options for roles & SQL. However, BLOBs are much easier - enable CDN and your BLOBs are replicated across 24 data centers automatically. Read Using CDN for Windows Azure for how to setup the CDN from your primary Storage account.
For table storage, I would handle this programatically, keep a list of the Table connections and then use a parallel foreach to insert into the different data centers.
We maintain a different Service Configuration file for each Data Center to simplify deployment.
Related
I have built a spring boot/angular web application that uses a mySQL database for storage. The web application's main purpose is to be like a social media website for gardeners. Next to this it has a couple of tools that allow the user to generate a personalized planting calendar based on the monthly average temperature curve of the region where the user lives. Alternatively the user can also generate a personalized planting calendar based on planting journals made by other users that live within a certain radius near the user doing the calendar generating. I am using Hibernate Search for this.
I do not expect to get millions of visits in the first months after launching the web application, so my question is: What would be the best ec2 instance type to start out with? Could a t3.micro support an application like that for the first months or two? Also, How will i know when the current instance type can no longer handle the incoming traffic without lag and therefore i need to upgrade to a bigger instance like t3.medium or large?
Thank you
If the instance is suitable or not depends on many things. Based on my experience a micro instance is not enough for many use cases.
My suggestion is to start with a t3.small instance, start gathering metrics in CloudWatch to establish your baseline for few days. Then decide if it is enough or not.
If you are filling all your resources you can eventually upgrade to a bigger instance. However if your app is dealing with Java I think that a medium size is the minimum start.
About the lag and other things, first suggestion is to put CloudFront on top of the EC2 at least for all your static content (suggestion: put your static contents on S3 don't let EC2 serve them). Then I think that the only option is to rely on some third party performance tool, external to AWS.
By the way, I have built the same app on iOS many years ago, with a support website hosted on AWS. Now the app is gone, and the website is unmaintained :-)
while developing Azure application I got famous error "Cache referred to does not exist", and after a while I found this solution: datacacheexception Cache referred to does not exist (for short: dont point multiple cache clusters to one storage account by ConfigStoreConnectionString)
Well, I have 3 roles using co-located cache, and testing+production environment. So I would have to create 6 "dummy" storage accounts just for cache configuration. This doesnt seems very nice to me.
So the question is - is there any way to point multiple cache clusters to one storage account? for example, specify different containers for them (they create one named "cacheclusterconfigs" by default) or so?
Thanks!
Given your setup, i would point each cloud service at its own storage account. So this gives two per environment (one for each cloud service). Now there are other alternatives, you could set up Server AppFabric cache in an IaaS VM and expose that to both of your cloud services by placing them all within a single Azure Virtual Network. However, this will introduce latency to the connections as well as increase costs (from running the virtual network).
You can also put the storage account for cache as the same one used by diagnostics or the data storage for your cloud services, just be aware of any scalability limits as the cache will generate some traffic (mainly from the addition of new items to the cache).
But unfortunately, to my knowledge there's no option currently to allow for two caches to share the same storage account.
The scenario is that I am rebuilding an application that is presently SQL and classic asp. However I want to update this a bit to leverage Azure Tables. I know that the Azure SDK has the Dev Fabric storage thing available and I guess it's an option to have that installed on all of my machines.
But I'm wondering if there is a less 'invasive' way to mimick the Azure Tables. Do object DBs or document DBs provide a reasonable facsimile that could be used for the early protoyping. Or is making the move from them to the Azure SDK tables just more headache than it's worth?
In my opinion you should skip the fake Azure tables completely. Even the MS development storage is not an exact match to how things will actually run in the cloud. You get 1M transactions for $1, 1GB of storage for $0.15 and $0.15 per GB in/out of the data centre. If you're just prototyping, live dangerously and spend $10.
If you're new to working with Azure tables and you try to use a development storage or some other proxy you'll save yourself that much money in time spent reworking your code to work against the real thing.
If you're just using tables and not queues, blobs $10 will go a long way.
Each Azure "project" (which is like an Azure account) is initially limited to 5 hosted storage accounts. Let's say you have one storage account for your actual content, and one for diagnostics. Now let's say you want a dev and QA storage account for each of those, respectively, so you don't damage production data. You've now run out of your storage accounts (in the above scenario, you'd need 6 hosted accounts). You'd need to call Microsoft and ask for this limit to be increased...
If, instead, you used the Dev Fabric for your tables during development / local testing, you'll free up Azure storage accounts (the ones used for dev - you'd still want QA storage accounts to be in Azure, not Dev Fabric).
One scenario where you can't rely on Dev Fabric for storage: performance/load testing.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have thought a lot recently about the different hosting types that are available out there. We can get pretty decent latency (average) from an EC2 instance in Europe (we're situated in Sweden) and the cost is pretty good. Obviously, the possibility of scaling up and down instances is amazing for us that's in a really expansive phase right now.
From a logical perspective, I also believe that Amazon probably can provide better availability and stability than most hosting companies on the market. Probably it will also outweigh the need of having a phone number to dial when we wonder anything and force us to google the things by ourselves :)
So, what should we be concerned about if we were about to run our web server on EC2? What are the pro's and cons?
To clarify, we will run a pretty standard LAMP configuration with memcached added probably.
Thanks
So, what should we be concerned about if we were about to run our web server on EC2? What are the pro's and cons?
The pros and cons of EC2 are somewhat dependent on your business. Below is a list of issues that I believe affect large organizations:
Separation of duties Your existing company probably has separate networking and server operations teams. With EC2 it may be difficult to separate these concerns. ie. The guy defining your Security Groups (firewall) is probably the same person who can spin up servers.
Home access to your servers Corporate environments are usually administered on-premise or through a Virtual Private Network (VPN) with two-factor authentication. Administrators with access to your EC2 control panel can likely make changes to your environment from home. Note further that your EC2 access keys/accounts may remain available to people who leave or get fired from your company, making home access an even bigger problem...
Difficulty in validating security Some security controls may inadvertently become weak. Within your premises you can be 99% certain that all servers are behind a firewall that restricts any admin access from outside your premises. When you're in the cloud it's a lot more difficult to ensure such controls are in place for all your systems.
Appliances and specialized tools do not go in the cloud Specialized tools cannot go into the cloud. This may impact your security posture. For example, you may have some sort of network intrusion detection appliances sitting in front of on-premise servers, and you will not be able to move these into the cloud.
Legislation and Regulations I am not sure about regulations in your country, but you should be aware of cross-border issues. For example, running European systems on American EC2 soil may open your up to Patriot Act regulations. If you're dealing with credit card numbers or personally identifiable information then you may also have various issues to deal with if infrastructure is outside of your organization.
Organizational processes Who has access to EC2 and what can they do? Can someone spin up an Extra Large machine and install their own software? (Side note: Our company http://LabSlice.com actually adds policies to stop this from happening). How do you backup and restore data? Will you start replicating processes within your company simply because you've got a separate cloud infrastructure?
Auditing challenges Any auditing activities that you normally undertake may be complicated if data is in the cloud. A good example is PCI -- Can you actually always prove data is within your control if it's hosted outside of your environment somewhere in the ether?
Public/private connectivity is a challenge Do you ever need to mix data between your public and private environments? It can become a challenge to send data between these two environments, and to do so securely.
Monitoring and logging You will likely have central systems monitoring your internal environment and collecting logs from your servers. Will you be able to achieve the monitoring and log collection activities if you run servers off-premise?
Penetration testing Some companies run periodic penetration testing activities directly on public infrastructure. I may be mistaken, but I think that running pen testing against Amazon infrastructure is against their contract (which make sense, as they would only see public hacking activity against infrastructure they own).
I believe that EC2 is definitely a good idea for small/medium businesses. They are rarely encumbered by the above issues, and usually Amazon can offer better services than an SMB could achieve themselves. For large organizations EC2 can obviously raise some concerns and issues that are not easily dealt with.
Simon # http://blog.LabSlice.com
The main negative is that you are fully responsible for ALL server administration. Such as : Security patches, Firewall, Backup, server configuration and optimization.
Amazon will not provide you with any OS or higher level support.
If you would be FULLY comfortable running your own hardware then it can be a great cost savings.
i work in a company and we are hosting with amazon ec2, we are running one high cpu instance and two small instances.
i won't say amazon ec2 is good or bad but just will give you a list of experiences of time
reliability: bad. they have a lot of outages. only segments mostly but yeah...
cost: expensive. its cloud computing and not server hosting! a friend works in a company and they do complex calculations that every day have to be finished at a certain time sharp and the calculation time depends on the amount of data they get... they run some servers themselves and if it gets scarce, they kick in a bunch of ec2's.
thats the perfect use case but if you run a server 24/7 anways, you are better of with a dedicated rootserver
a dedicated root server will give you as well better performance. e.g. disk reads will be faster as it has a local disk!
traffic is expensive too
support: good and fast and flexible, thats definately very ok.
we had a big launch of a product and had a lot of press stuff going on and there were problems with the reverse dns for email sending. the amazon guys got them set up all ripe conform and nice in not time.
amazon s3 hosting service is nice too, if you need it
in europe i would suggest going for a german hosting provider, they have very good connectivity as well.
for example here:
http://www.hetzner.de/de/hosting/produkte_rootserver/eq4/
http://www.ovh.de/produkte/superplan_mini.xml
http://www.server4you.de/root-server/server-details.php?products=0
http://www.hosteurope.de/produkt/Dedicated-Server-Linux-L
http://www.klein-edv.de/rootserver.php
i have hosted with all of them and made good experiences. the best was definately hosteurope, but they are a bit more expensive.
i ran a CDN and had like 40 servers for two years there and never experienced ANY outage on ANY of them.
amazon had 3 outages in the last two months on our segments.
One minus that forced me to move away from Amazon EC2:
spamhaus.org lists whole Amazon EC2 block on the Policy Block List (PBL)
This means that all mail servers using spamhaus.org will report "blocked using zen.dnsbl" in your /var/log/mail.info when sending email.
The server I run uses email to register and reset passwords for users; this does not work any more.
Read more about it at Spamhaus: http://www.spamhaus.org/pbl/query/PBL361340
Summary: Need to send email? Do not use Amazon EC2.
The other con no one has mentioned:
With a stock EC2 server, if an instance goes down, it "goes away." Any information on the local disk is gone, and gone forever. You have the added responsibility of ensuring that any information you want to survive a server restart is persisted off of the EC2 instance (into S3, RDS, EBS, or some other off-server service).
I haven't tried Amazon EC2 in production, but I understand the appeal of it. My main issue with EC2 is that while it does provide a great and affordable way to move all the blinking lights in your server room to the cloud, they don't provide you with a higher level architecture to scale your application as demand increases. That is all left to you to figure out on your own.
This is not an issue for more experienced shops that can maintain all the needed infrastructure by themselves, but I think smaller shops are better served by something more along the lines of Microsoft's Azure or Google's AppEngine: Platforms that enforce constraints on your architecture in return for one-click scalability when you need it.
And I think the importance of quality support cannot be underestimated. Look at the BitBucket blog. It seems that for a while there every other post was about the downtime they had and the long hours it took for Amazon to get back to them with a resolution to their issues.
Compare that to Github, which uses the Rackspace cloud hosting service. I don't use Github, but I understand that they also have their share of downtime. Yet it doesn't seem that any of that downtime is attributed to Rackspace's slow customer support.
Two big pluses come to mind:
1) Cost - With Amazon EC2 you only pay for what you use and the prices are hard to beat. Being able to scale up quickly to meet demands and then later scale down and "return" the unneeded capacity is a huge win depending on your needs / use case.
2) Integration with other Amazon web services - this advantage is often overlooked. Having integration with Amazon SimpleDB or Amazon Relational Data Store means that your data can live separate from the computing power that EC2 provides. This is a huge win that sets EC2 apart from others.
Amazon cloud monitoring service and support is charged extra - the first one is quite useful and you should consider that and the second one too if your app is mission critical.
My startup is located in Europe where most of our current users are.
I'm looking for a host that will allow us to scale to the US and Asia without latency taking its toll on performance.
Does the cloud solve the distance = latency problem?
If not, Where would be the ideal hosting location for a growing startup?
Some data:
Asp.net 3.5
SQL 2005
Jquery (lots of Ajax)
MVC
Thanks
The Cloud is just an abstraction. It doesn't affect the underlying physical nature of the servers running your code and hosting your data. If the systems storing your data are a long way from your users there will some latency, no matter how you access them.
Most Cloud providers allow you to choose where you want your data - for example, Amazon S3 lets you choose to store your data in either the US or Europe - but no provider is going to be able to magically store all your data in multiple locations simultaneously.
If you want the benefit of multiple data centres you'd have to allow simultaneous updates at each location and there is no way to synchronise such updates without knowledge of the business logic of the application, so you're going to have to write some code to do this.
You're still going to have a look at what each Cloud provider offers and work out how each can help solve your problems, but you're going to have to do some work yourself.
What you're looking is CDN (Content Delivery Network) hosting for Windows Applications. In CDN, your content is cached on various POP's located across the continents. So, if a request is coming from India, cached copy of content stored on Indian POP is served. The same is the case for US, EU and other continent clients.
This technology is still in early phase of development and there are two types of CDN technology - PUSH & PULL. PUSH means content is immediately PUSHED to POP's when there is any change on Master server and PULL means POP servers are pulling content at regular interval from Master server and this interval is usually 12 hours to 24 hours.
If your site is database driven and frequently updated, PUSH technology CDN will be the right choice.