Azure cache configs - multiple on one storage account? - caching

while developing Azure application I got famous error "Cache referred to does not exist", and after a while I found this solution: datacacheexception Cache referred to does not exist (for short: dont point multiple cache clusters to one storage account by ConfigStoreConnectionString)
Well, I have 3 roles using co-located cache, and testing+production environment. So I would have to create 6 "dummy" storage accounts just for cache configuration. This doesnt seems very nice to me.
So the question is - is there any way to point multiple cache clusters to one storage account? for example, specify different containers for them (they create one named "cacheclusterconfigs" by default) or so?
Thanks!

Given your setup, i would point each cloud service at its own storage account. So this gives two per environment (one for each cloud service). Now there are other alternatives, you could set up Server AppFabric cache in an IaaS VM and expose that to both of your cloud services by placing them all within a single Azure Virtual Network. However, this will introduce latency to the connections as well as increase costs (from running the virtual network).
You can also put the storage account for cache as the same one used by diagnostics or the data storage for your cloud services, just be aware of any scalability limits as the cache will generate some traffic (mainly from the addition of new items to the cache).
But unfortunately, to my knowledge there's no option currently to allow for two caches to share the same storage account.

Related

Can you mount more than one ADLS2 instance in Databricks?

What is the best practice to set up a DEV/TEST/PROD environment for the data lakehouse / delta lake architecture? Do you have a separate ADLS2 instance for DEV/TEST/PROD each, or do you host all three in one ADLS2 instance? Can you even mount more than one ADLS2 instances in Datbricks?
It's better to have separate storage accounts for each environment. There are multiple reasons for that:
It's simpler to control access to the data - usually in prod environment only service accounts have access to the data, but dev & test environments have other requirements.
You're avoiding influence of dev/test on the prod & vice versa. Each storage account has limits on the number of API calls and bandwidth. This is especially helpful when you're trying to do load testing that may affect all environments if you have shared storage account.
You can mount as many storage accounts as you want, but you need to understand that everyone in workspace will have access to the data in that mount, and it will depends on the permission of the service principal or shared signature used for mount. More secure way would be to use credentials scoped to the specific cluster(s).

Transferring between IaaS providers with only raw images of instances and volumes

My system runs on a cloud environment provided by an academic research computing organization. Problem is, it's unreliable, service is bad, and it can be slow.
So I'm considering switching to a different cloud provider, but I want to know whether it is possible to create a new compute instance and volume from just snapshots (raw images) of those instances and volumes? I asked DigitalOcean and they said this isn't possible (I'd have to create a new droplet and reinstall/transfer everything).
I've also emailed AWS but no response yet. If this is possible (just because it seems like the simplest route), are there any recommendations of cloud providers?
My system is running Ubuntu with Apache and MySQL. It hosts a wordpress website, a large database, and a series of Java tools. The instance snapshot is about 20gb and the storage volume is 250gb.
Thanks in advance!

How to create a Windows Azure application hosted in different datacenters

I'm trying to figure out how to scale a Windows Azure app, where there are some web roles and some worker roles.
The objective is to have some instances in a US datacenter and some others in an Europe datacenter, for different users in America an Europe to have the better response time. My problem is to replicate all my storages (for users in Europe who travel to America and viceversa) and even for troubles in one datacenter.
Until now, I understand that it's possible using Traffic Manager to let Azure know which datacenter is closer to the user.
I know I can replicate data between databases with SQL Data Sync.
The blob storages can also be replicated using Copy Blob API .
I understand the queues cannot be automatically replicated but I don't have much problem with that.
My problem is I cannot find a way to replicate table storages.
As a matter of fact I really don't know if this is the best strategy for my problem...
Thank you.
DX - you are right on with Traffic Manager and Data Sync. Those are the best options for roles & SQL. However, BLOBs are much easier - enable CDN and your BLOBs are replicated across 24 data centers automatically. Read Using CDN for Windows Azure for how to setup the CDN from your primary Storage account.
For table storage, I would handle this programatically, keep a list of the Table connections and then use a parallel foreach to insert into the different data centers.
We maintain a different Service Configuration file for each Data Center to simplify deployment.

What are viable ways to develop an Azure app on multiple machines

The scenario is that I am rebuilding an application that is presently SQL and classic asp. However I want to update this a bit to leverage Azure Tables. I know that the Azure SDK has the Dev Fabric storage thing available and I guess it's an option to have that installed on all of my machines.
But I'm wondering if there is a less 'invasive' way to mimick the Azure Tables. Do object DBs or document DBs provide a reasonable facsimile that could be used for the early protoyping. Or is making the move from them to the Azure SDK tables just more headache than it's worth?
In my opinion you should skip the fake Azure tables completely. Even the MS development storage is not an exact match to how things will actually run in the cloud. You get 1M transactions for $1, 1GB of storage for $0.15 and $0.15 per GB in/out of the data centre. If you're just prototyping, live dangerously and spend $10.
If you're new to working with Azure tables and you try to use a development storage or some other proxy you'll save yourself that much money in time spent reworking your code to work against the real thing.
If you're just using tables and not queues, blobs $10 will go a long way.
Each Azure "project" (which is like an Azure account) is initially limited to 5 hosted storage accounts. Let's say you have one storage account for your actual content, and one for diagnostics. Now let's say you want a dev and QA storage account for each of those, respectively, so you don't damage production data. You've now run out of your storage accounts (in the above scenario, you'd need 6 hosted accounts). You'd need to call Microsoft and ask for this limit to be increased...
If, instead, you used the Dev Fabric for your tables during development / local testing, you'll free up Azure storage accounts (the ones used for dev - you'd still want QA storage accounts to be in Azure, not Dev Fabric).
One scenario where you can't rely on Dev Fabric for storage: performance/load testing.

Do I need Amazon's EC2, Cloudfront, RDS?

I want to publish a web site on Amazon's servers, that:
Runs CakePHP
Uses MySQL to store data
Lets users upload audio through flash (currently using a hosted Flash Media Server), and listen to the files later
Do I need Amazon's EC2 for the website, RDS for the MySQL database, and CloudFront for the FMS? I'd really like a walkthrough of which services I should use.
Thanks.
First of all you need EC2 service in order to have a virtual machine, where you can install Apache, PHP and your Web Application.
Then you also need a database server and data repository for the media files. The recommended way is exactly what you suggest: RDS for MySQL and CloudFront as the file repository.
Initially none of the above services (RDS, CloudFront and even EBS) were available. Developers have no way to use a MySQL database, because even if it was installed in an EC2 instance, the instance isn't guaranteed to stay up and running and if the instance is lost, the data is also lost. For this reason EBS was introduced. It created a mounted storage with guaranteed persistence that you could access from the EC2 instance. Theoretically you could install MySQL there and use it to store the flash files. If you only want to serve files through the HTTP protocol, there is no problem using EBS.
CloudFront however has some advantages:
Users are automatically routed to the nearest edge location for high performance delivery of your content.
You can also use it to stream content through the the RTMP protocol.
You don't have to worry about the size of the storage. With EBS you create a storage with a specific size. This could be a problem if you later find out that you need more storage. With CloudFront the files are installed in S3 and you do not need to worry about their size.
You do not waste web server capacity. If you use EBS, the files will be served by the server in EC2.
You could also use S3, but you wouldn't able to use the RTMP protocol and you would need to manually create links to your files. Also, it wouldn't be possible to use your domain name for the files.
RDS also has some advantages over installing MySQL in EC2, EBS:
automated database backups
You can monitor your database with Amazon CloudWatch (free service)
You need EC2 to launch instance and create your LAMP server. RDS is good if you don't need to manage MySql db yourself, but one limiting factor of RDS is you can't have DB replication.
For persistent storage, you can make use EBS or S3 for data file.
One thing not mentioned in any of these replies is the security that may (or may not) need to go around your file access. Cloud networks are good for publicly accessible data, but I haven't seen a cloud network yet that will provide a granular level of file access on a per user basis. While you may be able to obfuscate the url's to access files so that it isn't easy to sequentially guess audio file IDs, that may not be enough if people are keeping private audio. Not saying don't do it, just make the decision with care.

Resources