Public-Private Cloud (Hybrid Cloud) - hadoop

Let's assume that I have a limited budget to make a small in-house private cloud. Now I want to be able to combine this private cloud with one of the public clouds (e.g. Amazon EC2). what options do I have?
More specifically I want to be able to do the following:
Use my private cloud primarily but if the request rate or size of datasets increased, transfer part of the load/data to EC2
Store my confidential data on the private cloud and move the more general data to EC2. Upon receiving a request, I want to be able to do some computation on the public data and then combine that with some computation on confidential data. But the confidentiality of the data must not be compromised.
I am looking into this for a project and would appreciate any idea/suggestions or related material.

It's a difficult question because the variety and amount of paths you can take in order to do this is great.
Anyway, what you're describing it's an hybrid cloud:
First you have to build your private cloud, there are plenty of options to do this, you have CloudStack, OpenStack, Eucalyptus, Open Nebula, etc. If you choose the open source way (I recommend it) you can see this analysis of the different open source cloud computing solutions:
http://bit.ly/QeGpqK
When you have your own infrastructure managed by your private cloud, you need a third party provider like Amazon for deploy the "public side" of your infrastructure.
And now comes the hard part:
You have tu build your own logic in order to scale your infrastructure to your "public side", and this will be actually the 80% of the work, you have to plan when, what and where you will scale, you have to identify what data you want to store in the public side, etc...
Tools like Rackspace can help you a little bit:
http://www.rackspace.com/cloud/hybrid/

#arcade_fire provides an excellent overview of competing private cloud offerings that are OpenSource. To this list, you could add Microsoft's SCVMM and VMWare's vCloud. Depending on budget and workload, these proprietary offerings may also be of interest.
After choosing a cloud O/S, consider the following problems:
How do you make the public and private elements of the hybrid cloud transparent to your network?
How do you control the resulting Hybrid cloud?
The first issue is addressed by using what is referred to as VPC (virtual private cloud). This term was introduced by AWS to describe a public cloud deployment that sits on a private network. These machines are joined to your private cloud via a VPN. One end of the VPN is in the cloud, and the other in your data center. A google of "aws vpc architecture" will net you a good explanation. I liked EC2 to VPC: A transition worth doing
The second issue involves you choosing admin tools to manage your work load that support the API used to communicate to each of your public and private accounts. The architype example is RightScale, which supports a number of APIs, but there are others. #arcade_fire provides a link to Rackspace. Alternatively, you can find consultancies that can tailor a tool depending on the intended workload e.g. ShapeBlue (CloudStack ecosystem).

If you are planning to have your own hybrid cloud you have to have you own code that look after the scaleUP and other provision task .
for private cloud you can go for eucalyptus or OpenQRM . if you are comfortable with Linux you can use Open source KVM that comes bundled with Linux OS hosted on apache server , you can call its exposed methods from your java or PHP code and carry out the provision and de-provision task . or otherwise you can simply use the management console provided by KVM.
citrix -Xen is also one of the option.
Now for public cloud EC2 is an option other than that you can use various other Iaas.
for high availability you can Open-source apache HAProxy that take care of your load balancing.
as you are dealing with some data you have the options to go for BIG data providers that includes , mapreduse ,Terradata , IBM-netzza ,cloudere for graph and other analysis you can use splunk and as apache hadoop with pig and hive is always an option .
the scaleUP code you have to write along with the integration of private and public cloud. Amazon exposes its web service you can leverage it ....

there are professional vendors who offer this combined service of both a private (primarily) and a public cloud. it's called a hybrid cloud, you build your own private one to serve your project, and you've got some drawn elements from public clouds to serve you even better, in cases of increased data base while your own cloud is limited. i personally like stratoscale, their symphony product is good and serves all of a customer's needs (from my experience), but obviously you've got many out there (they work with openstack as well.)

The thing you're describing is undoubtedly a Hybrid cloud deployment model. Out of my experience with our team I can recommend you go ahead give a third party tool a chance. Third party services nowadays allow you to have all the relevant freedom of action within the cloud environment, which basically means you have the complete control over your cloud resources. Those services let you manage your on-premise private cloud along with using AWS as an-on premise, which is a very advanced function. You might want to check it out as well as for the costs, most of these tools offer a free trial.

Related

Azure Resource Manager: The Future of Cloud Services

I am currently working heavily in Azure. I am actually quite fond of ARM (Azure Resource Manager) right now and would love to keep using it. Right now in the old portal, We have a lot of resources tied up as Cloud Services. Now, I know cloud services are available in the new portal, but it seems that Microsoft is moving away from the classic cloud service model. Can someone explain if this is true? If so, what will the new model look like? I already use resources groups to manage Websites (WebApps), so I assume this is where the azure future lies. Will we see the "deprecation" of cloud services on down the line?
I am trying to understand if I need to begin re-structuring my Azure Infrastructure.
Any insight, explanation, or documentation is greatly appreciated.
So there are two things here - Cloud Services and managemenet of Cloud Services.
When you manage Cloud Services in current portal the underlying mechanism used is Azure Service Management (ASM) where as it is Azure Resource Manager (ARM) in the preview portal. To me, ARM is the new way of managing your Cloud resources in Azure (including Cloud Services).
I don't work for Microsoft so I would not know if Cloud Services themselves will be deprecated down the road or not but one thing I think will happen is that ASM will be deprecated in favor of ARM. At some point of time, the only option you will be left with managing your cloud resources will be through Azure Resource Manager. One example that makes me believe this thing is the presence of Classic resource providers (e.g. Classic Storage Resource Provider which enables you to manage storage accounts created in current portal via ASM in the preview portal which works exclusively on ARM).
Personally I can't see a place for cloud services in the new ARM world of Azure. I have always found them a convoluted concept that simply added complexity to a deployment.
In the ARM view of deployments servers are collected together in a VNet, and each server is attached to a Nic which in turn can be connected to the internet. A security group then takes care of ingress / egress rules.
This is a much cleaner deployment method, as it puts connectivity configuration at the server layer instead of mapping them all through a higher layer of abstraction.
I don't see the place of cloud services in ARM, however after a quick search it seems that there is a plan to implement it
Still no direction from the Azure Advisers group other than officially they will not drop support for Cloud Services. I think they are nearing giving us some kind of direction but I can't say anymore than that.
I asked a question about the future of Cloud Services on the recent Azure Compute AMA.
You can read the answers directly on Reddit for all details, below are a few interesting quotes (emphasis mine).
On ARM Integration for Cloud Services:
We are looking at ways to make the transition to ARM easier for Cloud Service customers- one of those options includes CS integration in ARM. This investigation is in the very early stages though, so if you are looking for a solution soon, check out VMSS/ACS/SF/Web Apps (meagan-msft)
And:
I think it's safe to say that if we make any significant investment in CS in the near future, it would be ARM integration, and as Meagan suggests, that's still in planning. Beyond that, there are no major feature improvements on the horizon. We believe the platform is pretty mature at this point. (seanmichaelmckenna)
So it doesn't look like any major innovations will hit Cloud Services soon, however:
Cloud Services are not going anywhere. In fact, many Microsoft services run on Cloud Services, so we heavily rely on them as well. They are fully supported, so feel free to continue to use them.
(meagan-msft)
For those who want to switch to a different Compute service, these recommendations were made:
However, if you would like to check out other services that are integrated with ARM today, we recommend checking out the following:
Web Apps for customers who want a fully managed platform and are building traditional web applications
Service Fabric for customers who want an opinionated application platform and managed infrastructure, but still need some control over the IAAS layer
VM Scale Sets for customers who need IaaS-level control with easy scaling, autoscale and load balancer integration
Azure Container service was also listed as a potential alternative.
Some things to consider (my understanding):
Service Fabric currently (2017) requires at least 5 VM instances, except for dev/test purposes. So probably only an option for larger services
VM Scale Sets is an IaaS offering, i.e. you have to manage OS updates etc. yourself. However, support for automatic OS updates is being worked on.

real time number crunching and storage on cloud

I have some hardware devices that send some data that need to be stored on the cloud server and also I need to do some real time processing on them.
The data they send need to be preserved for months in some custom binary files. These files related to each device can grow in size up to 10GB over time.
There will client programs (mobile / web) that will be looking at the processed data at real time.
My prefered choice of language is C/C++/C#, since there is time sensitive number crunching involved.
Goal is write scalable application that can have thousands of such devices monitored on the cloud.
Do I have to upfront write the code for running on the cloud ( undestand Azure / amazon EC2) ? Can I write multi threaded desktop application and later migrate to cloud ?
I have used Message passing interface (MPI) in the past for clusters. Can I still use MPI ?
If I use microsoft azure API can I still host my software on Amazon cloud ?
For mobile devices to talk to the server, I understand that I need to have a webservice running. how can I convert a desktop program writeen in C++ / C# to act as a web service talking to client?
Are there any 3rd part frame works or tools taht can help me with my work ?
With most cloud compute services you can deploy an off-the-shelf server and install your own software on it. So, yes, you can write and test you application locally then migrate to the cloud once you get all the bugs worked out. Here are the available EC2 server configurations.
I have not tried MPI but you should be able to run just about anything you want on the servers in the cloud. However, Amazon does offer the Simple Queue Service which provides message passing in the cloud. Your software does not need to run in the cloud to use this service.
I have not used Azure. I doubt there are any restrictions regarding which external servers you use for storage and/or compute. However, keeping your cloud storage and compute resources within a single provider will reduce costs, improve performance and provide you with a unified management interface and billing system.
Web servers are fairly simple things. See this post. That took me about 10 seconds to find.
There is plenty of third party software out there. Figure out what you need in more detail and ask more specific questions

Cloud Mangement for Amazon IaaS

I am planning to migrate few products on Cloud which will be used as a platform for the developer community. In short I am trying to host PaaS vendor for my products which can be consumed by developers for build and development process.
The plan is as below:
I am trying to use Amazon IaaS ( S3, EC2) as the hardware.
I will require a cloud management software which can be installed somewhere on one of my local systems and can manage the Amazon cloud.
I will deploy all my products on the Amazon Cloud with the help of the Cloud Management Software.
I will develop and provide APIs to my end users(developer community) to use my service as a PaaS.
What I am trying to achieve is as follows:
Vendor independence in terms of IaaS. Lets say tomorrow I move to another IaaS provider.
Customer support for the cloud management software.
Ease of setup and use for the cloud management software.
Evaluation so far:
I tried looking at Eucalyptus and it sounds promising, but I am still not able to find out if this will be supporting the public cloud setup as my requirement is. I believe this is more like a private cloud setup.
If anyone can help me compare the other available options, that would help me solving my issue. For e.g. RightScale, OpenStack, CloudStack, Nimbula etc.
There are several PaaS providers out there. There is a comparison here: Looking for PaaS providers recommendations
Disclaimer: I work for GigaSpaces, developing the Cloudify open-source PaaS stack.
Cloudify answers most of your requirements, especially vendor independence - it supports a large number of IaaS providers, including: EC2, HP, Rackspace, Azure and others.
Cloudify does require its management server to run in the same cloud as the applications it runs so it can collect monitoring information using private communications rather then over the internet. Why do you want to run your management server on-premise?

Migrate Azure Web Site to Azure Cloud Service

I have a project and I'm planning to start the web app as an Azure Web Site and then migrate it to an Azure Cloud Service (also called Hosted Service) if it is needed as a scale strategy.
The decision is because I read that Azure Web Sites are more simple and fast to develop with almost no Azure-specific configurations or code. So starting fast and simple is a good starting point for the project.
But, is that a good starting point for you?
Is migrating an Azure Web Site to an Azure Cloud Service the same as you were migrating a normal ASP.NET Website to an Azure Cloud Service?
Would you start with an Azure Cloud Service right from the beginning? If yes, why?
Thanks for your time.
There are benefits to both deployment models, it will eventually come down to what you are trying to achieve and ultimately the success of your application.
Below I've outlined the Pros and Cons of each of the models to ensure that you're making the right choice for your applications goals.
Windows Azure Web Sites
You have properly identified that Windows Azure Web Sites is a great starting point for an application, however you could also consider that Web Sites does offer enough scalability for many solutions.
Pros
10 Free sites during preview [Free for 12 months]
Easy Deployment (use Git, TFS, Web Deploy or FTP)
Quick Scalability (You can move to your own dedicated cluster [aka reserved standard])
Simple Development (Supports Classic ASP, ASP.NET, Node.js, Python & PHP)
Persistent Environment (most people are used to this)
Cons
No SSL Support on Custom Domains
in Preview (currently no SLA)
Windows Azure Cloud Services
Cloud Services (formerly known as Hosted Services) is definitely the vision for the future of Web Applications. It is built with resiliency in mind to keep the cost of applications affordable by scaling to meet demand, and dial back capacity when your traffic slows.
Pros
Increased control over the cost of your application (if architected correctly)
Flexibility (You have full control over the environment)
SSL Support
Language Agnostic
Web Server Agnostic (although IIS is available by default)
Auto Management of Servers
Cons
Architecture should be carefully considered
Deployment time is slower (Slows development cycle)
Things to consider for Portability
The items above might have given you enough to plan the immediate future of the application and it is very likely that you might want to consider Cloud Services in the future (it fits a number of application scenarios better in the long run).
Here is a list of things to help portability between Web Sites to Cloud Services:
Start thinking Stateless
Windows Azure Web Sites is nice as it is a persistent environment, which means you are able to store things like session state and assets to the disk.
Although this is a good feature, it's best to start planning towards a stateless application, if your end goal is to be in Cloud Services. Here are a few things you can do to start thinking stateless:
Don't rely on Session State
If you need it, come up with a strategy to make it scale (Caching Service, SQL, or Storage)
Use the Storage Service
Assets such as Static HTML, css, javascript and images are better placed in Storage
Avoids additional bandwidth on your Web Site (potentially stay shared longer for lower cost)
Can be CDN Enabled, provides a better experience for International markets
Easier to update web assets when application is migrated to Cloud Services
Storing User content
If your application already stores to the Storage Service, there is one less code modification in the future when moving to cloud services.
Make it easy to discover patterns in your Data
The benefit of Cloud Services is it enables you to reduce cost by only scaling what needs scaled. Starting the process of identifying your scale units i.e. How you partition your database or Tables in Storage.
I read all post and all of them are very helpful.
In addition to all post , I found an info on msdn : Windows Azure Websites, Cloud Services, and VMs: When to use which?
With Windows Azure Websites you can:
Build highly scalable web sites on Windows Azure.
Quickly and easily deploy sites to a highly scalable cloud environment that allows you to start small and scale as needed.
Use the languages and open source applications of your choice then deploy with FTP, Git or TFS, and easily integrate Windows Azure services like SQL Database, Caching, CDN and Storage.
With Cloud Services you can:
Build or extend your enterprise applications on Windows Azure.
Create highly-available, scalable applications and services using a rich PaaS environment. Support advanced multi-tier scenarios, automated deployments and elastic scale. Deliver great SaaS solutions to customers anywhere around the world.
And also there is summarizes the option on msdn :
And comparing some features Web Sites and Cloud Services on msdn:
Azure is a great place to have your app, but there are some considerations you need to know before start migrating it.
Azure Websites and Hosted Services are really trivial to deploy. With
Visual studio you generate the package and simply upload it. Then you
have a Development environment to check it. If it's ok for you, swap
ips. If it's not ok for you, upgrade again.
Your instances have some properties that could be annoying. For
example, you cannot be sure about your IP. Then if your app works
with some provider using IP restriction, you will need to figure out
how to proceed.
More considerations. Your "server" could be reimaged at any moment.
If you store something on the local disc, that file could go away at any moment.
Azure works very nice if you have at least 2 instances or more for
each website. Maybe your app is not prepared for that. The first step
will be managing the sessions with the appFabric. Is really
easy, just a change on your web config. Be careful because this
session state doesn't work exactly as the "old one". You cannot store
non-serializable objects (should be easy to adapt) or a very large objects (more than 8MB).
If you are going to develop something from zero, I suggest you to start into azure from the beginning. The reason is simple: it's really cheap to start and you will not pay serious money until the app have lot's of visits. It's also very cheap to setup a SQLAzure and a storage account. One you have all in place, it's easy to add more instances or scale up.
Example:
Imagine you have an idea and you wish to show up to some possible investors.
You start setting up a little SQLAzure database (1GB ), $9,99 monthly.
Then you build a site and you put 2 extra small instances, $18,72 monthly.
Let's say you need 100 GB of space (images, backups, ...), $12,50 monthly.
At his point, you have all in place to start your business paying less than $50 monthly.
If you site have exit and the visits starts to come, you change your instances for small instances (it's really dangerous to have production environment with extra small instances, because do not have cpu reservation). Then you change the extra small cost ($18,71) up to $57,60. Maybe you need more space to that SQL Azure? etc...
prices calculated from here: http://www.windowsazure.com/en-us/pricing/calculator/?scenario=web .
Those are few tips, there is a lot more. My advice is to start a trial account and play with it.
Final advice: Its very easy to solve everything just purchasing more resources. Sometimes you need to refactor and optimize your code. If you simply add more resources each time you have a problem, you could end with a huge bill and a very messy code.
Hope it helps!
Another advantage of Windows Azure Cloud Services over Web Sites is that a cloud service can be added to an Azure Virtual Network. This can give it access to on-premises resources like databases. So if your requirements are such that you need the scalability offered by Azure but need to keep your data on-premises due to security restrictions, cloud services is a better choice.
Azure web sites cannot be part of an Azure virtual network. To access on-premises resources mechanisms such as Azure Service Bus Relay must be configured.
We've had our web site running on PHP on some hosting and at some point decided to move it to Azure (where sits main part of our service). We've started with Azure Web Sites which was great from development point of view (mainly integration with git). But after about a week of testing (when we've decided to actually move the production web site) we've found that currently
No SSL for custom domains
Custom domains are available only for reserved instances (no shared infrastructure)
SLA
So we moved to Hosted Service. The main problem for us was lack of ability of simple deployment (need to build package and upload whole package of the web site), and found solution was to use dropbox - as a startup task for role, we're installing dropbox service on the machine, which takes all the web site from dropbox, which in turn have SVN checked out folder, so site updates became very easy.

library/development platform on EC2/Rackspace/Eucalyptus/OpenStack

I am trying to build a cloud VM brokering service which can borrow computer power as VM's on-demand, from the private/public cloud computer infrastructure. I have following goals for my service.
Abstract out vendor specific API in to a library which will give flexibility to choose any of the vendors (eg. EC2, rackspace) VM's with out affecting my service built on top of the library.
Also I should have flexibility to borrow VM's from a pure private cloud infrastructure built using stacks like OpenStack/Eucalyptus. Due to huge upfront Capex we will be using public clouds but we plan to move to private cloud infrastructure. So from design perspective we want to hide those details transparent to brokering service.
My question is if there are any open-source/commercial libraries or cloud development platforms, which can give me this functionality over which I can just build my service without really bothering about vendor specific details.
I came across rightscale & scalr but I am not clear if they are tools or platform. I need a platform over which I can develop not just to tools to monitor and auto provision cloud deployments.
TIA.
For python there's boto and libcloud.
For Java there's jclouds and also a port of libcloud (scroll a bit further down the page).
These are all open source libraries.
Yes, there is! It's a ruby library called fog. It's the only library I have found which gives you a vendor agnostic interface to various cloud providers.
For an Openstack cloud (RackSpace and may be some other in future) you should consider using the following python libraries:
novaclient - client library for OpenStack Compute API
nova-adminclient - client for administering Openstack Nova
You will be able to write recipes to provision control and play with your VMs in an Openstack cloud.
Hope it helps. Let me know if you need any more help in this regard.

Resources