Not able to install Hadoop in Google Compute Engine - hadoop

When I am trying to "Deploy Apache Hadoop" from google compute engine , i am getting a message as "Deployment would exceed CPU quota for us-central1. Limit: 8. Decrease usage, select a zone in another region, or request a quota increase." I tried with all the Zone. Its still not working.

If you are using the GCE Free trial, you are limited to 8 concurrent CPU cores. This is true for all regions and zones so trying in a different zone will not solve this problem.
To run a larger deployment, you need to upgrade to a paid account. Alternatively, you can use Google Cloud Dataproc or bdutil to deploy a Hadoop cluster and choose a few, smaller instance types such that you are not using more than 8 CPU cores.
You can see how many cores are in each machine type to help you decide how to structure your cluster.

Related

What if I choose us-central1-a zone for my google compute VM Instance and my traffic, calling VM is from asia? (in respect of pricing & efficiency)

I am trying to create a Google Compute VM Instance which will host my website, the traffic to this website will be coming mostly from asia, so which region should I select for my compute VM Instance.
How selecting of region will effect on the pricing and performance?
Have a look at the Best practices for Compute Engine regions selection section Factors to consider when selecting regions:
Latency
The main factor to consider is the latency your user experiences.
However, this is a complex problem because user latency is affected by
multiple aspects, such as caching and load-balancing mechanisms.
In enterprise use cases, latency to on-premises systems or latency for
a certain subset of users or partners is more critical. For example,
choosing the closest region to your developers or on-premises database
services interconnected with Google Cloud might be the deciding
factor.
For example you can serf some sites located and Asia and then compare your experience to sites located in US - you'll notice significant difference in response caused by latency. The same with your site - it'll be less responsive. You should set up your VM instance as close to your customers as possible.
To estimate pricing check resources below:
Pricing
Google Cloud resource costs differ by region. The following resources
are available to estimate the price:
Compute Engine pricing
Pricing calculator
Google Cloud SKUs
Billing API
If you decide to deploy in multiple regions, be aware that there are
network egress charges for data synced between regions.
In addition, you can find monthly estimate cost in Create a new instance wizard as well - try to set different regions and you'll get the numbers.
If your customers located in different regions you can try Google Cloud CDN:
Cloud CDN (Content Delivery Network) uses Google's globally
distributed edge points of presence to cache HTTP(S) load balanced
content close to your users. Caching content at the edges of Google's
network provides faster delivery of content to your users while
reducing serving costs.

AWS EC2 pay per hour with 100% availability

I want to run an server for an application I have.
I'm a complete beginner with AWS, so bear with me.
There will be about 50 users (all from the same time zone) that will be accessing the server and I would like to have near 100% availability.
The application I have requires 2 processors and 2GB ram.
I could pay for a machine 24/7 or even only 18 hours a day, assuming I turn it off at night, but I there will be some days where the server is not used at all.
I was wondering if the following is possible: when amazon detects that someone is requesting something from my server, it turns it on in real time, and then forwards the request to my server. After say 5 minutes of no activity, it will turn my server off. This way I can only pay for hours when there is traffic.
Is this possible?
How have people solved similar problems?
No this isn't possible. There is nothing built into AWS to detect traffic and start an EC2 server like you suggest. Plus the startup time on an EC2 server is at least a couple minutes, so those first incoming requests would have to wait a really long time.
You might want to look into running multiple small servers instead of a single larger server. AWS does have the ability to balance the load across multiple servers and add/remove servers from the pool based on traffic. You could have as few as one server running when there is no traffic, and have more servers automatically created as load increases. Look into the AWS Elastic Beanstalk service for this.
If you want to run a truly "serverless" environment where you only pay for compute cycles in milliseconds, instead of servers by the hour, you could look into using AWS Lambda. If you can architect your system to run on Lambda you are almost guaranteed to save costs, but it can be a real challenge to convert an existing system to this sort of architecture.
If you want to look outside AWS you might find something more along the lines of what you describe with Google App Engine. Heroku's free tier also works similarly to your description, but as soon as you outgrow the free tier you have to upgrade to always-running instances.

any alternatives to Amazon Windows Virtual Machine hosting?

Does anyone know if there are any competing hosting alternatives I can explore other than Amazon Web Services for running very small instances of Windows virtual machines? I have used AWS for years but am thinking that it might be worth-while to see if there are better alternatives.
In particular, the scenario I have is this: I have created a Windows virtual machine image with the applications and configuration I want and then spin up VMs based on that image as I need from on the AWS spot market. I can go weeks at a time without needing any virtual machines but then will spin up 20 VMs for a few hours to do a particular job. I typically pay around .61 cents an hour per micro Windows VM running on AWS (keep in mind that the AWS spot market is way cheaper than reserved instances).
Does Microsoft Azure or any other service support a similar scenario? I don't mind paying a little more if the performance and such is better. However, it is absolutely critical that I can set things up so I only have to pay for VMs when I actually need them rather than keep paying for VMs that aren't in use.
Microsoft Azure has the capability you are looking for. You can upload your own images and then quickly deploy extra-small machines based on it. On Azure you can turf off the VM's through the Azure portal after you are finished with them and you will not be charged. Make sure that you do it through the portal and not the windows session or you will continue to be billed.
Check out this link for pricing information:
http://azure.microsoft.com/en-us/pricing/details/virtual-machines/
You can follow these steps to upload your image to your azure account:
http://azure.microsoft.com/en-us/documentation/articles/virtual-machines-create-upload-vhd-windows-server/
Also, you can scale up very easy in the azure portal so this might help reduce your need for spinning up multiple machines.

Setup cloudbees app to always have instances in different Regions

In the past month I've seen my cloudbees app being down due to cloudbees issues with its providers. Yesterday AWS East had problems and last summer this happened: http://blog.cloudbees.com/2012/07/cloudbees-postmortem-on-two-recent.html
In order to achieve higher availability, I am wondering if would be a viable solution, and supported by cloudbees, to have always two instances open on different regions. Best would be if one of them could be EU.
Thanks.
You're right that a Cloud application to be Highly Available must be multi-region on AWS. This has some serious impact on app architecture, with master/backup, data replication and such issues to address.
We (cloudbees) don't provide an out-of-the-box solution to this complex issue, that really depends on your requirements, data weight and update frequency, etc
Deployin in EU region is only available on Cloudbees for "dedicated servers" (contact sales#cloudbees.com for details and princing) but could be an option to get such a multi-region HA application

Compare site traffic to Dedicated machine time usage

If I have a website, hosted with a standard hosting company, and I would like to move it to a Dedicated machine, maybe EC2, is there a way to compare my current traffic to usage of a cloud machine?
Hosting companies gives you plan measured in Bandwith/Space while EC2 in usage time.
So I'm looking for a way to predict machine usage time based on my current traffic data for costs evaluation.
Thanx!
I'm not sure you're understanding usage time correctly. For your website to exist on EC2, you'll need to create one or more instances depending on the architecture you use. This is the same as a dedicated hosting setup elsewhere except with cloud instances.
The difference lies with billing. Where a traditional hosting company will charge you monthly, EC2 charges you per instance hour, or every hour you have an instance running. Therefore, for hosting a website, you'll have the server running 24/7 which will equate to roughly 720 hrs a month charged at a few cents per hour.
The key thing to work out is how many/what size instances you'll need to run your site at the equivalent performance you're seeing now, and that's only something you'll figure out with testing.

Resources