Is it possible to rent CPU cycles? - hosting

I have an application that takes days to process data. Is there a service that would let me run my application on powerful computers?
I'm not running a website or a web service. This is taking lots and lots of data files, running them through a big custom application, and outputting a result.
It takes days on my PC and it's something that needs to be done every once in a while, but not continuously.
Cost isn't really an issue, in the sense that my company will pay for it, but of course it should be cheaper than buying a big-ass machine ourselves.

Have you considered Amazon EC2? You pay by the hour for what you use. No more, no less. You could event rent many servers at once to split the work load.
I'm not sure if that meets your requirement of "powerful computers", because they're just average servers, but at least it will give you a pay-as-you-go solution for running the program off of your own computer.

Amazon's EC2 Service is an excellent solution for your needs. You only pay for the time you use, and you can scale up to as many machines as you need.
From their information:
Elastic – Amazon EC2 enables you to increase or decrease capacity within minutes, not hours or days. You can commission one, hundreds or even thousands of server instances simultaneously. Of course, because this is all controlled with web service APIs, your application can automatically scale itself up and down depending on its needs.
Flexible – You have the choice of multiple instance types, operating systems, and software packages. Amazon EC2 allows you to select a configuration of memory, CPU, and instance storage that is optimal for your choice of operating system and application. For example, your choice of operating systems includes numerous Linux distributions, Microsoft Windows Server and OpenSolaris.

If your application is not parallel, you won't get many advantages by running it in a "big machine", unless the bottleneck is in the virtual memory swapping. Even the Top500 supercomputers are not essentially faster than any PC for sequential workloads.
If your application can exploit parallelism maybe you could use your company's existent resources more efficiently than just deploying it in one and only pc. If you have a few dozens of computers, you could set up a loosely coupled heterogeneous cluster (or local grid, terminology changes with fashion).

I recommend CPUsage.
It is a "startup" in grid computing.
It's speciality is that any individual can join to the grid with spare cpu cycles. That makes the grid management cheap, thus the grid usage prices are also very cheap.
They have an API which if you integrate into your program, it will be able to run on the system.

Related

What is the dIfference between a distributed system and a clustered system?

Both are defined to be a set of computers that work together and give the end users a perception of a single computer running behind it.
So what is the difference here?
What is the difference between a car and a sports car?
A cluster is a system, usually managed by a single company. Clusters have normally a very low latency and consist of server hardware. A distributed system can be anything. Having JS on the client and PHP-server code which makes up together a system is already called a distributed system by some people.
In general when working with distributed systems you work a lot with long latencies and unexpected failures (like mentioned in p2p systems). When building a cluster (or a big cluster which can be called supercomputer) you try to prevent it by using more robust hardware and better network interconnection (InfiniBand). But nevertheless, a cluster is still a distributed system. (A sports car still has 4 wheels and an engine)

Requirements of Liferay Portal on JBoss

I tested Liferay on two different machines: One vserver with 1GB of RAM and the another with 3 GB of RAM. On the one with 1GB Liferay was very slow. On the second (3GB of RAM) it runs quite good.
My testing environments has just one organization/community and only one user (me). Imagine the situation I would build a portal for approx. 15 organizations and 400 users (30 users per organization) in total. Would a server with 3GB of RAM be enough to run quite fast?
This is very important question for me because of the financial aspect. I don't want to spend 200 Dollars per month for hosting. :-)
Thx.
It's more dependent on the number of concurrent users than on the number of users on the system.
IMHO Liferay runs slow on your 1GB server because most likely you didn't tune and run with the default memory settings - this will most likely cause swapping to step in, thus your suffering in performance.
Tipp: Download the performance whitepaper, read and understand the scenarios in there. Also, you can easily do the initial (rule of thumb) measurements on a local computer and see how much memory the JVM has to have in order to run smoothly. Especially in tight memory situations, you definitely want to fine tune your VM settings to match your hardware.
You'll find rough numbers and orders of magnitude in the performance whitepaper. See what best matches your usecases.
Remember that the same argument counts for your database and other components that you happen to have. With what I assume your sizing requirements to be (from the few details that you give) you should get Liferay to run on a server for well below 200$/month

Scaling Tigase XMPP server on Amazon EC2

Does anyone have an experience running clustered Tigase XMPP servers on Amazon's EC2, primarily I wish to know about anything that might trip me up that is non-obvious. (For example apparently running Ejabberd on EC2 can cause issues due to Mnesia.)
Or if you have any general advice to installing and running Tigase on Ubuntu.
Extra information:
The system I’m developing uses XMPP just to communicate (in near real-time) between a mobile app and the server(s).
The number of users will initially be small, but hopefully will grow. This is why the system needs to be scalable. Presumably for a just a few thousand users you wouldn’t need a cc1.4xlarge EC2 instance? (Otherwise this is going to be very expensive to run!)
I plan on using a MySQL database hosted in Amazon RDS for the XMPP server database.
I also plan on creating an external XMPP component written in Python, using SleekXMPP. It will be this external component that does all the ‘work’ of the server, as the application I’m making is quite different from instant messaging. For this part I have not worked out how to connect an external XMPP component written in Python to a Tigase server. The documentation seems to suggest that components are written specifically for Tigase - and not for a general XMPP server, using XEP-0114: Jabber Component Protocol, as I expected.
With this extra information, if you can think of anything else I should know about I’d be glad to know.
Thank you :)
I have lots of experience. I think there is a load of non-obvious problems. Like the only reliable instance to run application like Tigase is cc1.4xlarge. Others cause problems with CPU availability and this is just a lottery whether you are lucky enough to run your service on a server which is not busy with others people work.
Also you need an instance with the highest possible I/O to make sure it can cope with network traffic. The high I/O applies especially to database instance.
Not sure if this is obvious or not, but there is this problem with hostnames on EC2, every time you start instance the hostname changes and IP address changes. Tigase cluster is quite sensitive to hostnames. There is a way to force/change the hostname for the instance, so this might be a way around the problem.
Of course I am talking about a cluster for millions of online users and really high traffic 100k XMPP packets per second or more. Generally for large installation it is way cheaper and more efficient to have a dedicated servers.
Generally Tigase runs very well on Amazon EC2 but you really need the latest SVN code as it has lots of optimizations added especially after tests on the cloud. If you provide some more details about your service I may have some more suggestions.
More comments:
If it comes to costs, a dedicated server is always cheaper option for constantly running service. Unless you plan to switch servers on/off on hourly basis I would recommend going for some dedicated service. Costs are lower and performance is way more predictable.
However, if you really want/need to stick to Amazon EC2 let me give you some concrete numbers, below is a list of instances and how many online users the cluster was able to reliably handle:
5*cc1.4xlarge - 1mln 700k online users
1*c1.xlarge - 118k online users
2*c1.xlarge - 127k online users
2*m2.4xlarge (with 5GB RAM for Tigase) - 236k online users
2*m2.4xlarge (with 20GB RAM for Tigase) - 315k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 400k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 312k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 327k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 280k online users
A few more comments:
Why amount of memory matters that much? This is because CPU power is very unreliable and inconsistent on all but cc1.4xlarge instances. You have 8 virtual CPUs but if you look at the top command you often see one CPU is working and the rest is not. This insufficient CPU power leads to internal queues grow in the Tigase. When the CPU power is back Tigase can process waiting packets. The more memory Tigase has the more packets can be queued and it better handles CPU deficiencies.
Why there is 5*m2.4xlarge 4 times? This is because I repeated tests many times at different days and time of the day. As you can see depending on the time and date the system could handle different load. I guess this is because Tigase instance shared CPU power with some other services. If they were busy Tigase suffered from CPU under power.
That said I think with installation of up to 10k online users you should be fine. However, other factors like roster size greatly matter as they affect traffic, and load. Also if you have other elements which generate a significant traffic this will put load on your system.
In any case, without some tests it is impossible to tell how really your system behaves or whether it can handle the load.
And the last question regarding component:
Of course Tigase does support XEP-0114 and XEP-0225 for connecting external components. So this should not be a problem with components written in different languages. On the other hand I recommend using Tigase's API for writing component. They can be deployed either as internal Tigase components or as external components and this is transparent for the developer, you do not have to worry about this at development time. This is part of the API and framework.
Also, you can use all the goods from Tigase framework, scripting capabilities, monitoring, statistics, much easier development as you can easily deploy your code as internal component for tests.
You really do not have to worry about any XMPP specific stuff, you just fill body of processPacket(...) method and that's it.
There should be enough online documentation for all of this on the Tigase website.
Also, I would suggest reading about Python support for multi-threading and how it behaves under a very high load. It used to be not so great.

How many users a single-small Windows azure web role can support?

We are creating a simple website but with heave user logins (about 25000 concurrent users). How can I calculate no. of instances required to support it?
Load testing and performance testing are really the only way you're going to figure out the performance metrics and instance requirements of your app. You'll need to define "concurrent users" - does that mean 25,000 concurrent transactions, or does that simply mean 25,000 active sessions? And if the latter, how frequently does a user visit web pages (e.g. think-time between pages)? Then, there's all the other moving parts: databases, Azure storage, external web services, intra-role communication, etc. All these steps in your processing pipeline could be a bottleneck.
Don't forget SLA: Assuming you CAN support 25,000 concurrent sessions (not transactions per second), what's an acceptable round-trip time? Two seconds? Five?
When thinking about instance count, you also need to consider VM size in your equation. Depending again on your processing pipeline, you might need a medium or large VM to support specific memory requirements, for instance. You might get completely different results when testing different VM sizes.
You need to have a way of performing empirical tests that are repeatable and remove edge-case errors (for instance: running tests a minimum of 3 times to get an average; and methodically ramping up load in a well-defined way and observing results while under that load for a set amount of time to allow for the chaotic behavior of adding load to stabilize). This empirical testing includes well-crafted test plans (e.g. what pages the users will hit for given usage scenarios, including possible form data). And you'll need the proper tools for monitoring the systems under test to determine when a given load creates a "knee in the curve" (meaning you've hit a bottleneck and your performance plummets).
Final thought: Be sure your load-generation tool is not the bottleneck during the test! You might want to look into using Microsoft's load-testing solution with Visual Studio, or a cloud-based load-test solution such as Loadstorm (disclaimer: Loadstorm interviewed me about load/performance testing last year, but I don't work for them in any capacity).
EDIT June 21, 2013 Announced at TechEd 2013, Team Foundation Service will offer cloud-based load-testing, with the Preview launching June 26, coincident with the //build conference. The announcement is here.
No one can answer this question without a lot more information... like what technology you're using to build the website, what happens on a page load, what backend storage is used (if any), etc. It could be that for each user who logs on, you compute a million digits of pi, or it could be that for each user you serve up static content from a cache.
The best advice I have is to test your application (either in the cloud or equivalent hardware) and see how it performs.
It all depends on the architecture design, persistence technology and number of read/write operations you are performing per second (average/peak).
I would recommend to look into CQRS-based architectures for this kind of application. It fits cloud computing environments and allows for elastic scaling.
I was recently at a Cloud Summit and there were a few case studies. The one that sticks in my mind is an exam app. it has a burst load of about 80000 users over 2 hours, for which they spin up about 300 instances.
Without knowing your load profile it's hard to add more value, just keep in mind concurrent and continuous are not the same thing. Remember the Stack overflow versus Digg debacle "http://twitter.com/#!/spolsky/status/27244766467"?

Is on-demand elasticity the only major feature of cloud computing that cannot be easily found with traditional hosting?

I am trying to compare cloud computing (on EC2) against traditional hosting on the following grounds to determine whether any of these features present unique benefits in the world of cloud computing versus more traditional hosting strategies:
Real-time monitoring
Server virtualization
Deployment automation
High performance computing
On-demand elasticity
As far as I can see, (1) monitoring is just as easy in both areas; (2) server virtualization is also present in both areas thanks to server farms which allow traditional hosts to beef up resources at will - and of course the same applies in the cloud; (3) deployment can be equally automated in both areas since the same tools often can be applied to both; (4) in the area of high performance computing maybe you get an extra boost from the cloud theoretically but I'm not so sure - you have to pay for that boost whether it's the cloud or not; (5) elasticity is the only real benefit that i can see of moving to the cloud - resources can be pumped up at the flick of a switch.
So my question is, is this really the only benefit of cloud computing from this list that offers a real benefit over traditional hosting or is my analysis flawed?
The main difference here is the cost model. While it's true you can gain all of the same benefits from your list with both Cloud Computing and traditional hosting, you pay up front for traditional hosting. You have to buy and maintain your own servers, while cloud computing allows you to pay a variable cost.
This is the reason cloud computing is so attractive for startup companies.
Not only do you have elasticity, but you have, in theory at least, a greater total amount of resources available than you could have with any static hosting solution.
Also, a side effect of elasticity is decreased electricity usage, which may or may not be a factor for you.
The company I work for is getting ready to move from self-hosting to a cloud provider (EC2). One thing I am greatly looking forward to is not having to worry about managing hardware. I don't need to worry about lead time for ordering parts. The need to have spare parts on-hand to cover unexpected hardware failures is gone. I don't need to worry about UPS or any power. We aren't big enough for cooling to be a concern... but now we never will have to worry about that either.
Depending on your own datacenter costs, a cloud computing platform can be much cheaper, as you don't need anybody to manage physical devices. Cloud services can provide bulk computing resources at likely a lower cost than you can provide if you bought the machines and hooked them up yourself.
Assuming your "traditional hosting" involves a single server, there is a very real benefit to high-performance computing in cloud / grid environments. Specifically, virtually unlimited performance, since you can have n cores working at the same time, whereas with a single server, you are limited by the maximum server capacity.
To put it more clearly, if the most powerful computer in the world is a 1000 - core system with 20 terabytes of RAM, then that's the most power you could have on a hosted server. However, a cloud consisting of 100 of these machines could do 100x the work in almost the same amount of time.
Additionally, it's generally less expensive (financially) to distribute work across multiple smaller machines than it is to get one powerful system capable of doing the same work.
And if you'd like to talk about disaster recovery....clouds can be geographically distributed, meaning if a tornado rips your data center out of the ground, plucks the server into little shards of metal and plastic, and embeds them in telephone poles...you experience a slight dip in your performance because your other 99 servers are still operating.
Elasticity of the computing, storage and network capacities is just a feature. Yet, it brings a huge number of economical benefits for the companies. For example, by implementing a Cloud Bursting scenario a small SaaS company could easily and cheaply handle traffic and usage spikes that might take an expensive hosted solution down.
Elasticity is only useful if you have a problem that can be solved horizontally. For example a web server to serve a static site, if the load increases, add more web servers to server the exact same content. On the other hand, even a simple blog site breaks under that scenario as comments entered into one server's database are not reflected in the other machines.
The resources to scale is not the same thing as the ability to scale. Cloud computing will not solve scalability issues with your application.
A good example of this is a video hosting site: using AWS to deliver the videos results in a disappointing experience since the EC2 cannot deliver the I/Ops necessary to deliver video. Throwing more machines at the problem won't solve the issue with how data gets from disk to network. (Yes I'm aware of the ridiculously expensive high-iops instances)

Resources