On memory intensive on-demand compute

On memory intensive on-demand compute - aws-lambda

We are considering using azure functions to run some compute on. However our computes could take up lots of memory, lets say more than 5GB.
As I understand there is no easy way to scale azure functions based on memory usage. Ie If you reach 15GB start a new instance (since you don't want it to run over the maximum memory of your instance)?
Is there a way around this limitation?
OR
Is there another technical alternative to azure functions that provide pay per use and allows rapid scaling on demand?

Without any further details about what you are actually trying to compute, it is very hard to give you any meaningful advice.
But there are a few things you could consider.
For example, if you are processing CSV files and you know you need 1GB of memory to process a file, then "spin up" a AWS Lambda per file. You could use serverless orchestration like AWS Step Function to coordinate your processing. This way you are effectively splitting up your compute intensive tasks.
Another option would be to automate "pay per use" yourself. You could use automation tools like Terraform to start a EC2 spot instance, run your computing task and after it finishes just shut down the EC2 instance. That is more or less pay as you go with a bit of operations overhead on your part.
There are also other services like AWS Fargate, which are marketed as "Serverless compute for containers", allowing you to run Docker containers in a pay per use manner. Fargate allows provisioning of up to 30GB of memory.
Another option would be to use services like ElastiCache or Memcached to "externalize" your memory.
Which of these options would be the best for you is hard to tell, because it depends on your constraints: do you need everything to be stored in the "same memory" or can it be split up etc, what are your data structures, can it be processed in chunks (to minimise memory usage), is latency important, how long does it take, how often do you need to run your tasks, etc...
Note: There are probably equivalent Azure services to the AWS services I talked about in this answer.

Related

Distributed calculation on Cloud Foundry with help of auto-scaling

I have some computation intensive and long-running task. It can easily be split into sub-tasks and also it would be kind of easy to aggregate the results later on. For example Map/Reduce would work well.
I have to solve this on Cloud Foundry and there I want to get advantage from autos-caling, that is creation of additional instances due to high CPU loads. Normally I use Spring boot for developing my cf apps.
Any ideas are welcome of how to divide&conquer in an elastic way on cf. It would be great to have as many instances created as cf would do, without needing to configure the amount of available application instances in the application. Also I need to trigger the creation of instances by loading the CPUs to provoke auto-scaling.

I have to solve this on Cloud Foundry
It sounds like you're on the right track here. The main thing is that you need to write your app so that it can coexist with multiple instances of itself (or perhaps break it into a primary node that coordinates work and multiple worker apps). However you architect the app, being able to scale up instances is critical. You can then simply cf scale to add or remove nodes and increase capacity.
If you wanted to get clever, you could set up a pipeline to run your jobs. Step one would be to scale up the worker nodes of your app, step two would be to schedule the work to run, step three would be to clean up and scale down your nodes.
I'm suggesting this because manual scaling is going to be the simplest path forward (please read on for why).
and there I want to get advantage from autos-caling, that is creation of additional instances due to high CPU loads.
As to autoscaling, I think it's possible but I also think it's making the problem more complicated than it needs to be. Auto scaling by CPU on Cloud Foundry is not as simple as it seems. The way Linux reports CPU usage, you can exceed 100%, it's 100% per CPU core. Pair this with the fact that you may not know how many CPU cores are on your Cells (like if you're using a public CF provider), the fact that the number of cores could change over time (if your provider changes hardware), and that makes it's difficult to know at what point you should scale your application.
If you must autoscale, I would suggest trying to autoscale on some other metric. What metrics are available, will depend on the autoscaler tool you are using. The best would be if you could have some custom metric, then you could use work queue length or something that's relevant to your application. If custom metrics are not supported, you could always hack together your own autoscaler that does work with metrics relevant to your application (you can scale up and down by adjusting the instance cound of your app using the CF API).
You might also be able to hack together a solution based on the metrics that your autoscaler does provide. For example, you could artificially inflate a metric that your autoscaler does support in proportion to the workload you need to process.
You could also just scale up when your work day starts and scale down at the end of the day. It's not dynamic, but it simple and it will get you some efficiency improvements.
Hope that helps!

Best practices for use of Neo4j on Google Compute Engine / Amazon EC2 instances

There is a very nice guide on optimizing linux machine for Neo4j. But this guide assumes the typical characteristics of a physical hard drive. I am running my Neo4j instances on Google CE and Amazon EC2. I am unable to find any document detailing an optimal setup for these virtual machines. What resources do I need in terms of memory (for heap or extended use) and disk speed / IOPS to get an optimal performance? I currently have a couple of million nodes and about ten million relationships (2 GBs) and the data size is increasing with imports.
On EC2 I used to rely on SSD scratch disks and then make regular backups to permanent disks. There is no such thing available on Compute Engines, and the write speeds don't seem very high to me, at least at normal disk sizes (because speed changes with size). Is there any way to get a reasonable performance on my import/index operations? Or maybe these operations have more to do with memory and compute capacities?
Any additional reading is welcome...

Use local disks whenever possible, SSDs are better than other, try provisioned ops on AWS.
EBS is not a good fit, it is slow and jittery.
No idea for compute engine though, you might want to use more RAM and try to load larger parts of the graph into memory then.
Additional reading: http://structr.org/blog/neo4j-performance-on-ext4
You still should check the other things mentioned in that blog post. Like Linux scheduler, write barriers etc.
Better to set those memory mapping settings manually. And for the 2nd level caches probably check out the enterprise version with the hpc cache.
See also this webinar: https://vimeo.com/46049647 on hw-sizing

AWS RDS Provisioned IOPS really worth it?

As I understand it, RDS Provisioned IOPS is quite expensive compared to standard I/O rate.
In Tokyo region, P-IOPS rate is 0.15$/GB, 0.12$/IOP for standard deployment. (Double the price for Multi-AZ deployment...)
For P-IOPS, the minimum required storage is 100GB, IOP is 1000.
Therefore, starting cost for P-IOPS is 135$ excluding instance pricing.
For my case, using P-IOPS costs about 100X more than using standard I/O rate.
This may be a very subjective question, but please give some opinion.
In the most optimized database for RDS P-IOPS, would the performance be worth the price?
or
The AWS site gives some insights on how P-IOPS can benefit the performance. Is there any actual benchmark?
SELF ANSWER
In addition to the answer that zeroSkillz wrote, I did some more research. However, please note that I am not an expert on reading database benchmarks. Also, the benchmark and the answer was based on EBS.
According to an article written by "Rodrigo Campos", the performance does actually improve significantly.
From 1000 IOPS to 2000 IOPS, the read/write(including random read/write) performance doubles. From what zeroSkillz said, the standard EBS block provices about 100 IOPS. Imagine the improvement on performance when 100 IOPS goes up to 1000 IOPS(which is the minimum IOPS for P-IOPS deployment).
Conclusion
According to the benchmark, the performance/price seems reasonable. For performance critical situations, I guess some people or companies should choose P-IOPS even when they are charged 100X more.
However, if I were a financial consultant in a small or medium business, I would just scale-up(as in CPU, memory) on my RDS instances gradually until the performance/price matches P-IOPS.

Ok. This is a bad question because it doesn't mention the size of the allocated storage or any other details of the setup. We use RDS and it has its pluses and minuses. First- you can't use an ephemeral storage device with RDS. You cant even access the storage device directly when using the RDS service.
That being said - the storage medium for RDS is presumed to be based on a variant of EBS from amazon. Performance for standard IOPS depends on the size of the volume and there are many sources stating that above 100GB storage they start to "stripe" EBS volumes. This provides better average case data access both on read and write.
We run currently about 300GB of storage allocation and can get 2k write IOP and 1k IOP about 85% of the time over a several hour time period. We use datadog to log this so we can actually see. We've seen bursts of up to 4k write IOPs, but nothing sustained like that.
The main symptom we see from an application side is lock contention if the IOPS for writing is not enough. The number and frequency you get of these in your application logs will give you symptoms for exhausting the IOPS of standard RDS. You can also use a service like datadog to monitor the IOPS.
The problem with provisioned IOPS is they assume steady state volumes of writes / reads in order to be cost effective. This is almost never a realistic use case and is the reason Amazon started cloud services to fix. The only assurance you get with P-IOPS is that you'll get a max throughput capability reserved. If don't use it, you pay for it still.
If you're ok with running replicas, we recommend running a read-only replica as a NON-RDS instance, and putting it on a regular EC2 instance. You can get better read-IOPS at a much cheaper price by managing the replica yourself. We even setup replicas outside AWS using stunnel and put SSD drives as the primary block device and we get ridiculous read speeds for our reporting systems - literally 100 times faster than we get from RDS.
I hope this helps give some real world details. In short, in my opinion - unless you must ensure a certain level of throughput capability (or your application will fail) on a constant basis (or at any given point) there are better alternatives to provisioned-IOPS including read-write splitting with read-replicas memcache etc.

So, I just got off of a call with an Amazon System Engineer, and he had some interesting insights related to this question. (ie. this is 2nd hand knowledge.)
standard EBS blocks can handle bursty traffic well, but eventually it will taper off to about 100 iops. There were several alternatives that this engineer suggested.
some customers use multiple small EBS blocks and stripe them. This will improve IOPS, and be the most cost effective. You don't need to worry about mirroring because EBS is mirrored behind the scenes.
some customers use the ephemeral storage on the EC2 instance. (or RDS instance) and have multiple slaves to "ensure" durabilty. The ephemeral storage is local storage and much faster than EBS. You can even use SSD provisioned EC2 instances.
some customers will configure the master to use provisioned IOPS, or SSD ephemeral storage, then use standard EBS storage for the slave(s). Expected performance is good, but failover performance is degraded (but still available)
anyway, If you decide to use any of these strategies, I would recheck with amazon to make sure I haven't forgotten any important steps. As I said before, this is 2nd hand knowledge.

online fast computation environment for ruby

I'm writing a ruby program that need raw cpu power (I know ruby is a very bad choice for this!).. but I don't have a powerful computer, so I wanted to rent something online that you pay per hour..
Any idea? Something simple to use yet very powerful, with multiple cores. I took a look at amazon ec2, that's a possibility. Anything else, more CPU oriented?

As far as on-demand compute power, Amazon's EC2 is a good choice. You can either pay market rates or you can go to their special discount spot market which is similar, but your instance can and will be terminated without warning when demand picks up again.
It's best to have a system that either uses a persistent EBS drive to save results, or saves them to something like S3 frequently.
If you can parallelize your processing, try and split it across the most cost-effective instance type instead of having to pay a premium for a single instance. For example, the XL Hi-CPU On-Demand Instance gives you 20 compute units for $0.68/hr. compared with the 4XL Cluster Compute Instance which is only 33.5 for $1.60/hr.
Remember that a single Ruby process can only use one CPU core unless you're using a combination of JRuby and threads. You will need to support multiple processes in order to make full use of the machine if this is not the case.

SimpleWorker - I think it's more simpler then EC2.

Prototyping for amazon Ec2

How do people (and start up companies) actually go about prototyping/deploying things on amazon and keep costs reasonable? Last month we were experimenting with some specific applications and running own hadoop cluster and managed to spend almost 1.5k just for tests ? Sure - they have micro instances, but what if you application is so intensive it actually requires a larger instance to even test? So I'd like some input as to how people go about doing this?

Several key issues:
Consider a local testbed for some purposes & consider if a given test really needs EC2. If it's really so hard to wrangle 2-4 machines to use as a testbed for Hadoop, there's a different problem. Get your head around whatever you're going to run, how Hadoop will play a role, and kick the tires on that. In time, you will also want to change your grid, upgrade software, tinker with other ideas, etc. When you go to EC2, you'll have smoothed some rough edges already.
Don't use a larger capacity machine than you need while getting the hang of things. If you're not pushing lots of data or compute cycles through at this stage, don't bother with cluster compute nodes, massive RAM instances, etc. Just focus on getting things set up correctly.
When you are ready to retarget to more powerful machines, try a few different machine setups. Maybe the cluster compute instances will pay off, maybe you don't need that kind of throughput: until you know your bottlenecks, don't overspend.
Be sure to use spot instances frequently during the testing phase. You will typically pay about 50% of the on-demand price.
If you get to a point where you want to pay for on-demand instances, have a separate instance start and stop Hadoop instances as needed - unless you need a big cluster all on cluster compute instances.
Prepare your AMIs to get launched as quickly as possible (under 1 minute) and never leave anything running overnight or over a weekend if it isn't necessary.
Until you get the system set up and running, you're basically paying tuition to learn how to get everything tailored to your needs. Just pay the "tuition" to learn each lesson (configurations, bottlenecks, scaling up, etc.), rather than try to take on everything at once. When you approach it as a series of lessons to be learned, it is less painful to spend the money, but as long as you know what you're about to test and learn, you will also spend money more judiciously.
Finally, compare the $1500 to the labor costs of this learning experience - it probably isn't a big deal in the long run. Once you know that something is going to be a reasonable block of computational effort, it's well engineered, and will finish quickly (albeit on many machines), it isn't so painful to spend money on it. Right now, it's hard to appreciate what you're learning because it doesn't yet benefit your org's goals.

To address cost issue while doing proof-of-concept of using Amazon Cloud.
I created a light-weight Java Application using Amazon AWS API, which creates the amazon cloud instances when I want to run a test on them. Once the test is finished or failed-to-start the application terminates the instances immediately by sending out diagnostic mail.
So, no amazon instance kept running or sitting ideal. Which can happen if you create/terminate manually or through a separate program.

Consider using spot instances. If you overbid, you can be almost sure it won't be terminated. In longer run they have price on a level of reserved instances, but you don't need to pay upfront. I believe you could also schedule the tests for non-peak hours, reaching even better prices, or switch to on-demand if spot instance price exceeds on-demand one - Hadoop should handle it nicely. Check this article about spot instances. It has also references to two other articles that analyze the potential of spot instances.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio