Is taskset for CPU affinity applicable when trying to use L2 cache efficiently on a multi core processor in a virtualised environment like Amazon EC2?
No, especially towards the smaller instances, there's heavy CPU sharing, you're dependent on what the other instances are doing with the CPU.
Related
Mesos is advertised as a system that lets you program against your datacenter like it's a single pool of resources (See the Mesos Website). But is this really true that you don't need to consider the configuration of the individual machines? Using Mesos, can you request more resources for a task than are available on a single machine?
For example, if you have 10 machines each with 2 cores and 2g of RAM and 20g HD, can you really request 10 cores, 15g of RAM and 100g of disk space for a single task?
If so, how does this work? Is Mesos able to address memory across machines for you, and use other CPUs as local threads and create a single filesystem from a number of distributed nodes?
How does it accomplish this without suffering from the Fallacies of distributed computing, especially those related to network latency and transport cost?
According to this Mesos architecture you can't aggregate resources from different slaves (agents / machines) to use them for one task.
As you can see there is strict "taks per agent" situation
Also their example says pretty much same
Let’s walk through the events in the figure.
Agent 1 reports to the master that it has 4 CPUs and 4 GB of memory
free. The master then invokes the allocation policy module, which
tells it that framework 1 should be offered all available resources.
The master sends a resource offer describing what is available on
agent 1 to framework 1. The framework’s scheduler replies to the
master with information about two tasks to run on the agent, using <2
CPUs, 1 GB RAM> for the first task, and <1 CPUs, 2 GB RAM> for the
second task. Finally, the master sends the tasks to the agent, which
allocates appropriate resources to the framework’s executor, which in
turn launches the two tasks (depicted with dotted-line borders in the
figure). Because 1 CPU and 1 GB of RAM are still unallocated, the
allocation module may now offer them to framework 2.
My tests with standalone (single-threaded) Redis show that load from a number of parallel clients can drive Redis CPU usage to 100% (in my memory cache use case).
Starting it in cluster mode and sharding the content to multiple masters is a possible approach for speeding it up, if persistence is turned on.
I have a configuration without persistence (turned off RDB and AOF). Would starting multiple masters help performance (still using the same cummulative amount of RAM)?
Redis is single-threaded, so the performance of a standalone instance is limited by processing power of a single CPU core and the network bandwidth of a single machine. However, Redis is very very fast. So normally the bottleneck is network bandwidth, unless you run lots of slow commands/lua scripts.
If you deploy Redis cluster on multiple machines, the performance should be improved no matter whether the persistence is turned on or off. Since you have more CPU cores, and more network bandwidth.
If you deploy Redis cluster on a single machine (each node listen on a unique port), the performance might be improved. It depends... If the bottleneck is network bandwidth, it won't be improved. On the other hand, if the bottleneck is CPU processing power, the performance should be improved. So, in this case, you should do some benchmark with your specific data, specific environment, and specific commands/lua scripts.
I am interested in understanding the way in which the hardware resources (CPU, disk, network, etc.) of an AWS physical server is shared between different applications. Do people have experiences about inexplicable performance changes in services running on AWS that you have successfully attributed to another application sharing the physical resources? If so, how did you go about debugging this?
In particular, I am interested in more complicated interactions between the resources, such as CPU->Memory bandwidth. If you run 15 VMs on a single machine, you will surely have worse performance than if you ran 2 VMs.
Perhaps this is a more general question about Xen virtualization, but I don't know if there is some kind of AWS magic happening under the hood that I don't know about.
I am not sure if this is the right forum for this kind of question; if not, it would be helpful if you could point me towards a resource or another forum.
Amazon EC2 instances are not susceptible to "noisy neighbour" problems.
Based upon the Instance Type selected, the EC2 instance receives CPU, Memory and (for some instance types) locally attached disk storage. These resources are dedicated to the instance and will not be impacted by other users nor other virtual machines. (An exception to this is the t1 and t2 instance types.)
Specifically:
The instance is allocated a number of vCPUs. These are provided to the instance and no other instance can use these vCPUs (see note about t1 and t2 below). The EC2 Instance Type page defines a vCPU as:
Each vCPU is a hyperthread of an Intel Xeon core for M4, M3, C4, C3, R3, HS1, G2, I2, and D2.
The instance is allocated an amount of RAM. No other instance can use this RAM. There is no oversubscription of CPU nor RAM.
The instance might be allocated locally-attached disk storage, known as Instance Store or Ephemeral Storage. This disk storage does not persist when the instance is Stopped or Terminated, so only store temporary data or data that is replicated elsewhere.
The instance is allocated network bandwidth that is dedicated to that instance. No other instance can impact this network bandwidth. The network performance is based upon the selected instance type. Basically, larger instances receive more network performance.
None of the above factors are impacted by other instances (virtual machines) running on the same host.
t1 and t2 instance types
An exception to the above statement are:
t1.micro instances "provide a small amount of consistent CPU resources and allow you to increase CPU capacity in short bursts when additional cycles are available".
t2 instances provide burst capacity based upon a system of CPU Credits. CPU Credits are earned at a constant rate depending upon instance type, and these credits can be used to burst the CPU when necessary.
For both these instance types, I would assume that this burst capacity is shared between instances, so it is possible that CPU burst might be impacted by other instances also wishing to burst. The t2 instances, however, would make this 'fair' by only consuming CPU credits when the CPU did actually burst.
Dedicated Instances and Dedicated Hosts
Dedicated instances are "Amazon EC2 instances that run in a virtual private cloud (VPC) on hardware that's dedicated to a single customer." Basically, your AWS account will be the only account running instances on that host computer.
A Dedicated Host is a "physical server with EC2 instance capacity fully dedicated to your use. Dedicated Hosts allow you to use your existing per-socket, per-core, or per-VM software licenses, including Windows Server, Microsoft SQL Server, SUSE, Linux Enterprise Server, and so on." Basically, you pay for the entire host computer and then launch individually instances on the host (at no additional charge).
The use of a Dedicated Instance or a Dedicated Host has no impact on resources allocated to each instance. They would receive the same resources as when running as a normal Shared Instance.
I understand Mesos architecture at a high level, but I'm not clear about the OS level techniques used to implement resources allocation. For example, Mesos offers a framework 1 CPU and 400MB memory, and another framework 2 CPUs and 1GB memory, how is this actually implemented at OS level?
tl;dr: Mesos itself doesn't "allocate" any resources at the OS-level. The resources are still allocated by the OS, although Mesos can use OS-level primitives like cgroups to ensure that a task doesn't use more resources than it should.
The Mesos agent at the node advertises that some resources are available at the host (e.g., 4 CPUs and 16GB of RAM) -- either by auto-detecting what is available at the host or because the available resources have been explicitly configured (recommended for production).
The master then offers those resources to a framework.
The framework can then launch a task, using some or all of the resources available at the agent: e.g., the framework might launch a task with 2 CPUs and 8GB of RAM.
The agent then launches an executor to run the task.
How strictly the "2 CPUs and 8GB of RAM" resource limit is enforced depends on how Mesos is configured. For example, if the agent host supports cgroups and the agent is started with --isolation='cgroups/cpu,cgroups/mem', cgroups will be used to throttle the CPU appropriately, and to kill the task if it tries to exceed its memory allocation.
I'm using small size ec2.
its noticeably slower than my less than $800 home linux machine.
(about average machine purchased 6month ago)
I don't know cpu or hard-disk is the bottleneck.
Wonder if there's a way to tell which.
yes, if you want to monitor your EC2 instance, consider using Amazon's cloudwatch ( http://aws.amazon.com/cloudwatch/ ). This service can monitor all your instance's resources, such as CPU utilization, memory usage, network latency, and request counts. It's also free in the amazon free tier.
If you're looking for more detailed monitoring, consider serverdensity service ( http://www.serverdensity.com/cloud-monitoring/ ). They can monitor software installed on the server itself, such as apache service