How to monitoring a server's resources/performance - amazon-ec2

I've been giving a task to monitor an Amazon ec2 instance's resources/performance. I do not have access to the Amazon Control Panel/Dashboard but I'm allowed to install free software on the ec2 that can track the stats.
I know you need to pay for indpeth/custom charts/graphs in the Amazon Control Panel, is this maybe the best approach for accurate stats or are the preferable free software that can track the following stats.
Total used memory and free memory in x amount time
Total requests made in X amount time
Total CPU usage in x amount time

You may want to use a good, basic monitoring service like New Relic. They have both server and application monitoring available that, together, could give you the stats you list. Your first and third items are more server-centric, while your second bullet is specific to the application you're running (i.e. Apache, NGINX, Postfix, etc.).
Here is a list of other monitoring options.

Related

Live video streaming performance testing for 10k VU

In basic terms, I want to make sure that our Livestreaming Shows can without issues have 10,000 viewers at one time. That the following things are working well:
Video Quality
Video Resolution
Video Latency
Do this can be done using local machine , I read local machine cannot produce such huge number of requests.
Do I needed to purchase addition premium platform or it can be done using jmeter alone.
Do this can be done using local machine - we don't know, it depends on your machine hardware specifications. I would take the following steps:
set up monitoring of the machine's resources consumption like CPU, RAM, etc. If you don't have better alternatives - you can go for JMeter PerfMon Plugin
make sure to follow JMeter Best Practices
start with 1 virtual user and gradually increase the load till 10 000 at the same time looking at the resources consumption
when any of monitored resources starts exceeding reasonable threshold, i.e. 90% of total available capacity stop the test and check how many users were online at this stage via i.e. Active Threads Over Time plugin
this is how many users you can simulate for particular this test from particular this machine. If it's 10 000 - you're good to go with a single one, if it's less - divide 10 000 to the number of users you were able to mimic and this will be the number of machines of that hardware specifications you will need for the test
JMeter out of box can be run in the clustered mode so given you have the machines you can use as the load generators there is no need to purchase anything else. If you don't - you can rent VMs from i.e. MS Azure or AWS EC2 or whatever is your favorite cloud provider. In this case you will need to pay for the machine/computing time according to the vendor price list
There are companies which offer "JMeter as a service", normally they charge more than cloud VM vendors but you won't need to worry about JMeter distributed configuration, results collection, etc. They are BlazeMeter, Flood.io, Redline13, etc.

What is ECU(AWS)? What does that mean?

I did a test performance for my server(1 ECU), but My server only arrived 1000 users in testing, how many ECU I need for 15000 users?
The ECU (Elastic Compute Unit) was a unit of measure designed to provide a relative measure of performance between Amazon EC2 instance types. For example, an m1.small instance had 1 ECU, an m1.large had 2 ECUs, etc.
However, it is no longer possible to summarize the power of an instance in a single number. Some instances have more RAM, some have more CPUs or more powerful CPUs, GPUs, enhanced networking and even burst capabilities.
Therefore, the ECU has slowly disappeared from AWS services and documentation. It can still be viewed as an optional column in the Amazon EC2 Launch Instance console.
The ECU is definitely not a good measure of "the number of users" that a system can support. The number of users that a system can support are totally dependent upon the application architecture and its system requirements. When testing the number of users a system can support, closely monitor all system components (eg CPU load, RAM utilization, disk queues) to identify the bottleneck. You can then try to modify the application or improve the bottleneck to provide better application performance.

Policy for EC2 and ELB based on number of transcoding processes on each instance

I need to transcode massive number of audio files on a series of auto-scaling instances behind an ELB. The core of transcoding script is based on Node.Js and FFMPEG. Queuing is impossible because users are not patience! I need to control the number of transcodings on each instance to avoid CPU 100% problem.
My questions:
A- Is there any way to define a policy for ELB to control the number of connections to each instance? if not is there any parameter to control average CPU utilization on each instance and add a new one after triggering level? (I have found this slide but it is not complete) If it adds a new instance on the fly how much it takes time the new instance be 100% operative to serve the user ( I mean does auto scaling have long latency?)
B- Is there another alternative architecture to achieve same transcoding solution? (I have included my current idea to this answer as a drawing). I can not use third party solutions like Transcoding.com I need to have my native solution.
C- I use node.js for each instance and by socket to the user browser show progress. From browser side I send regularly some ajax request to the node.js side to get the progress information. Does this mechanism has problem with sticky session?
Thanks you.
If your scaling needs to take place in response to individual requests on the server (i.e. a single request would require X number of machines to execute in desired timeframe), then autoscaling is probably not going to be the answer for you, as you will have delay as the new instances become active. You will also potentially have much higher cost to run service in such manner as you could scale up and time a number of times in response to individual request, charging you for one hour minimum for each instance that is started.
If however you are concerned with autoscaling, to for example, increase your fleet 50% during peak times when you get request volume spikes (i.e. you already have many servers serving many requests, but you just need to keep latency down during peak hours by adding more instances), then autoscaling should probably work just fine for you.
There are any number of triggers you can configure to control scaling events in such a case.
ELB does support session affinity ("sticky" sessions).
You will want to use an AWS SDK. Normally you'd use one of the official ones for C#, Ruby etc. Since you're on node.js, try using this SDK on github to monitor, throttle and create instance connection pools etc.
https://github.com/awssum/awssum
there's also AWS2JS
https://github.com/SaltwaterC/aws2js

EC2 for handling demand spikes

I'm writing the backend for a mobile app that does some cpu intensive work. We anticipate the app will not have heavy usage most of the time, but will have occasional spikes of high demand. I was thinking what we should do is reserve a couple of 24/7 servers to handle the steady-state of low demand traffic and then add and remove EC2 instances as needed to handle the spikes. The mobile app will first hit a simple load balancing server that does a simple round-robin user distribution among all the available processing servers. The load balancer will handle bringing new EC2 instances up and turning them back off as needed.
Some questions:
I've never written something like this before, does this sound like a good strategy?
What's the best way to handle bringing new EC2 instances up and back down? I was thinking I could just create X instances ahead of time, set them up as needed (install software, etc), and then stop each instance. The load balancer will then start and stop the instances as needed (eg through boto). I think this should be a lot faster and easier than trying to create new instances and install everything through a script or something. Good idea?
One thing I'm concerned about here is the cost of turning EC2 instances off and back on again. I looked at the AWS Usage Report and had difficulty interpreting it. I could see starting a stopped instance being a potentially costly operation. But it seems like since I'm just starting a stopped instance rather than provisioning a new one from scratch it shouldn't be too bad. Does that sound right?
This is a very reasonable strategy. I used it successfully before.
You may want to look at Elastic Load Balancing (ELB) in combination with Auto Scaling. Conceptually the two should solve this exact problem.
Back when I did this around 2010, ELB had some problems with certain types of HTTP requests that prevented us from using it. I understand those issues are resolved.
Since ELB was not an option, we manually launched instances from EBS snapshots as needed and manually added them to an NGinX load balancer. That certainly could have been automated using the AWS APIs, but our peaks were so predictable (end of month) that we just tasked someone to spin up the new instances and didn't get around to automating the task.
When an instance is stopped, I believe the only cost that you pay is for the EBS storage backing the instance and its data. Unless your instances have a huge amount of data associated, the EBS storage charge should be minimal. Perhaps things have changed since I last used AWS, but I would be surprised if this changed much if at all.
First with regards to costs, whether an instance is started from scratch or from a stopped state has no impact on cost. You are billed for the amount of compute units you use over time, period.
Second, what you are looking to do is called autoscaling. What you do is setup up a launch config that specifies an AMI you are going to use (along with any user-data configs you are using, the ELB and availiabilty zones you are going to use, min and max number of instances, etc. You set up a scaling group using that launch config. Then you set up scaling policies to determine what scaling actions are going to be attached to the group. You then attach cloud watch alarms to each of those policies to trigger the scaling actions.
You don't have servers in reserve that you attach to the ELB or anything like that. Everything is based on creating a single AMI that is used as the template for the servers you need.
You should read up on autoscaling at the link below:
http://aws.amazon.com/autoscaling/

how does windows azure platform scale for my app?

Just a question about Azure.
Yes, I know roughly about Azure and cloud computing. I will put it in this way:
say, in normal way, I build a program listening to a TCP port. I run this server program in a server. I also build a client program, which connects to the server through specified port. Once a client is connected, my server program will compute some thing and return to the client.
Above is the normal model, or say my program's model.
Now I want to use Azure. I want to use because my clients are too many, let's say 1 million a day. I don't want to rent 1000 servers and maintain them. ( just a assumption for the number of clients)
I have looked at the Azure pricing plan. It say about CPU and talks about small, median, large instances.
I don't know what they mean. for e.g., in my above assumed case, how many instances do I need? or at most I can get from azure for extra large (8 small instances?)
How does Azure scale for my program? If I choose small instance (my server program is very little, just compute some data and return to clients), will Azure scale for me? or Azure just gives me one virture server and let it overload?
Please consider the CPU only, not storage or network traffic.
You choose two things: what size of VM to run (small, medium, large) and how many of those VMs to run. That means you could choose a small VM (single processor) and run 100 "instances" of it (100 VMs), or you could choose a large VM (eight processors on the same server) and run 10 instances of it (10 VMs).
Today, Windows Azure doesn't automatically adjust your scale, so it's up to you to use the web portal or the Service Management API to increase the number of instances as your need increases.
One factor to consider is if your app can take advantage of multi-core environments - multi-thread, shared memory, etc. to improve its scale. If it can, it may be better to use 5 2x core (i.e. medium) VMs than 10 1x core (small) VMs. You may find in some cases that 2 4x core VMs perform better than 5 2core.
If your app is not parallel/multi-core, then you could just do some 'x' number of small VMs. The charges are linear anyway - i.e. a 2core VM is twice the cost of a single core.
Other factors would include the scratch disk size & memory available in the VM.
One other suggestion - you may want to look into leveraging the Azure queues (i.e. have the client post to the queue and the workers pull from there). This would allow you to transparently (to the client) increase/decrease the workers w/out worrying about connections, etc. Also, if a processing step failed and crashed your instance the message would persist and be picked up by one of the others.
I suggest you also monitor, evaluate, and perfect the results of your Azure configuration.
For "Monitoring Applications in Windows Azure" (and performance) please reference
http://channel9.msdn.com/learn/courses/Azure/Deployment/DeployingApplicationsinWindowsAzure/Exercise-3-Monitoring-Applications-in-Windows-Azure/
There is also a good blog entry titled "Visualizing Windows Azure diagnostic data"
Check out http://www.paraleap.com - simple service for automatically adjusting number of instances that you have according to demand.

Resources