Extremely slow response in EC2 (width less than 100 requests) - performance

I have a EC2 instance, everything was going ok but for the last few days my website (https://www.cinescondite.com) started having a very slow response. I didn't change anything, and I have the site very well optimized. I saw that I have very high network rates, but there are so many stats that I don't know which ones really matter or which ones are affecting my site. I have a bitnami WordPress installed in my instance.
AWS Stats:
Chrome console stats:

It looks like you are one of the t class instance types that use burstable credits and you ran out. For details on Burtsable Credits You can see this by the "CPU Credit Balance" which is the last graph.
You need to switch to fixed performance instance type or choose larger type. If you are using a t3.large or larger try switching to an m5 in the same size. The difference between t3.large and m5.large is only $10 a month.
If you are using a t3.medium or smaller then try stepping up to the next size.
Another way to speed up the performance is to offload content delivery with a CDN (content delivery network). By doing thing your server will not work as hard and your user experience will be improved by the delivery speed of the CDN.
One option is to use the W3 Total Cache Wordpress Plugin
And here is article on step by step instructions on using Total Cache with AWS S3 and CDN
Note: I have not stepped through instructions to see if they are correct; however, in reading them they appear to be complete.

Related

Freezing while downloading large datasets through Shodan?

I'm using Shodan's API through the Anaconda Terminal on Windows 10 to get data against the query below, but after a few seconds of running, the ETA timer freezes, and my network activity drops to zero. Hitting Control+C restarts it when this happens and gets it moving for a few seconds again, but it stops soon after.
shodan download --limit 3100000 data state:"wa"
Also, when it is running- the download speeds seem pretty slow; and I wanted to inquire if there was any way I can speed it up? My Universities internet is capable of upwards of 300 Mbps, but the download seems to cap at 5 Mbps.
I don't know how to solve either of these issues; my device has enough space and my internet isn't disconnecting. We've tried running the Anaconda Terminal as an Administrator, but that hasn't helped either.
I am not familiar with the specific website, but in general seeing limited speed or stopped downloads are not caused by things 'on your side' like the university connection, or even your download script.
Odds are that the website wants to protect itself, and that you need to use the api differently (for example with a different account). Or that you have some usage limits in place based on your account, that you hit.
The best course of action may be to contact the website and ask them how to do this.
I heard back from Shodan support; cross-posting some of their reply here-
The API is not designed for large, bulk export of data. As a result,
you're encountering a few problems/ limits:
There is a hard limit of 1 million results per search query. This means that it isn't possible to download all results for the search query "state:wa".
The search API performs best on the first few pages and progressively responds slower the deeper into the results you get.
This means that the first few pages return instantly whereas the 100th
page will take potentially 10+ seconds.
You can only send 1 request per second so you can't multiplex/ parallelize the search requests.
A lot of high-level analysis can be performed using search facets.
There's documentation on facets in the shodan.pdf booklet floating around their site for returning summary information from their API.

Fast delivery webpages on shared hosting

I have a website (.org) for a project of mine on LAMP hosted on a shared plan.
It started very small but now I extended this community to other states (in US) and it's growing fast.
I had 30,000 (or so) visits per day (about 4 months ago) and my site was doing fine and today I reached 100,000 visits.
I want to make sure my site will load fast for everyone and since it's not making any money I can't really move it to a private server. (It's volunteer work).
Here's my setup:
- Apache 2
- PHP 5.1.6
- MySQL 5.5
I have 10 pages PER state and on each page people can contribute, write articles, like, share, etc... on few pages I can hit 10,000 per hours during lunch time and the rest of the day it's quiet.
All databases are setup properly (I personally paid a DBA expert to build the code). I am pretty sure the code is also good. Now, I can make page faster if I use memcached but the problem is I can't use it since I am on a shared hosting.
Will the MySQL be able to support that many people, with lots of requests per minutes? or I should create a fund to move to a private server and install all the tools I need to make it fast?
Thanks
To be honest there's not much you can do on shared hosting. There's a reason why they are cheap ... they limit you to do stuff like you want to do.
Either you move to a VPS that allow memcache (which are cheaper) and you put some google ads OR you keep going on your shared hosting using a pre-generated content system.
VPS can be very cheap (look for coupons) and you can install what ever you want since you are root.
for example hostmysite.com with the coupon: 50OffForLife you pay 20$ per month for life ... vs a 5$ shared hosting ...
If you want to keep the current hosting, then what you can do is this:
Pages are generated by a process (cronjob or on the fly), everytime someone write a comment or make an update. This process start and fetch all the data on the page and saves it to the web page.
So let say you have a page with comments, grab the contents (meta, h1, p, etc..) and the comments and save both into the file.
Example: (using .htaccess - based on your answer you are familiar with this)
/topic/1/
If the file exists, then simply echo ...
if not:
select * from pages where page_id = 1;
select * from comments where page_id = 1;
file_put_contents('/my/public_html/topic/1/index.html', $content);
Or something along these lines.
Therefore, saving static HTML will be very fast since you don't have to call any DB. It just loads the file once it's generated.
I know I'm stepping on unstable ground providing an answer to this question, but I think it is very indicative.
Pat R Ellery didn't provide enough details to do any kind of assessment, but the good news there can't be enough details. Explanation is quite simple: anybody can build as many mental model as he wants, but real system will always behave a bit differently.
So Pat, do test your system all the time, as much as you can. What you are trying to do is to plan the capacity of your solution.
You need the following:
Capacity test - To determine how many users and/or transactions a given system will support and still meet performance goals.
Stress test - To determine or validate an application’s behavior when it is pushed beyond normal or peak load conditions.
Load test - To verify application behavior under normal and peak load conditions.
Performance test - To determine or validate speed, scalability, and/or stability.
See details here:
Software performance testing
Types of Performance Testing
In the other words (and a bit primitive): if you want to know your system is capable to handle N requests per time_period simulate N requests per time_period and see the result.
(image source)
Another example:
There are a lot of tools available:
Load Tester LITE
Apache JMeter
Apache HTTP server benchmarking tool
See list here

ec2 spot complete price history

I am using API to get ec2 spot price history, but I cannot get anything except for last 90 or so days, and cannot specify frequency of observations. Is there a way to get a complete history of spot prices, preferable at minute or hourly frequency?
While not explicitly documented for the DescribeSpotPriceHistory API action, this restriction is at least mentioned for the AWS Management Console (which uses that API in turn), see Viewing Spot Price History:
You can view the Spot Price history over a period from one to 90 days based on the instance type, the operating system you want the instance to run on, the time period, and the Availability Zone in which it will be launched.
Since anybody could have retrieved and logged the entire spot price history ever since this API is available (and without a doubt quite some users and researchers have done just that; even the AWS blog listed some dedicated Third-Party AWS Tracking Sites, but these are all defunct at first sight), this restriction admittedly seems a bit arbitrary, but is certainly pragmatic from a strictly operational point of view, i.e. you have all the information you need to base future bids upon (esp. given AWS has only ever reduced prices so far, and regularly does so in fact much to the delight of its customers).
Likewise there's no option to change the frequency, so you'd need to resort to client side code for the hourly aggregation.
This website has re-sampled EC2 spot price histories for some regions, you may access them via a simple API directly from your Python script:
http://ec2-spot-prices.ai-mmo-games.de/
I hope this helps.
AWS only provides 90 days of history. And the data is raw, i.e., not normalized by hour or even minute. so there are holes in the data sometimes.
One approach would be to suck in the data into an ipython notebook and use pandas excellent time series tools to resample by minute or 5-min etc. here's a short tutorial:
https://medium.com/cloud-uprising/the-data-science-of-aws-spot-pricing-8bed655caed2
here are more details on using pandas for time-series resampling:
http://pandas.pydata.org/pandas-docs/stable/timeseries.html
hope that helps...

Web Development: What page load times do you aim for?

Website Page load times on the dev machine are only a rough indicator of performance of course, and there will be many other factors when moving to production, but they're still useful as a yard-stick.
So, I was just wondering what page load times you aim for when you're developing?
I mean page load times on Dev Machine/Server
And, on a page that includes a realistic quantity of DB calls
Please also state the platform/technology you're using.
I know that there could be a big range of performance regarding the actual machines out there, I'm just looking for rough figures.
Thanks
Less than 5 sec.
If it's just on my dev machine I expect it to be basically instant. I'm talking 10s of milliseconds here. Of course, that's just to generate and deliver the HTML.
Do you mean that, or do you mean complete page load/render time (html download/parse/render, images downloading/display, css downloading/parsing/rendering, javascript download/execution, flash download/plugin startup/execution, etc)? The later is really hard to quantify because a good bit of that time will be burnt up on the client machine, in the web browser.
If you're just trying to ballpark decent download + render times with an untaxed server on the local network then I'd shoot for a few seconds... no more than 5-ish (assuming your client machine is decent).
Tricky question.
For a regular web app, you don't want you page load time to exceed 5 seconds.
But let's not forget that:
the 20%-80% rule applies here; if it takes 1 sec to load the HTML code, total rendering/loading time is probably 5-ish seconds (like fiXedd stated).
on a dev server, you're often not dealing with the real deal (traffic, DB load and size - number of entries can make a huge difference)
you want to take into account the way users want your app to behave. 5 seconds load time may be good enough to display preferences, but your basic or killer features should take less.
So in my opinion, here's a simple method to get a rough figures for a simple web app (using for example, Spring/Tapestry):
Sort the pages/actions given you app profile (which pages should be lightning fast?) and give them a rough figure for production environment
Then take into account the browser loading/rendering stuff. Dividing by 5 is a good start, although you can use best practices to reduce that time.
Think about your production environment (DB load, number of entries, traffic...) and take an additional margin.
You've got your target load time on your production server; now it's up to you and your dev server to think about your target load time on your dev platform :-D
One of the most useful benchmarks we use for identifying server-side issues is the "internal" time taken from request-received to response-flushed by the web server itself. This means ignoring network traffic / latency and page render times.
We have some custom components (.net) that measure this time and inject it into the HTTP response header (we set a header called X-Server-Response); we can extract this data using our automated test tools, which means that we can then measure it over time (and between environments).
By measuring this time you get a pretty reliable view into the raw application performance - and if you have slow pages that take a long time to render, but the HTTP response header says it finished its work in 50ms, then you know you have network / browser issues.
Once you push your application into production, you (should) have things to like caching, static files sub-domains, js/css minification etc. - all of which can offer huge performance gains (esp. caching), but can also mask underlying application issues (like pages that make hundreds of db calls.)
All of which to say, the values we use for this time is sub 1sec.
In terms of what we offer to clients around performance, we usually use 2-3s for read-only pages, and up to 5s for transactional pages (registration, checkout, upload etc.)

How many users can an Amazon EC2 instance serve?

The use will be to serve dynamic content from data on S3. You can make up any definition of "normal" you think is normal.
What about small, medium, and large instances?
Ok. People want some data to work with, so here:
The webservice is about 100kb at start, and uses AJAX, so it doesnt have to reload the whole page much, if at all. When it loads the page, it will send between 20 - 30 requests to the database (S3) to get small chunks of text (like comments). The average user will stay on the page for 10 min, translating to about 100kb at offset, and about 400kb more through requests. Assume that hit volume is the same at night and day.
Depends on with what and how you're serving the content, not to mention how often those users will be accessing it, the size and type of the content, etc. There's essentially not one bit of information you've provided that allows us to answer your question in any sort of meaningful way.
As others have said, this might require testing under your exact conditions. Fortunately, if you're willing to go as far as setting up a test version of your server setup, you can spawn instances that simulate users. Create a bunch of these test instances, and run Apache's ab benchmarking tool on them, directing them at your test site. If the instances are within the same availability zone as your test site, you won't be charged for bandwidth, just by the hour for the running instances. Run a test for under an hour, shutting down the test instances afterward, and it will cost you very little to organize this stress test.
As one data point, running the Apache ab tool locally on my small instance, which is serving up a database-heavy Drupal site, it reported the ability of the server to handle 45-60 requests per second. I'm assuming that ab is a reasonable tool for benchmarking, and I might be wrong there, but this is what I'm seeing.
As a suggestion, not knowing too much about your particular case, I'd move your database to an Elastic Block Store (EBS) volume. S3 is not really intended to host databases, and the latency it has might kill your performance. EBS volumes can easily be snapshotted to S3 for backup, if that's what you're worried about.
One can argue that properly designed, it doesn't matter how many users an instance can support. Ideally, when your instance is saturated, you fire up a new instance to manage the traffic.
Obviously, this grossly complicates the deployment and design.
But beyond that, an EC2 instance a low end Linux box, effectively (depending on which model you choose).
Let's rephrase the question, how many users do you want to support?

Resources