Fast delivery webpages on shared hosting - performance

I have a website (.org) for a project of mine on LAMP hosted on a shared plan.
It started very small but now I extended this community to other states (in US) and it's growing fast.
I had 30,000 (or so) visits per day (about 4 months ago) and my site was doing fine and today I reached 100,000 visits.
I want to make sure my site will load fast for everyone and since it's not making any money I can't really move it to a private server. (It's volunteer work).
Here's my setup:
- Apache 2
- PHP 5.1.6
- MySQL 5.5
I have 10 pages PER state and on each page people can contribute, write articles, like, share, etc... on few pages I can hit 10,000 per hours during lunch time and the rest of the day it's quiet.
All databases are setup properly (I personally paid a DBA expert to build the code). I am pretty sure the code is also good. Now, I can make page faster if I use memcached but the problem is I can't use it since I am on a shared hosting.
Will the MySQL be able to support that many people, with lots of requests per minutes? or I should create a fund to move to a private server and install all the tools I need to make it fast?
Thanks

To be honest there's not much you can do on shared hosting. There's a reason why they are cheap ... they limit you to do stuff like you want to do.
Either you move to a VPS that allow memcache (which are cheaper) and you put some google ads OR you keep going on your shared hosting using a pre-generated content system.
VPS can be very cheap (look for coupons) and you can install what ever you want since you are root.
for example hostmysite.com with the coupon: 50OffForLife you pay 20$ per month for life ... vs a 5$ shared hosting ...
If you want to keep the current hosting, then what you can do is this:
Pages are generated by a process (cronjob or on the fly), everytime someone write a comment or make an update. This process start and fetch all the data on the page and saves it to the web page.
So let say you have a page with comments, grab the contents (meta, h1, p, etc..) and the comments and save both into the file.
Example: (using .htaccess - based on your answer you are familiar with this)
/topic/1/
If the file exists, then simply echo ...
if not:
select * from pages where page_id = 1;
select * from comments where page_id = 1;
file_put_contents('/my/public_html/topic/1/index.html', $content);
Or something along these lines.
Therefore, saving static HTML will be very fast since you don't have to call any DB. It just loads the file once it's generated.

I know I'm stepping on unstable ground providing an answer to this question, but I think it is very indicative.
Pat R Ellery didn't provide enough details to do any kind of assessment, but the good news there can't be enough details. Explanation is quite simple: anybody can build as many mental model as he wants, but real system will always behave a bit differently.
So Pat, do test your system all the time, as much as you can. What you are trying to do is to plan the capacity of your solution.
You need the following:
Capacity test - To determine how many users and/or transactions a given system will support and still meet performance goals.
Stress test - To determine or validate an application’s behavior when it is pushed beyond normal or peak load conditions.
Load test - To verify application behavior under normal and peak load conditions.
Performance test - To determine or validate speed, scalability, and/or stability.
See details here:
Software performance testing
Types of Performance Testing
In the other words (and a bit primitive): if you want to know your system is capable to handle N requests per time_period simulate N requests per time_period and see the result.
(image source)
Another example:
There are a lot of tools available:
Load Tester LITE
Apache JMeter
Apache HTTP server benchmarking tool
See list here

Related

XPages performance - 2 apps on same server, 1 runs and 1 doesn't

We have been having a bit of a nightmare this last week with a business critical XPage application, all of a sudden it has started crawling really badly, to the point where I have to reboot the server daily and even then some pages can take 30 seconds to open.
The server has 12GB RAM, and 2 CPUs, I am waiting for another 2 to be added to see if this helps.
The database has around 100,000 documents in it, with no more than 50,000 displayed in any one view.
The same database set up as a training application with far fewer documents, on the same server always responds even when the main copy if crawling.
There are a number of view panels in this application - I have read these are really slow. Should I get rid of them and replace with a Repeat control?
There is also Readers fields on the documents containing Roles, and authors fields as it's a workflow application.
I removed quite a few unnecessary views from the back end over the weekend to help speed it up but that has done very little.
Any ideas where I can check to see what's causing this massive performance hit? It's only really become unworkable in the last week but as far as I know nothing in the design has changed, apart from me deleting some old views.
Try to get more info about state of your server and application.
Hardware troubleshooting is summarized here: http://www-10.lotus.com/ldd/dominowiki.nsf/dx/Domino_Server_performance_troubleshooting_best_practices
According to your experience - only one of two applications is slowed down, it is rather code problem. The best thing is to profile your code: http://www.openntf.org/main.nsf/blog.xsp?permaLink=NHEF-84X8MU
To go deeper you can start to look for semaphore locks: http://www-01.ibm.com/support/docview.wss?uid=swg21094630, or to look at javadumps: http://lazynotesguy.net/blog/2013/10/04/peeking-inside-jvms-heap-part-2-usage/ and NSDs http://www-10.lotus.com/ldd/dominowiki.nsf/dx/Using_NSD_A_Practical_Guide/$file/HND202%20-%20LAB.pdf and garbage collector Best setting for HTTPJVMMaxHeapSize in Domino 8.5.3 64 Bit.
This presentation gives a good overview of Domino troubleshooting (among many others on the web).
Ok so we resolved the performance issues by doing a number of things. I'll list the changes we did in order of the improvement gained, starting with the simple tweaks that weren't really noticeable.
Defrag Domino drive - it was showing as 32% fragmented and I thought I was on to a winner but it was really no better after the defrag. Even though IBM docs say even 1% fragmentation can cause performance issues.
Reviewed all the main code in the application and took a number of needless lookups out when they can be replaced with applicationScope variables. For instance on the search page, one of the drop down choices gets it's choices by doing an #Unique lookup on all documents in the database. Changed it to a keyword and put that in the application Scope.
Removed multiple checks on database.queryAccessRole and put the user's roles in a sessionScope.
DB had 103,000 documents - 70,000 of them were tiny little docs with about 5 fields on them. They don't need to be indexed by the FTIndex so we moved them in to a separate database and pointed the data source to that DB when these docs were needed. The FTIndex went from 500mb to 200mb = faster indexing and searches but the overall performance on the app was still rubbish.
The big one - I finally got around to checking the application properties, advanced tab. I set the following options :
Optimize document table map (ran copystyle compact)
Dont overwrite free space
Dont support specialized response hierarchy
Use LZ1 compression (ran copystyle compact with options to change existing attachments -ZU)
Dont allow headline monitoring
Limit entries in $UpdatedBy and $Revisions to 10 (as per domino documentation)
And also dont allow the use of stored forms.
Now I don't know which one of these options was the biggest gain, and not all of them will be applicable to your own apps, but after doing this the application flies! It's running like there are no documents in there at all, views load super fast, documents open like they should - quickly and everyone is happy.
Until the http threads get locked out - thats another question of mine that I am about to post so please take a look if you have any idea of what's going on :-)
Thanks to all who have suggested things to try.

I have many separate installs for magento can I merge them together?

I've just started a new job and we have several installs of magneto all of different versions!
Now really it seems to me that we need to firstly upgrade them all and then get them all under one installation of magneto and using one database.
What is the best way (in general terms) of doing this.
Is it even possible or is my best bet to make the sites again under one installation and import the products into it.
There is some talk by a fellow developer that having them under different installs helps with performance. Is this true?
Once we have them all under one install things like stock control and orders as well as putting products on multiple site should also be very straightforward - correct?
We are talking quite a few stores say around 15ish and quite a few products around I would say 4000 maybe more.
My first suggestion is to consider the reasons, why you need all Magento instances to be moved under one installation. The reasons are not clear from your question. So the best developer's advice is "Does it work? Then don't touch it" :)
If there are no specific reasons, then you'd better leave it as is. All reorganization processes (upgrading, infrastructure configuration, access setup, etc.) for a software system are hard, costly, consume time, error-prone, usually have no much value from business point of view and are a little boring. This is not a Magento-specific thing, it is just general characteristic for any software.
Also note, that it is a holiday season. So it is better not to do anything with e-commerce stores until the middle of January.
If you see value in a reorganization of your Magento stores, then the best way to do it is to go gradually - step by step, store by store:
Take your most complex store. Prepare everything you need for the further steps - i.e. get ready the tools, write automatic scripts, go through the process with its copy at some testing server. Write set of functional tests
to cover it with at least smoke-tests. You'll have to repeat
such light checks many times to be sure, that the store appears to be
working. The automatic tests will save much time. Thus all these preparations will decrease your downtime.
Close public access to the store.
Upgrade store to the Magento version, you need. Move it to the new infrastructure.
Verify all the user scenarios manually and with automatic tests. Fix the issues, if any.
Open public access to the store. Monitor logs, load reports at the server machines. Fix issues, if any.
Take next store (let's call it NextStore). Make its copy at a sandbox server.
Make copy of your already converted store (let's call it ConvertedStore) at a sandbox server.
Export all the data from copy of NextStore and import it to the copy of ConvertedStore. You can use Magento Dataflow or Import/Export modules to do that. Not all data can be
imported/exported with those modules - just Catalog, Orders, Customers. You will need
to develop custom scripts to import/export other entities, if you need them.
Verify result manually and with automatic tests and manually. Write automatic scripts, that fix found issues. You will need those scripts later during the real converting process.
Close NextStore.
Move it to the new infrastructure, by engaging the already prepared procedures and scripts. You will need to consider, whether to close ConvertedStore during the converting process. It depends on your feeling, whether it is ok to have it opened or not. For safety reasons it is better to close it.
Verify, that everything works fine. Monitor logs, reports.
Fix issues, if any.
Proceed with the rest of your stores.
That is my (totally personal) view on the procedures.
There is some talk by a fellow developer that having them under
different installs helps with performance. Is this true?
Yes, your friend is right. Separating Magento (actually, anything in this world) into smaller instances makes it lighter to be handled. The performance difference is very small (for your instance of 4000 products), but it is inevitable. Consider, that after combining the instances (suppose, there are ten of them with 400 products each) you'll be handling data for 10x more customers, reports, products, stores, etc. Therefore any search will have to go through ten times more products, in order to return data. Of course, it doesn't matter, if the search takes 0.00001 second, because 0.0001s for combined instance is ok as well. But some things, like sorting or matching sets, grow non-linearly. But, as said before, for 4000 products you won't see big difference.
Once we have them all under one install things like stock control and
orders as well as putting products on multiple site should also be
very straightforward - correct?
You're right - after combining the stores together, handling orders, stock, customers will be much more simpler and straightforward process.
Good luck! :)
The most important thing to consider is what problem you're solving by having all these sites on one Magento "instance". What's more important to your business/team: having these sites share product and inventory or having the flexibility of independently modifying these sites? Any downtime or impacts to availability may affect all sites.
Further questions/areas of investigation:
How much does the product hierarchy (categories and attributes) differ?
Is pricing the same across each site or different?
Are any of these sites multi-regional and how is pricing handled for each region?
It's certainly possible to run multiple sites on one Magento instances, even if there are some rough edges within the platform.
Since there's no way to export all entities in Magento, there's no functionality to merge stores. You'd have to write custom code - it would have to take all the records from the old store, assign them new IDs while preserving referential integrity & insert them into the new store (this is what the "product import" does, but they don't have it for categories, orders, customers, etc.).
The amount of code you'd be writing to do that would take almost longer than just starting over in my opinion. You'd basically be writing the missing functionality for Magento. If it were easy they would have it done it already.
However splitting two stores apart is very easy, since you don't have to worry about reassigning unique identifiers in the DB.

RavenDB - slow write/save performance?

I started porting a simple ASP.NET MVC web app from SQL to RavenDB. I noticed that the pages were faster on SQL than on RavenDB.
Drilling down with Miniprofiler it seems the culprit is the time it takes to do: session.SaveChanges (150-220ms). The code for saving in RavenDB looks like:
var btime = new TimeData() { Time1 = DateTime.Now, TheDay = new DateTime(2012, 4, 3), UserId = 76 };
session.Store(btime);
session.SaveChanges();
Authentication Mode: When RavenDB is running as a service, I assume it using "Windows Authentication". When deployed as an IIS application I just used the defaults - which was "Windows Authentication".
Background: The database machine is separate from my development machine which acts as the web server. The databases are running on the same database machine. The test data is quite small - say 100 rows. The queries are simple returning an object with 12 properties 48 bytes in size. Using fiddler to run a WCAT test against RavenDB generated higher utilization on the database machine (vs SQL) and far fewer pages. I tried running Raven as a service and as an IIS application, but didn't see a noticible difference.
Edit
I wanted to ensure it wasn't a problem with a) one of my machines or b) the solution I created. So, decided to try testing it on Appharbor using another solution created by Michael Friis: RavenDN sample app and simply add Miniprofiler to that solution. Michael is one of the awesome guys at Apharbor and you can download the code here if you want to look at it.
Results from Appharbor
You can try it here (for now):
Read: (7-12ms with a few outliers at 100+ms).
Write/Save: (197-312ms) * WOW that's a long time to save *. To test the save, just create a new "thingy" and save it. You might want to do it at least twice since the first one usually takes longer as the application warms up.
Unless we're both doing something wrong, RavenDB is very slow to save - around 10-20x slower to save than read. Given that it re-indexes asynchronously, this seems very slow.
Are there ways to speed it up or is this to be expected?
First - Ayende is "the man" behind RavenDB (he wrote it). I have no idea why he's not addressing the question, although even in the Google groups, he seems to chime in once to ask some pointed questions, but rarely comes back to provide a complete answer. Maybe he's working hard to to get RavenHQ off the ground?!?
Second - We experienced a similar problem. Below's a link to a discussion on Google Groups that may be the cause:
RavenDB Authentication and 401 Response.
A reasonable question might be: "If these recommendations fix the problem, why doesn't RavenDB work that way out of the box?" or at least provide documentation about how to get decent write performance.
We played for a while with the suggestions that were made in the thread above and the response-time did improve. In the end though, we switched back to MySQL because it's well-tested, we ran into this problem early (luckily) which caused concern that we might hit more problems and, finally, because we did not have the time to:
fully test whether it fixed the performance problems we saw on the RavenDB Server
investigate and test the implications of using UnsafeAuthenticatedConnectionSharing & pre-authentication.
To summarize Ayende's response you're actually testing the summation of network latency and authentication chatter. As Joe pointed out there's ways you can optimize the authentication to be less chatty. This does however arguably reduce security, clearly Microsoft built security to be secure first and performance secondary. You as the user of RavenDB can choose if the default security model is too robust as it arguably is for protected server-to-server communication.
RavenDB is clearly defined to be READ orientated. 10-20x slower for writes than reads is entirely acceptable because writes are full ACID and transactional.
If write speed is your limiting factor with RavenDB you've likely not modeled transaction boundaries properly. That you are saving documents that are too similar to RDBMS table rows and not actually well modeled documents.
Edit: Reading your question again and looking into the background section, you explicitly define your test conditions to be an optimal scenario for SQL Server while being one of the least efficient methods for RavenDB. For data that size, that's almost certainly 1 document if it would be real world usage.

Web Development: What page load times do you aim for?

Website Page load times on the dev machine are only a rough indicator of performance of course, and there will be many other factors when moving to production, but they're still useful as a yard-stick.
So, I was just wondering what page load times you aim for when you're developing?
I mean page load times on Dev Machine/Server
And, on a page that includes a realistic quantity of DB calls
Please also state the platform/technology you're using.
I know that there could be a big range of performance regarding the actual machines out there, I'm just looking for rough figures.
Thanks
Less than 5 sec.
If it's just on my dev machine I expect it to be basically instant. I'm talking 10s of milliseconds here. Of course, that's just to generate and deliver the HTML.
Do you mean that, or do you mean complete page load/render time (html download/parse/render, images downloading/display, css downloading/parsing/rendering, javascript download/execution, flash download/plugin startup/execution, etc)? The later is really hard to quantify because a good bit of that time will be burnt up on the client machine, in the web browser.
If you're just trying to ballpark decent download + render times with an untaxed server on the local network then I'd shoot for a few seconds... no more than 5-ish (assuming your client machine is decent).
Tricky question.
For a regular web app, you don't want you page load time to exceed 5 seconds.
But let's not forget that:
the 20%-80% rule applies here; if it takes 1 sec to load the HTML code, total rendering/loading time is probably 5-ish seconds (like fiXedd stated).
on a dev server, you're often not dealing with the real deal (traffic, DB load and size - number of entries can make a huge difference)
you want to take into account the way users want your app to behave. 5 seconds load time may be good enough to display preferences, but your basic or killer features should take less.
So in my opinion, here's a simple method to get a rough figures for a simple web app (using for example, Spring/Tapestry):
Sort the pages/actions given you app profile (which pages should be lightning fast?) and give them a rough figure for production environment
Then take into account the browser loading/rendering stuff. Dividing by 5 is a good start, although you can use best practices to reduce that time.
Think about your production environment (DB load, number of entries, traffic...) and take an additional margin.
You've got your target load time on your production server; now it's up to you and your dev server to think about your target load time on your dev platform :-D
One of the most useful benchmarks we use for identifying server-side issues is the "internal" time taken from request-received to response-flushed by the web server itself. This means ignoring network traffic / latency and page render times.
We have some custom components (.net) that measure this time and inject it into the HTTP response header (we set a header called X-Server-Response); we can extract this data using our automated test tools, which means that we can then measure it over time (and between environments).
By measuring this time you get a pretty reliable view into the raw application performance - and if you have slow pages that take a long time to render, but the HTTP response header says it finished its work in 50ms, then you know you have network / browser issues.
Once you push your application into production, you (should) have things to like caching, static files sub-domains, js/css minification etc. - all of which can offer huge performance gains (esp. caching), but can also mask underlying application issues (like pages that make hundreds of db calls.)
All of which to say, the values we use for this time is sub 1sec.
In terms of what we offer to clients around performance, we usually use 2-3s for read-only pages, and up to 5s for transactional pages (registration, checkout, upload etc.)

How many users can an Amazon EC2 instance serve?

The use will be to serve dynamic content from data on S3. You can make up any definition of "normal" you think is normal.
What about small, medium, and large instances?
Ok. People want some data to work with, so here:
The webservice is about 100kb at start, and uses AJAX, so it doesnt have to reload the whole page much, if at all. When it loads the page, it will send between 20 - 30 requests to the database (S3) to get small chunks of text (like comments). The average user will stay on the page for 10 min, translating to about 100kb at offset, and about 400kb more through requests. Assume that hit volume is the same at night and day.
Depends on with what and how you're serving the content, not to mention how often those users will be accessing it, the size and type of the content, etc. There's essentially not one bit of information you've provided that allows us to answer your question in any sort of meaningful way.
As others have said, this might require testing under your exact conditions. Fortunately, if you're willing to go as far as setting up a test version of your server setup, you can spawn instances that simulate users. Create a bunch of these test instances, and run Apache's ab benchmarking tool on them, directing them at your test site. If the instances are within the same availability zone as your test site, you won't be charged for bandwidth, just by the hour for the running instances. Run a test for under an hour, shutting down the test instances afterward, and it will cost you very little to organize this stress test.
As one data point, running the Apache ab tool locally on my small instance, which is serving up a database-heavy Drupal site, it reported the ability of the server to handle 45-60 requests per second. I'm assuming that ab is a reasonable tool for benchmarking, and I might be wrong there, but this is what I'm seeing.
As a suggestion, not knowing too much about your particular case, I'd move your database to an Elastic Block Store (EBS) volume. S3 is not really intended to host databases, and the latency it has might kill your performance. EBS volumes can easily be snapshotted to S3 for backup, if that's what you're worried about.
One can argue that properly designed, it doesn't matter how many users an instance can support. Ideally, when your instance is saturated, you fire up a new instance to manage the traffic.
Obviously, this grossly complicates the deployment and design.
But beyond that, an EC2 instance a low end Linux box, effectively (depending on which model you choose).
Let's rephrase the question, how many users do you want to support?

Resources