Openshift Prestashop 1.5.6 Performance Tuning - performance

In the prestashop performance page, it offers caching with APC, memcached, File System, and XCache. There's a warning about making sure the infrastructure contains one front-end server. If not sure, ask your hosting company. What's available and best here?
I'm currently using the free 3 gear set up. 1 small gear with PHP 5.4 and scaling turned on (minimum 1, maximum all available), 1 small gear for mysql 5.5. the HA Proxy shows one of my gears as active/down (red) with 0 bytes of anything. I'm assuming because I don't have the High Availability turned on correctly. I'm getting page load times of 1.5 to 3 seconds on average. I know the small gear is low cpu, but is that a normal response time? How can I tune this to sub 1 with what I have? I'd like to see the performance capabilities and scaling abilities before I kick over a CC to grab bigger gears and more of them.
Speaking of scaling I see this The code in the git repository is copied to each new gear, but the data directory begins empty. I just recently made some post_deploy hooks to symlink an images directory so that my git updates from my local machine didn't destroy custom uploads. Am I going to have to switch to some kind of CDN? I believe I saw an option to set CDN servers in the application.
Thanks in advance,
Alan

Prestashop is probably a pretty beefy application to run on a small gear, even with moving the database to it's own gear. Your web gear would have to receive enough traffic to scale up to another web gear, but as long as you are just getting a few requests to it, it might respond slowly and not scale up. For a few cents an hour, it might be worth spinning up a medium/large gear and checking it's performance.
As for the data directory symlink. On a scaled application, anything you upload into your OPENSHIFT_DATA_DIR will not be synced across gears that are spun up. You might want to consider finding some kind of Amazon S3 plugin or similar.

Related

How can I improve the performance of this architecture?

I'm running a website that is CPU heavy due to a lot of thumbnailing of images.
This is how I currently do things:
User uploads image to server
Server keeps a copy, and stores the image on Amazon S3
When an thumbnail is requested, server uses the local copy to generate it, and then stores it on S3; then gives the S3 URL to the client
Subsequent requests are optimized like this: Server caches S3 URL in memcached, so it won't do the work again; server never generates a thumbnail again if the file exists; the server uses mid-sized thumbnails to generate small-sized one, so not to work with large files of not necessary
Now, I'm hosting on a Linode 4G instance (8 cores with 4x priority, 4GB RAM), and despite my optiomizations and having a memcached hit ratio of 70%, my average CPU is 170%. I'm constantly seeing all 8 CPUs working with frequent spikes of 100% for many of them at the same time.
I'm using nginx and gunicorn to serve a Django application, and the thumbnails are generated with PIL.
How can I improve this architecture?
I was thinking about a few possibilities:
#1. Easiest: add a second identical server with a load balancer in front, so that they'd share the load.
The problem with this is that the two servers would not share the local image cache. Could I solve this by placing such share on a network drive, or would the latency ultimately hinder the gains?
#2. A little harder: split the thumbnailing code out of my app, as a separate webservice, that would run on a second server. This way the main application and database would not suffer from high CPU usage, and the web pages would be served fast. The thumbnails are anyway already served asynchronously with JavaScript
Can anyone recommend some other solution?
Are you sure your performance problems come from thumbnails? OK, I suppose you've checked that.
You can downsize and upload the 2 thumbnails to S3 immediately (or shortly) after user uploaded the image. This way you should be able to save unnecessary CPU load you're now wasting for every HTTP request checking those thumbnails and doing IPC with memcached.
In a way your problem is a "good" problem to have (or at least it could have been a lot worse), in that there are no dependencies between separate image resizing tasks, so you can trivially distribute them over multiple servers. A few comments:
Have you checked to see if there is anything you can do to make the image resizing operations faster? (Google brought this up, don't know if it's any help: http://dmmartins.appspot.com/blog/speeding-up-image-resizing-with-python-and-pil) Even if you still find you need to add more servers, anything you can do to make each resize operation more efficient will make each server go farther.
If your users keep becoming more and more, you will eventually need to "scale out", but for the short term, it is possible you could solve the problem simply by paying another $80 for the next "tier" of service (8 cores at 8x priority).
Is image resizing really your app's only bottleneck? If image resizing was "free", how much further can you scale on your existing server before rendering pages, running DB queries, etc. would limit throughput? If you don't know, it would be good to do some simulated load testing and find out. I ask because if rendering pages, DB queries, etc. are also bottlenecks, or are soon to become bottlenecks, you are going to have to distribute the app anyways. In that case, you might as well keep thumbnailing in the main app, and distribute it right now, rather than making your thumbnailing run as a web service on a 2nd server.
Regardless of whether you distribute the main app, or split out thumbnailing into a separate app on a different server, you need some kind of authoritative store to keep track of where each thumbnail is kept on S3. You can keep that information in memcached, in a database, or wherever you want. It doesn't really matter. Even if you keep it in memcached, that doesn't mean you can't share the cache between 2 servers -- 1 server can connect to a memcached instance running on the other server.
You asked if "the latency" of checking a cache which is held on a different server will "hinder the gains". I don't think you need to worry about that. Your problem is throughput, not latency. Those high-latency network operations parallelize very well. So if you just service more requests in parallel, you can still make full use of your CPUs (which is the resource bottleneck right now).

What's the speediest web hosting choices out there that are scalable to large traffic spikes and can handle fast page loads?

Is cloud hosting the way to go? Or is there something better that delivers fast page loads?
The reason I ask is because I run a buddypress site on a bluehost dedicated server, but it seems to run slow at most times of the day. This scares me because at the moment the sites not live and I'm afraid when it gets traffic it'll become worse and my visitors will lose interest. I use Amazon Cloud to handle all my media, JS, and CSS files along with a catching plugin, but it still loads slow at times.
I feel like the problem is Bluehost, because I visit other sites running buddypress and their sites seem to load instantly. Im not web hosting savvy so can someone please point me in the right direction here?
The hosting choice depends on many factors such as technical requirements, growth rates, burst rates, budgets and more.
Bigger Hardware
To scale up hosting operation, your first choice is often just using a more powerful server, VPS, or cloud instance. The point is not so much cloud vs. dedicated but that you simply bring more compute power to the problem. Cloud can make scaling up easier - often with a few clicks.
Division of Labor
The next step often is division of labor. You offload database, static content, caching or other items to specific servers or services. For example, you could offload static content to a CDN. You could a dedicated database.
Once again, cloud vs non-cloud is not the issue. The point is to bring more resources to your hosting problems.
Pick the Right Application Stack
I cannot stress enough picking the right underlying technology for your needs. For example, I've recently helped a client switch from a Apache/PHP stack to a Varnish/Nginx/PHP-FPM stack for a very business Wordpress operation (>100 million page views/mo). This change boosted capacity by nearly 5X with modest hardware changes.
Same App. Different Story
Also just because you are using a specific application, it does not mean the same hosting setup will work for you. I don't know about the specific app you are using but with Drupal, Wordpress, Joomla, Vbulletin and others, the plugins, site design, themes and other items are critical to overall performance.
To complicate matter, user behavior is something to consider as well. Consider a discussion form that has a 95:1 read:post ratio. What if you do something in the design to encourage more posts and that ratio moves to 75:1. That means more database writes, less caching, etc.
In short, details matter, so get a good understanding of your application before you start to scale out hosting.
A hosting service is part of the solution. Another part is proper server configuration.
For instance this guy has optimized his setup to serve 10 million requests in a day off a micro-instance on AWS.
I think you should look at your server config first, then shop for other hosts. If you can't control server configuration, try AWS, Rackspace or other cloud services.
just an FYI: You can sign up for AWS and use a micro instance free for one year. The link I posted - he just optimized on the same server. You might have to upgrade to a small server because Amazon has stated that micro is only to handle spikes and sustained traffic.
Good luck.

Magento 503 errors, intermittent, capacity problems?

I'm having intermittent 503 errors, out of the blue, on a Magento install. Occasionally the page will half load, without JS or CSS, or sometimes the images will not load, but the rest of it will.
I'm running 1.5 magento, and no settings have been changed before it started going awry.
My hosting guys (it's on shared hosting) have said:
Basically every time someone hits the site, you spawn about 10 - 20+
connections for every hit to:
/media/catalog/product/cache/1/....
For example:
/media/catalog/product/cache/1/image/9df78eab33525d08d6e5fb8d27136e95/2/0/200_2_series-max.jpg
And this is maxing out the capacity. I've disabled cache, and the problem persists. Is there a way I can check why this is happening, or am I at the whim of my hosting chaps.
Thanks in advance.
Not the cache you've disabled, turn that back on as it only causes more severe system loading as everything that would normally be cached has to be recompiled on every request.
/media/catalog/product/cache/1/
This is the image cache. Your system is being overloaded serving out images to your customers. Therefore your server is probably suboptimal and not able to properly handle the loads being placed on it. Typical symptom on a shared hosting plan when attempting to run Magento, this typically causes your website to fall flat on its face the first time Bing, Yahoo and Google decide to simultaneously index your website.
The first thing to do when you start noticing the website to get boggy is to go into "Customer - Online Customers" and see how many Magento reports are online. Sort by IP and see who's hogging resources and take that ip over to Bots VS Browsers to see if it's a known web indexer. The next step is to get your Web Server Access log and start viewing what's being requested to make sure you don't have some script kiddie on your site trying to break it.
One way of eliminating your image loading problem is to go over to Amazon Web Services, sign up for CloudFront and serve your images out through this CDN. Basically it serves as a proxy system so your images only get requested on the initial view and then get served out through CloudFront. Your server still probably is going to have problems with overloading, but it won't be image serving that causes most of it.
Your "shared hosting guys" guys don't have the decency to tell you that they're not capable of hosting a Magento installation for you. This,
you spawn about 10 - 20+ connections for every hit to:
/media/catalog/product/cache/1/....
while I'm sure describes something they've uncovered, doesn't accurately describe it.
Whatever the specifics of the problem you're running into are, the servers your on right now are designed to host a different type of application. You'll continue to run into problems. Move to a host that supports Magento.

Logging in to Magento's admin backoffice is slow

It takes about 20 seconds to login to the admin backoffice. During this time load would spike (from 0.02 to 0.20)
Yet I see no "slow queries" in the mysql log. I've optimized all the tables.
Is there some way to hunt down the source of this slow down?
Which browser are you using? The Magento admin can be very slow in some browsers i.e. Internet explorer. I recommend using Chrome or Firefox for admin work, especially adding/editing products. If you are generally looking at ways to speed up Magento then here is a good place to start:
https://web.archive.org/web/20140327173649/http://www.gxjansen.com/101-ways-to-speed-up-your-magento-e-commerce-website/
The admin panel pulls in a lot of files and is heavily javascripted. Turning on things like mod_deflate and mod_expires can help tell your browser to not download these files as often.
Magento is extremely resource intensive and the admin interface does not use any caching. A lot of it depends on the hardware your site is on. For any site is decent traffic and heavy admin use, the base line should be a dedicated dual quad core server with 24-32mb of ram, especially if you have the db and web on the same server. Any shared server will just not cut it. Cloud based solutions vary, we have had some good results with splitting db and load balanced web in a cloud environment. Make sure you have apache and mysql properly tuned based on the memory of the server and use a php accelerator like apc.

mosso versus gogrid which is better?

I have reasonable experience to manage my own server, so gogrid style management is not a problem. But seems mosso is a tag cheaper somewhat- except the very difficult to access compute cycles terms. Anyone could share about this would be very welcomed.
Well, even at the current moment as correct answer is marked GoGrid choice, I think I need to share my experience with GoGrid.
It's been several weeks after we broke our commitment with them and I think I'm pretty calm now to write cons for them.
1) Images. We were trying to use Windows 2008 images and those were pretty old. To be up to date, you need to install 80+ updates and that takes a while. But that's not the worst thing. Worst thing is, that default image hdd size is 20gb and that was not enough to complete windows updating, at least in automatic way (not talking about installing additional software). There's no way to increase image size, so you need to make all kinds of workarounds (for example disable virtual memory, when installing).
2) Support. It's not fanatic. I would call it robotic. Although live chat is working, at least we were unable to solve by live chat most of the problems, because live chat support personel would always forward request to upper level, which is not accessible through live chat. Another thing is, that as I understood, engineers, that have real knowledge and access to infrastructure don't work at night and in weekends (I was working from Europe, so I had completely different time zone).
3) Service Level Agreement. You need to be careful about small print (for example I've missed that rule 1hour of non working is compensated 100x was working only for one month bill), but there are things, that are not mentioned - for example I was told, that SLA terms do not work for cloud storage, although I think you won't find this mentioned in SLA.
4) Reaction time. Although in SLA they say, that will solve any issue in two hours, we couldn't get solution in 10 days. Problem was clear: network speed between gogrid server instances, also between instance and cloud storage was 10-15kbps (measured using several tools, such as netio and etc., tested several instances and so on). That wasn't because they forgot or smth., we were checking status at various levels every day. My management talked with VP of technology or something and he promised that problem will be solved in nearest time, several days passed and no solution was proposed. And some of the emails about how they are investigating problem made me laugh.
5) Internet speeds. Sometimes they were really good (I've measured 550mbps download speed), but sometimes they are terrible (upload up to 0.05mbps).
If someone thinks, that this is some kind of competitors posting, I have chat and email logs about mentioned issues, also screen shots of internet speed tests and could provide under request.
Ok, and one good thing about their service - you can use several IP addresses on one instance (what our current hosting provider - Amazon EC2 is unable to do).
Stay away from GoGrid !
I don't have any experience with Mosso, but I do have (unfortunately) VERY bad experience with GoGrid.
As other people mentioned, their support is horrible. Most times you will get a live chat person that really is no help at all - doesn't really know their system or how it works so he can't really help with any problem beyond restarting your server.
Another issue is their performance which is at best unreliable and at worst just not there. Starting from I/O which can drop to < 1mb/s (measured by a few tools) - ranging to network connections that are very slow - load balancers which do not spread the load (2 servers on RoundRobin get 70/30)
Not to mention a very buggy portal - new server picks a free ip, which I am then told is in use...and not by me - even though I have the whole range "assigned" to me -
new cases which are saved without the text - buttons which say "upgrade to a new plan" but do nothing... etc... etc...
Their billing department which is not responsive and you have to argue about everything (why am I paying $0.5/gb traffic when the site states $0.29 ?????)
I have been using them for about a year now - and that's only because I don't have the time to move. Hopefully I will be able to get the hell out of there in a month.
As you can tell, I am very very frustrated with them. I know it's my fault I didn't run away sooner, but I really didn't expect such a low level of service and quality.
beware....
Yoav.
Mosso has way better service though, and the clients stay happy. The only issue I have experienced with them ever was installing DNN (which is a pain period) and a single client machine refused to allow for FTP access to their site... but again, Mosso techs did everything they could to get it going.
It's simple, Mosso is just like a "reseller" hosting. They provide you everything whitelabel from billing to control panel then you sell it back to customers.
If you are developer, I recommend you choose GoGrid. Firstly, Mosso doesn't provide SSH access. Secondly, if you are RoR/Mongrel user, you are capped to limited RAM (unless you pay extra in addition to $100). Moreover, GoGrid allows you to choose server image (CentOS, Redhat, Windows) with some out-of-the-box support for RoR and LAMP.
Somemore, GoGrid provides you initial credits ($50 or $95 if you use MS-WEBFWRD) for you to try out before actually paying for it.
Mosso does not give you Admin control over the "servers" anymore...
Disclosure: I am the Technology Evangelist for GoGrid.
I wanted to address some of the points above by #Giedrius and #Yoav. I'm sorry if your experience was lower than expected. We have and continue to make dramatic improvements and upgrades to both our product features as well as our service. That being said, I want to answer a few points that you listed above, specifically:
1) Images - Do note that the HD size (persistent storage) is tied to the RAM allocation. Our base images for the lowest RAM allocation (512 MB) is now 30 GBs. Also, because some users experienced some performance issues with low allocations of RAM on Windows servers, we have set a minimum allocation of 1 GB or higher for most Windows instances. Also, all of our Windows 2008 instances now have SP2 on them: wiki.gogrid.com/wiki/index.php/Server_Images#Windows_2008_Server
2) Support - We are always working on making our support team and processes even better. Remember that there are several public clouds that charge for support, something we don't do. Yes, it is available 24/7/365 and you are correct that there are typically more support personnel available during business hours (that is the norm for many companies). Be we are here to help 24x7. Also, every GoGrid account gets a dedicated service team which consists of a variety of personnel from our organization (acct mgmt, tech support, billing, etc.)
3) SLA - We offer one of the most robust SLAs in the marketplace. Also, Cloud Storage IS in fact covered in our SLA under Section VI here: www.gogrid.com/legal/sla.php .
4) Reaction time - I do not believe that we ever state in the SLA that any issue will be "resolved" within 2 hours. I doubt that ANY hosting provider can offer that, simply because of the nature of hosting and the complexity therein. We will acknowledge and respond to tickets (as stated within the SLA) within 2 hours or 30 minutes depending on the nature of the ticket. I'm sorry if that isn't clear so please let me know where it can be better explained.
5) Internet speeds - we have multiple bandwidth providers for our datacenter. It is not typical that there is latency, jitter or slow transfer speeds. If a situation is encountered where the speeds are not what you expect, I encourage you to open a support ticket so that we can investigate.
6) I/O - recently we have been benchmarked by an independent 3rd party, CloudHarmony.com, as having the best I/O of cloud providers: http://blog.cloudharmony.com/2010/06/disk-io-benchmarking-in-cloud.html
7) Network Connections - see #5 above
8) Load Balancers - if you are encountering balancing issues, we encourage you to report it. Details on our LB can be found on the wiki: wiki.gogrid.com/wiki/index.php/(F5)_Load_Balancer
9) Portal - We continue to make optimizations to the web portal including recently launching a "list view" for customers with larger environments. If the portal is "misbehaving", I recommend clearing your cache and using the latest browser version (I personally use Chrome and Firefox regularly on the portal w/o issue). Alternatively, you could use the API to manage your GoGrid infrastructure.
10) Transfer Plan - A few months ago, we released some new RAM and Transfer Plans. It seems that you are still on the old Transfer plan if you have $0.50/GB instead of $0.29. We don't automatically change customers' plans without their permission. So I recommend that you upgrade your plan to enjoy the new pricing.
Hope that helps answer the questions/concerns. I didn't mean for it to be a sales pitch (as I'm not a sales guy) but I wanted to be sure that other readers had "the other side of the story."
Please contact me should you have any questions: michael[at]gogrid.com
Thanks!
-Michael

Resources