Any negative impacts when using Mod-Rewrite? - performance

I know there are a lot of positive things mod-rewrite accomplishes. But are there any negative? Obviously if you have poorly written rules your going to have problems. But what if you have a high volume site and your constantly using mod-rewrite, is it going to have a significant impact on performance? I did a quick search for some benchmarks on Google and didn't find much.

I've used mod_rewrite on sites that get millions/hits/month without any significant performance issues. You do have to know which rewrites get applied first depending on your rules.
Using mod_rewrite is most likely faster than parsing the URL with your current language.
If you are really worried about performance, don't use .htaccess files, those are slow. Put all your rewrite rules in your Apache config, which is only read once on startup. .htaccess files get re-parsed on every request, along with every .htaccess file in parent folders.

To echo what Ryan says above, rules in a .htaccess can really hurt your load times on a busy site in comparison to having the rules in your config file. We initially tried this (~60million pages/month) but didn't last very long until our servers started smoking :)
The obvious downside to having the rules in your config is you have to reload the config whenever you modify your rules.
The last flag ("L") is useful for speeding up execution of your rules, once your more frequently-accessed rules are towards the top and assessed first. It can make maintenance much trickier if you've a long set of rules though - I wasted a couple of very frustrating hours one morning as I was editing mid-way down my list of rules and had one up the top that was trapping more than intended!
We had difficulty finding relevant benchmarks also, and ended up working out our own internal suite of tests. Once we got our rules sorted out, properly ordered and into our Apache conf, we didn't find much of a negative performance impact.

If you're worried about apache's performance, one thing to consider if you have a lot of rewrite rules is to use the "skip" flag. It is a way to skip matching on rules. So, whatever overhead would have been spent on matching is saved.
Be careful though, I was on a project which utilized the "skip" flag a lot, and it made maintenance painful, since it depends on the order in which things are written in the file.

Related

How does scary transcluding caching work on MediaWiki?

We are running a small wiki farm (same topic; six languages and growing) and have recently updated most templates to use several layers of meta-templates in order to facilitate maintenance and readability.
We wish to standardise those templates for all languages, therefore most of them are going to contain the exact same code on each wiki. This is why, in order to further simplify maintenance, we are considering the use of scary transcluding (more specifically, substitution) so that those meta-templates are only stored on one wiki and only have to be updated on that wiki, not on every single version.
(Note: if you can think of a better idea, don't hesitate to comment on this post!)
However, scary transcluding is called so for being scarily inefficient, therefore I need to know more about the way content included that way is cached by MediaWiki.
If I understand correctly, the HTML output of a page is stored in the parser cache for a duration of $wgParserCacheExpireTime. The default is 1 day, but it's safe to increase it on a small to medium wiki because the content will get updated anyway if the page itself or an included page is updated (and in some other minor cases).
There's also a cache duration for scary transcluding: $wgTranscludeCacheExpiry. Good, because you wouldn't want to make that HTTP call every time. However, the default value of 1 hour is not suitable for smaller wikis, on which an article may only be viewed every now and then, therefore rendering that cache absolutely useless.
If a page A uses a template B that includes template C from another wiki, does page A have to be entirely regenerated after $wgTranscludeCacheExpiry has been exceeded? Or can it still make use of the parser cache of template B until $wgParserCacheExpireTime has been exceeded?
You could then increase $wgTranscludeCacheExpiry to a month, just like the parser cache, but a page wouldn't get updated automatically if the transcluded template was, would it?
If yes, would refreshing the pages using that transcluded template be the only solution to update the other wikis?
IMHO the solution to find out is simple: try it! $wgScaryTranscluding is rarely used, but the few who tried enabling it reported having very few problems. There are also JavaScript-based alternatives, see the manual.
Purging is rarely a big issue: a crosswiki template is unlikely to contain stuff you absolutely want to get out right now. If the cache doesn't feel aggressive enough for you, set it to a week or month and see if something goes wrong. Ilmari Karonen suggests such a long cache even for HTML after all.

caching snippets (modX)

I was simple cruising through the modx options and i noticed the option to cache snippets. I was wondering what kind of effect this would have (downsides) to my site. I know that caching would improve the loading time of the site by keeping them 'cached' after the first time and then only reloading the updates but this all seems to good to be true. My question is simple: are there any downsides to caching snippets? Cheers, Marco.
Great question!
The first rule of Modx is (almost) always cache. They've said so in their own blog.
As you said, the loading time will be lower. Let's just get the basics on the floor first. When you chose to cache a page, the page with all the output is stored as a file in your cache-folder. If you have a small and simple site, you might not see the biggest difference in caching and not, but if you have a complex one with lots of chunks-in-chunks, snippets parsing chunks etc, the difference is enormous. Some of the websites I've made goes down 15-30 levels to parse the content in come sections. Loading all this fresh from the database can take up to a coupe of seconds, while loading a flat-file would take only a few microseconds. There is a HUGE difference (remember that).
Now. You can cache both snippets and chunks. Important to remember. You can also cache one chunk while uncache the next level. Using Modx's brilliant markup, you can chose what to cache and what to uncache, but in general you want as much as possible cached.
You ask about the downside. There are none, but there are a few cases where you can't use cached snippets/chunks. As mentioned earlier, the cached response is divided into each page. That means that if you have a page (or url or whatever you want to call it), where you display different content based on for example GET-parameters. You can't cache a search-result (because the content changes) or a page with pagination (?page=1, ?page=2 etc would produce different output on the same page). Another case is when a snippet's output is random/different every time. Say you put a random quotes in your header, this needs to be uncached, or you will just see the first random result every time. In all other cases, use caching.
Also remember that every time you save a change in the manager, the cache will be wiped. That means that if you for example display the latest news-articles on your frontpage, this can still be cached because it will not display different content until you add/edit a resource, and then the cache will be cleared.
To sum it all up. Caching is GREAT and you should use it as much as possible. I usually make all my snippets/chunks cached, and if I crash into problems, that is the first thing I check.
Using caching makes your webserver respond quicker (good for the user) and produces fewer queries to the database (good for you). All in all. Caching is a gift. Use it.
There's no downsides to caching and honestly I wonder what made you think there were downsides to it?
You should always cache everything you can - there's no point in having something be executed on every page load when it's exactly the same as before. By caching the output and the source, you bypass the need for processing time and improve performance.
Assuming MODX Revolution (2.x), all template tags you use can be called both cached and uncached.
Cached:
[[*pagetitle]]
[[snippet]]
[[$chunk]]
[[+placeholder]]
[[%lexicon]]
Uncached:
[[!*pagetitle]] - this is pointless
[[!snippet]]
[[!$chunk]]
[[!+placeholder]]
[[!%lexicon]]
In MODX Evolution (1.x) the tags are different and you don't have as much control.
Some time ago I wrote about caching in MODX Revolution on my blog and I strongly encourage you to check it out as it provides more insight into why and how to use caching effectively: https://www.markhamstra.com/modx/2011/10/caching-guidelines-for-modx-revolution/
(PS: If you have MODX specific questions, I'd suggest posting them on forums.modx.com - there's a larger MODX audience there that can help)

What's the optimal amount of queries an ExpressionEngine page should load?

I saw #parscale tweet: How many queries are you happy with for a home page? When do you say this is Optimized?
I saw responses that < 50 is good, 30 or less is best, and 100+ is danger zone. Is there really any proper number? And if say you do have > 50 queries running on your pages, what are some ways to bring it down?
I generally have sites that run the gamut that are under 50 queries and some more, though the "more" don't seem to be too slow, I'm always interested in making it faster. How?
How to reduce queries will vary from site to site, template to template, but there's been a few articles on EE optimisation and performance:
http://expressionengine.com/wiki/Reduce_Queries/
http://expressionengine.com/blog/entry/troubleshooting_site_performance_issues/
http://www.netmagazine.com/tutorials/optimise-your-expressionengine-site
http://www.leezilla.net/post/12377053779/ab-seeing-your-sites-performance
http://eeinsider.com/articles/using-cache-wisely-with-expressionengine/
But if you've done all that and still need to speed things up, then your next step is to look at add-ons like CE Cache.
Thing to remember is not all queries are created equal. You can have 1,000 queries that do very little in the way of impacting performance, or a single query that can slow everything way down.
In EE its actually better to look at the template debug output and identify key slow down spots in the template build then to always focus on just the query count.
As others have pointed out products like CE Cache, Solspace's Template Morsels, or even adding a varnish caching server in-front of an intensive EE web site can do wonders, though with the added work required to fully get a varnish setup in front of EE setup, I would currently stick to the other solutions/directions first.
There is not a magic query number. In my opinion, your server environment dictates what can be supported. The more resources you have, the more complex your code can be.
With that said, there are lots of options you can use if issues do arise on an EE website. The links in the answer above give you a solid list but here are some first things to check:
Remove search:field_name="" parameters
Reduce use of channel tags, combine if you can
Add disable="" parameter to channel tabs to disable what you don't need
Reduce use of embeds
Turn off all EE tracking code
Stop using advanced conditionals if you have a channel tag inside
Following on from Nevin's point. I find that the JB Graphite is a huge help, it turns the debug output into a pretty graph, so you can easily spot bottleneck queries.
http://devot-ee.com/add-ons/jb-graphite
I'll expand on MediaGirl's point number 6 - you can often greatly simplify conditionals by using Croxton's Ifelse and/or Switchee add-ons. Definitely worth a look.
I used CE Cache on a really intensive build and it reduced page load from 6 seconds to 0.7 seconds. Awesome addpon, with incredible documentation and the best support you can get anywhere.

Web site performance - what is it about? Readability and ignorance Vs. Performance

Let me cut to the chase...
On one hand, many of the programming advices given (here and on other places) emphasize the notion that code should always be as readable and as clear as possible, at (almost?!) any pefromance cost.
On the other hand there are SO many slow web sites (at least one of whom, I know from personal experience).
Obviously round trips and DB access, are issues a web developer should always keep in mind. But the trade-off between readability and what not to do because it slows things down, for me is very unclear.
Question are- 1.What else? 2.Is there a rule (preferably simple, but probably quite general) one should adhere to in order to make sure his code does not slow things down too much?
General best practices as well as specific advices would be much appreciated. Advices based on experience would be especially appreciated.
Thanks.
Edit: A little clarification: General peformance advices aren't hard to find. That's not what I'm looking for. I'm asking about two things- 1. While trying to make my code as readable as possible, when should I stop and say: "Now I'm hurting performance too much". 2. Little, less known things like- is selecting just one column faster than selecting all (Thanks Otávio)... Thanks again!
See the Stack Overflow discussion here:
What is the most important effect on performance in a database-backed web application?
The top voted answer was, "write it clean, and use a profiler to identify real problems and address them."
In my experience, the biggest mistake (using C#/asp.net/linq) is over-querying due to LINQ's ease-of-use. One huge query is usually much faster than 10000 small ones.
The other ASP.NET gotcha I see a lot is when the viewstate gets extremely fat and bloated. EnableViewState=false is your best friend, start every new project with it!
For web applications that have a database back end, it is extremely important that:
indexing is done properly
retrieval is done for what is needed (avoid select * when selecting specific fields will do - even more so if they are part of a covered index)
Also, whenever possible an appropriate caching strategy can help performance
Optimizing your code.
While making your code as readable as possible is very important. Optimizing it is equally as important. I've listed some items that will hopefully get you in the right direction.
For example in regards to Databases:
When you define the schema of your database, you should make sure that it is normalized and the indexes of fields are defined properly.
When running a query, specifically SELECT, only select the fields you need.
You should only make one connection to the database per page load.
Re-factor. This is probably the most important factor in producing clean, optimized code. Always go back and look at your code and see what can be done to improve it.
PHP Code:
Always test your work with a tool like PHPUnit.
echo is faster than print.
Wrap your string in single quotes (‘) instead of double quotes (“) is faster because PHP searches for variables inside “…” and not in ‘…’, use this when you’re not using variables you need evaluating in your string.
Use echo’s multiple parameters (or stacked) instead of string concatenation.
Unset or null your variables to free memory, especially large arrays.
Use strict code, avoid suppressing errors, notices and warnings thus resulting in cleaner code and less overheads. Consider having error_reporting(E_ALL) always on.
Incrementing an undefined local variable is 9-10 times slower than a pre-initialized one.
Methods in derived classes run faster than ones defined in the base class.
Error suppression with # is very slow.
Website Optimization
A good place to start is here (http://developer.yahoo.com/performance/rules.html)
Performance is a huge topic and there are a lot of things that you can do to help improve the performance of your website. It's something that takes time and experience.
Best,
Richard Castera
Scott and Rcastera did a good job covering DB and querying optimization. To address your question from a HTML / CSS / JavaScript standpoint:
CSS:
Readability is key. CSS is rendered so fast that you should never feel it is necessary to sacrifice readability for performance. As such, focus on adding in as many comments as necessary to document the code, why certain rules (like hacks) are there, and whatever else floats your comment boat. In CSS there are a few obvious rules to follow: 1) Use external stylsheets. 2) Limit external stylesheets to limit GET requests.
HTML: Like CSS, HTML is read so fast by the browser you should really only focus on writing clean code. Use whitespace, indentation, and comments to properly document your work. Only major things in HTML to remember are: 1) declare the <meta charset /> early within the head section. 2) Follow this guys advice to minimize browser reflows. *this rule actually applies to CSS as well.
JavaScript: Most optimizations for JavaScript are really well known by now so these'll seem obvious, like initializing variables outside of loops, pushing javascript to bottom of body so DOM loads before scripts start tying up all of the resources, avoiding costly statements like eval() or with(). Not to sound like a broken record, but keeping a well commented and easily readable script should still be a priority when developing JavaScript code. Especially since you can just minimize and compress away all the excess when you deploy it.

mod_rewrite vs php parsing

im working on a little project of mine and need your help in deciding whether mod_rewrite is performance friendly or parsing each url in php.
urls will almost always have a fixed pattern. very few urls will have different pattern.
for instance, most urls would be like so :
dot.com/resource
some others would be
dot.com/other/resource
i expect around 1000 visitors a day to the site. will server load be an issue?
intuitively, i think mod rewrite would work better. but just for that peace of mind, i'd like input from you guys. if anyone has carried out any tests or can point me towards the same, id be obliged.
thanks.
You may want to check out the following Stack Overflow post:
Any negative impacts when using Mod-Rewrite?
Quoting the accepted answer:
I've used mod_rewrite on sites that get millions/hits/month without any significant performance issues. You do have to know which rewrites get applied first depending on your rules.
Using mod_rewrite is most likely faster than parsing the URL with your current language.
If you are really worried about performance, don't use htaccess files, those are slow. Put all your rewrite rules in your Apache config, which is only read once on startup. htaccess files get re-parsed on every request, along with every htaccess file in parent folders.
To add my own, mod_rewrite is definitely capable of handling 1,000 visitors per day.

Resources