when to use which mod_rewrite rule for self routing? - mod-rewrite

There are several ways to write a mod_write rule for self routing. At the moment i am using this one:
RewriteCond %{REQUEST_URI} !\.(js|ico|gif|jpg|png|css)$
RewriteRule ^.*$ index.php [NC,L]
But i also could use
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*) index.php
OR
ErrorDocument 404 /index.php
There may be many more.
Are there any drawbacks for using one of these examples?
Are there any use cases where one rule makes more sense then the other?
Could you explain the difference between these rules in detail?
Thx for your time and help.

When your condition is:
RewriteCond %{REQUEST_URI} !\.(js|ico|gif|jpg|png|css)$
Then only images, icons, styles, and javascript are excluded from routing. This means you can't access static html, directories, or directory indexes. So if you just want to plop down a static html page somewhere, and serve it without it getting routed through index.php. It also means if you accidentally put an image or script or style in the wrong place, and try to access it (you would normally get a 404), it wouldn't get routed through index.php eventhough and would yield the default 404 error page.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
These conditions will exclude any URI that points to an existing resource. So if you plot an image, a script, or directory, static html, etc anywhere in your document root, you'll be able to go there without it being routed through index.php. Sometimes the condition RewriteCond %{REQUEST_FILENAME} !-s is also included, which excludes URI's that point to a symlink. This is usually what you'd see when doing routing, wordpress uses this.
ErrorDocument 404 /index.php
This does essentially the same thing as the previous conditions, except it does it outside of mod_rewrite and there's no way to impose additional conditions in the future or as needed. The downside of doing routing outside of mod_rewrite is that mod_rewrite and the core directives (ErrorDocument in this case) do processing on the URI at different times in the URI-file mapping pipeline. So if you have rules that do other things, they could get applied, and then ultimately still get routed through index.php because the 2 directives are conflicting with each other. Simply because rewrite rules are applied at one point in the pipeline doesn't mean other directives won't get applied later down in the pipeline. This is a bad way to do routing.
There's also stuff like:
RewriteCond %{REQUEST_URI} !^/index.php
RewriteRule ^.*$ index.php [L]
Which will blindly route everything. Even javascript, even images, even static html, everything. Sometimes this is what people want. Ultimately, this is going to be dependent on what you want and what your index.php script does. Is it going to handle 404's? (like what you'd want in the first routing rule), is it just going to handle non-static resources? (like what the second rule does), or is it a literal catch all and will do everything (what the rule above does)?
Also note that your rewrite flags are different between the first and second rules. Those are significant if you have other rules.

The biggest drawback to the first example (which is the one you say that you use) is that this method hard codes the files extensions (.js .ico .gif .pnd) that are excluded from being rewritten to index.php. The problem with this is that if you need to add new static content that uses a file extension that is not in your exclusion list, you must modify your rewrite rule accordingly. For example, if you were to start hosting flash content and needed to host .swf and .flv files you will need to update your existing rewritecond rule.
The middle solution is best (IMHO) because it does exactly what is says it does, namely if the requested file doesn't exist (!-f condition) OR the requested directory doesn't exist (!-d condition) then rewrite the request to index.php.

Related

Rewrite rule for one or two parameters

I have the following rewrite rule:
RewriteRule ^(.*)/(.*)$ /?page=$1&id=$2 [L,QSA]
But i want the id parameter to be optional … how do i have to write the rule?
RewriteRule ^(.*)$ /?page=$1 [L,QSA]
RewriteRule ^(.*)/(.*)$ /?page=$1&id=$2 [L,QSA]
Doesn't work.
Your own attempt looks just fine from the point of view of the rewriting itself. You claim "doesn't work" is nothing we can argue with, but it also does not really help, since it does not say anything about what that means exactly.
Assuming that in general rewriting does work for you and you only get the wrong (undesired) result I would suggest some slightly altered rule set:
RewriteEngine on
RewriteRule ^/?([^/]+)/?$ /?page=$1 [L,QSA]
RewriteRule ^/?([^/]+)/([^/]+)/?$ /?page=$1&id=$2 [L,QSA]
And a general hint: you should always prefer to place such rules inside the http servers (virtual) host configuration instead of using dynamic configuration files (.htaccess style files). Those files are notoriously error prone, hard to debug and they really slow down the server. They are only supported as a last option for situations where you do not have control over the host configuration (read: really cheap hosting service providers) or if you have an application that relies on writing its own rewrite rules (which is an obvious security nightmare).

Single instance of Joomla!, subdirectories and multiple domains

We recently moved a number of static websites from multiple (regional) domains onto a single .com domain which uses Joomla! to serve up content. The new site uses subdirectories and allows uses to navigate between countries. Like this:
newdomain.com/country-name1
newdomain.com/country-name2
newdomain.com/country-name3
We would now like each site to go back to having it’s own domain, but to essentially serve up the same website the user would be seeing by viewing the sub directory (we’ll probably drop the ability to navigate between countries, back that’s largely irrelevant to this post).
How can we do this with as little work as possible to the templates whilst retaining a single Joomla! instance? Has anyone got any experience of similar? I've read some articles but am not sure any of them give the user a true sense of being on a separate domain. I could of course be wrong (tbh, this is a little out of my field of expertise). Spoon-feeding appreciated. :)
You could try mapping the requests to the relevant folders
RewriteCond %{HTTP_HOST} ^(www\.)?newdomain\.com [NC]
RewriteRule ^/(.*) /country-name1/$1 [L]
RewriteCond %{HTTP_HOST} ^(www\.)?newdomain\.fr [NC]
RewriteRule ^/(.*) /country-name2/$1 [L]
RewriteCond %{HTTP_HOST} ^(www\.)?newdomain\.de [NC]
RewriteRule ^/(.*) /country-name3/$1 [L]
That should map a request for
newdomain.com/country-name1/somepage to newdomain.com/somepage
newdomain.com/country-name2/page to newdomain.fr/page
etc
Obviously you'd also have to make all domain names resolve to the same folder.

How to keep mod_rewrite from recognizing directories

So, lately I've been dealing with an issue relating to mod_rewrite and it seems nobody is trying to do anything like it. Every question people have is about trying to exclude directories from the rewrite, when I want them to be included like any other.
For instance, assuming my root directory with .htaccess file in it is www.example.com/root/
When I type in made up directory, such as www.example.com/root/asdfasdf, I have my .htaccess file set to redirect me to www.example.com/root/index.php?url=asdfasdf without change what's in the address bar on my browser
However, in trying to do the same with a real directory, such as www.example.com/root/admin, it not only changes the url in the address bar but changes it to www.example.com/root/admin/?url=admin.
Can anyone explain to me what's going on. I've tried all kinds of different regular expressions and flags and the ones that redirect anything still cause this same issue. can I go to www.example.com/root/admin and still get redirected to the root folder while hiding that the query string is ?url=admin.
[UPDATE: additional information 11-30-2012]
Like I said, I've tried it will multiple different lines of code and come out with the exact same redirect issue, assuming the redirect doesn't just fail altogether and produce a 500 error. Here's one of my latest iterations, though, which has produced the issue of not ignoring direcotories.
RewriteEngine On
RewriteBase /root/
RewriteCond %{REQUEST_FILENAME} !^(.\*\\.("png"|"jpg"|"gif") [NC]
RewriteRule (.\*?) index.php?url=$1 [QSA]
The rewrite condition is to keep the engine from rewriting if a picture is being requested (for css and img tags). I only didn't mention it previously because I have tried removing that line and it has made no difference.
I'm not exactly a master of mod_rewrite, though, so if you see any errors with anything I've written, please feel free to let me know.
It's not entirely clear from your question what you are trying to do and it would have been helpful to see what your .htaccess file actually looked like. However the following lines in an .htaccess file in the root folder:
RewriteCond %{REQUEST_URI} !^/root/index\.php
RewriteRule (.*) /root/index.php?url=$1 [L]
Will silently redirect requests made to http://www.example.com/root/madeupfolder/madeupfile.php to http://www.example.com/root/index.php?url=madeupfolder/madeupfile.php and will also do the same for real folders. So if the folder admin exists under root, then requests to http://www.example.com/root/admin will be silently redirected to http://www.example.com/root/index.php?url=admin
If however you wanted to serve up folders and files that actually exist, but rewrite requests for folders and files that do not exist, then you would need to adjust the rewrite like so
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^/root/index\.php
RewriteRule (.*) /root/index.php?url=$1 [R=301]
This would still rewrite requests made to http://www.example.com/root/madeupfolder/madeupfile.php to http://www.example.com/root/index.php?url=madeupfolder/madeupfile.php, but for real folders and files, such as requests made to http://www.example.com/root/admin, the admin folder would be served up.
Hope this helps, but if you can clarify your question a bit then I can try and help again.

Using mod_rewrite to view cached version from usual URL

My PHP site generates static html versions of dynamic db driven code and stores them in a folder called cache.
This means that when you visit say, /about-us/, the request is routed through index.php?page=about-us, and produces a file called /cache/about-us.html.
If that file exists, the PHP includes it, then exits. This seems a waste of time, why not just get apache to serve up /cache/about-us.html when /about-us/ is requested, but only if it exists.
My current mod_rewrite section just includes this so far:
RewriteRule ^([A-Za-z0-9-_/\.]+)\/$ /?page=$1 [L]
Which writes any /foo_bar/ request to index.php?page=foo_bar. What can I put before this to request my cached version if it exists?
First send everything to the /cache/ folder
RewriteRule ^([A-Za-z0-9-_/\.]+)/$ /cache/$1
Then, check if it's not found, and reroute
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^/cache/([A-Za-z0-9-_/\.]+)$ /page?=$1 [L]
Depending on how your server is set up you might have to change the paths a bit (I'd recommend using absolute ones). Also note that this will also apply to images that match the RewriteRule
I'm not sure wether you'll see any significant performance increase with this - just including a static file doesn't take that long (if you don't hit a database). Also, if you have APC you could cache the files in-memory.

htaccess rewrite

I would like to rewrite /anything.anyextension into /?post=anything.
eg:
/this-is-a-post.php into /?post=this-is-a-post or
/this-is-a-post.html into /?post=this-is-a-post or even
/this-is-a-post/ into /?post=this-is-a-post
I tried
RewriteRule ^([a-zA-Z0-9_-]+)(|/|\.[a-z]{3,4})$ ?$1 [L]
but it doesn't work.
Any help appreciated.
If you have access to the main server configuration, use this:
RewriteRule ^/(.+)\.\w+$ /?post=\1 [L]
If not, and you are forced to put this in a .htaccess file, you could try
RewriteRule ^(.+)\.\w+$ /?post=\1 [L]
In either case, this assumes you will only be rewriting URLs with a single path component (i.e. if you get a request like /path/anything.anyextension it might not work as you expect, the rewrite rule would need to be modified to handle that)
You need a better way to determine when to apply the rewrite rule, otherwise your page won't be able to display external JS or CSS, unless you define an exception.
SilverStripe (or the core, Sapphire) offers a good approach to this, something like:
RewriteEngine On
RewriteCond %{REQUEST_URI} !(\.css)|(\.js)|(\.swf)$ [NC]
RewriteCond %{REQUEST_URI} .+
RewriteRule ^([^\.]+) /?post=$1 [L,R=301]
This requires the URI not to be empty, not to be JS, CSS or SWF, and redirects back to your root directory:
http://localhost/this-is-a-post.php
http://localhost/?post=this-is-a-post
If you don't want a redirection, but the processing, remove the redirection rule R=301

Resources