mod_rewrite: match only if no previous rules have matched? - mod-rewrite

I've got a large set of rewrite rules like the following:
RewriteRule ^foo foo.php?blah [L]
RewriteRule ^bar foo.php?baz [L]
And then I have a sort of catch-all rule that I want to only apply if the above rules don't match (e.g. for, say /blatz). As long as I remember to include the [L], that works fine -- but I've already had issues twice with accidentally forgetting it.
Is there any easy way to to force my catch-all rule to not match if an earlier rule has matched? (ideally, without appending something to every rule)

The only solution that I can image is to either use the S flag to skip the last rule:
RewriteRule ^foo foo.php?blah [L,S=999]
RewriteRule ^bar foo.php?baz [L,S=999]
RewriteRule …
Or to set an environment variable:
RewriteRule ^foo foo.php?blah [L,E=FLAG:1]
RewriteRule ^bar foo.php?baz [L,E=FLAG:1]
RewriteCond %{ENV:FLAG} ^$
RewriteRule …
Edit    Alright, here’s another solution that compares the current URL with the originally requested one:
RewriteCond %{THE_REQUEST} ^[A-Z]+\ (/[^?\s]*)\??([^\s]*)
RewriteCond %{REQUEST_URI}?%{QUERY_STRING}<%1?%2 ^([^<]*)<\1$
RewriteRule …
But I think that requires at least Apache 2 because Apache 1.x used POSIX ERE and POSIX ERE don’t support the \n backreferences in the pattern.

Related

Mod rewrite rule for mapping one or two parameters

I have the following .htaccess:
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^/?([^/]+)/?$ /?page=$1 [L,QSA]
RewriteRule ^/?([^/]+)/([^/]+)/?$ /?page=$1&id=$2 [L,QSA]
This rules allow to enter urls like:
example.com/my-account-dashboard
example.com/my-account-dashboard/1
which are pretty urls for:
example.com?page=my-account-dashboard
example.com?page=my-account-dashboard&id=1
This works fine so far. But internaly the links are with those parameters. Is it possible to redirect (or something) to the pretty urls if possible? What are the rewrite rules for that?
First of all, a few remarks about your current code which contains some errors.
1) RewriteCond only applies on the very next RewriteRule. So your second RewriteRule can match without that condition (you can try it, you'll see). You need to put (again) that condition to the other RewriteRule (or use S skip flag to simulate if/else condition but it gets complicated for nothing).
2) I'm pretty sure you don't want to use QSA flag the way you do. By using it, you tell mod_rewrite to append any query string to the rewrite. Example: example.com/my-account-dashboard/?foo=bar will rewrite to /?page=my-account-dashboard&foo=bar. So unless you really want that, you don't need it. A lot of people think that they need QSA when adding some query string directly in the rewrite, just like you do. Again, this is not an error that will make everything crash, but still it's not totally correct.
3) Your rules create duplicate content which is bad for SEO (referencing). For instance, example.com/my-account-dashboard and example.com/my-account-dashboard/ (notice the trailing slash) both lead to the same page. But search engines won't consider them as the same. I invite you to search "duplicate content" on Google (or any other search engine you like) and have a look at it. A simple way to avoid this is to chose either with or without the trailing slash.
Now that the base is clear, let's answer to your question. You can't simply use a redirect R from old-url to new-url because you'd end up with an infinite loop. Something is there for this problem: THE_REQUEST. When mod_rewrite uses it, it is able to know that it comes from a direct client request, not a redirect/rewrite by itself.
All-in-one, here is how your code should look like:
RewriteEngine On
RewriteBase /
# Redirect old-url /?page=XXX to new-url equivalent /XXX
RewriteCond %{THE_REQUEST} \s/\?page=([^/&\s]+)\s [NC]
RewriteRule ^ /%1? [R=301,L]
# Redirect old-url /?page=XXX&id=YYY to new-url equivalent /XXX/YYY
RewriteCond %{THE_REQUEST} \s/\?page=([^/&\s]+)&id=([0-9]+)\s [NC]
RewriteRule ^ /%1/%2? [R=301,L]
# if /XXX is not a file/directory then rewrite to /?page=XXX
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^/?([^/]+)$ /?page=$1 [L]
# if /XXX/YYY is not a file/directory then rewrite to /?page=XXX&id=YYY
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^/?([^/]+)/([0-9]+)$ /?page=$1&id=$2 [L]
NB: i chose to use the "without trailing slash" option (e.g. example.com/my-account-dashboard and example.com/my-account-dashboard/1). Feel free to ask if you want with.

Rewrite URL so that query string is in path

I'm trying to get the following path: /faculty/index.php?PID=FirstLast&type=alpha
To rewrite to this: /faculty/FirstLast
Am I correct to assume the following would be acceptable to put in .htaccess?
# Rewrite old URLS
RewriteCond %{QUERY_STRING} ^PID=([0-9a-zA-Z]*)$
RewriteRule ^/faculty/index.php$ /faculty/%1 [R=302,L]
I'm okay to throw away any other query string variables. I'm applying these rules at the .htaccess file level. This project is a migration from an older system into Drupal.
Outcome:
My .htaccess looks like
# Rewrite old URLS
RewriteCond %{QUERY_STRING} PID=([0-9a-zA-Z]*)
RewriteRule ^faculty/ /faculty/%1/? [R=301,L]
RewriteCond %{QUERY_STRING} vidID=([0-9]*)
RewriteRule ^videos/ /video/id/%1/? [R=301,L]
I also found this wonderful tool -- a mod_rewrite tester
http://htaccess.madewithlove.be/
All good!
Try this instead:
RewriteRule ^faculty/index.php$ /faculty/%1? [R=302,L]
The leading slash is not in the URI-path tested in the rule, so can't be in the regex either.
As the query is automatically appended to the substitution URL (passed through unchanged) unless a new query is created in the rule, the trailing question mark ? erases the existing query string when the rule is used.

mod_rewrite conflict with other $_GET vars

I am using mod_rewrite to make my URLs clean. By doing so:
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ index.php?page=$1&sub=$2
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/$ index.php?page=$1&sub=$2
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/([0-9]+)$ index.php?page=$1&sub=$2&id=$3
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/([0-9]+)/$ index.php?page=$1&sub=$2&id=$3
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ index.php?page=$1&sub=$2&action=$3
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/$ index.php?page=$1&sub=$2&action=$3
This changes everything nicely from for example:
website.com/index.php?page=user&sub=profile to website.com/user/profile/.
But what if there are other $_GET variables AFTER profile. So for example, if the user calls for:
website.com/user/profile/?do=that&go=ahead
When I try to print $_GET['do'] and $_GET['go'], they return empty.
Any ideas?
Suggestions to make my mod_rewrite code shorter are also welcome :)
In that case, you need to add the QSA flag. Furthermore, I'd say that all your 6 rules are basically doing the same. You can add optional sections to your regular expressions with the ? operator. For instance, these lines:
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ index.php?page=$1&sub=$2
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/$ index.php?page=$1&sub=$2
... can be merged as:
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/?$ index.php?page=$1&sub=$2 [QSA]

mod rewrite and static pages

is possible to exclude a url being parsed by mod rewrite?
my .htaccess has rewrite rules like
RewriteRule ^contact contact_us.php
and a couple more static pages.
currently my site don't have troubles cause uses http://domain.com/user.php?user=username
but now i need rewrite to:
http://domain.com/username
I've tried with:
RewriteRule ^(.*)$ user.php?user=$1 [L]
but all my site stops working...
is possible to avoid parse my static pages like contact/feed/etc being treated like usernames?
edit to match david req:
this is my actual .htaccess file:
RewriteEngine On
Options +Followsymlinks
RewriteRule ^contact contact_us.php [L]
RewriteRule ^terms terms_of_use.php [L]
RewriteRule ^register register.php [L]
RewriteRule ^login login.php [L]
RewriteRule ^logout logout.php [L]
RewriteRule ^posts/(.*)/(.*) viewupdates.php?username=$1&page=$2
RewriteRule ^post(.*)/([0-9]*)$ viewupdate.php?title=$1&id=$2
RewriteRule ^(.*)$ profile.php?username=$1 [L]
also i've enabled modrewrite log my first file:http://pastie.org/1044881
Put the rewrite rules for the static pages first, and add the [L] flag to them:
RewriteRule ^contact contact_us.php [L]
...
then after those, use your rewrite rule for the username:
RewriteRule ^(.*)$ user.php?user=$1 [L]
(hopefully nobody has a username of contact).
EDIT: Based on the log output you posted (which I'm assuming corresponds to an unsuccessful attempt to access the contact page... right?), try changing the contact rewrite rule to either
RewriteRule ^contact$ contact_us.php [L]
or
RewriteRule ^contact contact_us.php [L,NS]
That is, either add $ to make the pattern match only the literal URL contact, or add the NS flag to keep it from applying to subrequests. According to the log output, what seems to have happened is that Apache rewrites contact to contact_us.php and then does an internal subrequest for that new URL. So far so good. The weird thing is that the ^contact pattern again matches contact_us.php, "transforming" it to contact_us.php, i.e. the same thing, which Apache interprets as a signal that it should ignore the rule entirely. Now, I would think Apache would have the sense to ignore the rule only on the subrequest, but I'm not sure if it's ignoring the entire rewriting process and leaving the original URL, /contact, as is. If that's the case, making one of the changes I suggested should fix it.
EDIT 2: your rewrite log excerpt reminded me of something: I'd suggest making the rewrite rule
RewriteRule ^([^/]+)$ user.php?user=$1 [L]
since slashes shouldn't be occurring in any usernames. (Right?) Or you could do
RewriteRule ^(\w+)$ user.php?user=$1 [L]
if usernames can only include word characters (letters, numbers, and underscore). Basically, make a regular expression that matches only any sequence of characters that could be a valid username, but doesn't match URLs of images or CSS/JS files.
The -f and -d options to RewriteCond check if the current match is a file or directory on disk.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ....

Rewrite rule fails to match single quote chars

I asked this question earlier:
mod_rewrite: match only if no previous rules have matched?
And have been using the suggested solution with success for a while now:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{THE_REQUEST} ^.+\ (/[^?\s]*)\??([^\s]*)
RewriteCond %{REQUEST_URI}?%{QUERY_STRING}<%1?%2 ^([^<]*)<\1$
RewriteRule .* /pub/dispatch.php [L]
However, we've since discovered that this rule fails for URLs containing single quote chars, e.g. http://example.com/don't_do_it (which is actually requested as http://example.com/don%27t_do_it)
Specifically, this is the line that's failing to match:
RewriteCond %{REQUEST_URI}?%{QUERY_STRING}<%1?%2 ^([^<]*)<\1$
commenting it out causes the rule to match as expected, but breaks the "match only if no previous rules have matched" behavior. This is presumably related to the fact that ' is urlencoded to %27.
Here's the relevant RewriteLog entry (for the url /asdf'asdf aka /asdf%27asdf):
RewriteCond: input='/asdf'asdf?</asdf%27asdf?' pattern='^([^<]*)<\1$' => not-matched
What I'm seeing here is that %{REQUEST_URI} is unescaped while %{QUERY_STRING} is escaped, hence the mismatch. Is there an alternative to either one of those I should be using?
Any ideas how to rewrite the above line so that it will also match lines that contain ' chars?
Try the C flag and chain the sequence of rules of which you just want one to be applied. So actually chain all of your rules.
You can test the [NE] flag at the end of the RewriteRule.
After beating on it for quite some time, things are looking good with:
RewriteMap unescape int:unescape
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{THE_REQUEST} ^.+\ (/[^?\s]*)\??([^\s]*)
RewriteCond %{REQUEST_URI}?%{QUERY_STRING}<${unescape:%1}?%2 ^([^<]*)<\1$
RewriteRule .* /pub/dispatch.php [L]

Resources