New NICE URLs with 301s. How to make them work Together? - mod-rewrite

I have this old website URL structure:
site.com/folder/prod.php?cat=MAIN%20CAT%20&prodid1=123&prodtitle=PROD%20TITLE&subcat=SUB%20CAT
and real example will be something like:
site.com/folder/prod.php?cat=CAR%20AUDIO&prodid1=4444&prodtitle=MTX%20AMPS&subcat=AMPS
here you can see that for the product page there are 4 variables: category, produt id, product title and sub category. Some of this variables were used to open a menu. And yes, the URL pulls variables with space and both lower and uppercase.
The new site url has a new structure:
site.com/x/product-title-prodid2
a rel example will be like:
site.com/x/mtx-amps-8888
Which is accomplish by using two variables (friendly slug + a second product id: prodid2) with the following code in the .htaccess
<IfModule mod_rewrite.c>
Options +Indexes
Options +FollowSymlinks
RewriteEngine on
RewriteBase /
RewriteRule ^p/(.*)/$ product.php?prodid2=$1
RewriteRule ^p/(.*)$ product.php?prodid2=$1
</IfModule>
Internally we can get prodid2 if we have prodid1 from the same table, but not viceversa.
Everything works fine, but we now have to create 301 redirects and apparently since the same variables are not used in the old / new url, then it becomes tricky since apparently we have to create a single rule for the nice URL creation and the 301s?
We have tried adding the following to the htaccess:
RewriteCond %{QUERY_STRING} ^cat=CAR%20AUDIO&prodid1=4444&prodtitle=MTX%20AMPS&subcat=AMPS$ [NC]
RewriteRule ite.com/folder/prod.php site.com/x/mtx-amps-8888? [R=301,L]
and it works for only 1 product, but when adding 2 or more, the site goes down. I imaging this would be an infinite loop?
An alternative would be adding a:
ErrorDocument 404 /404.php
to get the URL and redirect to the page, but this would be ugly for SEs.
UPDATE:
Sorry for my lack of understanding on this topic, am very new to this.
The product has 2 important ids. For example:
MTX AMP (which is the actual product title) if listed in 3 categories will have 1 single prodid2 repeated and 3 different prodid1 (1 for each category). They all reside in the same table. So, if we have a prodid1 we can get the prodid2 which is right next to it in the db table.
The rule to get a nice URL on the new site is pulled using prodid2
RewriteRule ^p/(.*)$ product.php?prodid2=$1
which brings the complete value stored in the database. e.g. mtx-amps-8888 << this is a mix of a slug + the prodid2
complete url is:
site.com/p/mtx-amps-888
(the p is just a virtual forder and we take advantage of that variable to show the right page template)
So mtx-amps-888 are not 3 keys, these are generated when creating a product and saved all together in a single field in the db. They already include the separation - so this is not done in the htaccess.
The cat (key) value is really used to expand a menu used in the old site with, but to create the 301 redirect we would probably use prodid1 since we can match that value to get a prodid2. prodid2 is used as the main query to get the nice URL in the new site and its value will bring the nice URL stored in the db.
What makes sense from all my research would be the following:
<IfModule mod_rewrite.c>
Options +Indexes
Options +FollowSymlinks
RewriteEngine on
RewriteBase /
RewriteRule ^p/(.*)$ product.php?prodid2=$1
RewriteCond %{QUERY_STRING} ^cat=CAR%20AUDIO&prodid1=4444&prodtitle=MTX%20AMPS&subcat=AMPS$ [NC]
RewriteRule ite.com/folder/prod.php site.com/x/mtx-amps-8888? [R=301,L]
RewriteCond %{QUERY_STRING} ^cat=CAR%20AUDIO&prodid1=5555&prodtitle=BOSS%20AMPS&subcat=AMPS$ [NC]
RewriteRule ite.com/folder/prod.php site.com/x/mtx-amps-8888? [R=301,L]
RewriteCond %{QUERY_STRING} ^cat=CAR%20VIDEO&prodid1=6666&prodtitle=ALPINE%20DVDS&subcat=DVD%20PLAYERS$ [NC]
RewriteRule ite.com/folder/prod.php site.com/x/mtx-amps-8888? [R=301,L]
</IfModule>
Pls note that I removed a line from the main rewrite rule:
RewriteRule ^p/(.*)/$ product.php?prodid2=$1
This only assures that the user can also use / at the end of the URL: site.com/p/mtx-amps-888/
I also repeated the rewrite condition for the 301 redirects of 3 products, but i really have about 3K products to list here. If I keep 1, it will work but if I add 2, I believe a loop is created.
Hopefully this makes sense. You have no idea how important is for me to get this up and running, so my best wishes to those who can help :)

Just re-create the file /folder/prod.php and have php do the redirect. This is the easiest and cleanest solution.
<?php
$prodid1 = $_GET['prodid1'];
//calculate prodid2 based on prodid1, or use mysql to retreive the prodid2 belonging to prodid1
$prodid2 = $prodid1;//just for testing
$newpath = "/p/$prodid2/";
// redirect using 301
header("Location: http://{$_SERVER['HTTP_HOST']}{$newpath}");
header('HTTP/1.1 301 Moved Permanently');
?>

Related

Mod_rewrite engine enabled slows my dedicated beast of a box to a crawl

I've been troubleshooting an issue with .htaccess on a script that I've purchased and it's causing me quite a bit of strife. If I have rewrite enabled and use Blitz.io for a 1-250 test, it get's to about 5 users before timing out on all requests. There is no server resource contention that I can see, when this event occurs, yet I do on occasion see the event from Apache that I've used the maximum connections up. This can't be right as I've set it to handle several thousand connections.
Further backing up the rewrite theory, if I disable rewrite and run a Blitz against the same php page it completes the test without errors or timeouts of any significance (it also breaks most of the script :)). I also notice that my response time in Blitz with rewrite off is about 250ms max, whereas if I enable the rewrite engine it shoots up to past one second.
Any suggestions would be greatly appreciated, I've searched quite a bit an haven't come up with much, granted I'm a re-write n00b.
Thanks in advance, going to go ice my head now...
# enable apache morRewrite module #
RewriteEngine on
RewriteBase /
# set files headers
<IfModule mod_headers.c>
<FilesMatch "\.(css|js|png|gif|jpg|jpeg|htc)$">
Header set Cache-Control "max-age=2678400, public, must-revalidate"
</FilesMatch>
</IfModule>
# allow request methods
<Limit POST PUT DELETE GET OPTIONS HEAD>
Order deny,allow
Allow from All
</Limit>
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
ErrorDocument 404 /404.html
# non last slash redirect
RewriteCond %{REQUEST_URI} !(\.php|\.html|\.xml|\.txt|[\/])$ [NC]
RewriteRule ^(.*)$ http://%{HTTP_HOST}/$1/ [NC,L,R=301]
# define system languages
#RewriteRule ^([a-zA-Z]{2})$ index.php?page=$1 [QSA,L]
# define paging
RewriteRule ^([^//]+)/?(.*)?/index([0-9]*).ht(m?ml?)$ index.php? page=$1&rlVareables=$2&pg=$3 [QSA,L]
# define listing
RewriteRule ^(([\w\-\_]+)?/)(.+)-l?([0-9]+).ht(m|ml)$ index.php? page=$2&rlVareables=$3&listing_id=$4 [QSA,L]
# wildcard request
RewriteCond %{HTTP_HOST} ^((?!www\.|m\.|mobile\.).*)\..+\.[^/]+$ [NC]
#RewriteCond %{HTTP_HOST} ^((?!www\.|m\.|mobile\.).*)\..+$ [NC] # FIRST LEVEL DOMAIN (localhost) USAGE
RewriteRule (.*) index.php?page=%1&wildcard&rlVareables=$1 [QSA,L]
# account request (sub-directory)
RewriteRule ^((\w{2})/)?([\w-_]{3,})$ index.php?page=$3&lang=$2&account_request [QSA,L]
# define single pages
RewriteRule ^([^//]+)/?(^/*)?.ht(m?ml?)$ index.php?page=$1 [QSA,L]
# define other pages
RewriteRule ^([^//]+)/?(.*)?/?(.*)?(.ht(m?ml?)|/+)$ index.php?page=$1&rlVareables=$2 [QSA,L]
Have you looked at your access logs and rewrite.logs (if you can temporarily enable the latter)?
One thing that does jump out is the Header directive for your furniture (css, jpegs, etc.) and specifically the must revalidate flag. This will force client browsers to issue a conditional GET for every image etc. This is not the default behaviour. Browsers will assume a cacheable life of 10% of the age of any static file (that is if it is 10weeks old, the browser will only revalidate the file once per week). OK most of these GETs will result in 304 "not modified" response, but this still means that Apache has to validate these requests, and this could easily increase the overall request rate to your server by 5-10x.
The "non last slash redirect" will fire for all URIs other than php, html, xml and txt files including jpeg's etc. The two REQUEST_FILENAME conditions should immediately precede the REQUEST_URI condition. Viz the ErrorDocument directive needs to be moved up 4 lines.
You also need to use test vectors to check out the single and other page regexps. They are valid syntax but won't give you what I think you want (e.g. [^//] is the same as [^/]; .ht(m?ml?) matches shtmml; the ^/* should probably read [^/]* so this rule currently only matches if ^([^//]+)/? matches to the null string and thus degenerates to (^/*)?.ht(m?ml?)$.
I'd ask for my money back if I were you :-(

Joomla htaccess rewrite url - Parameter must index by a number - Why?

this is the firsttime a put a question here, so dont hard on me. Thank you.
I currently setup a joomla site. I create a page, and a new template, and a module, inside the template/index.php i call my module.
The original url that works is something like:
index.php/danh-sach-game?gt_name=game_mang_xa_hoi
danh-sach-game: is the page.
game-mang-xa-hoi: is the input parameter to the module.
everythings works find but i want to rewrite url to this:
danh-sach-game/game-mang-xa-hoi
So i created a .htaccess with content:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^danh-sach-game/(.*)$ index.php/danh-sach-game?gt_name=$1 [L]
</IfModule>
Now this is time for "MAGIC"
if i enter the url:
danh-sach-game/game-mang-xa-hoi
then Joomla push a message "An error has occurred.The requested page cannot be found."
But if i index the parameter by a number like this:
danh-sach-game/1-game-mang-xa-hoi (note: the number 1).
Then it works finds. Any paremeter index by a number will work find.
I rewrite url to a test file (replace index.php by test.php) than the page test.php receive the parameter as usuas, with or without number index the parameter.
This is because Joomla and most of the Joomla extensions uses ID column of the content as an identifier to look up the database. Some of the SEF tools (for example AceSef, sh404Sef) provide the facility to lookup using the alias name (the text after the number and hyphen) however with additional cost of database queries (they will in turn query for proper url internally).
The number in the last part of the URL will be processed and passed as ID of the particular page/content that you are viewing. This is done in the particular component's router.php file. So check the router.php file of the component you are using to check how the url gets parsed.

Apache Mod Rewrite Rule Special Condition

Im trying to redirect an old domain to its noew domain but there are some rules wheich I need to put in place and so far I havn't managed to get it quite right.
the old domain e.g www.old-domain.com has hundreds of folders names after UK towns like this:
www.old-domain.com/sheffield/
www.old-domain.com/london/
www.old-domain.com/essex/
inside each of these folders contains an index.html file and possible other directoreis and files.
I needs to redirect them to the new domain in such a way so that old domain/town maps to new domain/town but old domain/town/index.html doesnt put index.html on the new domain end however if the path after the town is anything other than index.html to redirect to it on the new domain.
Sorry that isn't the easiest to explain and not the easiest to read and undeerstand Im sure.
www.old-domain.com/sheffield => www.new-domain.com/sheffield
www.old-domain.com/sheffield/ => www.new-domain.com/sheffield/
www.old-domain.com/sheffield/index.html => www.new-domain.com/sheffield/
www.old-domain.com/sheffield/main.html => www.new-domain.com/sheffield/main.html
www.old-domain.com/sheffield/innerFolder/ => www.new-domain.com/sheffield/innerFolder
www.old-domain.com/sheffield/innerFolder/file.php => www.new-domain.com/sheffield/innerFolder/file.php
The two in bold above I managed to get working by this:
Options +FollowSymLinks
RewriteEngine on
RewriteRule ^sheffield/(.*)$ http://www.new-domain.com/sheffield/$1 [R=301,L]
However Im really struggling to get old-domain.com/sheffield/index.html to not put .index.html on the new domain.
Can anyone shed any light on this before I pull my hair out staring at mod rewrite tutorial for any more hours?
Hint: The rewrite rules are processed on first matched basis.
You can put your exception before the main rule
After roughly 4-5 hours of trying different combniations and reading god know how many rewriterule tutorials I managed to get there. Heres the htacess file for just 3 locations which all work wonderfully now.
Options +FollowSymLinks
RewriteEngine on
RewriteRule ^sheffield\/index\.html http://www.new-domain.co.uk/sheffield/ [R=301,L]
RewriteRule ^sheffield/(.*)$ http://www.new-domain.co.uk/sheffield/$1 [R=301,L]
RewriteRule ^bolton\/index\.html http://www.new-domain.co.uk/bolton/ [R=301,L]
RewriteRule ^bolton/(.*)$ http://www.new-domain.co.uk/bolton/$1 [R=301,L]
RewriteRule ^coventry\/index\.html http://www.new-domain.co.uk/coventry/ [R=301,L]
RewriteRule ^coventry/(.*)$ http://www.new-domain.co.uk/coventry/$1 [R=301,L]

Rewrite Condition not working for Rewrite rule on individual pages

I am trying to write a rule that will capture any url that does NOT have sales/anything up to a .php or .php3 file and anything after that - if there is anything - and rewrite that to a new website as per below:
RewriteCond %{REQUEST_URI} !^(/sales/.*php3?).*
RewriteRule ^/sales/([^./]*)$ http://www2.domain.com/sales$1/index.shtml [R,L]
It captures if I put in www.domain.com/sales but if I put in just http://www.domain.com/sales/trucks.shtml if does not capture the individual pages.
Can anyone see what I need to do to get this to work correctly please ?
To clarify:
.If I put in url www.domain.com/sales, the site redirects to www2.domain.com/sales/index.shtml ....however if I put in the url www.domain.com/sales/trucks.shtml the condition is not picked up and the url does not rewrite to the ww2 site so I am stuck on the old page still ....thanks for your help
Alright use these 2 rules for your requirements:
RewriteRule ^sales/?$ http://www2.domain.com/sales/index.shtml [R,L,NC]
RewriteRule ^sales/(?!.*\.php3?$).*$ http://www2.domain.com%{REQUEST_URI} [R,L,NC]

Dealing with non-hardcoded domain names with mod_rewrite

I am migrating my application which provides a subsite for each user from domain.com/~user to user.domain.com. In order to do this, I wrote the following RewriteRule:
RewriteRule ^~([a-z_]+)(/.*)?$ http://$1.%{HTTP_HOST}$2 [R=301,QSA,NC]
However, %{HTTP_HOST} doesn't do exactly what I need it to, because if for instance a user browses to www.domain.com/~user, it'll redirect to user.www.domain.com which is obviously not what I'm looking for.
I know that I can replace %{HTTP_HOST} with a hardcoded domain, but I don't want to do this either, because I will be rolling out the changes on multiple domains and don't want to have to customize it for each one. Is there a better way to make a singular change without hardcoding? (Furthermore, what if the base domain already has a subdomain -- ie. sub.domain.com/~user -> user.sub.domain.com)
Try it with this additional RewriteCond:
RewriteCond %{HTTP_HOST} ^(www\.)?(.+)
RewriteRule ^~([a-z_]+)(/.*)?$ http://$1.%2$2 [R=301,QSA,NC]
This will remove the www. prefix from the host if present.

Resources