mod_rewrite to string special characters and shorten URL simultaneously - mod-rewrite

Gang,
Long time sysadmin but first time poster to this excellent site, so, please be gentle.
I am not strong at REGEX yet and trying to do two things at once on our internally hosted "mediawiki" site.
We are running an otherwise pretty plain jane LAMP stack (centOS 5.x, Apache 2.x, PHP 5.x). We are root. We are using /etc/httpd/conf.d/wiki.conf and not using .htaccess. The physical path is /var/www/html/wiki/
I have partially successful results with some combination of the below, but I am not good enough to get it all the way there. I know that there are some mod_write studs on this site that I am hoping to avail.
I am following this recipe https://www.mediawiki.org/wiki/Manual:Short_URL/Apache so as to shorten URL's from www.example.com/wiki/index.php?=title=Garden_Store to www.example.com/wiki/Garden_Store
still allow the use of www.example.com/wiki/index.php?=title=Garden_Store should the user should choose to type out that syntax of URL. (I believe that is possible with mediawiki to use both style URL's at the same time. If it is impossible, then I will be forced to skip the short URL and use the style with the variable in it.)
Last, string special characters from the URL in the example like www.example.com/wiki/index.php?=title=Garden,_Store! ought to be this www.example.com/wiki/index.php?=title=Garden_Store .
Another example of that might be www.example.com/wiki/index.php?=title=Garden_Store,_Inc. ought to be www.example.com/wiki/index.php?=title=Garden_Store_Inc
One last example, us to make sure that I am communicating well, would be getting this "/title=Garden%20Store,%20Inc" but wanting this "/index.php?title=Garden%20Store%20Inc" as I know that the spaces are replaced with underscores inside of mediawiki.
Thanks so much for walking a newbie the last bit to the finish line on this one.
Cheers.
Jason

Something like the following rules should do what you need:
RewriteRule ^(.*)\ (.*)$ $1\_$2 [L]
RewriteRule ^(.*)[^a-zA-Z0-9\/\._](.*)$ $1$2 [L]
First rule does replace space with underscore and the other line strips all chars you don't want to stay in the resulting URL. Note, your will probably need to add some more, if you want.

Related

mod_rewrite one folder somewhere

Hi i hve been running forum for quite while and people started using it for spam etc and now even though i took it down i still got like 100 clicks day for threads that dont exist and the forum is not existing too.
I tired of seeing this crap in my piwik stats i want to move it with mod rewrite so every time someone access
site.com/samples/forum he goes soemwhere else like actual redirect to fbi.com etc. better without showing my url of couse or just to some non existing folder so it does not triger my stats.
You're probably looking for something along the lines of:
RewriteEngine on
RewriteRule /samples/forum http://othersite.com/path [L,R]
(of course can flex if you need patterns, or for different paths) There are tons of handy options that you can use, all documented in Apache's documentation.

(ISAPI_Rewrite) Rewrite domain when certain file is requested

I am trying to create a condition to rewrite a url from http://subdomain.example.com/foo/foofile.txt to http://example.com/foo/foofile.txt when the request has foo/foofile.txt in it. If someone could point me in the right direction, I would appreciate any and all help...
Thanks in advance,
B
This is redirecting a specific subdomain to the main domain, for a specific file.
Which version of isapi_rewrite are you using, v2 or v3?
This assumes v3:
RewriteCond %{HTTP:Host} ^subdomain\.(.*)$
RewriteRule ^(/foo/foofile3\.txt)$ http://%1$1 [NC,R=301]
The RewriteCond tests for a host beginning with subdomain (and a dot), and captures the remaining (example.com) for convenience in the rewrite rule. You could also put the specific example.com in there, but this keeps it more generic.
The RewriteRule then looks for a specific /foo/foofile3.txt, and captures it (again for convenience of not repeating it in the rule). The result has %1 for the capture up in the condition (example.com) and $1 for the capture in the rule (/foo/foofile3.txt)
The NC is not case-sensitive, the R=301 is a permanent redirect.
With v3, querystring parameters are automatically handled, so this keeps any that are after the file.
Other possibilities that make it a little more complex are any subdomain at all, or filename patterns instead of the one specific file.
I was having trouble with the browser caching my attempts while getting the rule right. So even though I changed the rule, the browser had cached my previous redirect. My quick fix was to count up the filename for new urls: foofile2, foofile3 (I guess I got it on the 3rd try...). I could have cleared the browser cache each time too; this was quicker.

mod_rewrite: how to strip url of query string yet retain it's values

I'd like to strip a URL of it's query string using mod_rewrite but retain the values of the querystring, for example, id like to change:
http://new.app/index.php?lorem=1&ipsum=2
to a nice clean:
http://new.app/
but retain the values of lorem and ipsum, so inside index.php:
$_GET["lorem"]
would still return 1 etc.
This is my first dabble with mod_rewrite so any help is greatly appreciated, and if you could explain exactly how your solution works, I can learn a little for next time too!
Thanks!
As Roland mentioned, you don't seem to understand the way rewriting works. It's typically done using Apache mod_rewrite in .htaccess, which silently rewrites the pretty URLs to the php script as /index.php?lorem=1&ipsum=2
Even Joomla uses .htaccess, except it has a single rewrite rule that passes EVERYTHING to a PHP script which does the actual rewriting in PHP.
What you are not understanding is that something still needs to exist in the "pretty" version for the php script to pull the value of $_GET["lorem"]
So it would be like http://new.app/lorem/ or http://new.app/section/lorem which would then (using mod_rewrite in .htaccess) rewrite TO the php script.
I don't understand exactly what you want. Your first URL is the external form, which the users see and can type into their browsers.
The second form has almost all information stripped, so when you send that to a server, how is the server supposed to know that lorem=1&ipsum=2?
If your question is really
How do I make the URLs in the browser look nice, even if the user is somewhere deep in the website clicking on URLs that carry lots of information?
then there are two solutions:
You can pass the information in small bits to the server and save them all in a session. I don't like that because then the user cannot take the URL, show it to a friend and have him see the same page.
You can have your entire web site in an HTML <frameset> containing only one <frame>. That way, the URL of the top-level window will not change, only the inner URL (which is not displayed by the browser) will.

Creating user/search engine friendly URLs

I want to create a url like www.facebook.com/username just like Facebook does it. Can we use mod_rewrite to do it. Username is name of the user in a table. It is not a sub directory. Please advise.
Sure, mod_rewrite can do that. Here is a tutorial on it.
Yes you can do this but you might have a couple of initial hurdles to get it going correctly.
The first is that you will have to use a regular expression to match it. If you don't know regex then this can be confusing at first.
The second is that you will need to take into account that of you are going to rewrite the top path on the domain you will have to have some mechanism for only rewriting if the file doesn't exist.
I guess if mod_rewrite supports testing if the url points at a real file that will be easy. If not you might have to use a blacklist of words that it wont rewrite as you will need to have some reserved words.
This would include at the least the folder that contains your images, css, js, etc and the index.php your site runs off, plus any other php files you have kicking around.
I would like to be more help but I am a .net guy and I usually help out in asp.net url rewriting issues with libraries such as UrlRewriter.net which have different configurations than mod_rewrite.
To match the username I would use a regex like this:
^/(\w*)/?$
this would then put the bit in the brackets into a variable you can use in the rewrite like
/index.php?profileName={0}
The regex I provided means:
^ nothing before this
/ forward slash
(\w*) any number of letters or numbers
/? optional forward slash
$ nothing after this

Rewrite URL to HTML file causing rewrite to .html/

I'm currently working on an overhaul of my blog site, and have found a way to convert all my current pages into static html pages. They are currently using friendly url's which remap to a central index.php page with GET parameters attached on.
The change I am trying to make is have those same friendly URL's map to their html counterparts. I am currently using this rule:
RewriteRule ^archives?/([^/]+)/([^/.]+)/?$ archives/$1/$2.html
The error log is reporting that it cant find blah.html/ which means it's looking for the .html directory, instead of the .html file. So a better example:
/archives/2009/original-name
should be getting mapped to
/archives/2009/original-name.html
but is really getting mapped to
/archives/2009/original-name.html/
What am I missing here?
don't you need to use it the other way around? I didn't test the code but it should be something like this:
RewriteRule ^archives/(.*)/(.*).html archives/$1/$2
I can't see anything obviously wrong with your regex.
At a guess I'd say you might have a rule somewhere following this, which is redirecting anything without a trailing slash to its equivalent with the slash (a common thing to do to avoid duplicate content issues).
You didn't escape your period in the 2nd statement. Try this.
RewriteRule ^archives?/([^/]+)/([^/\.]+)/?$ archives/$1/$2.html

Resources