I am caching pages in my (Rails) application based on subdomain. The pages for certain actions are cached to /public/cache/(subdomain)/. The application is running under Apache with Phusion Passenger. The caching is working fine. The problem is that Apache is not picking up the cached pages and bypassing Rails like it should be. My rewrite rules are wrong and I need help fixing them.
I have used, as one example of many, the suggestion located at: https://github.com/yeah/page_cache_fu#readme, which is as follows:
RewriteMap uri_escape int:escape
<Directory /var/www/example.com/current/public>
RewriteEngine On
RewriteCond %{REQUEST_METHOD} GET [NC]
RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}%{REQUEST_URI}%{QUERY_STRING}.html -f
RewriteRule ^([^.]+)$ cache/%{HTTP_HOST}/$1${uri_escape:%{QUERY_STRING}}.html [L]
RewriteCond %{REQUEST_METHOD} GET [NC]
RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/index.html -f
RewriteRule ^$ cache/%{HTTP_HOST}/index.html
The problem with this is it seems to be expecting the directory to be the full http host (i.e. it's looking in cache/subdomain.example.com rather than just cache/subdomain).
Edit: Even when I change the Rails app to cache to cache/subdomain.example.com Apache still does not use them so it seems that there is more wrong than just the subdomain aspect.
Could someone please help me come up with the correct rule?
Edit(2):
I have simplified my rewrite to the following (just to try to get to a working starting point):
RewriteEngine On
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com$ [NC]
RewriteCond ^stats$ cache/%1/stats.html [L]
I would think this would cause http://abc.example.com/stats to be rewritten to http://abc.example.com/cache/abc/stats.html
It is not. I also added a RewriteLog entry and what I see there makes me think it is trying to redirect to http://abc.example.com/var/www/example.com/current/public/cache/abc/stats.html. This is further confirmed by that if I add an 'R' option along with the 'L' I see in my browser http://abc.example.com/var/www/....etc. I.e. it seems to be appending the full document root instead of just the public facing part.
Of course the result of the above is that I get a 404 error returned to the browser.
Can you see what is still wrong with my rule?
Edit: It's actually a bug.
http://code.google.com/p/phusion-passenger/issues/detail?id=563
Alright, this looks like it should work, but it doesn't. I've done a lot of testing with this, and it seems like the problem is the ^([^.]+)$ in the RewriteRule. Now, I did Google this, and it seems like it's a common enough pattern, so I don't understand what the issue could be. I just know that when I use that pattern in a RewriteRule, the rule fails. If I change it to ^([^.]+), it seems to work.
Hopefully someone with more experience with mod_rewrite can come along and explain to us what the problem with that pattern might be.
Edit: I just realized the problem with ^([^.]+)$:
Since you're building a cache, then the "normal" file will exist in its usual place. The implication of this is that if you ask the server for /file then, depending on your configuration, it will say "hey, file doesn't exist, so let's try the default extension of .html!" and so it goes off and finds file.html. Now when you get to the RewriteRule, the ^([^.]+)$ regex will be matched against file.html NOT file.
The ^([^.]+)$ says "the start of the string, followed by as many non-period characters as you can grab, followed by the end of the string" which works fine against file because it contains no periods. It fails against file.html because ^[^.]+ will match against file, but where the regex then expects to find the end of the string (i.e. $), it instead finds .html and fails.
The reason ^(.*)$ works is that it's guaranteed that only .* will be the whole of the string, since .* matches "as many of any character" so there is no character that can possibly exist between the (.*) and $ portions of the regex. That's not the case with [^.]+.
In order to extract the subdomain, you're going to need to backreference a RewriteCond. Basically, if you capture a reference (i.e. encapsulate something inside parens) in a RewriteCond, those references are available to a RewriteRule which immediately follows it.
For example, if I wrote this:
RewriteCond %{HTTP_HOST} ^([^.]+)\.example.com
Then the parentheses would capture the subdomain - note the () around [^.]+
If I were then to write a RewriteRule on the next line, the text captured above would become accessible as %1.
So your RewriteRule would look like this:
RewriteRule ^([^.]+) cache/%1/$1${uri_escape:%{QUERY_STRING}}.html [L]
Hope that helps.
Related
i need help regarding nice looking referral link. for example here is a referral link
http://www.my-domain.com/register.php?ref=john.doe
this is a perfect url but not looks good like the following
http://www.my-domain.com/john.doe
how can i achieve this using .htaccess file? please note that, i have index.php, member.php and other many php files in my server. moreover, if someone write my-domain.com it need to hit index.php file.
any help is highly appreciated.
Based on your example you can use the following .htaccess:
DirectoryIndex index.php
RewriteEngine On
RewriteRule ^([a-zA-Z]+\.php)$ $1 [NC,L]
RewriteRule ^([a-zA-Z]+\.[a-zA-Z]+)$ register.php?ref=$1
This works as following:
Line 1: Sets your index file to index.php if someone accesses http://www.my-domain.com/. This should already be working, but just in case.
Line 2: Enables the RewriteEngine.
Line 3: If someone wants to access anything (technically: anything with at least one character) ending with .php, it is just forwarded (i.e. foo.php will be mapped to foo.php). [NC,L] enables case insensitive matching (for the extension - you never know) and prevents any further rules from being executed. Otherwise the second one would also match every time.
Line 4: If someone wants to access anything matching "at least one character, a dot, at least one more character", then this will be mapped to register.php?ref=<input>
Note: This will effectively prevent all user names ending with .php, but allow access to all your files. It will also prevent user names containing less or more than one dot and it will in its current form not work if your files or user names contain any other characters (e.g. foo_bar.php or i_love_php). But those two limitations can be easily overcome if needed, just provide more details regarding expected behaviour.
You could add a RewriteCond to check if there actually exists a .php file with the requested name and treat it as user name otherwise, but I really don't think you should do that (think about adding new files).
This will get you what you need. This rule will handle nice link and also redirect old link to nice link.
DirectoryIndex index.php
RewriteEngine On
RewriteCond %{THE_REQUEST} ^GET\ /+register\.php\?ref=(.+)
RewriteRule ^ %1? [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/?$ /register.php?ref=$1 [L]
I'm trying to use mod_rewrite to point the blog portion of a site to a blog site.
this is what I have to handle the normal stuff
RewriteRule ^(\w+)/?$ index.php?page=$1
This is what i'm trying to use for the blog site
RewriteRule ^blog/?$ http://url.to.my.blogger.site
but it's not working, when I go to site/blog it directs me to index.php?page=blog is there something I need to do to not do the second rewrite if the first is correct? like an if/else? sorry don't know much about mod_rewrite so any advice would be awesome.
also I noticed that if I try to do something like site/home everything works fine but if I attempt to hit site/home/ it puts all of my urls into the wrong context, for example my css and images don't get loaded correctly.
my full file is this
RewriteEngine on
RewriteRule ^blog/?$ remote/blog/uri/here
RewriteRule ^(\w+)/?$ index.php?page=$1
RewriteCond %{THE_REQUEST} index\.php
RewriteRule ^index\.php - [F]
and when i hit site/blog it still tries to serve index.php?page=blog, I'm guessing I have to break out of the code at some point? I couldn't find documentation on if/else statements
I needed to add flags to my RewriteRule lines so that the server wouldn't evaluate further. Changing them to be
RewriteRule ^/blog http://url.to.blog [L]
did the trick, the problem was that it was evaluating all the way down, seeing as I wasn't attempting to go to index the last valid rule to evaluate was the general rewrite rule.
I've been working on a solution to this for several hours now & figured if someone doesn't mind helping me out, it might save me some time. My question is with regards to Apache mod_rewrite; of course there is tons of documentation out there, however nothing specific to my requirements which are:
to take a URL in this format:
language/pagename.php
(language will either be 'english' or 'french', I will write a separate rule for each. [only need an example for one though]. page name will be any word character (w+). all URLs will have a .php extension).
And then rewrite it so the URL doesn't change in the users browser, but so that php could receive it in this format:
language/page.php?slug=pagename
e.g. so $_GET['slug'] would return the value pagename, and all requests are then handled by page.php.
So far my best guess is
RewriteEngine On
RewriteBase /
RewriteRule ^english/(\w+).php$ english/page.php?slug=$1
However this make php tell me that slug=page for this URL for example english/financial.php; rather than financial.
Have tried a bunch of other regex conventions too (.) instead of w & so on..
Use these rules:
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(english|french)/([^/\.]+)\.php$ /$1/page.php?slug=$2 [NC,QSA,L]
This needs to be place in .htaccess file in root folder. If you will be placing it config file (e.g. httpd-vhost.conf, inside <VirtualHost> directive), then rule needs to be slightly altered.
This rule should work for any language, as long as you add it into the rule (english|french part).
This rule has a condition which will not rewrite if such file already exists. This should solve your problem with slug=page: in your rule you most likely have a rewrite loop (after rewrite occurs it goes to the next iteration -- that's how mod_rewrite works, and you need to have some logic in place to break this loop). Instead of RewriteCond %{REQUEST_FILENAME} !-f you could use RewriteCond %{REQUEST_URI} !^/(english|french)/page\.php [NC] but it is a bit more difficult to maintain (you need to add languages here as well as in rewrite rule itself).
If you already have some other rewrite rules then take care with placing these in correct place (order of rules matters).
Because I do not know for sure what page names (slugs) would be, I've used this pattern: [^/\.]+ (any characters except / or .) .. but you may change it to \w+ or whatever you think will be better.
Rule preserve any optional page parameters (query string) -- useful for preserving referrals/tracking code etc.
I am currenctly facing some htaccess/rewriterule issues. (And I am new to this area)
Let's assume we have an url like this:
http://mypage.at/very/cool
The URL is supposed to look like this (Cause I am using an AJAX-loadedContent which requires this):
http://mypage.at/#ajx/very/cool
So I would like to add the part '#ajx' to every url which does not already contain it.
Which means if an url does already look like: http://mypage.at/#ajx/so/pretty then there is no need for changes.
As I am not sure wheter this creates troubles with the GoogleSearchIndex, I would additionally like to know if there is a way to exclude this rule for searchbots.
Thanks for any help.
Ripei
Since you reported that this does not work (which is probably because your version of Apache doesn't support Perl-style RegEx):
RewriteRule ^(?!#ajx)(.*)$ http://mypage.at/#ajx/$1 [L]
I think this should do it:
RewriteCond %{REQUEST_URI} !^/#ajx
RewriteRule ^(.*)$ http://mypage.at/#ajx/$1 [L]
EDIT: After trying this myself and reading around on the Internet, I'm not sure this is actually possible. A pound sign (#) is not a legal part of a URL. This answer comes close, but I'm going to have to leave this to somebody who knows more to say whether this can even be done the way #Ripei asked for.
Something like this might work:
RewriteCond %{REQUEST_URI} !^/#ajx
RewriteRule ^(.*) http://%{SERVER_NAME}/#ajx$1 [R,L]
Hah finally after lot's of reading and testing - I got it.
RewriteCond %{REQUEST_URI} !^#ajx
RewriteRule . /\#ajx%{REQUEST_URI} [L,R,NE]
This Code works for me... don' ask me why this one is working, to be honest I do not have a clue! But well I am fine with the fact, that it IS working :)
Okay its been a while since I used mod rewrite and I created a couple of mod rewrites along time ago and forgot what they did and I was wondering what exactly does this code snippet do. Can someone please be as detailed as possible as to what the snippet does. Thanks!
Here is my mod rewrite code.
RewriteRule ^/?sitemap.xml?$ sitemap.php [L,NC,QSA]
Basically anything that looks for sitemap.xml will be passed to sitemap.php but without it showing to the user, that is, the url doesn't change for the user. Here is some of the documentation:
Taken from the Apache mod_rewrite Flags:
RewriteRule pattern target [Flag1,Flag2,Flag3]
L|last
The [L] flag causes mod_rewrite to stop processing the rule set. In most contexts, this means that if the rule matches, no further rules will be processed.
If you are using RewriteRule in either .htaccess files or in sections, it is important to have some understanding of how the rules are processed. The simplified form of this is that once the rules have been processed, the rewritten request is handed back to the URL parsing engine to do what it may with it. It is possible that as the rewritten request is handled, the .htaccess file or section may be encountered again, and thus the ruleset may be run again from the start. Most commonly this will happen if one of the rules causes a redirect - either internal or external - causing the request process to start over.
It is therefore important, if you are using RewriteRule directives in one of these context that you take explicit steps to avoid rules looping, and not count solely on the [L] flag to terminate execution of a series of rules, as shown below.
The example given here will rewrite any request to index.php, giving the original request as a query string argument to index.php, however, if the request is already for index.php, this rule will be skipped.
RewriteCond %{REQUEST_URI} !index\.php
RewriteRule ^(.*) index.php?req=$1 [L]
NC|nocase
Use of the [NC] flag causes the RewriteRule to be matched in a case-insensitive manner. That is, it doesn't care whether letters appear as upper-case or lower-case in the matched URI.
In the example below, any request for an image file will be proxied to your dedicated image server. The match is case-insensitive, so that .jpg and .JPG files are both acceptable, for example.
RewriteRule (.*\.(jpg|gif|png))$ http://images.example.com$1 [P,NC]
QSA|qsappend
When the replacement URI contains a query string, the default behavior of RewriteRule is to discard the existing query string, and replace it with the newly generated one. Using the [QSA] flag causes the query strings to be combined.
Consider the following rule:
RewriteRule /pages/(.+) /page.php?page=$1 [QSA]
With the [QSA] flag, a request for /pages/123?one=two will be mapped to /page.php?page=123&one=two. Without the [QSA] flag, that same request will be mapped to /page.php?page=123 - that is, the existing query string will be discarded.
When "sitemap.xm" is requested redirect the request to "sitemap.php" instead. "L" means leave (or skip any following rules)
NOTE
QSA flag has to do with Query String handling (combine the old and new). I couldn't find anything about NC.
The reference has the full detail (Apache).