Can't use parentheses in RewriteCond QUERY_STRING - laravel

Moved from https://serverfault.com/questions/1013461/cant-use-parentheses-in-rewritecond-query-string because it's on topic here.
I need to capture a UID from an old url and redirect it to a new format.
example.com/?uid=123 should redirect to example.com/user/123
What should work:
RewriteCond %{QUERY_STRING} ^uid=(\d+)$
RewriteRule ^$ /user/%1? [L]
This does not redirect at all.
However, this does:
RewriteCond %{QUERY_STRING} ^uid=\d+$
RewriteRule ^$ /user/%1? [L]
It goes to example.com/user. The UID is left out, but it DOES redirect.
Notice: All I did was remove the parentheses in the second example.
Why is this?? How can I match the query AND capture the value of UID?
Updates
This is a laravel app. I've discovered that the redirects I did see may have been coming from the app, not Apache.
Self-answer coming soon...
Temporarily adding R=302 gives the desired result:
RewriteCond %{QUERY_STRING} ^uid=(\d+)$
RewriteRule ^$ /user/%1? [L,R=302]
This, of course, sends a 302 redirect to /users/123. I'd like to see if this can be done with an internal rewrite though...
Here are some rules in laravel's default .htaccess:
# Handle Front Controller...
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^ index.php [L]
This catches paths that do not point to real files, and it points them to the laravel app. When this is removed, Apache responds with a 404 for /users/1234.
https://httpd.apache.org/docs/2.4/rewrite/flags.html#flag_l
Such a rewrite goes back to Apache's URL parser. Then the .htaccess is processed again (since it's still applicable to this new URL). At this point, I'd expect the above rules to pick up the non-existent path and point it to the laravel app...
Found it. Writing an answer now.

The Answer
MrWhite was right. You have to add R=302 or R=301 to perform a redirect. An plain ol' rewrite won't work.
RewriteCond %{QUERY_STRING} ^uid=(\d+)$
RewriteRule ^$ /user/%1? [L,R=302]
The Reason
So, the way Laravel works is:
you request /some/file
.htaccess tells apache, "hey apache, if you have a request for a file that doesn't exist just pretend it's for index.php"
apache says, "hey php, I have a request to run index.php and the url is /some/file"
php runs the script which --whoah-- is a huge laravel application
whatever, "hey laravel, the server said /some/file is the url"
laravel does all it's fancy stuff, and it tries to match the url to one of your routes
Now, I added a rule to rewrite a certain URL to a virtual URL that Laravel should handle. I was matching against query parameters, but that was irrelevant. (see below for details)
When Apache's Rewrite Module hits a RewriteRule without an [R] flag, it rewrites the URL and sends it back to the URL Handler. Apache's URL Handler then processes the new URL against all the rules, including those in any applicable .htaccess files.
So all the proper rules did get applied.
Here's the key revelation:
The originally requested URL never changed. So while Apache was able to pass the request to PHP with the correct file, it was also sending along the old URL.
Therefore, we have to tell Apache to send a 301 or 302 Redirect response, instead of just rewriting the request. The user will send another request with the URL that Laravel needs to resolve the route.
But what about the different behavior with/without the parentheses?
The answer lies within Laravel's default .htaccess. Let's take a look my old rules without the parentheses:
RewriteCond %{QUERY_STRING} ^uid=\d+$
RewriteRule ^$ /user/%1? [L]
Without the parenthesis to grab the uid value, %1 is empty. So we end up rewriting the URL to just /user/.
Now, we have to look at another set of Laravel rules:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [L,R=301]
This normalizes urls so that virtual paths/routes don't contain trailing slashes. Doing this makes route parsing easier.
This returns a 301 Redirect to `/users'. This is very different from the 200 we were getting with the parentheses, but it does not mean the parentheses were behaving differently. As MrWhite said in the comments, surely something else was doing it.
I hope you enjoyed the ride. And I hope even more that this will save some poor, confused soul from hours of torment. :)

Related

Issues with RewriteRule on Generic Anchor Redirect

I want to redirect any request to the root of my site to an anchor on the index. So
https://example.com/foo
Gets sent to
https://example.com/#foo
I've written this .htaccess file (it also redirects http requests to https, that part works, but is included for completeness)
RewriteEngine On
RewriteRule ^/(.*) /#$1 [NE,R=302]
RewriteCond %{HTTPS} !=on
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
Based on the discussion in this thread: mod_rewrite with anchor link this should work, but it's not matching for some reason. I tried out the rule given in that thread using this tool: http://htaccess.madewithlove.be/ and it doesn't seem to work there either.
I've tried clearing my cache and accessing in incognito mode, to no avail. Any help?

Return 410 for all but robots.txt

I have a machine I'm leasing that was assigned an IP address that must have previously been assigned to some kind of link spamming company. Said company has hundreds of domains that still resolve to the IP address of my server, and Google and the like are constantly attempting to index the site with their bots (hundreds of thousands of pages). I've been unsuccessful in getting said link spammer to change their DNS records to resolve elsewhere. Fine.
I decided I could use mod_rewrite to deal with this in a fairly direct manner: I want any request that doesn't include one of my domain names to return 410, unless the request is for /robots.txt. For the robots file I want to return a simple file that disallows everything with a 200. By my thinking I can quickly extinguish the bots and return to normal.
My mod_rewrite configuration looks like this:
RewriteEngine On
RewriteCond %{HTTP_HOST} !^.*foo\.com$
RewriteRule ^/robots\.txt$ /robots-off.txt [L]
RewriteCond %{HTTP_HOST} !^.*foo\.com$
RewriteRule !^/robots\.txt$ - [G]
Where all of the domains I might host on this IP fall somewhere under/at the foo.com domain. So I would expect the first rule to tell Apache to output the contents of /robots-off.txt with a 200 whenever a request is made for /robots.txt for any domain other than my own.
Sadly what's happening is that every request results in a 410, so the bots never get the chance to learn why that they should stop indexing the entire site. Here is the response when I query the wrong host:
The requested resource<br />/robots-off.txt<br />
is no longer available on this server and there is no forwarding address.
Please remove all references to this resource.
This has been going on for over a week with no end in sight. The first rule is running, but the [L] seems to be ignored and the second rule is then run. I don't understand why.
OK, I misunderstood how [L] works. See here: mod_rewrite seems to ignore [L] flag
The working code looks like this:
RewriteCond %{HTTP_HOST} !^.*foo\.com$
RewriteRule ^robots\.txt$ /robots-off.txt [L]
RewriteCond %{HTTP_HOST} !^.*foo\.com$
RewriteRule !^robots-off\.txt$ - [L,G]
Hope this helps somebody.
It's a bit late, but this would return a redirect to the browser, the browser would then re-request robots-off.txt this would be a new request and so again be rewritten. However if you do a pas-through then apache will return the final file inline and so no new request is made and the last is honoured in the way you expect.
RewriteCond %{HTTP_HOST} !^.*foo\.com$
RewriteRule ^robots\.txt$ /robots-off.txt [PT,L]
RewriteCond %{HTTP_HOST} !^.*foo\.com$
RewriteRule !^robots-off\.txt$ - [L,G]

code igniter & mod_rewrite - one rewrite rule breaking another

I have a site built in codeigniter. We use short urls from our database & rewrite rules to redirect them to their full path.
For example,
RewriteRule ^secure-form$ form/contract/secure-form [L]
This works fine by itself. But I would like to use SSL on certain pages. I have edited the code so that if you go to one of these pages, all instances of http:// within the page are replaced with https:// but I need to rewrite the url to use it as well.
The pages all use the same template and all the content comes from the database so I can't just specify ssl on a particular directory.
The url's for the secure pages all start with 'secure' so I wrote the following rules and placed them above the other rewrites.
RewriteCond %{HTTPS} off
RewriteCond %{REQUEST_URI} ^/secure/?.*$
RewriteRule ^(.*)$ https://%{HTTP_HOST}/$1 [R=301,L]
RewriteCond %{HTTPS} on
RewriteCond %{REQUEST_URI} !^/secure/?.*$
RewriteRule ^(.*)$ http://%{HTTP_HOST}/$1 [R=301,L]
RewriteRule ^secure-form$ form/contract/secure-form [L]
RewriteRule ^secure-different-form$ form/contract/secure-different-form [L]
all other rewrite rules for specific pages follow
then the default rewrite further down...
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [L]
The problem is that when I add the rules to change the protocol, it ends up displaying 'form/contract/secure-form' in the url instead of 'secure-form'.
This renders the actual form on the page broken since it uses that url to build itself.
If I take out the rules that change the protocol, it displays secure-form in the url as it should, but the page is not secure.
What am I doing wrong?
----UPDATE----
Ooh, after over 20 hrs of searching, I think I finally have an answer. So, first time through, https is off & gets turned on. Then, because of the 301, it's run again & the page gets sent to form/contract/secure... But this time, https is on. Since the uri no longer STARTS with secure, it turns https off.
Hopefully, this will help someone else.

dynamic subdomains with htaccess: URL shouldnt change in the browser

Trying to implement subdomains with htaccess.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} !^www.
RewriteCond %{HTTP_HOST} ^([a-z0-9]+)\.domain.com(.*)$
RewriteRule ^(.*)$ http://domain.com/index.php?/public_site/main/%1/$1 [L]
</IfModule>
when i enter ahser.domain.com the browser URL is changing. is there a htaccess option to not let this happen when absolute URLs is used in RewriteRule?
Don't rewrite to a full URL with domain in it. That generates a redirect since it's going to a different website! You could put microsoft.com there; so how would it work without redirecting?
What you have to do is make sure that the web pages work under the original domain. So when the client asks for myname.domain.com/... how about rewriting that to myname.domain.com/index.php?public_site/main/myname/.... Keep the domain the same. The index.php? can be made to work in any of those domains. For instance, even this could work:
http://OTHER.domain.com/index.php?public_site/main/MYNAME/...
I.e. set it up so it doesn't matter which virtual host accesses that path.
Once you have that, the rewrite can then just do:
# will not trigger redirect
RewriteRule ^(.*)$ /index.php?/public_site/main/%1/$1 [L]
You have to be careful not to introduce a loop since you're now redirecting a URL to a longer URL which matches the same rewrite rulethe same domain. You need an additional RewriteCond not to apply this rewrite if the URL already starts with /index.php?public_site/.

htaccess rewrite

I would like to rewrite /anything.anyextension into /?post=anything.
eg:
/this-is-a-post.php into /?post=this-is-a-post or
/this-is-a-post.html into /?post=this-is-a-post or even
/this-is-a-post/ into /?post=this-is-a-post
I tried
RewriteRule ^([a-zA-Z0-9_-]+)(|/|\.[a-z]{3,4})$ ?$1 [L]
but it doesn't work.
Any help appreciated.
If you have access to the main server configuration, use this:
RewriteRule ^/(.+)\.\w+$ /?post=\1 [L]
If not, and you are forced to put this in a .htaccess file, you could try
RewriteRule ^(.+)\.\w+$ /?post=\1 [L]
In either case, this assumes you will only be rewriting URLs with a single path component (i.e. if you get a request like /path/anything.anyextension it might not work as you expect, the rewrite rule would need to be modified to handle that)
You need a better way to determine when to apply the rewrite rule, otherwise your page won't be able to display external JS or CSS, unless you define an exception.
SilverStripe (or the core, Sapphire) offers a good approach to this, something like:
RewriteEngine On
RewriteCond %{REQUEST_URI} !(\.css)|(\.js)|(\.swf)$ [NC]
RewriteCond %{REQUEST_URI} .+
RewriteRule ^([^\.]+) /?post=$1 [L,R=301]
This requires the URI not to be empty, not to be JS, CSS or SWF, and redirects back to your root directory:
http://localhost/this-is-a-post.php
http://localhost/?post=this-is-a-post
If you don't want a redirection, but the processing, remove the redirection rule R=301

Resources