Apache: just rewrite if external ressource exists - mod-rewrite

I use Apache as a reverse proxy. There is no web content on the dedicated server itself. If a client requests a resource on the local Apache server, Apache should determine on which remote (proxied) server the resource exists and do a proxy rewrite to that server.
A snippet should (that currently does not work) should demonstrate, what i would do:
RewriteCond http://200.202.204.11:3000%{REQUEST_URI} -U
RewriteRule ^(.*)$ http://200.202.204.11:3000$1 [P]
I spared out the rest of my configuration (ProxyPass, ProxyPassReverse, other RewriteCond,...) to focus on my problem:
How could I check if an external resource exists / is available before rewriting?
The -U option for RewriteCond returns alwas true. The -F option returns alwas false. Is there a working solution for my intent?

After searching for weeks to get the solution I come to the conclusion: there is no reliable RewriteRule if an external ressource exists.
You go much better if you address your service behind an reverse proxy via subdomains. E.g. 'gitlab.youdomain.net' if you want to address a ressource on your gitlab server behind your reverse proxy. So the reverse proxy does not become confused if the ressource is lying in the root directory '/' of the gitlab server.

I had the same problem but, as far as I know, I got same results: it is not possible do it using only Apache httpd directives (at least with the version 2.2).
In my solution I did it using a RewriteMap and a PHP script able to check if the external resource exists.
In this example, when a new request comes, RewriteMap check the existence of requested path on Server A and, if successfully found, it reverse proxy the request on same server.
On the other hand, if the requested path is not found on Server A, it implements a rewrite rule to reverse proxy the request on serverB.
As said, I have used a RewriteMap with MapType prg: and a PHP script.
Here the Apache directives:
# Please pay attention to RewriteLock
# this directive must be defined in server config context
RewriteLock /tmp/if_url_exists.lock
RewriteEngine On
ProxyPreserveHost Off
ProxyRequests Off
RewriteMap url_exists "prg:/usr/bin/php /opt/local/scripts/url_exists.php"
RewriteCond ${url_exists:http://serverA%{REQUEST_URI}} >0
RewriteRule . http://serverA%{REQUEST_URI} [P,L]
RewriteRule . http://serverB%{REQUEST_URI} [P,L]
Here comes the interesting and tricky part.
This is the url_exists.php script, executed by Apache. It is waiting on the standard input stream and write into standard output.
This scripts return 1 if the resource is found and readable, otherwise 0.
It is so light even because it implements only an HTTP request using the HEAD method.
<?php
function check_if_url_exists($line) {
$curl_inst = curl_init($line);
curl_setopt( $curl_inst, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt( $curl_inst, CURLOPT_LOW_SPEED_LIMIT, 1);
curl_setopt( $curl_inst, CURLOPT_LOW_SPEED_TIME, 180);
curl_setopt( $curl_inst, CURLOPT_HEADER, true);
curl_setopt( $curl_inst, CURLOPT_FAILONERROR, true);
// Exclude the body from the output and request method is set to HEAD.
curl_setopt( $curl_inst, CURLOPT_NOBODY, true);
curl_setopt( $curl_inst, CURLOPT_FOLLOWLOCATION, true);
curl_setopt( $curl_inst, CURLOPT_RETURNTRANSFER, true);
$raw = curl_exec($curl_inst);
curl_close($curl_inst);
return ($raw != false) ? true : false;
}
set_time_limit(0);
$keyboard = fopen("php://stdin","r");
while (true) {
$line = trim(fgets($keyboard));
if (!empty($line)) {
$str = (check_if_url_exists($line)) ? "1" : "0";
echo $str."\n";
}
}

Related

ManagedFusion Rewriter 404 if trailing slash is missing?

I'm using ManagedFusion Rewriter as a reverse proxy. The configuration is fairly simple:
RewriteRule ^/api/(.*) http://www.example.com/api/$1 [P]
This will work pretty much for any URL. However, if the URL happens to not end on a trailing slash, it will fail.
A request like this will go fine perfectly: GET api/report/
2013-10-10T11:27:11 [Rewrite] Input: http://localhost:50070/api/report/
2013-10-10T11:27:11 [Rule 0] Input: /api/report/
2013-10-10T11:27:11 [Rule 0] Rule Pattern Matched
2013-10-10T11:27:11 [Rule 0] Output: http://www.example.com/api/report/
2013-10-10T11:27:11 [Rewrite] Proxy: http://www.example.com/api/report/
2013-10-10T11:27:11 **********************************************************************************
2013-10-10T11:27:11 [Proxy] Request: http://www.example.com/api/report/
2013-10-10T11:27:12 [Proxy] System.Net.HttpWebResponse
2013-10-10T11:27:12 [Proxy] Received '200 OK'
2013-10-10T11:27:12 [Proxy] Response: http://localhost:50070/api/report/
2013-10-10T11:27:12 [Proxy] Response is being buffered
2013-10-10T11:27:12 [Proxy] Responding '200 OK'
However, a request like this will return a 404 without even making the request on the proxied URL: GET api/report/1
2013-10-10T11:27:13 [Rewrite] Input: http://localhost:50070/api/report/1
2013-10-10T11:27:13 [Rule 0] Input: /api/report/1
2013-10-10T11:27:13 [Rule 0] Rule Pattern Matched
2013-10-10T11:27:13 [Rule 0] Output: http://www.example.com/api/report/1
2013-10-10T11:27:13 [Rewrite] Proxy: http://www.example.com/api/report/1
(the log file finishes right here)
This is my whole configuration file:
RewriteEngine On
RewriteLog "log.txt"
RewriteLogLevel 9
RewriteRule ^/api/(.*) http://www.example.com/api/$1 [P]
Any idea where may I be wrong?
EDIT: My workaround has been accepted as the solution in the Rewriter codebase, so I'll make this the accepted answer. Please, still provide feedback on possible approaches to it.
Found a workaround, but I don't think this is the actual solution, so I'll answer my own question but won't accept it as an answer. (Unless I change my mind later. Fate is a fickle mistress.)
I downloaded the source code of ManagedFusion.Rewriter (the latest one, apparently from GitHub, here: https://github.com/managedfusion/managedfusion-rewriter/releases) and integrated it into my code base.
The class ManagedFusion.Rewriter.RewriterModule contains the following two methods:
private void context_PostResolveRequestCache(object sender, EventArgs e)
{
var context = new HttpContextWrapper(((HttpApplication)sender).Context);
// check to see if this is a proxy request
if (context.Items.Contains(Manager.ProxyHandlerStorageName))
context.RewritePath("~/RewriterProxy.axd");
}
private void context_PostMapRequestHandler(object sender, EventArgs e)
{
var context = new HttpContextWrapper(((HttpApplication)sender).Context);
// check to see if this is a proxy request
if (context.Items.Contains(Manager.ProxyHandlerStorageName))
{
var proxy = context.Items[Manager.ProxyHandlerStorageName] as IHttpProxyHandler;
context.RewritePath("~" + proxy.ResponseUrl.PathAndQuery);
context.Handler = proxy;
}
}
As the names imply, the first one is the handler of PostResolveRequestCache, while the second one is the handler for PostMapRequestHandler.
In both of my example requests, the PostResolveRequestCache handler was being invoked and working fine. However, for my failing request, PostMapRequestHandler was not being executed.
This made me think that maybe, for some reason, rewriting a specific resource that does not look like a directory to a resource that looks like a file through the usage of RewritePath was preventing the actual actual handler from being picked up, thus preventing the raising of PostMapRequestHandler.
As such, I upgraded the Rewriter project from .NET 3.5 to 4.5 and replaced these lines:
if (context.Items.Contains(Manager.ProxyHandlerStorageName))
context.RewritePath("~/RewriterProxy.axd");
by these ones
if (context.Items.Contains(Manager.ProxyHandlerStorageName)) {
var proxyHandler = context.Items[Manager.ProxyHandlerStorageName] as IHttpHandler;
context.RemapHandler(proxyHandler);
}
With this, all the requests were being properly picked up by the handler and started working.
As a side note, I had some mistakes in the original rules, instead of
RewriteRule ^/api/(.*) http://www.example.com/api/$1 [P]
It should have been:
RewriteRule ^/api/(.*) http://www.example.com/api/$1 [QSA,P,NC]
QSA to append the query string of the original request
NC to match the regex case insensitive

Mod-Rewrite Dynamic URLs

I've searched and searched and tried and tried for 2 days solid now on how to make myself some friendly URL's on a CMS I am making to teach my self php.
I am trying to change:
www.mydomain.com/cms/index.php?id=30
To:
www.mydomain.com/cms/30
to begin with, I already have created another function to change it from id to a seourl but I can't even get the basic number version working yet.
I have tried hundred of combinations of how to write my .htaccess file this is my current one which seemingly does nothing:
Options +FollowSymLinks
RewriteEngine on
RewriteRule cms/index/id/(.*) cms/index.php?id=$1
RewriteRule cms/index/id/(.*)/ cms/index.php?id=$1
How my urls are dynamical created:
$sqlCommand = "SELECT id, linklabel, seourl FROM pages WHERE showing='1' ORDER BY pageorder ASC";
$query = mysqli_query($myConnection, $sqlCommand) or die (mysqli_error());
$menuDisplay = '';
while ($row = mysqli_fetch_array($query)) {
$pid = $row["id"];
$linklabel = $row["linklabel"];
$seourl = $row["seourl"];
$menuDisplay .= '<a href="index.php?id=' . $pid . '">' . $linklabel . '<a><br .>';
}
mysqli_free_result($query);
Does anyone have any idea or solutions on what I could be doing wrong?
Thanks
How about:
RewriteRule cms/([/d]+) cms/index.php?id=$1

htaccess internal and external request distinction

I have a problem with an .htaccess file. I've tried googling but could not find anything helpful.
I have an AJAX request loading pages into the index.php. The link triggering it is getting prepended by "#" via jquery. So if you click on the link domain.com/foo/bar (a wordpress permalink) you get domain.com/#/foo/bar in the browser and the content will get loaded via AJAX.
My problem is: Since these are blog posts, external links grab the real link (domain.com/foo/bar), so I want them to get redirected to domain.com/#/foo/bar (cause then ajax checks the hash and does its magic).
Example here.
The jquery code for the prepend is:
$allLinks.each(function() {
$(this).attr('href', '#' + this.pathname);
...
and then the script checks
if (hash) { //we know what we want, the url is not the home page!
hash = hash.substring(1);
URL = 'http://' + top.location.host + hash;
var $link = $('a[href="' + URL + '"]'), // find the link of the url
...
Now I am trying to get the redirect to work with htaccess. I need to check if the request is external or internal
RewriteCond %{REMOTE_HOST} !^127\.0\.0\.1 #???
and if the uri starts with "/#/" which is a problem since it's a comment then, \%23 does not really work somehow.
RewriteCond %{REQUEST_URI} !^/\%23/(.*)$ #???
How do I get this to work to simply redirect an external request from domain.com/foo/bar to domain.com/#/foo/bar without affecting the internal AJAX stuff?
I suppose your $allinks variable is assigned in a fashion similar to this:
$allinks = $('a');
Do this instead:
$allinks = $('a[href^="' + document.location.protocol + '//' + document.location.hostname + '"]');
This will transform internal links to your hash-y style only.
Ok i've done it with PHP here is the code
$path = $_SERVER["REQUEST_URI"];
if(isset($_SERVER['HTTP_X_REQUESTED_WITH']) && strtolower($_SERVER['HTTP_X_REQUESTED_WITH']) == 'xmlhttprequest') {
echo "It's ajax";
} else {
if(strpos($path, '/#/') === false) {
header("Location: http://schnellebuntebilder.de/#".$path); //ONLY WORKS IF THERE IS NO BODY TAG
}
}
There sure is a better solution, but this does the trick for now and since the page /foo/bar does, in my case, not include the header.php there is no >body<-tag and the php "header()" function works . If anyone knows the htaccess script for this I am keen to know and learn.

Making PHP GET parameters look like directories

I am trying to make it so:
http://foo.foo/?parameter=value
"converts" to
http://foo.foo/value
Thanks.
Assuming you're running on Apache, this code in .htaccess works for me:
RewriteEngine on
RewriteRule ^([a-zA-Z0-9_-]+)/$ /index.php?parameter=$1
Depending on your site structure you may have to ad a few rules though.
Enabling mod_rewrite on your Apache server and using .htaccess rules to redirect requests to a controller file.
.htaccess
# Enable rewrites
RewriteEngine On
# The following two lines skip over other HTML/PHP files or resources like CSS, Javascript and image files
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# test.php is our controller file
RewriteRule ^.*$ test.php [L]
test.php
$args = explode('/', $_SERVER['REDIRECT_URL']); // REDIRECT_URL is provided by Apache when a URL has been rewritten
array_shift($args);
$data = array();
for ($i = 0; $i < count($args); $i++) {
$k = $args[$i];
$v = ++$i < count($args) ? $args[$i] : null;
$data[$k]= $v;
}
print_r($data);
Accessing the url http://localhost/abc/123/def/456 will output the following:
Array
(
[abc] => 123
[def] => 456
)
Assuming you are using Apache, the following tutorials are epically helpful:
.htaccess part one
.htaccess part two
The second tutorial has your answer. Prepare to dig deep into a dungeon called mod_rewrite.
Use mod rewrite rules if you are using Apache. This is better and secure way to make a virtual directory.

LibCURL sends filename rather than file contents when I try to upload a binary file in Windows Server 2008

I'm getting this weird behavior with libCURL. When I try to upload a file by appending "#" to the beginning of filename (as documented in libCURL's man page), instead of file contents being uploaded, libCURL sends the filename itself (with the # in the beginning).
This is running on Windows 2008 R2, with xampp version 5.6.8, which has curl compiled in (curl version 7.40.0).
Here's releant code fragment:
$post['pic'] = "#C:\\image.png";
$ret = curl_setopt( $ch, CURLOPT_POST, TRUE );
if (!$ret) die("curl_setopt CURLOPT_POST failed");
$ret = curl_setopt( $ch, CURLOPT_POSTFIELDS, $post );
if (!$ret) die("curl_setopt CURLOPT_POSTFIELDS failed");
$response = curl_exec( $ch );
This code works on Linux but not Windows Server 2008.
Here's the form data that I get:
Content-Type: multipart/form-data; \\
boundary=------------------------c74a6af8b52d997a
--------------------------c74a6af8b52d997a
Content-Disposition: form-data; name="pic"
#C:\image.png
--------------------------c74a6af8b52d997a--
As you can see I receive #C:\image.png rather than contents.
Does anyone know why libCURL wouldn't upload the file contents?
From the documentation of curl-setopt
CURLOPT_POSTFIELDS The full data to post in a HTTP "POST" operation.
To post a file, prepend a filename with # and use the full path. The
filetype can be explicitly specified by following the filename with
the type in the format ';type=mimetype'. This parameter can either be
passed as a urlencoded string like 'para1=val1&para2=val2&...' or as
an array with the field name as key and field data as value. If value
is an array, the Content-Type header will be set to
multipart/form-data. As of PHP 5.2.0, value must be an array if files
are passed to this option with the # prefix. As of PHP 5.5.0, the #
prefix is deprecated and files can be sent using CURLFile. The #
prefix can be disabled for safe passing of values beginning with # by
setting the CURLOPT_SAFE_UPLOAD option to TRUE.
The behavior depends on php release and # prefix is now deprecated.
You should use CurlFile class to set the CURLOPT_POSTFIELDS of the curl request like this :
$post['pic'] = new CurlFile('C:\\image.png');
$ret = curl_setopt( $ch, CURLOPT_POSTFIELDS, $post );

Resources