Is it possible to create conditional rewrite rules - url-rewriting

Here is my predicament.
I've inherited support on a website and tasked with moving it to a new host.
An issue I have encountered with this is that there is an uploads folder with over 93,000 files in it. I have to move these files into a 'Year\Month' directory structure, based on the date of the files, while keeping external links alive.
Putting aside the complexity of modifying the database information, the rows relating to the individual files, to reflect the new structure is it possible to create conditional rewrite statements.
What I mean is if a request is made to find a file in that directory, specifically in the root 'Uploads' folder, that there would be a list of corresponding ReWrite rules reflecting the new positions.
Would having this many rules have a serious performance issue?
I suppose I could simplify it further where instead of putting the existing files into 'Year/Month' structure I could put them into an alphanumeric structure based on the first character of the files i.e. files starting with symbols would all go in the 'Sorted\Symbols' folder, files starting with 1 will go in the 'Sorted\1' folder and so on.

Im going to assume you're using Apache
Yes it is possible to create conditional rewrite rules using RewriteCond
See here for more info: http://wiki.apache.org/httpd/RewriteCond
If you are sensible with how you set it up, it shouldnt have a noticable impact on performance.
For example if the first rewrite condition you can provide can eliminate all the files that wont be rewritten then it shouldn';t create a big performance hit.
E.g.
RewriteCond %{REQUEST_URI} ^/mydir
Only perform rewrites on requests with uri starting with mydir

Related

Make: Shall we give same targets for multiple rules?

I am pretty new to GNUmakefiles.
I am going to place two rules with same target patters as follows:
$(TGTSIP)/$(VERILOG_DIR)/src/%: $(TGTDDVAPI)/$(VERILOG_DIR)/%
$(test_file)
$(TGTSIP)/$(VERILOG_DIR)/src/%: $(TGTMODEL)/%
$(test_file)
i Mentioned above in makefiles. because for few files the pre-requisit will change.Build works fine as expected. But i am not very sure whether this is the right way of having such rules ? If this is not the right way . Could anyone share the best way how we can simplify this?
For more understanding i am trying to copy files from 2 different paths to one same location. In this case target is the path where we are copying and prerequisites are the different paths. How can we handle it in single rule?
Make does not object against multiple concurrent pattern rules with non-empty recipes. In fact, it's totally legit.
However, you should keep in mind that in this case the timestamps of the concurrent sources do not matter: if both patterns match directly (i.e. not by an implicit chaining) and with the same stem length then the first one always wins (and no warning issued). And then only the winner's timestamp will be compared against the target's one.
Therefore, it's possible that the target will not be updated by a source from the second directory (being shadowed by an old source from the first one). If it's okay for you (e.g. there's no conflict between the source dirs) then just do this.

Is there an efficient way in docpad to keep static and to-be-rendered files in the same directory?

I am rebuilding a site with docpad and it's very liberating to form a folders structure that makes sense with my workflow of content-creation, but I'm running into a problem with docpad's hard-division of content-to-be-rendered vs 'static'-content.
Docpad recommends that you put things like images in /files instead of /documents, and the documentation makes it sound as if otherwise there will be some processing overhead incurred.
First, I'd like an explanation if anyone has it of why a file with a
single extension (therefore no rendering) and no YAML front-matter,
such as a .jpg, would impact site-regeneration time when placed
within /documents.
Second, the real issue: is there a way, if it does indeed create a
performance hit, to mitigate it? For example, to specify an 'ignore'
list with regex, etc...
My use case
I would like to do this for posts and their associated images to make authoring a post more natural. I can easily see the images I have to work with and all the related files are in one place.
I also am doing this for an artwork I am displaying. In this case it's an even stronger use case, as the only data in my html.eco file is yaml front matter of various meta data, my layout automatically generates the gallery from all the attached images located in a folder of the same-name as the post. I can match the relative output path folder in my /files directory but it's error prone, because you're in one folder (src/files/artworks/) when creating the folder of images and another (src/documents/artworks/) when creating the html file -- typos are far more likely (as you can't ever see the folder and the html file side by side)...
Even without justifying a use case I can't see why docpad should be putting forth such a hard division. A performance consideration should not be passed on to the end user like that if it can be avoided in any way; since with docpad I am likely to be managing my blog through the file system I ought to have full control over that structure and certainly don't want my content divided up based on some framework limitation or performance concern instead of based on logical content divisions.
I think the key is the line about "metadata".Even though a file does NOT have a double extension, it can still have metadata at the top of the file which needs to be scanned and read. The double extension really just tells docpad to convert the file from one format and output it as another. If I create a straight html file in the document folder I can still include the metadata header in the form:
---
tags: ['tag1','tag2','tag3']
title: 'Some title'
---
When the file is copied to the out directory, this metadata will be removed. If I do the same thing to a html file in the files directory, the file will be copied to the out directory with the metadata header intact. So, the answer to your question is that even though your file has a single extension and is not "rendered" as such, it still needs to be opened and processed.
The point you make, however, is a good one. Keeping images and documents together. I can see a good argument for excluding certain file extensions (like image files) from being processed. Or perhaps, only including certain file extensions.

Variable directory path for AC_CONFIG_FILES in configure.ac

I am writing a small tool in c++. It is actually more of a framework that is open to customization. It has the following directory structure (simplified example).
src/
main/myexec # linked to libapple.so
apple/
coder/libapple.so
john/libapple.so
.
.
james/libapple.so
Here, the directory "coder" is a generic dummy, with some example code to generate libapple.so. Different users can checkout this tool, create directories of their own, copy the template code from "coder" and customize as they wish. Depending on the configure option (indicating the user), the respective libapple.so needs to be generated.
As I mentioned, this is a simplified example. It is not a matter of generic programming, inheritance etc. In fact, similar to the "apple" folder there are others like "scripts", "docs", "configs" etc each having similar user specific folders. Also, the tool will be maintained at a single repository location to allow me to support & maintain all the code that is not specific to user. As a policy, users are expected to modify and check-in only the contents of their folders.
The problem I am facing is with "configure.ac". I do not want to use "AC_ARG_WITH" option as it would require each new user to edit configure.ac. Also for each user the AC_CONFIG_FILE entries would be exactly the same except for his folder name. I tried using "--enable-user=User" and then AC_SUBST(USERDIR), which also helps in setting "SUBDIRS = #USERDIR#" in Makefile.am. Everything looks good except for the fact that "Makefile.in" is not getting created under the user folder when I specify "AC_CONFIG_FILE = ([apple/${USERDIR}/Makefile])".
Please advice how to overcome this issue. In the worst case I may end up in creating softlinks :(
After one full day of scratching my head, following is the solution that I have come up with.
Create a file "project_makefiles.m4.in" like this
AC_CONFIG_FILES([ apple/USERDIR/Makefile ]
Add the below to configure.ac
m4_include([project_makefiles.m4])
Create a wrapper script like "build.sh" which will create "project_makefiles.m4" from "project_makefiles.m4.in" by replacing "USERDIR". This is done before the automake.

Lighttpd's mod_rewrite module

I have a problem with the mod_rewrite module with Lighty.
I am trying to make this: example.com/index.php?search=whatever, appear as example.com/whatever or example.com/search/whatever (have not decided yet -- don't know much about SEO)
While I want it to act like above, I also want to exclude all physichal directories and all files, such as the directory /images/ and the files index.php, favicon.ico, style.css etc. from the rewrite, because it acts weird.
How would I achieve this? I've tried the following, which worked okay for what I wanted, but didn't really work with the exclusion of the directories and files:
url.rewrite-once = (
"^/([a-zA-Z0-9_-]+)" => "/index.php?search=$1",
"^/(images|js|wp-content)/(.*)" => "$0",
"(.*\.php|.*\.css|favicon.ico)" => "$0" )
By the way, what difference is there between this:
"^/([a-zA-Z0-9_-]+)" => "/index.php?search=$1",
And this:
"^/(.*)$" => "/index.php?search=$1"
To avoid having to add numerous RewriteCond directives checking that the visitor's request is not an actual file rather than an artificial path, I recommend going with the /search/whatever pattern in favour of the /whatever pattern. Then, so long as you never create an actual directory called "search", you'll never need to check whether a path beginning with /search is an actual file path. So your RewriteRule becomes this simple:
RewriteRule ^/search/([a-zA-Z0-9_-]+)$ /index.php?search=$1
(I'm not familiar with Lighty, so I'm not sure how to translate this into a url.rewrite-once instruction, but this is such a simple rewrite that it should be straightforward.)
However, the visitor's browser will now think that they are viewing a page which is in a sub-directory called "search", so if you have any image elements or CSS files specified with relative paths (paths not anchored to the root directory) such as src="images/photo.jpg" or href="stylesheets/clean.css" then the browser will think those paths are relative to the "search" directory and will ask your web server for /search/images/photo.jpg and /search/stylesheets/clean.css respectively.
There are two ways to do this. The first is to change all page decoration (images, stylesheets, JavaScript) paths to absolute paths. That is, change the path so that it begins with a forward-slash which represents the root directory of the website. So your image path would need to be changed to src="/images/photo.jpg" and your stylesheet path to href="/stylesheets/clean.css". The forward slash at the start tells the web browser that the path starts at the site's root directory, so there is no ambiguity.
The second option is to create convoluted RewriteRules to redirect requests for images, stylesheets, script files, etc, to the correct directories. This tends to become ugly and fragile if you have a lot of media types in a lot of different directories and you need them to work from a lot of different sub-directories (virtual and/or otherwise).
Which option you choose depends on your requirements and preferences.
Regarding your question about the difference between [a-zA-Z0-9_-]+ and .* the first pattern only allows letters a to z (lowercase or uppercase), digits, underscores and hyphens. The second pattern allows any characters. For security and debugging reasons, it's usually better to use the pattern which limits characters to only those which should be allowed. So I'd go with the first of those patterns, adding additional permitted characters if necessary, rather than allow all characters.

Redirect images from folder to folder

I had a folder that contained 15000 images, i decided to put them in 10 folders.
The initial folder url where i had my 15000 images was :
www.mysite.com/images/games/
And the new folders' urls are :
www.mysite.com/images/games/1/
www.mysite.com/images/games/2/
www.mysite.com/images/games/3/
www.mysite.com/images/games/4/
www.mysite.com/images/games/5/
www.mysite.com/images/games/6/
www.mysite.com/images/games/7/
www.mysite.com/images/games/8/
www.mysite.com/images/games/9/
www.mysite.com/images/games/10/
How can I redirect my images to match their correct folder to get rid of 404 errors?
Given your expansion in the comments above:
I know which images go to which folder, but its a big list
I suggest putting that list in a text file, and using find and replace (or awk?) to transform that list into mod_redirect rules, e.g.
Redirect /images/games/oldname /images/games/8/oldname
Then paste those into a configuration file. If there's no actual algorithm that determined how these got sorted into directories, there's no program that can redirect requests, because it would require a formula to work from.
Much as I love to cite When Not To Use Rewrite, this is actually a case where mod_rewrite is called for.
You don't say how you distributed your images between the ten new directories, but I am assuming the images were numbered (e.g. 00001.png, 00002.png, ... 15000.png). Then you used the last digit to determine which image goes into which folder, i.e. 00001.png goes into the 1 directory, all the way up to 15000.png which goes into the 10 directory (I think - why didn't you name it 0?) That's what you did, right?
You don't mention what server you're using on this site, so I'm going to assume you're using Apache, right? In that case you'd edit the appropriate configuration file (I started trying to assume which one, but honestly it could be any one of four I can think of, depending on whether you have edit rights to the vhost configuration, or root on the server, or if you're on a Debian-flavored or Red-Hat-flavored distro, you are using Linux, right? You didn't say.) You'd make sure there's a RewriteEngine On in there somewhere, then you'd do something like this:
RewriteRule ^/images/games/(\d+)(\d)\.png$ /images/games/$2/$1$2.png
...except that you named the directory 10 instead of 0, so all the ones ending in 0 are going to get misdirected. Maybe we should try this?
RewriteRule ^/images/games/(\d+0)\.png$ /images/games/10/$1.png [L]
RewriteRule ^/images/games/(\d+)([1-9])\.png$ /images/games/$2/$1$2.png [L]
...and catch the 0s first, then everything else.
I'd check this answer, but you haven't given us anywhere near enough information to know how.

Resources