Looking for some help with setting up al UrlRewrite for the same page with different querystring parameters.
Below are the two urls that I am looking to rewrite
stockists.aspx?product=1&fragrance=2
stockists.aspx?store=1
I setup the url rewrite for stockists.aspx?product=1&fragrance=2 (in config/UrlRewriting.config) first and tested successfully.
<add name="Stockists"
virtualUrl="^~/stockists/(.*)/(.*).aspx"
rewriteUrlParameter="ExcludeFromClientQueryString"
destinationUrl="~/stockists.aspx?product=$1&fragrance=$2"
ignoreCase="true" />
I then setup the url rewrite for stockists.aspx?store=1 (in config/UrlRewriting.config) and now neither url rewrite works.
<add name="Stores"
virtualUrl="^~/stockists/(.*).aspx"
rewriteUrlParameter="ExcludeFromClientQueryString"
destinationUrl="~/stockists.aspx?store=$1"
ignoreCase="true" />
Any suggestions on how the above can be achieved?
As it is now, the Stores rewrite would also match the url format that the Stockists rewrite is matching. (.*) matches any number of any characters. To fix this, each (.*) should be changed to ([^/]+). That will match any character except a forward slash and also makes sure that there is at least one character.
<add name="Stockists"
virtualUrl="^~/stockists/([^/]+)/([^/]+).aspx"
rewriteUrlParameter="ExcludeFromClientQueryString"
destinationUrl="~/stockists.aspx?product=$1&fragrance=$2"
ignoreCase="true" />
<add name="Stores"
virtualUrl="^~/stockists/([^/]+).aspx"
rewriteUrlParameter="ExcludeFromClientQueryString"
destinationUrl="~/stockists.aspx?store=$1"
ignoreCase="true" />
Now, this might not completely solve the problem, but it should allow the Stockists rewrite to work again. I suspect there is something else wrong with the Stores rewrite, but the Stockists rewrite stopped working because the url pattern was matched by the Stores rewrite.
Related
I have got a requirement for generating user friendly urls.I am on IIS.
My dynamic URLs looks like,
www.testsite.com/blog/article.cfm?articleid=4432
Client wants the urls should look like
www.testsite.com/blog/article_title
I know this can be easily done using IIS URL rewiter 2.0.
But the Client wants to do it using ColdFusion only. Basic idea he given like,
User will hit the url www.testsite.com/blog/article_title
I need to fetch the article id using the article_title in the url.
Using the ID to call the article.cfm page and load the output into cfsavecontent and then deliver that output to the browser.
But I do not think its possible at application server level. How IIS will understand our user friendly urls . OR am I missing something important? Is it possible to do it using ColdFusion at application server level?
First, I hate to recommend reinventing the wheel. Webservers do this and do this well.
Cold Fusion can do something like this with #cgi.path_info#. You can jump through some hoops as Adam Tuttle explains here: Can I have 'friendly' url's without a URL rewriter in IIS?.
Option #2: My Favorite: OnMissingTemplate..
Only available to users of Application.cfc (I'm pretty sure .cfm has no counterpart to onMissingTemplate).
You can use this function within application.cfc and all affected pages will throw any "missing" urls at this event. You can then place
<cffunction name="onMissingTemplate">
<cfargument name="targetPage" type="string" required=true/>
<!--- Use a try block to catch errors. --->
<cftry>
<cfset local.pagename = listlast(cgi.script_name,"/")>
<cfswitch expression="#listfirst(cgi.script_name,"/")#">
<cfcase value="blog">
<cfinclude template="mt_blog.cfm">
<cfreturn true />
</cfcase>
</cfswitch>
<cfreturn false />
<!--- If no match, return false to pass back to default handler. --->
<cfcatch>
<!--- Do some error logging here --->
<cfreturn false />
</cfcatch>
</cftry>
</cffunction>
mt_blog.cfm can have contents like, if your url is say just like /blog/How-to-train-your-flea-circus.cfm
<!--- get everything after the slash and before the dot --->
<cfset pagename = listfirst(listlast(cgi.script_name,"/"),".")>
<!--- you may probably cache queries blog posts --->
<cfquery name="getblogpost">
select bBody,bTitle,bID
from Blog
where urlname = <cfqueryparam cfsqltype="cf_sql_varchar" value="#pagename#">
</cfquery>
<!--- This assumes you will have a field, ex: urlname, that has a url-friendly format to match
to. The trouble is that titles are generically, in most blogs, changing every special char
to - or _, so it's difficult to change them back for this sort of comparison, so an add'l
db field is probably best. It also makes it a little easier to make sure no two blogs have
identical (after url-safe-conversion) titles. --->
...
Or if you use a url like /blog/173_How-to-train-your-flea-circus.cfm (where 173 is a post ID)
<!--- get everything after the slash and before the dot --->
<cfset pageID = listfirst(listlast(cgi.script_name,"/"),"_")>
<!--- you may probably cache queries blog posts --->
<cfquery name="getblogpost">
select bBody,bTitle,bID
from Blog
where bID = <cfqueryparam cfsqltype="cf_sql_integer" value="#pageID#">
</cfquery.
...
I don't recommend using a missing file handler (or CF's onMissingTemplate). Otherwise IIS will return a 404 status code and your page will not be indexed by search engines.
What you need to do is identify a unique prefix pattern you want to use and create a web.config rewrite rule. Example: I sometimes use "/detail_"+id for product detail pages.
You don't need to retain a physical "/blog" sub-directory if you don't want to. Add the following rewrite rule to the web.config file in the web root to accept anything after /blog/ in the URL and interpret it as /?blogtitle=[everythingAfterBlog]. (I've added an additional clause in case you want to continue to support /blog/article.cfm links.)
<rules>
<rule name="Blog" patternSyntax="ECMAScript" stopProcessing="true">
<match url="blog/(.*)$" ignoreCase="true" />
<conditions logicalGrouping="MatchAll" trackAllCaptures="false">
<add input="{SCRIPT_FILENAME}" matchType="IsFile" negate="true" />
<add input="{PATH_INFO}" pattern="^.*(blog/article.cfm).*$" negate="true" />
</conditions>
<action type="Rewrite" url="/?blogtitle={R:1}" appendQueryString="true" />
</rule>
</rules>
I recommend using a "301 Redirect" to the new SEO-friendly URL. I also advise using dashes (-) between word fragments and ensure that the character case is consistent (ie, lowercase) or you could get penalized for "duplicate content".
To add to what cfqueryparam suggested, this post on Using ColdFusion to Handle 404 errors shows how to replace the web server's 404 handler with a CFM script - giving you full rewrite capabilities. It is for an older version of IIS, but you should be able to find the proper settings in the IIS version you are using.
As Adam and other's have said (and the same point is made in the post) this is not something you should do if you can avoid it. Web servers working at the HTTP level are much better equipped to do this efficiently. When you rely on CF to do it you are intentionally catching errors that are thrown in order to get the behavior you want. That's expensive and unnecessary. Typically the issue with most clients or stakeholders is a simple lack of understanding or familiarity with technology like url rewriting. See if you can bend them a little. Good luck! :)
Note that, in my attempt to display code examples, I will redact/edit out any references to the company for whom I work in an effort to obscure their identity, not so much to hide the fact that I'm even asking. It should also be of note that I am very new to this game of UrlRewrite/Tuckey/dotCMS.
I have been having trouble getting a redirect to work. It's using Tuckey URLRewrite through dotCMS. The attempt is to redirect, but as a forward versus a proxy, for SEO purposes.
I've found that the following works ('redirect' and 'proxy' are interchangeable here):
<to type="proxy">http://[redacted]:8080$1$3?%{query-string}</to>
However, the following leads to a 404 ('forward' and 'passthrough' are interchangeable here):
<to type="forward">http://[redacted]:8080$1$3?%{query-string}</to>
The entirety of the rule is as follows:
<!-- EN with Query Params -->
<rule>
<from>^/([^/]+)/en/([^/]+)?$</from>
<to type="proxy" qsappend="true">[redacted]:8080$1$3&%{query-string}</to>
</rule>
<!-- EN without Query Params -->
<rule>
<from>^(.*)(\/en)(\/.*)?$</from>
<to type="proxy">[redacted]:8080$1$3?%{query-string}</to>
</rule>
Some of my initial questions (as many more are likely to arise):
Is there such a difference between 'proxy'/'redirect' and 'forward'/'passthrough' that more specialized efforts to achieve a meaningful redirect need to be implemented?
Am I missing something in other configuration files that may affect the outcomes of these attempts at redirection?
EDIT: The differences in RegEx are me trying things to see if that could possibly be where the disconnect is occurring
Because urls in dotCMS do not really exist, the servlet requestdispatcher, which is used by forward rules, does not work. You need to set a request attribute, CMS_FILTER_URLMAP_OVERRIDE, which dotCMS will respect. In code, this looks like:
NormalRule forwardRule = new NormalRule();
forwardRule.setFrom( "^/example/forwardDotCMS/(.*)$" );
SetAttribute attribute = new SetAttribute();
attribute.setName("CMS_FILTER_URLMAP_OVERRIDE");
attribute.setValue("/about-us/index");
forwardRule.addSetAttribute(attribute);
addRewriteRule( forwardRule );
I have a urlrewrite.xml file with several rules in it. I'd like to put a rule at the very beginning to match any/all css or js requests and just pass them through AND STOP COMPARING TO ANY OTHER RULES...
I've got this:
<rule>
<name>RULE: ignore js and css</name>
<from>^(.*(\.css|\.js))$</from>
<to last="true">$1</to>
</rule>
but it seems this just causes the whole thing to loop repeatedly.
How can I make this work?
I found an answer...posting it here for future searchers...
I just had to change the "to" part. Instead of using $1 I found a reference to using the dash '-' so I gave it a try and it worked...
<rule>
<name>RULE: ignore js and css</name>
<from>^(.*(\.css|\.js))$</from>
<to last="true">-</to>
</rule>
I'm building a project using Spring Boot and Angular 1.5.X and I am struggling to handle full page refreshes of Angular routes - typical "404 because the path I made doesn't actually exist" problem. I've done a fair bit of research and the solution that I keep seeing is to implement a .htaccess file with the following snippet in order to redirect all unknown requests back to the index (I pulled the following from this post)
RewriteEngine On
Options FollowSymLinks
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /#/$1 [L]
I have Tuckey's UrlRewriteFilter installed - according to this blog post since I don't have a WEB-INF folder - and it is working. It starts and it reads the urlrewrite.xml successfully. However, I don't know what to put in my urlrewrite.xml - I haven't the slightest clue of how to translate the above into something that the UrlRewriteFilter can understand. I've browsed the manual for the UrlRewriteFilter and I don't really know where / how to start.
Basically, what do I have to put in my urlrewrite.xml so that if I hit F5, my website doesn't puke back 404 errors?
Any help is appreciated.
Edit 1
I should mention that all of my API endpoints are prefaced with /api/** in order to distinguish them from links on my front end - an example would be /api/open/getUser and /api/secured/updateSettings.
Edit 2
Couple things I've discovered so far. One is that the UrlRewriteFilter can actually support .htaccess files and I did get it (as far as I can tell) to load in by moving the .htacess into my Resources folder and tweaking the code sample in the above blog post slightly, changing this
private static final String CONFIG_LOCATION = "classpath:/urlrewrite.xml";
to
private static final String CONFIG_LOCATION = "classpath:/.htaccess";
and
Conf conf = new Conf(filterConfig.getServletContext(), resource.getInputStream(), resource.getFilename(), "MyProject");
to
Conf conf = new Conf(filterConfig.getServletContext(), resource.getInputStream(), resource.getFilename(), "MyProject", true);
The addition of the true tells the filter to use a .htaccess file. Awesome, problem solved right? Not quite - it's hard to explain but it doesn't seem like the UrlRewriteFilter was/is reading the .htacess correctly. I was using an .htacess tester to verify that the regex's and rewrite conditions were working as I expected and they seemed to be. The tester said that they were fine. However, the UrlRewriteFilter would freak out and get stuck in some kind of loop, to the point that Java would throw a stack overflow exception (as to why, I've no idea - I can't seem to find a way to set the filter's logging level to debug via Java D:< ).
So clearly that didn't work - I am currently attempting to translate the .htaccess into urlrewrite.xml myself, and here is what I've managed to created so far.
<urlrewrite use-query-string="true">
<rule match-type="regex" enabled="true">
<note>
Any URI that ends with one of the following extensions will be allowed to continue on unimpeded.
Buried in the manual was the single line that said a "-" in the "to" will allow the request to
continue on unmodified.
</note>
<from>\.(html|css|jpeg|gif|png|js|ico|txt|pdf)$</from>
<to last="true">-</to>
</rule>
<rule match-type="regex" enabled="true">
<note>
Any URI that is prefaced with "/api/open/**" or "/api/secured/**" will be allowed through unmodified.
</note>
<condition type="request-uri" operator="equal">\/api\/(open|secured)\/([a-zA-Z0-9\/]+)</condition>
<from>^.*$</from>
<to last="true">-</to>
</rule>
<rule match-type="regex" enabled="false">
<note>
This one is supposed to be a "when all else fail" rule - if the other two rules don't match,
forward to the index and let Angular figure out the rest.
!! This one seems to be getting stuck in a loop of sorts !!
</note>
<from>^.*$</from>
<to last="true">/</to>
</rule>
</urlrewrite>
The first two seem to be working splendidly. The third rule (the one with enabled set to false for good reason) does not - it also appears to getting stuck in the same filter loop (or whatever is happening - the stack trace is so big that Intellij is like "nah man") as the .htacess method. Making progress.
Huzzah, I managed to get it! It was a right pain the butt since I couldn't figure out how to turn on debugging and see what the filter was actually doing, but alas, I have succeeded!
Spent one metric crap ton of time using a regex tester, and this is what I came up with. I am by no means even remotely close to a regex master, so please try to contain your nausea should you have any.
<urlrewrite use-query-string="true">
<rule match-type="regex" enabled="true">
<note>
- "/post/**" and "/user/.../**" are optional - this is because when you're on, say, "/post/20" and you hit
F5, the browser will attempt to get the static assets from "/post/**"
- the second group is used to see if the request is for a static asset
- take advantage of back references and forward only the part that matches the second group
- i.e. "/post/20" as URI -> hit F5 -> "/post/scripts/mainController.js" request of server -> "/scripts/mainController.js" forwarded
- i.e. "/user/Tester/home" -> hit F5 -> "/user/Tester/scripts/mainController.js" -> "/scripts/mainController.js" forwarded
</note>
<condition type="request-uri" operator="equal">\/?(post\/|user\/[a-zA-Z0-9]+\/)(.*.(html|css|jpe?g|gif|png|js|ico|txt|pdf))</condition>
<from>^.*$</from>
<to last="true">/%2</to>
</rule>
<rule match-type="regex" enabled="true">
<note>
Any URI that is prefaced with "/api/open/**" or "/api/secured/**" will be allowed through unmodified.
</note>
<condition type="request-uri" operator="equal">\/api\/(open|secured)\/([a-zA-Z0-9\/]+)</condition>
<from>^.*$</from>
<to last="true">-</to>
</rule>
<rule match-type="regex" enabled="true">
<note>
- Register, browse, search, and upload are all single level urls - the "\z" is to match the end of the string,
otherwise "/register" would match "/registerController.js"
- Inbox CAN be like "/inbox/favorites" so that's why it has a secondary regex - my Regex-Fu isn't good enough to combine
- Settings always has a secondary level
- User always has either home, gallery (w/ page and number), or favorites (w/ page and number)
- A post will always have a number
</note>
<condition type="request-uri" operator="equal" next="or">\/(register\z|browse\z|search\z|upload\z|inbox\z|tag\z)</condition>
<condition type="request-uri" operator="equal" next="or">\/inbox\/(favorites\z|uploads\z|comments\z)?</condition>
<condition type="request-uri" operator="equal" next="or">\/settings\/[A-Za-z-_0-9]+</condition>
<condition type="request-uri" operator="equal" next="or">\/user\/[A-Za-z-_0-9]+\/(home\z|gallery\/[0-9]+\/[0-9]+|favorites\/[0-9]+\/[0-9]+)</condition>
<condition type="request-uri" operator="equal" next="or">\/post\/[0-9]+</condition>
<from>^.*$</from>
<to last="true">/</to>
</rule>
The rules are not as general as I'd like, but they are functional (I have a sneaking suspicion that those five conditionals daisy chained together are a bit of a performance hit). The rules are pretty much specifically tailored solely to my needs but hopefully they can at least be starting point to anybody else who was in my shoes about 4 days ago.
Another important thing to take note of is that in your Angular config (if you're using HTML5 mode - I don't believe that the following is required for hashbang mode), make sure you set requiredBase to true, like:
$locationProvider.html5Mode({
enabled: true,
requireBase: true
});
and include a
<base href="/">
in the <head> of your index.html file. If you don't, Angular will get confused and parts of your application might not quite load correctly - parts of my URI were being trimmed, for example.
Also, tip for anybody new to using .htaccess / UrlRewriteFilter, go get yourself Postman in order to test your rules - probably a major "well, duh" for most, but for the rest of us it'll be a life saver :)
If anybody has any tips on how to improve the efficiency / combine the regex's at all, please let me know.
When my URL is localhost:8080, the rule below for Tuckey UrlRewriteFilter wrongly always results in localhost:8080 redirecting to www.example.com.
That behaviour for seems contrary to Tuckey UrlRewriteFilter reference manual!
What I want is for localhost:8080 to remain unchanged without redirection, to allow testing on local computer.
I wish to avoid unwanted URLs which are NOT at the example.com domain from being indexed by search engines. The unwanted URLs have a different domain but point to the same/duplicate example.com pages.
<urlrewrite>
<rule>
<name>Avoid wrong hostname's pages being indexed by search engines</name>
<condition name="host" operator="notequal" next="and">www.example.com</condition>
<condition name="host" operator="notequal" next="and">localhost:8080</condition>
<from>^/(.*)</from>
<to type="permanent-redirect" last="true">http://www.example.com/$1</to>
</rule>
Alternative:
I also tried it another way: removing all condition elements, and altering "from" to be:
<from>^/(^www.example.com|^localhost:8080)(\?.*)?$</from>
i.e. not equal to example.com and not equal to localhost -- but that has same problem.
I had the same problem as you do but couldn't find a solution using tuckey. I end up solving this compatibility of localhost-test and domain-name-consistency by using the interceptor in Spring. My code is like this
public boolean preHandle(HttpServletRequest request,
HttpServletResponse response, Object handler) throws Exception {
String url = request.getRequestURL().toString();
if (!url.startsWith("http://localhost") && !url.startsWith("http://www.example.com")){
response.sendRedirect("http://www.example.com"+request.getRequestURI());
return false;
}
return true;}
but there will be the necessary overhead to check in every request. Hope this helps!
as you are using regex match-type please try giving to the condition a regex too. E.g. <condition name="host" operator="notequal">^www.example.com$</condition>