How do I strip out ?_escaped_fragment_= using .htaccess - ajax

Google discovered that I'm allowing end users to navigate my content using ajax loading, and is loading my pages as a user client rather than requesting them as new page loads. So instead of trying to index www.mysite.com/page, it's requesting www.mysite.com/?_escaped_fragment_=/page
Which is not at all what I want it to do. My snapshots are served at the same URL as the ajax-loaded content. The site is not using queries, it's not supporting them and I don't want to build that support. This means that all the pages look broken to google which of course is unfortunate!
Currently all page requests are redirected server side using .htaccess sending requests to the index.php file which in turn compiles the html doc on the server before serving to the client. The site serves perfectly valid and unique html documents for all pages. But google insists on doing it the ajax way and adding the query which always returns a broken page.
I'm not a .htaccess expert, but it seems to me that the easiest way to solve this would be to rewrite the request, remove the ?_escaped_fragment_=/ bit and permanently redirect any such requests to what currently works which is to load the pages using their correct url's.
Anyone know how I would go about doing that? Below is the current redirect part of my .htaccess file which needs to be amended with the _escaped_fragment_ stripping code:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /
#if trailing / remove it with a permanent redirect
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [L,R=301]
#if missing www. add it with a permanent redirect
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [L,R=301]
#requests for index.php never rewritten
RewriteRule ^index\.php$ - [L]
#if file or directory are missing, route to index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

This is how I rewrote it so that all ?_escaped_fragment_=/XXXXX requests got redirected to /XXXXX without the query
RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}%1? [L,R=301]
This makes www.domain.com/?_escaped_fragment_=/somepage redirect (permanently) to www.domain.com/somepage
...which is just what I wanted.

Related

rewriting simple html files for SEO friendly URLS

I have a simple file mydomain.com/business_nottingham.html
and i want to re-write that to an SEO friendly URL, eg mydomain.com/business-nottingham/
I've googled all the examples but they seem to be designed for either CMSes or PHP scripts.
Is there a simple .htaccess re-write example available that allows me to do something very simple as above?
Edit: I managed to find the following code finally
Options +FollowSymlinks
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)\.html$ /$1 [L,R=301]
However, it's still not quite working. If i enter the tidy URL, i get re-directed to a 404 page, but putting in the exact .html file name gets me to the right file but doesnt present me with a clean url.
I've tried various combinations of the above by reading various articles and tutorials but for some reason it doesn't seem to work for me.
You want to redirect only when the actual request is for an .html file, then you want to internally rewrite to the html file. The way URLs resolve is the browser shows where it thinks it's going (URL in the address bar), then the request is made to the web server. If the webserver (where the htaccess file is) wants to change what's in the browser's URL address bar, it needs to tell the browser to literally load an entirely different URL. The browser will then request the new URL. Then the server must internally rewrite that URL back to where the actual resource is (the first URL), but the browser doesn't see this happen.
RewriteEngine On
RewriteCond %{THE_REQUEST} ^(GET|HEAD)\ /([^\ ]+)\.html
RewriteRule ^ /%2/ [L,R=301]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} ^/(.*?)/?$
RewriteCond %{DOCUMENT_ROOT}/%1.html -f
RewriteRule ^ /%1.html [L]

mod_rewrite forward shortend URL

I am looking for a way to create a short URL path for a longer URL on my page
the long url is: domain.com/tagcloud/user.html?t=1234ABCD
i would like to offer a short version of the URL to easy access it:
domain.com/t/1234ABCD
I tried a few examples but I just don't get it how I could forward these rules.
RewriteRule ^(.*)/t/$ /tagcloud/user.html?t=$1 [L]
I am also using MODX so they already use rules.
in addition my htaccess file
RewriteEngine On
RewriteBase /
# Always use www
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.domain\.com [NC]
RewriteRule (.*) http://www.domain.com/$1 [R=301,L]
# The Friendly URLs part
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
I must keep the code snippets above in my htaccess file. The first one simply forwards http://domain.com requests to www.domain.com
The friendly URLs part is needed to translate the internal IDs of my CMS with the alias of the URL. This feature must remain because the entire site cannot be influencted by the changes I try to make in htaccess...
I simply would like to add a listener that only if the URL matches www.domain.com/t/abcd1234
Therefore I need something that identifies the www.domain.com/t/ URL
your help is much appreciated
Try this:
RewriteCond %{REQUEST_URI} ^/t/.*
RewriteRule ^t/(.*)$ /tagcloud/user.html?t=$1 [R=301,L]

Using mod_rewrite to redirect old urls with Codeigniter

I'm redeveloping a website using the codeigniter framework.
When we go live, we want to ensure a few of the old URLs will be redirected to the appropriate pages on the new site.
So I put what I thought would be the correct rules into the existing htaccess file, above the other rules that CodeIgniter applies.
However, they are not taking affect. Can anyone suggest what I'm missing here?
# pickup links pointing to the old site structure
RewriteRule ^(faq|contact)\.php$ /info/ [R=301]
RewriteRule ^registration\.php$ /register/ [R=301]
RewriteRule ^update_details\.php$ /change/ [R=301]
# Removes access to the system folder by users.
RewriteCond %{REQUEST_URI} ^_system.*
RewriteRule ^(.*)$ /index.php?/$1 [L]
# This snippet prevents user access to the application folder
# Rename 'application' to your applications folder name.
RewriteCond %{REQUEST_URI} ^myapp.*
RewriteRule ^(.*)$ /index.php?/$1 [L]
# This snippet re-routes everything through index.php, unless
# it's being sent to resources
RewriteCond $1 !^(index\.php|resources)
RewriteRule ^(.*)$ index.php/$1
Try adding a [L]ast flag to your R=301 flag => [L,R=301] that makes sure no other rules are applied, and, just to be sure, try to redirect to a complete URL and, to be even more sure you haven't deleted anything, add RewriteEngine On to the top and set the RewriteBase.
Make your first rows look like
RewriteEngine On
RewriteBase /
RewriteRule ^(faq|contact)\.php$ http://www.YOURDOMAIN.XYZ/info/ [L,R=301]
and check if the URL in your browser changes when you call for instance the faq page.

using mod_rewrite on ajax site to load root index page when you access folders in url

I'm building an Ajax site that runs off of a root-level index.html file and uses history.js for pushState/popState, which I have updating the urls such that they are nice and clean without hashes or bangs (Example: site.com/section/1).
How can I do a mod_rewrite so that when a user tries to link to site.com/section/1 or site.com/section (or anywhere other than the root), the server serves up site.com/index.html?
From there the js would load the requested content in the url via ajax.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) /index.html [L]
If you want to do something with the actual request (the /section/1 part) you can access it via $1. Example:
RewriteRule (.*) /index.html?path=$1 [L]
Which will rewrite /section/1 to /index.html?path=/section/1

mod_rewrite - some requests being rewritten should produce a 404 but don't

I've been working on creating seo friendly urls for my site and have done this successfully with the following rules:
RewriteEngine on
RewriteBase /
#redirect non www to www
RewriteCond %{HTTP_HOST} ^example.co.uk [NC]
RewriteRule ^(.*)$ http://www.example.co.uk/$1 [L,R=301]
# Redirect any request with page=var to /var/ format
RewriteCond %{QUERY_STRING} ^page=(.+[^/])$ [NC]
RewriteRule ^index\.cfm$ http://%{HTTP_HOST}/%1/? [R=301,L,NC]
# If not an existing file or directory rewrite any request
# to page var.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+[^/])/$ index.cfm?page=$1 [QSA,L]
My problem is that because of the last rule (I think), any root dir request is made into a friendly url such as /pagethatdoesntpointanywhere/ whether it points anywhere or not. Now I'd be happy with a 404 but it's not doing that it's just displaying the homepage.
I've also tried adding a blanket 404 rule:
# 404 files that don't exist
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .+ /notfound.html [R=404,L,NC]
But that makes all of the friendly urls 404s as well.
Could someone explain where I am going wrong here please?
Use ErrorDocument 404 /notfound.cfm to display not found error page when Apache cannot find the file (replace notfound.cfm by your file).
In index.cfm, if value of page parameter is unknown (e.g. pagethatdoesntpointanywhere), display 404 error page (use the same/similar code as notfound.cfm). It is the right place to do considering your rewrite rules and the fact that you checking which page to display here anyway. Lots of products/frameworks work in similar fashion (for example: WordPress).
Use both #1 and #2. Number #2 will work for 1-folder deep URLs (e.g /meow/) while #1 will catch any other URLs (e.f. /meow/kitten/ or /meow/wuf/oink.css).

Resources