New OScommerce user.
I've been fiddling around with Chemo's Ultimate SEO add-on the last few days. I've mostly got it working (minus one bizarre redirect loop for category pages?) but I'm a little disappointed in the limited options for formatting URLs.
I'm seeing:
http://www.website.com/category-awesomeproduct-p-1735.html
When we'd really like to do something more in line with:
http://www.website.com/category/awesomeproduct
What are my options? Am I out of luck?
I fear that the stock URL parameters are rigidly defined and that there's no way to hide the less friendly ones.
After researching this for quite awhile, and receiving no answers here, I believe the answer is: no
The view controller expects that data and it can't be omitted, even with customizations installed.
Related
Is it possible to make a 2.5 component without using TableHelloWorld class and all that field type stuff like from here. Or is it compulsory?
http://docs.joomla.org/Developing_a_Model-View-Controller_Component/2.5/Using_the_database
The system will function without it fairly well actually. All you actually need to get something running is a base file named after your component, a controller.php file, and the view as outlined in this section: http://docs.joomla.org/Developing_a_Model-View-Controller_Component/2.5/Adding_a_view_to_the_site_part
From that you will get something that runs and loads. And if you choose you can just make raw sql queries to the database.
That being said, the framework is there to help you, not to hinder you. I've cut a lot of corners over the years, and almost always you end up regretting it later. Feel free to play around with skipping the pieces, but just remember that there are pieces out there that can help you with all kinds of important things that you may not think you need right now. (Binding input, table row hierarchies, and check-in/check-out functionality are just a few that come to mind that I'm glad I didn't have to make myself.)
I have seen these "domain.com/#!/" formated urls, and driven merely by curiosity I chose to ask you people... what is that used for? A kinda "exclamated-hashtag" if you know what I mean.
I see it on sites such as "hypem.com" or "buzzchips.com", both of them delivering asynchronous dynamic content in a similar way.
I uploaded a tiny shot just so you actually see what I see, here and there.
It appears to be a standard for allowing dynamically created content to be crawled.
You can see a good explanation of this under the SEO heading for the following answer:
https://softwareengineering.stackexchange.com/questions/46716/what-should-a-developer-know-before-building-a-public-web-site/46760#46760
I'm trying to clear up a grey area about this much talked about topic...
Like most devs, I've made some pretty URLs with mod_rewrite. My sites internal links point to the pretty URLs and things are working nicely.
But, I can still access the old URL if I point to it directly.
Now, this is most certainly going to cause duplicate content issues so after doing some research it seems that 301 redirects are the way to go.
But.... and here's the grey bit...
If you are working on a site with thousands of URLs, what's best practice to achieve this? I don't wantto list 1k+ lines in .htaccess I thought of a regexp in my rewrite rule, but my pretty URLs have names from the database in them... and I can't access that from .htaccess :)
Have I hit a dead end? Is there a way around this? Would Google's canonical tag be a possibility??
Well, I don't know if this is the "definitive" answer, but I have a bunch of "functional" URLS like:
http://www.flipscript.com/product.aspx?cid=7&pid=42&ds=asdjlf8i7sdfkhsjfd978
but I remap the URLs, link to them and list them in my site map as:
http://www.flipscript.com/ambigram-ring.aspx
I haven't seen ANY evidence that identical URLS pointing to the same content within the same domain has any negative impact on SEO.
In fact, over the past year, I have climbed to the #1 position on Google with this in place for my primary keyword.
My theory about why this should be so is that Google applies the duplicate content penalty for entire "clone sites", not for just linking with different URLs to the same content within a single site.
A quick dirty way would be to re-route everything on the site via a PHP file that checks to see if the path is still valid, querying the database if necessary. Use a 301 redirect if the path has permanently moved. Soon enough these "grey urls" should hardly ever come across, and indexes should be updated across search engines. At which point you can remove the router.
If you could specify what your "grey url" looks like I may be able to suggest a better alternative.
"Would Google's canonical tag be a possibility??" -- Why not?
--> It automatically transfers page rank
--> Google recommends canonical tag even if the content differs slightly but is more or less similar.
--> Too many 301 redirects to pages within site are bad for SEO (my personal experience with Bing).
--> Too may 301 redirects increase the effective load time of content for your users (especially bad if the ping times from their location to your server is high).
Having a look at how google perceives our site at the moment and coming up short...
Basically, we use a bog-standard structure of URL rewriting to make them look SEO friendly.
for instance, a product URL takes shape of any string_([0-9]).html and so forth. of course, this allows us to link to whatever we want before the product id... which we have done. In the past, a product page was Product_Name_79.html and then became Brand_Name_Product_Name_79.html. apache does not really care and id 79 gets passed on in either case. However, google now has 2 versions of this product cached under different URLs - and that's not a good thing as it continues to arrive to the first URL and spider it.
same thing applies to our rewrite rules for brands and categories, some of which had been dropped and some of which have been modified.
there are over 11k urls in site:domain whereas our sitemap gets some 5.8k only. how would you prevent spiders from fetching older versions of urls that you no-longer link to (considering it's not a manual process and often such urls can be very dynamic).
eg, Mens_Merrell_Trail_Running_Shoes__50-100__10____024/ is a dynamic url for the merrell brand, narrowed down by items in trail running shoes that cost between 50 and 100 and size 10 with gender set to men's.
if we decide to nofollow any size and money filter urls, that leaves google still being able to access them through its old cache...
what is the best practice for disallowing a particular type of urls? as the combinations above are nearly infinite, i cannot produce a list and it certainly cannot be backdated against what brands and categories google may hold for us historically.
shall we add noindex when such filters are applied? shall we export them to robots.txt? do nothing in the hope that google stops returning?
to put it into perspective, we have 2600 product page urls that are now redundant / disabled, what would you do with them? redirect to homepage, brand page, 404, do nothing?
thanks for any advice
i think you're looking for rel="canonical", google should start ignoring you're links if they're really not linked to. You can check any incoming links with a tool like this: http://www.seomoz.org/linkscape.
Also if you're old urls match (or don't match) a consisent pattern you could set up a 301 redirect in apache either for pages matching the old pattern or not matching the new pattern...
hope this helps!
Just be sure to set up redirects for any URL you change. Also, I don't recommend using rel=nofollow since it indicates to Google that your site is not trustworthy.
What do you think.. are clean URLs a backend or frontend 'discipline'
The answer is BOTH.
For example:
https://stackoverflow.com/questions/203278/are-clean-urls-a-backend-or-a-frontend-thing
The number above is a database id, a back-end thing. Chop off the pretty part and it goes to the same page. Therefore the "are-clean-urls-a-backend-or-a-frontend-thing" is part of the front-end thing.
If we're talking url's being 'clean' from an end user experience then I'm going to break the mould a bit and say that url's in general are not intuitive and they never will be, they are intended to be machine readable.
There is no standard to the format of a url such that when navigating from site to site humans will never ever remember how to reach a resource purely through remembering urls and their 'friendly syntax'. We can argue the toss about whether using a '?' and '&' or '/' to express how how to identify a resource via a url; is one method better than the other? it doesn't matter. At the end of the day a machine parses it and sends back the result.
We should stop deluding ourselves that people actually type these things in and realise that uri's are for machines, not people.
I have yet to use/remember a uri that goes beyond the first few characters of the http://domain.com/ part of an address, and I've been using the web since a long time. That's what bookmarks are for. Nowhere on a website does it say 'change this part here in our url to view 'whatever else' resource' because url's are usually undocumented and opaque.
Yes make your uri's SEO friendly (hell even they change periodically) but forget about the whole 'human/clean' resource identifier thing, it's a mystical pipe dream.
I agree with Vlion that url's should provide a unique mechanism to bookmark a resource and return to it (unlike some of these abominable web 2.0 ajax/silverlight/flash creations), but the bookmark will never be for humans to comprehend and understand. There seems to be quite a lot of preoccupation and energy spent in dreaming up url strategies that humans can remember and type in, it's a waste of energy. Let's get on and solve real problems.
Sorry for the rant, but there's a lot of web 2.0 nonsense related to urls going on in certain circles that are just a total waste of time.
Now that Firefox's Awesome bar and Google Chrome's Omnibox address bars can be used to search the browsing history it makes it much easier for users to search their history for previously visited sites, so having clean urls may help the user find sites in their history more easily.
Making sure the page has an appropriate Title is important (as both browsers search the title as well as the url) but by making sure the url has relevant keywords in it as well, when those keywords are typed in the address bar the urls will be more likely to show up higher in the suggestions as the keyword will be matched twice, in the url and the title.
Also, once a user has typed the name of a site they will be presented with example urls from the site which they can then use as a template for narrowing down their search. So using verbs and nouns in the url for different sections or actions of the site will aid the user to narrow their search to just the part of the site they are interested in, e.g. the /questions/ or /tag/ sections of stackoverflow, or the "/doc" at the end of docs.google.com/doc that can be used to view just document pages on Google docs*.
Since both Firefox and Chrome search for each space separated word typed into the address bar, it could be argued that it isn't necessary for searching that the url be completely human readable, but to allow the user to actually read the keywords they are interested in from the url the amount of "noise" should be kept to a minimum.
* which are of the form http://docs.google.com/Doc?id=gibberish
My perspective is simple:
every place I visit with my browser(with various edge case exceptions) should be bookmarkable and Forward/Back should be usable and not destroy any data entry.
Backend for sure. Your server is the one that has to take care of the routing to the resources requested by the URL.
I think the main reasons for using friendly URLs are:
Ease of linking / sharing
Presentation
Seo
So I think it's purely a client-side pleasure. While they're nice on the server as well, they're not mission critical.