Why do sites regularly place language "en", "fr", etc in the URL rather than leaving it in a session variable/cookie etc?

Surely language is an individual thing, and better set based on the user's browser settings or an explicit setting that they've selected and saved (via session/cookie).
If I'm sending a page I've just read to my french-speaking friend it would be better without any language code in the URL, so it opens in French for him based on his browser setting or on an earlier visit to the site.
This seems very obvious to me... but a LOT of major sites put the language in the URL. So I feel I must be missing something... What?

The only advantage I can see if for indexation purpose. If your web site needs to be indexed (e-commerce products for instance) then the language become very important for regional search engines.
For instance here is an extract from Google doc :
Google uses the content of the page to determine its language, but the
URL itself provides human users with useful clues about the page’s
content. For example, the following .ca URLs use fr as a subdomain or
subdirectory to clearly indicate French content:
http://example.ca/fr/vélo-de-montagne.html and


make google show two different websites depending on searcher's language

Hi i made a google web master tool account and sent 2 site maps: one for the italian language and one for the english one of my site.
Now, my site has a script in the index that redirects the users to mywebsite.it/it if he's italian otherwise it will go to mywebsite.it/en.
The problem is that now google's crawler(that obviously is not italian) only sees the english version of site and not both of them.
Is there a way to make it crawl and show the two different websites depending on the language?
Do you use JavaScript to redirect the people? It would be better to use a server-side redirect, for example with .htaccess
However, when you link both language versions from your index page and Google accepted your sitemaps, your site should be okay to be indexed. Maybe it takes some more time until the crawler visits your Italian site, too.
Update: You could/should add a language switcher for users to your site, and also link the translations in the head area of your site with the link element and rel="alternate and hreflang="it resp. "en". See Google: rel="alternate" hreflang="x"

is it possible to run multiple websites from the same URL?

i'm in the process of adding a US site to my current UK site. I'd like to do this as transaprently as possible so that we don't lose any traffic to existing links. We're currently running this under version of Magento on a shared hosting setup.
The new website (US) will be essentially the same as the current (UK) site, but with US Dollar pricing instead of Pound Sterling.
We currently have a GeoIP setup whereby visitors are redirected to either UK or US site whilst utulising the same URL. This essentially means that we have switch statements in our index.php to indicate what run code to use.
Here's my question:
what's the best way of selecting/overriding the GeoIP selection via the standard store switcher selector dropbox? Both websites are being populated in the dropbox, however, since both are utilising the same URL (www.example.com/boutique) the default one is the only one that's being selected.
I've also tried the &_store= as well as the &_website= arguments with no success.
Any ideas? are URL rewrites in .htaccess the answer? if so, any ideas as what to use?
P.S. this is the method that's pretty much being followed however my aim is to let users override their location-specific website (e.g. US) if necessary:http://www.magentocommerce.com/wiki/4_-_themes_and_template_customization/navigation/multiple-website-setup#multiple_website_setup_for_useuuk_storespricing
Have you tried using a getUrl() method to build the store arguments for you? It can help clear up those little misunderstandings, for example I'm pretty sure the store parameter is supposed to have three underscores but cannot really remember so I use the function instead.
The best way to over-ride is to have a little php program, e.g. 'countries.php' that sets a cookie depending on the country code that you choose or 'auto' to test regular geoip. Then in your index.php have an 'if cookie then use cookie code else use geoip code'. Naturally the cookie can only be set by your test program.
And yes, you only need set 'website' not 'store'. There is no benefit in your US customers being able to see your UK prices (and vice-versa) so don't even bother with setting up a frontend drop-down. Or, if you really want, you can have rest-of-the-world customers choose their currency/website and put your own cookie-setting code in the header for them, with a couple of nice flag icons.

how to get the country of the user to use his language

i heard about many ways that depend of files like csv or database
but i think uploading an extra database on my site to do that is not good idea
i feel good about the external providers
is useing externial site that give your the country by the ip is good way
or its not good because the server will wait the response of the external provider and this will slow down the site?
Look at this: http://www.rubyquiz.com/quiz139.html
It's actually not a great idea to base the language choice on the IP address anyway. What if I'm an American browsing from Germany, and I don't speak German very well? Your most standards-compliant way (I think) would be to parse the Accept-Language header of the web request, and use that to set a user's default, but always provide them a way to override the default and pick their language (which you'd then store in their session or user prefs)
I would use PHP variable $_SERVER['HTTP_ACCEPT_LANGUAGE'] which, in my case, holds this value sk,cs;q=0.8,en-us;q=0.5,en;q=0.3. That means, my browsers language is 'Slovak'.
I think, this option is better. Just imagine, that you are English, but you are on vacation somewhere.You use your notebook over there... Your IP address would tell your server, that you are in Croatia and you'd like to get content in their language... But you browser still says you are english speaking person... There is the difference ;)
I wouldn't use GeoIP for this - there are too many scenarios when it fails or produces the wrong results.
As #Paul says the HTTP Accept-Language header specifies the user's language preferences as defined in the browser. You can view what your browser is set to by visiting Browser Language Detection.
For a real worked example see Parse Accept-Language to detect a user's language.
Also remember that crawlers don't use Accept-Language so it is important to ensure that you have a strategy for making this available (e.g URLs for each language content) and include in sitemap.
Also see Apache Module mod_negotiation for content selection.

Get URI fragment (hash) to affect SEO? Get indexed by SEs?

I am building a forum site where the post is retrieved on the same page as the listing via AJAX. When a new post is shown, the URI fragment is changed (ex: .php#1_This-is-the-first-post). Also the title and meta tags are changed.
My question is this. I have read that search engines aren't able to use #these-words. So therefore, my entire site won't be able to be indexed (as it will look like one page).
What can i do to get around this, or at least make my sub-pages be able to get indexed?
NOTE: I have built almost all of the site, so radically changes would be hard. SEO is my weakest geek-skill.
Add non-AJAX versions of every page, and link to them from your popups as "permalinks" (or whatever you want to call them). Not only aren't your pages available to search engines, they can't be bookmarked or emailed to friends. I recently worked with some designers on a site and talked them out of using an AJAX-only design. They ended up putting article "teasers" in popups and making users go to a page with a bookmarkable URL to read the complete texts.
As difficult as it may be, the "best" answer may be to re-architect your site to use the hash tag URL scheme more sparingly
Short of that, I'd suggest the following:
Create an alternative, non-hash based URL scheme. This is a must.
Create a site-map that allows search engines to find your existing pages through the new URL scheme.
Slowly port your site over. You might consider adding these deeper links on the page, or encourage users to share those links instead of the hash-based ones, etc.
Hope this helps!

websites urls without file extension?

When I look at Amazon.com and I see their URL for pages, it does not have .htm, .html or .php at the end of the URL.
It is like:
Why and how? What kind of extension is that?
Your browser doesn't care about the extension of the file, only the content type that the server reports. (Well, unless you use IE because at Microsoft they think they know more about what you're serving up than you do). If your server reports that the content being served up is Content-Type: text/html, then your browser is supposed to treat it like it's HTML no matter what the file name is.
Typically, it's implemented using a URL rewriting scheme of some description. The basic notion is that the web should be moving to addressing resources with proper URIs, not classic old URLs which leak implementation detail, and which are vulnerable to future changes as a result.
A thorough discussion of the topic can be found in Tim Berners-Lee's article Cool URIs Don't Change, which argues in favour of reducing the irrelevant cruft in URIs as a means of helping to avoid the problems that occur when implementations do change, and when resources do move to a different URL. The article itself contains good general advice on planning out a URI scheme, and is well worth a read.
More specifically than most of these answers:
Web content doesn't use the file extension to determine what kind of file is being served (unless you're Internet Explorer). Instead, they use the Content-type HTTP header, which is sent down the wire before the content of the image, HTML page, download, or whatever. For example:
Content-type: text/html
denotes that the page you are viewing should be interpreted as HTML, and
Content-type: image/png
denotes that the page is a PNG image.
Web servers often use the file extension if the file is served directly from disk to determine what Content-type to assign, but web applications can also generate pages with any Content-type they like in response to a request. No matter the filename's structure or extension, so long as the actual content of the page matches with the declared Content-type, the data renders as intended.
For websites that use Apache, they are probably using mod_rewrite that enables them to rewrite URLS (and make them more user and SEO friendly)
You can read more here http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html
and here http://www.sitepoint.com/article/apache-mod_rewrite-examples/
EDIT: There are rewriting modules for IIS as well.
Traditionally the file extension represents the file that is being served.
For example
Later that same approach was used to allow a script process the parameter
In this case the file was a php script that process the "request" and presented a dinamically created file.
Nowadays, the applications are much more complex than that ( namely amazon that you metioned )
Then there is no a single script that handles the request ( but a much more complex app wit several files/methods/functions/object etc ) , and the url is more like the entry point for a web application ( it may have an script behind but that another thing ) so now web apps like amazon, and yes stackoverflow don't show an file in the URL but anything comming is processed by the app in the server side.
Here I questions represents the webapp and 322747 the parameter
I hope this little explanation helps you to understand better all the other answers.
Well how about a having an index.html file in the directory and then you type the path into the browser? I see that my Firefox and IE7 both put the trailing slash in automatically, I don't have to type it. This is more suited to people like me that do not think every single url on earth should invoke php, perl, cgi and 10,000 other applications just in order to sent a few kilobytes of data.
A lot of people are using an more "RESTful" type architecture... or at least, REST-looking URLs.
This site (StackOverflow) dosn't show a file extension... it's using ASP.NET MVC.
Depending on the settings of your server you can use (or not) any extension you want. You could even set extensions to be ".JamesRocks" but it won't be very helpful :)
Anyways just in case you're new to web programming all that gibberish on the end there are arguments to a GET operation, and not the page's extension.
A number of posts have mentioned this, and I'll weigh in. It absolutely is a URL rewriting system, and a number of platforms have ways to implement this.
I've worked for a few larger ecommerce sites, and it is now a very important part of the web presence, and offers a number of advantages.
I would recommend taking the technology you want to work with, and researching samples of the URL rewriting mechanism for that platform. For .NET, for example, there google 'asp.net url rewriting' or use an add-on framework like MVC, which does this functionality out of the box.
In Django (a web application framework for python), you design the URLs yourself, independent of any file name, or even any path on the server for that matter.
You just say something like "I want /news/<number>/ urls to be handled by this function"
