Do I have to have language in URL? [Codeigniter] - codeigniter

I am making a mutli-language site. I have seen that some websites have urls with languages in them like so:
http://example.com/en/homepage
I hear it is important for SEO, but I was wondering, doesn't that make it more complicated in terms of routing, URI, controllers, rather than just having a session/cookie that holds the desired language?
What are pluses and minuses of each way and which way should I go?
thank you

you could add some lines to your route config and to your core to do what you want.
Here are two links with a lot of information to implement this: http://codeigniter.com/wiki/URI_Language_Identifier/
http://sumonbd.wordpress.com/2009/09/16/develop-multilingual-site-using-codeigniter-i18n-library/

Related

Internationalizing Title/Meta Tags ok or bad practice?

Is there a problem if I have both English and Chinese versions of the same title/meta tags under the same exact url? I detect the language the user has set for the browser (through the http header "accept-language" field) and change the titles/meta tags based on the language set. I get a large percentage of my traffic from China and felt this was a better-localized user experience for those users BUT I have no idea how Google would view this. My gut feeling tells me that this is not good for SEO.
Baidu.com, a major Chinese search engine, does in fact pick up my translated tags however for other US based sites it does not translate their English title/meta tags into Chinese. I would think Chinese users are less likely to click on those.
Creating sub domains and or separate domains for other countries is not an option at this point. That being said should I only have one language (English) for my title/meta tags to avoid any search engine issues?
Thanks for any advice / wisdom you can offer. Really hoping to get clarity on best practices.
Thanks all!
Yes, it probably is a problem. Search engines see mixed language content. You are not describing how you “detect and change the titles/meta tags based on the users browser language”, but you are probably doing it client-side and using “browser language”, which is wrong whatever it means in detail (it does not specify the user’s preferred language).
To get a more targeted answer, ask a more real question, with a URL.
If you want to get search traffic from search engines in both English and Chinese, you should have two urls instead of one.
When googlebot crawls a page, it does not even send the "Accept-Language" header. You have to send it your default language. When there is one url, there is no way for you to have your second language indexed. You won't be ranked in search engines in multiple languages.
For best SEO, use separate top level domains, subdomains, or folders for different languages.
http://example.de/
http://example.es/
http://example.com/
http://de.example.com/
http://es.example.com/
http://www.example.com/
http://example.com/de/
http://example.com/es/
http://example.com/en/
I think there are no problem when you use English and Chinese in same meta tags.

ASP MVC3 localizing URL query string parameters

I already know how to localize an ASP.NET MVC3 URL (using this technique).
This solution is very elegant and i already managed to tweak it to my needs.
But now i have this small (or rather huge) problem:
how is it possible to have localized url query parameters?
For example how is it possible to have this (US) english version
english URL: http://www.mysite.com/en-US/Classifieds/Search?ZipCode=92274
german (DE) version:
http://www.mysite.com/de-DE/Anzeigen/Suche?Postleitzahl=71710
spanish (ES) vesrion:
http://www.mysite.com/es-ES/Clasificados/Busqueda?Codigo_postal=08110
See the bolded part? This is what i'm looking for!
Thanks in advance
PS. I need this cause i think this will give much better SEO rankings. Is there anyone who can confirm this?
I can think of several ways of doing what you need. You may want to try to create your own HTML helper for localized links building. That could include the translation logic based on db table ( baseName, Culture, translation ). Once you've got this in place you either could refer to Request object and get the parameter by index, or create a logic to translate back ( again based on your table) to the base name.
Regarding your SEO question - I only know that MVC rewriting logic and 'friendliness' of the links is based on the fact that static-looking links are crawled faster then the dynamic ones. So that's something to consider on your site.( http://www.seo-consultant-services.co.uk/static-html-vs-dynamic-urls.html) I'm not an expert but I would guess that translating your url parameters makes sense if you expect users to search for services like this for example ' near ZipCode 92274' (I may be wrong)

What is a good approach for extracting keywords from user-submitted text?

I'm building a site that allows users to make sense of a debate by graphically representing arguments for and against a particular issue. (Wrangl)
I'd like to categorise these debates so they are more easily found and connected. I don't want to irritate the person creating the debate by asking them to add tags and categories before they see any benefit, so I'm looking at a way of automatically extracting keywords.
What's a good approach for taking the debate's title and description (and possibly the content of the arguments themselves once there are some) to pull out, say, ten strong keywords that could be used as metadata to connect similar debates together, or even as the content of the "meta" keywords tag in the head of the HTML page where the debate is viewable. Eg. Datamapper vs ActiveRecord
The site is coded in Ruby with Sinatra, using DataMapper for data storage. I'm ideally looking for something which will work on Heroku (I don't have a way of writing files to disk dynamically), and I'd consider a web service, an API or ideally a Ruby gem.
Maybe you can use TextAnalyzer.
I understand that you're wanting to find an easy way of achieving this, I've recently dived into the world of NLP (Natural Language Processing) and Text-mining and its a daunting process of which most went far above my head.
Although i managed to code some functionality that resembles what you're looking for, though I did it in PHP. What i would suggest, that if you want it tailored to your project (Wrangl) then do it yourself.
Using the Porter stemming algorithm which I'm sure there will be Ruby code for.
Ruby Porter stemmer
You can try the salsaAPI to automatically extract keywords and categorize the debates!

Why not use HTML tags in websites' text editors?

I may need to implement this sometime in the future, but I think the trigger for the question now is mainly curiosity.
I thought of how to write a text editor to a web site I'll build soon, and saw this site's (and other's) way, so I thought - isn't it a bit too complicated? If tags should be used from the first place, why not let users use HTML tags? The only reason I can think of is HTML injection which I don't know much about, but it sounds like an easy issue to solve, isn't it?
Thank you.
Simply because not all of your users will know HTML. *bold text* is a lot more easy to understand (and read in it's raw form) than <b>bold text</b>. Especially if you get into links.
The reason we use Markdown, Textile and the rest is to provide a nice alternative that's accessible to more users.
Of course you can still provide the ability to use HTML to your users (it's in the Markdown spec) but you'll have to do a lot of checking to make sure there's nothing malicious going on - for example, blocking <script>, <iframe>, large images, javascript in the form <a href="javascript:alert("...");"> etc.
There are several reason why you should not use HTML tags in such an editor:
1) It might be less complex for the user if you introduce an own reduced tag set
2) HTML Injection: There is a big risk of dangerous HTML code getting injected.
If you really want to allow HTML code you have to be very careful.
Historically, systems like BBCode were designed to limit available formatting elements to things that would not break the layout of the site, but now, with more mature and smarter HTML parsers, it's not necessary to invent a new markup language just to bar certain un-safe HTML tags.
The current main reason I've seen is that HTML is foreign to most users, and the HTML substitutes are aimed at providing a simplified version of the formatting directives an every-day user would need.
HTML script injection is most emphatically not an easy problem to solve. HTML is a fairly complicated, non-regular language - detecting all possible vulnerabilities is a really hard problem. Many sites have tried, and failed. It's easier, from a vulnerability-prevention POV, to just prohibit HTML entirely, or allow only a small subset of tags.

What are the url parameters naming convention or standards to follow

Are there any naming conventions or standards for Url parameters to be followed. I generally use camel casing like userId or itemNumber. As I am about to start off a new project, I was searching whether there is anything for this, and could not find anything. I am not looking at this from a perspective of language or framework but more as a general web standard.
I recommend reading Cool URI's Don't Change by Tim Berners-Lee for an insight into this question. If you're using parameters in your URI, it might be better to rewrite them to reflect what the data actually means.
So instead of having the following:
/index.jsp?isbn=1234567890
/author-details.jsp?isbn=1234567890
/related.jsp?isbn=1234567890
You'd have
/isbn/1234567890/index
/isbn/1234567890/author-details
/isbn/1234567890/related
It creates a more obvious data structure, and means that if you change the platform architecture, your URI's don't change. Without the above structure,
/index.jsp?isbn=1234567890
becomes
/index.aspx?isbn=1234567890
which means all the links on your site are now broken.
In general, you should only use query strings when the user could reasonably expect the data they're retrieving to be generated, e.g. with a search. If you're using a query string to retrieve an unchanging resource from a database, then use URL-rewriting.
There are no standards that I'm aware of. Just be mindful of IE's URL length limit of 2,083 characters.
Standard for URI are defined by RFC2396.
Anything after the standardized portion of the URL is left to you.
You probably only want to follow a particular convention on your parameters based on the framework you use.
Most of the time you wouldn't even really care because these are not under your control, but when they are, you probably want to at least be consistent and try to generate user-friendly bits:
that are short,
if they are meant to be directly accessible by users, they should be easy to remember,
case-insensitive (may be hard depending on the server OS).
follow some SEO guidelines and best practices, they may help you a lot.
I would say that cleanliness and user-friendliness are laudable goals to strive for when presenting URLs.
StackOverflow does a fairly good job of it.
I use lowercase. Depending on the technology you use, QS is either threated as case-sensitive (eg. PHP) or not (eg. ASP). Using lowercase avoids possible confusion.
Like the other answers I've not heard about any conventions.
The only "standard" I would adhere to is to use the more search engine friendly practice of using a URL rewriter.
There are no standards that I know of, and case shouldn't matter.
However within your application (website), you should stick to your own standards. For your own sanity if nothing else.

Resources