Language system design

Language system design - codeigniter

I am about to implement a language system into my codeigniter project. (following this tutorial: http://www.sitepoint.com/multi-language-support-in-codeigniter/), but I am a bit stuck with the thinking process.
The website will contain alot of text so there would have to be alot of indivudual language files like error_english.php, user_english.php etc...
But I'm wondering, is that the right way to go? Like for example what if I have a page with different language files loaded in it because it has alot of text and I have to load 1 word, something like Firsname or something.
That would mean I'd have to load user_english.php while that will contain for example more than 100's rows of texts. Wouldn't there be alot of loading because of this, just for 1 word? There would be so many unneeded arrays.
Does anyone know a good design/routing pattern to be able to keep the server loadtime/performance as it's best?

language packs are simply key-value pairs. yes, you load thousands of lines if need be and it works fine. One way to mitigate that line count somewhat is to have different language pack files for different sections of site.
so you could have an error file and do:
$this->load->language('errors');
But really it doesn't make a practical difference for speed - it is an organisational thing.
I have extended mine to use field substitution. Language packs also provide a way to separate code from presentation - i always use them in models even if I am only writing in english because it forms a separation between code logic and outputs that makes the code easier to read.
so if you put this in your helpers:
function lang_sub( $str, $params = '')
{
$CI = & get_instance();
$return_string = $str;
if (array_key_exists($str, $CI->lang->language))
{
// the str is a key. so it only is a one line substitution.
$return_string = $CI->lang->line($str);
if (is_array($params))// if a parameter array is sent, substitute the strings %s in order.
{
$return_string = vsprintf($r, $params);
}
}
return $return_string;
}
you can do something like this in your language entry:
$lang['login_message'] = "Welcome %s good to see you again! we as last saw your on %s.";
and in your model can do something like:
$welcome_message = lang_sub('login_message',array($username,$last_seen));
note that it is a good idea to consider naming and collisions of your entries before you start.
good luck!

Related

Single or multiple translation.json files for i18n?

I'm working a project in Aurelia and using the aurelia-i18n plugin. So far it looks great and translation is working and instantly updating interface language when I change locale.
Question: is there a logical, organizational or performance advantage to using multiple translation files vs. a single translation file? For instance:
Should I just put everything into one file?
my-aurelia/locales/en/translation.json
my-aurelia/locales/es/translation.json
Or should I separate into multiple translation files?
my-aurelia/locales/en/nav.json
my-aurelia/locales/en/words.json
my-aurelia/locales/en/phrases.json
my-aurelia/locales/es/nav.json
my-aurelia/locales/es/words.json
my-aurelia/locales/es/phrases.json
Here's how I have instantiated the plugin for this example (inside the export function configure(aurelia) { of my-aurelia/src/main.js, but I'm at an important design crossroads.
aurelia.use.plugin('aurelia-i18n', (instance) => {
// register backend plugin
instance.i18next.use(XHR);
// adapt options to your needs (see http://i18next.com/docs/options/)
instance.setup({
backend: {
loadPath: '/locales/{{lng}}/{{ns}}.json',
},
lng : 'es',
ns: ['words','phrases','nav'],
defaultNS: 'words',
attributes : ['t','i18n'],
fallbackLng : 'en',
debug : false
});
});
One json language file or multiple json language files? Any additional advice?

Performance-wise, a single file per language will be slightly faster on the initial load because fewer requests are necessary. However, this micro-optimization will be negligible, and you should put more value towards code structure and readability, especially for other people working on the code after you.
Will a single file become so large, that it will be hard for people to find the right entry, and change the content of the JSON file? If not, and you do not expect it to grow to such a size, you're probably best off using a single file.
Will people wonder if you put "Gracias/Thank You" in words (thanks) or phrases (thank you)? I recommend using a structure which is clear for someone who is not familiar with your code.
Lastly, one of the organization structures I have not seen you mention, but which I have used myself, is to order i18n files based on your views. This makes it easy to find the file which needs to be changed, as you already know which view you're working so you don't have to look for the i18n.

Internalizating content heavy pages into messagefiles seem cumbersome in play2?

Internalization in Play2 can be done with Message.get("home.title") and language files. What about when you internalizate a page full of textual content and not just one specific header or link?
For example doing Messagefile for a long page representing e.g. product info:
_First header_
Some paragraphs of text
...
_Tenth header_
Tenth paragraph and more text*
Messagefile
a)
product.info = "<many paragraphs of text including headers>"
or splitting one page into html elements
b)
product.info.h1 = "<first header>"
product.info.p1 = "<first para>"
product.info.p2 = "<2nd para>"
For me both solutions doesn't sound right. In first having a vast value for a single key seems bad convention and in latter separating a single page into dozens of keys doesn't sound good either.
Big websites often follow the convention www.site.com/en-us/product/1 of having the language in the URL. So the question is, how do i do in this way and is doing in this way a better way at all? I could easily end up not just translating to dozen languages but doing also dozen times layout changes.
I could use global codesnippets using Messagefile for elements that have a little text and doesn't change often e.g. navigation /view/global/header/somenavbar.scala.html but then i end up only having a complex folder structure.
Another way, a best practise, in Play 2 for internalization than messagefile?

Take a look to the Joscha Feth's solution in play_authenticate Java sample.
There are templates for emails in 3 languages for email confirmation, password reseting etc.
Template for each 'type' of email && each language is kept in single file ie:
_password_reset_en.scala.html
_password_reset_de.scala.html
_password_reset_pl.scala.html
_verify_email_en... etc
And for each 'type' there is an 'parent' template, which contains a condition (common Scala's match check the Tags section of template doc) which returns rendered view depending on detected language:
password_reset.scala.html
Finally, yes, at the beginning I also thought that some kind of madness, but believe me, that technique can be useful. There's field for further improvements I think. Maybe it would be better to move the language conditioning to the controller, hm I think that depends on many factors and it will be great if you'll find a time to investigate this topic.

Formatting Issue when Using Manipulate to display Lists

Please consider the following problem.
I'm writing a quick Manipulate[] program to display a ton of information, but am running into problem with the unicode. Here is what I currently have as input and output:
Manipulate[
request = filenumber <> "*";
filenames = FileNames[request];
display = Import[type, "List"];
Short[display, 25]
, {filenumber, "001", InputField}, {type, filenames, PopupMenu}]
The problem is that the French-language accents are showing up oddly. The quick workaround I thought of was to change my code to Import[type,"Plaintext"]; which works, but then displays the information in list form, like so:
What would you suggest as a way to get the clarity of the second example with the straightforward list format of the former? So that it wraps on the line rather than having a line break after each entry.
As an aside - probably just as important as the actual question itself - could anybody explain the rationale behind why importing as a "List" distorts the unicode? I've had a lot of trouble working around this, and understanding the underlying behaviour might help me move forward quicker.

Although Import does not have options associated with itself, it takes options relevant to the format being imported. Specifically see the Options section of ref/Format/List for the list of options.
In the case at hand, you can indicate the file encoding with CharacterEncoding->"UTF8":
Import[filename, "List", CharacterEncoding -> "UTF8"]

Why do people use plain english as translation placeholders?

This may be a stupid question, but here goes.
I've seen several projects using some translation library (e.g. gettext) working with plain english placeholders. So for example:
_("Please enter your name");
instead of abstract placeholders (which has always been my instinctive preference)
_("error_please_enter_name");
I have seen various recommendations on SO to work with the former method, but I don't understand why. What I don't get is what do you do if you need to change the english wording? Because if the actual text is used as the key for all existing translations, you would have to edit all the translations, too, and change each key. Or don't you?
Isn't that awfully cumbersome? Why is this the industry standard?
It's definitely not proper normalization to do it this way. Are there massive advantages to this method that I'm not seeing?

Yes, you have to alter the existing translation files, and that is a good thing.
If you change the English wording, the translations probably need to change, too. Even if they don't, you need someone who speaks the other language to check.
You prep a new version, and part of the QA process is checking the translations. If the English wording changed and nobody checked the translation, it'll stick out like a sore thumb and it'll get fixed.

The main language is already existent: you don't need to translate it.
Translators have better context with a real sentence than vague placeholders.
The placeholders are just the keys, it's still possible to change the original language by creating a translation for it. Because when the translation doesn't exists, it uses the placeholder as the translated text.

We've been using abstract placeholders for a while and it was pretty annoying having to write everything twice when creating a new function. When English is the placeholder, you just write the code in English, you have meaningful output from the start and don't have to think about naming placeholders.
So my reason would be less work for the developers.

I like your second approach. When translating texts you always have the problem of homonyms. Like 'open' can mean a state of a window but also the verb to perform the action. In other languages these homonyms may not exist. That's why you should be able to add meaning to your placeholders. Best approach is to put this meaning in your text library. If this is not possible on the platform the framework you use, it might be a good idea to define a 'development language'. This language will add meaning to the text entries like: 'action_open' and 'state_open'. you will off course have to put extra effort i translating this language to plain english (or the language you develop for). I have put this philosophy in some large projects and in the long run this saves some time (and headaches).
The best way in my opinion is keeping meaning separate so if you develop your own translation library or the one you use supports it you can do something like this:
_(i18n("Please enter your name", "error_please_enter_name"));
Where:
i18n(text, meaning)

Interesting question. I assume the main reason is that you don't have to care about translation or localization files during development as the main language is in the code itself.

Well it probably is just that it's easier to read, and so easier to translate. I'm of the opinion that your way is best for scalability, but it does just require that extra bit of effort, which some developers might not consider worth it... and for some projects, it probably isn't.

There's a fallback hierarchy, from most specific locale to the unlocalised version in the source code.
So French in France might have the following fallback route:
fr_FR
fr
Unlocalised. Source code.
As a result, having proper English sentences in the source code ensures that if a particular translation is not provided for in step (1) or (2), you will at least get a proper understandable sentence than random programmer garbage like “error_file_not_found”.
Plus, what do you do if it is a format string: “Sorry but the %s does not exist” ? Worse still: “Written %s entries to %s, total size: %d” ?

Quite old question but one additional reason I haven't seen in the answers yet:
You could end up with more placeholders than necessary, thus more work for translators and possible inconsistent translations. However, good editors like Poedit or Gtranslator can probably help with that.
To stick with your example:
The text "Please enter your name" could appear in a different context in a different template (that the developer is most likely not aware of and shouldn't need to be). E.g. it could be used not as an error but as a prompt like a placeholder of an input field.
If you use
_("Please enter your name");
it would be reusable, the developer can be unaware of the already existing key for an error message and would just use the same text intuitively.
However, if you used
_("error_please_enter_name");
in a previous template, developers wouldn't necessarily be aware of it and would make up a second key (most likely according to a predefined wording scheme to not end up in complete chaos), e.g.
_("prompt_please_enter_name");
which then has to be translated again.
So I think that doesn't scale very well. A pre-agreed wording scheme of suffixes/prefixes e.g. for contexts can never be as precise as the text itself I think (either too verbose or too general, beforehand you don't know and afterwards it's difficult to change) and is more work for the developer that's not worth it IMHO.
Does anybody agree/disagree?

Oraganize Pictures on the website

I am designing a website which will involve too many photos.
There are two modules Restaurants and Dishes. which is the best way to create the directory strcuture ?
images/Restaurants/ID
images/Dishes/ID
am using the following to create the filename
function imgName($imgExtension)
{
return time() . substr(md5(microtime()), 0, 12) . ".".$imgExtension;
}
ii want two different sized thumbnails. which is the best way to name the thumbnails ?
since the db will hold only the main pictures filename with extension.

I wouldn't worry too much about the directory structure, what you have seems good, even better might be to use S3 buckets.
As for the filename part, I've found the simplest way is to prepend thumb_ to the thumb filename. So: somefilename.jpg -> thumb_somefilename.jpg
This way you can store somefilename.jpg in the database and simply add the extra part to the front when you want the thumb.

Is there any reason for randomising the filenames like this? It's my personal opinion that you shouldn't be giving them random names in the first place, they should relate to what's actually in the picture - simply because it's nicer for users and it's more meaningful to search engines.
In a perfect world you'd have a logical structure like
/images/dishes/moules-de-mariniere.jpg
where the dish / restaurant name is a unique slug. That's pretty much impossible to implement in the real world, though, so
/images/dishes/id/moules-de-mariniere.jpg
is a fair compromise to avoid collisions.
Thumbs I generally put under their own thumbs/ directory so I can use the same filenames in all locations (laziness more than anything):
/images/dishes/id/thumbs/moules-de-mariniere.jpg
but thenduks' prepending suggestion works too, it's really just personal preference.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Language system design - codeigniter

Related

Single or multiple translation.json files for i18n?

Internalizating content heavy pages into messagefiles seem cumbersome in play2?

Formatting Issue when Using Manipulate to display Lists

Why do people use plain english as translation placeholders?

Oraganize Pictures on the website

Categories

Resources