Do I need English beside Base localization which would contain the exact same 'translation'? - xcode

I'd expect the base file to contain my English words since my project has "Localization native development region" set to English.
Update - to clarify my question:

Apart from addressing question what language end-users will see, you need to consider also what will be shown in the AppStore.
My current experience is that if you use Base for English, English won't appear in list of supported languages (how Apple knows in which language your base localization is) in the description of your app.
I've met this issue myself - base (English), German and Russian
Target settings refer to:
Localization native development region = en
But on Appstore it appears in this form:
Languages: German, Russian
no reference to English
I consider to duplicate base localization to English (not a high priority, as users see from screenshots that App works in English anyway)
Edit: there seem to be a different behavior in iOS8 - Application Settings (Settings.bundle) seem to ignore Base translation, if any of translations match your "Preferred Language Order".
In other words, App is localized: Base, German, Russian.
iPhone is configured to use English, preferred languages order is English, German, Russian.
Application settings come in ... German!
Once again: this is applied to Settings only not to the application itself!

Although I am not entirely sure if I get this correctly, I will try to answer your question TTBOMK.
Suppose you’re using NSLocalizedString(key, comment) from in your code. You can clearly see that the first argument is actually is a key for a string, rather than the translated (or to be translated) string itself. Therefore when you “write code” you actually don’t write strings in base language — or any other language for that matter. You should think it as if you're adding string placeholders in your code.
Later on, you’re supposed to create a Localizable.strings file for each language you would like to support, in the form of key = value;. To make your UI appear at least in one humanly–readable language you should at least have one Localizable.strings file with proper string values for each placeholder key.
For example: if you had NSLocalizedString(#“ConfirmationButtonTitle", #“Yada yada”) in your code, then it makes totally sense having a Localizable.strings file that contains ”ConfirmationButtonTitle” = “Tap here to confirm”; element in it. If you don’t create a Localizable.strings file or no Localizable.strings file contain ConfirmationButtonTitle key, then button title falls back to ConfirmationButtonTitle, since it is the name of the placeholder key.
Having said that, most people prefer naming their keys exactly as string values for various reasons. This is arguably a convenient — and very common — practice, but could lead to conflicts in people’s minds.
So, if you were to create the previous NSLocalizedString example like NSLocalizedString(#“Tap here to confirm", #“Yada yada”) instead, then your default/base Localizable.strings file would probably contain an element like “Tap here to confirm” = “Tap here to confirm”;.
What happens here isn’t that you’re repeating yourself, but instead you’re naming your key exactly as your base language’s string value, that’s all.
EDIT
There always have been a base language concept, but as I understand it Xcode 5 emphasizes this even more: that’s good. If your base language is English, then you don’t have to have a Localizable.strings file for English, again.

According to the documentation (scroll down to Creating Strings Files for User-Facing Text in Your Code), you shouldn't add Localizable.strings to the Base localization. Even if your development language is English, create a separate folder and Localizable.strings for English. Create others for each additional language you want to add.
Further reading
Managing Strings Files Yourself
Localizing Your App
Internationalizing the User Interface
iOS Localization Tutorial
Working with Localization in iOS 8 and Xcode 6
What’s new in localization in Xcode 9 and iOS 11

Related

How to enable Latin names for places in some places instead of local ones in Carto Services?

Some places on the map is labeled with Cyrillic names, but I need only English/Latin names of places on the map, however sometimes there are only local names. How can I implement this?
P.S.: I have spotted this issue on Belorussian and partly on Russian places.
Screenshot
About languages in general: after all, it depends on which languages specific placename is tagged with. OpenStreetMap has always "local" variant in local primary language, and CARTO Mobile SDK uses this by default, but the data has also other languages, so you can control it as following.
CartoVectorTileLayer (both CartoOnlineVectorTileLayer and CartoOfflineVectorTileLayer are subclasses of it) has method setLanguage(String) to select language, so e.g.:
layer.setLanguage("en");
will give you English language maps.
In SDK 4.0.2 SDK and nutiteq.osm tile source you can use following languages: local/default, en, es, de, fr, it, ru, zh (Chinese), tr (Turkish) and et (Estonian) as language
With latest CARTO SDK 4.1.0 and new carto.streets source you can use any OSM language. I would suggest to configure map based on device language settings, with something like:
// Android
layer.setLanguage(Locale.getDefault().getLanguage());
// iOs / Xamarin
layer.Language = Foundation.NSLocale.PreferredLanguages[0].Substring(0, 2);
What if specific name is not available in given language? Then the MapView will fallback to 'local' language by default, the map will not be empty. But if the 'local' language is still unreadable, so I'd prefer latin alphabet names? In SDK 4.1.0 you can configure primary and secondary fallback languages, e.g. you set primary language to 'de' for Germans, then to avoid strange alphabets (say Hebrew, Greek, most of Asia) set 'en' as primary fallback; then local is used only if both your primary and English names are missing:
layer.FallbackLanguage = "en";
Now I know you want automatically transliterated / Romanizied names, so even if source data from OpenStreetMap has e.g. names in Cyrillic only (Russia, Belorussia etc), then it would show them in Latin chars. It is not exactly same as translation, e.g. Moscow would become Moskva with Romanization, but can be helpful for many cases, actually with all none-Latin scripts, especially from Asia (Chinese etc). The problem here is that many languages, including Russian have many competing Romanization rules, so even if we would want to, then we could not do it in general SDK map rendering level. Our CARTO SDK may provide API for app to apply your preferred translation table, but we do not have this anyway. The SDK is open source and you are welcome to provide patch for the feature. I added issue ih the project for this: https://github.com/CartoDB/mobile-sdk/issues/147

How do I reverse engineer Mac OS X language localisation files for natural language learning?

OK, the goal of this question is not strictly programming related but it is a question programmers can answer using programming tools, and programmers may find useful answers here. Bear with me.
I find changing the system language in Mac OS X a useful way to augment my learning of natural languages, eg French. However sometimes I find a menu item or dialog box in French that I can't understand and it's a bore to google the translation or change the system language back to English. But I know that the English translation is hidden away somewhere in the localisation file and maps somehow to the French phrase. So what I want to do is extract all the text from all the localisation files to develop a mapping of this phrase in English = that phrase in French so I can look it up easily.
I know that the localisation files are stored in something like Localizable.strings, lproj files and nib files but I can't make head or tail of how they are stored or how to work with them. I can program but I've never written anything in Xcode. All the information I can find is for Mac OS / iOS programmers to localise their software, not for hackers to extract already made localisation information.
How can I extract the foreign language information as plain text from Mac OS X system and 3rd party software localisation files? Thanks!
Strings files are easy. They're simply dictionaries serialized as property lists. The dictionary keys are used by the program to look up the given string for a particular localization. You can build a mapping from English to another language by loading both dictionaries, iterating over the keys, and using the value from the English dictionary as the key in your output and the value from the other language dictionary as the value in your output.
NIBs are harder. The build process "compiles" NIB files in to a form that's not conduicive to editing or parsing. If you have access to uncompiled NIB files then you can use ibtool --export-strings-file to dump a strings file, which you could then process as per above. If you don't then I think you may have a hard time.

How do you name i18n text?

When I work on web application with my colleges. The name of i18n text are given quite freely.
It's like each one has his own rule of naming.
Take an example, we have a text "Create a new item", it is used for a link.
A names the key in resources file like: CreateANewItem, which puts all word together.
B prefers to name it like this: CreateLinkText, which describes it's usage in the application.
C, however, wants to use: CreateItemText, which summarizes it's literal meaning.
When some text is longer or containing format of dynamic content. Naming varies a lot and agreement is hard to be met.
So I wonder whether there's a good naming rule or convention for the i18n text in different cases: short, long, with format, vulnerable to change, etc. Or how do you do this in your project? With this convention, maintenance can be easy and code is more readable.
Thanks a lot.
It's not something that I have seen so far in coding conventions. I guess it matters a lot less than other issues when it comes to coding standards. That said, I don't know what platform you are using, but if it's .NET there is a very short page on naming conventions for resource identifiers here: http://msdn.microsoft.com/en-us/library/vstudio/ms229037%28v=vs.100%29.aspx

Why do people use plain english as translation placeholders?

This may be a stupid question, but here goes.
I've seen several projects using some translation library (e.g. gettext) working with plain english placeholders. So for example:
_("Please enter your name");
instead of abstract placeholders (which has always been my instinctive preference)
_("error_please_enter_name");
I have seen various recommendations on SO to work with the former method, but I don't understand why. What I don't get is what do you do if you need to change the english wording? Because if the actual text is used as the key for all existing translations, you would have to edit all the translations, too, and change each key. Or don't you?
Isn't that awfully cumbersome? Why is this the industry standard?
It's definitely not proper normalization to do it this way. Are there massive advantages to this method that I'm not seeing?
Yes, you have to alter the existing translation files, and that is a good thing.
If you change the English wording, the translations probably need to change, too. Even if they don't, you need someone who speaks the other language to check.
You prep a new version, and part of the QA process is checking the translations. If the English wording changed and nobody checked the translation, it'll stick out like a sore thumb and it'll get fixed.
The main language is already existent: you don't need to translate it.
Translators have better context with a real sentence than vague placeholders.
The placeholders are just the keys, it's still possible to change the original language by creating a translation for it. Because when the translation doesn't exists, it uses the placeholder as the translated text.
We've been using abstract placeholders for a while and it was pretty annoying having to write everything twice when creating a new function. When English is the placeholder, you just write the code in English, you have meaningful output from the start and don't have to think about naming placeholders.
So my reason would be less work for the developers.
I like your second approach. When translating texts you always have the problem of homonyms. Like 'open' can mean a state of a window but also the verb to perform the action. In other languages these homonyms may not exist. That's why you should be able to add meaning to your placeholders. Best approach is to put this meaning in your text library. If this is not possible on the platform the framework you use, it might be a good idea to define a 'development language'. This language will add meaning to the text entries like: 'action_open' and 'state_open'. you will off course have to put extra effort i translating this language to plain english (or the language you develop for). I have put this philosophy in some large projects and in the long run this saves some time (and headaches).
The best way in my opinion is keeping meaning separate so if you develop your own translation library or the one you use supports it you can do something like this:
_(i18n("Please enter your name", "error_please_enter_name"));
Where:
i18n(text, meaning)
Interesting question. I assume the main reason is that you don't have to care about translation or localization files during development as the main language is in the code itself.
Well it probably is just that it's easier to read, and so easier to translate. I'm of the opinion that your way is best for scalability, but it does just require that extra bit of effort, which some developers might not consider worth it... and for some projects, it probably isn't.
There's a fallback hierarchy, from most specific locale to the unlocalised version in the source code.
So French in France might have the following fallback route:
fr_FR
fr
Unlocalised. Source code.
As a result, having proper English sentences in the source code ensures that if a particular translation is not provided for in step (1) or (2), you will at least get a proper understandable sentence than random programmer garbage like “error_file_not_found”.
Plus, what do you do if it is a format string: “Sorry but the %s does not exist” ? Worse still: “Written %s entries to %s, total size: %d” ?
Quite old question but one additional reason I haven't seen in the answers yet:
You could end up with more placeholders than necessary, thus more work for translators and possible inconsistent translations. However, good editors like Poedit or Gtranslator can probably help with that.
To stick with your example:
The text "Please enter your name" could appear in a different context in a different template (that the developer is most likely not aware of and shouldn't need to be). E.g. it could be used not as an error but as a prompt like a placeholder of an input field.
If you use
_("Please enter your name");
it would be reusable, the developer can be unaware of the already existing key for an error message and would just use the same text intuitively.
However, if you used
_("error_please_enter_name");
in a previous template, developers wouldn't necessarily be aware of it and would make up a second key (most likely according to a predefined wording scheme to not end up in complete chaos), e.g.
_("prompt_please_enter_name");
which then has to be translated again.
So I think that doesn't scale very well. A pre-agreed wording scheme of suffixes/prefixes e.g. for contexts can never be as precise as the text itself I think (either too verbose or too general, beforehand you don't know and afterwards it's difficult to change) and is more work for the developer that's not worth it IMHO.
Does anybody agree/disagree?

Steps to develop a multilingual web application

What are the steps to develop a multilingual web application?
Should i store the languages texts and resources in database or should i use property files or resource files?
I understand that I need to use CurrentCulture with C# alone with CultureFormat etc.
I wanted to know you opinions on steps to build a multilingual web application.
Doesn't have to be language specific. I'm just looking for steps to build this.
The specific mechanisms are different depending on the platform you are developing on.
As a cursory set of work items:
Separation of code from content. Generally, resources are compiled into assemblies with the help of resource files (in dot net) or stored in property files (in java, though there are other options), or some other location, and referred to by ID. If you want localization costs to be reasonable, you need to avoid changes to the IDs between releases, as most localization tools will treat new IDs as new content.
Identification of areas in the application which make assumptions about the locale of the user, especially date/time, currency, number formatting or input.
Create some mechanism for locale-specific CSS content; not all fonts work for all languages, and not all font-sizes are sane for all languages. Don't paint yourself into a corner of forcing Thai text to be displayed in 8 pt. Also, text directionality is going to be right-to-left for at least two languages.
Design your page content to reflow or resize reasonably when more or less content than you expect is present. Many languages expand 50-80% from English for short strings, and 30-40% for longer pieces of content (that's a rough rule of thumb, not a law).
Identify cultural presumptions made by your UI designers, and try to make them more neutral, or, if you've got money and sanity to burn, localizable. Mailboxes don't look the same everywhere, hand gestures aren't universal, and something that's cute or clever or relies on a visual pun won't necessarily travel well.
Choose appropriate encodings for your supported languages. It's now reasonable to use UTF-8 for all content that's sent to web browsers, regardless of language.
Choose appropriate collation for your databases, or enable alternate collations, if you are dealing with content in multiple languages in your databases. Case-insensitivity works differently in many languages than it does in English, and accent insensitivity is acceptable in some languages and generally inappropriate in others.
Don't assume words are delimited by spaces or that sentences are delimited by punctuation, if you're trying to support search.
Avoid:
Storing localized content in databases, unless there's a really, really, good reason. And then, think again. If you have content that is somewhat dynamic and representatives of each region need to customize it, it may be reasonable to store certain categories of content with an associated locale ID.
Trying to be clever with string concatenation. Also, try not to assume rules about pluralization or counting work the same for every culture. Make sure, at least, that the order of strings (and controls) can be specified with format strings that are typical your platform, or well documented in your localization kit if you elect to roll your own for some reason.
Presuming that it's ok for code bugs to be fixed by localizers. That's generally not reasonable, at least if you want to deliver your product within a reasonable time at a reasonable cost; it's sometimes not even possible.
The first step is to internationalize. The second step is to localize. The third step is to translate.

Resources