How to enable Latin names for places in some places instead of local ones in Carto Services? - xamarin

Some places on the map is labeled with Cyrillic names, but I need only English/Latin names of places on the map, however sometimes there are only local names. How can I implement this?
P.S.: I have spotted this issue on Belorussian and partly on Russian places.
Screenshot

About languages in general: after all, it depends on which languages specific placename is tagged with. OpenStreetMap has always "local" variant in local primary language, and CARTO Mobile SDK uses this by default, but the data has also other languages, so you can control it as following.
CartoVectorTileLayer (both CartoOnlineVectorTileLayer and CartoOfflineVectorTileLayer are subclasses of it) has method setLanguage(String) to select language, so e.g.:
layer.setLanguage("en");
will give you English language maps.
In SDK 4.0.2 SDK and nutiteq.osm tile source you can use following languages: local/default, en, es, de, fr, it, ru, zh (Chinese), tr (Turkish) and et (Estonian) as language
With latest CARTO SDK 4.1.0 and new carto.streets source you can use any OSM language. I would suggest to configure map based on device language settings, with something like:
// Android
layer.setLanguage(Locale.getDefault().getLanguage());
// iOs / Xamarin
layer.Language = Foundation.NSLocale.PreferredLanguages[0].Substring(0, 2);
What if specific name is not available in given language? Then the MapView will fallback to 'local' language by default, the map will not be empty. But if the 'local' language is still unreadable, so I'd prefer latin alphabet names? In SDK 4.1.0 you can configure primary and secondary fallback languages, e.g. you set primary language to 'de' for Germans, then to avoid strange alphabets (say Hebrew, Greek, most of Asia) set 'en' as primary fallback; then local is used only if both your primary and English names are missing:
layer.FallbackLanguage = "en";
Now I know you want automatically transliterated / Romanizied names, so even if source data from OpenStreetMap has e.g. names in Cyrillic only (Russia, Belorussia etc), then it would show them in Latin chars. It is not exactly same as translation, e.g. Moscow would become Moskva with Romanization, but can be helpful for many cases, actually with all none-Latin scripts, especially from Asia (Chinese etc). The problem here is that many languages, including Russian have many competing Romanization rules, so even if we would want to, then we could not do it in general SDK map rendering level. Our CARTO SDK may provide API for app to apply your preferred translation table, but we do not have this anyway. The SDK is open source and you are welcome to provide patch for the feature. I added issue ih the project for this: https://github.com/CartoDB/mobile-sdk/issues/147

Related

WinRT Localization - Multiple Translations per Language

I am building a Windows Store application in XAML/C# for a Windows 8.1 Professional environment.
My project has a requirement that I must support multiple languages in addition to multiple translations for any given language. For example, I may have a label that would be displayed in English or French, but in English it may need to display the word "Title" or the word "Heading" depending on the customer's preferences.
My issue is that I cannot figure out a way to package and switch between multiple resource dictionaries for the same language while still using the built-in localization functionality provided by XAML for WinRT (i.e. using the Uid property on my controls to bind them to a resource dictionary).
I've noticed two functions, ResourceManager.LoadPriFiles and ResourceManager.UnloadPriFiles, that I thought might allow me to swap out resource dictionaries at runtime, but I can't figure out how to get the PRI files to be package outside of the application's main resource map to allow the loading and unloading.
I've also considered creating a custom data binding or converter that I could use to bind the controls' text manually, but that would cost me the ability to see labels at design time in Blend as well as sacrificing the convenience of the built-in localization capabilities.
Another option was to compile a separate instance of the application for each of the custom translations the customer might require, but obviously that's not a very maintainable way of solving the issue...
Finally, I had considered repurposing something like the homeregion qualifier of the ResourceContext to solve the issue; however, that seems very limiting as there are already pre-established homeregions that I would have to choose from. Repurposing fields seems like a bad idea in general.
You can use several resources files and use the PrimaryLanguageOverride property to select a different language than the default one. This will allow you to change the current resources set without doing anything specific.
You can use a structure like this one for your resources:
Strings
+- en-US
+-Resources.resw
+- fr-FR
+-Resources.resw
+- fr-other
+-Resources.resw
Then in you code, you will just have to call any of the following lines :
Windows.Globalization.ApplicationLanguages.PrimaryLanguageOverride = "fr-other";
Windows.Globalization.ApplicationLanguages.PrimaryLanguageOverride = "en-US";
Windows.Globalization.ApplicationLanguages.PrimaryLanguageOverride = "fr-FR";
You application will now use the "fr-other" language. You can use up to 8 characters in the second part of the language tag.

Do I need English beside Base localization which would contain the exact same 'translation'?

I'd expect the base file to contain my English words since my project has "Localization native development region" set to English.
Update - to clarify my question:
Apart from addressing question what language end-users will see, you need to consider also what will be shown in the AppStore.
My current experience is that if you use Base for English, English won't appear in list of supported languages (how Apple knows in which language your base localization is) in the description of your app.
I've met this issue myself - base (English), German and Russian
Target settings refer to:
Localization native development region = en
But on Appstore it appears in this form:
Languages: German, Russian
no reference to English
I consider to duplicate base localization to English (not a high priority, as users see from screenshots that App works in English anyway)
Edit: there seem to be a different behavior in iOS8 - Application Settings (Settings.bundle) seem to ignore Base translation, if any of translations match your "Preferred Language Order".
In other words, App is localized: Base, German, Russian.
iPhone is configured to use English, preferred languages order is English, German, Russian.
Application settings come in ... German!
Once again: this is applied to Settings only not to the application itself!
Although I am not entirely sure if I get this correctly, I will try to answer your question TTBOMK.
Suppose you’re using NSLocalizedString(key, comment) from in your code. You can clearly see that the first argument is actually is a key for a string, rather than the translated (or to be translated) string itself. Therefore when you “write code” you actually don’t write strings in base language — or any other language for that matter. You should think it as if you're adding string placeholders in your code.
Later on, you’re supposed to create a Localizable.strings file for each language you would like to support, in the form of key = value;. To make your UI appear at least in one humanly–readable language you should at least have one Localizable.strings file with proper string values for each placeholder key.
For example: if you had NSLocalizedString(#“ConfirmationButtonTitle", #“Yada yada”) in your code, then it makes totally sense having a Localizable.strings file that contains ”ConfirmationButtonTitle” = “Tap here to confirm”; element in it. If you don’t create a Localizable.strings file or no Localizable.strings file contain ConfirmationButtonTitle key, then button title falls back to ConfirmationButtonTitle, since it is the name of the placeholder key.
Having said that, most people prefer naming their keys exactly as string values for various reasons. This is arguably a convenient — and very common — practice, but could lead to conflicts in people’s minds.
So, if you were to create the previous NSLocalizedString example like NSLocalizedString(#“Tap here to confirm", #“Yada yada”) instead, then your default/base Localizable.strings file would probably contain an element like “Tap here to confirm” = “Tap here to confirm”;.
What happens here isn’t that you’re repeating yourself, but instead you’re naming your key exactly as your base language’s string value, that’s all.
EDIT
There always have been a base language concept, but as I understand it Xcode 5 emphasizes this even more: that’s good. If your base language is English, then you don’t have to have a Localizable.strings file for English, again.
According to the documentation (scroll down to Creating Strings Files for User-Facing Text in Your Code), you shouldn't add Localizable.strings to the Base localization. Even if your development language is English, create a separate folder and Localizable.strings for English. Create others for each additional language you want to add.
Further reading
Managing Strings Files Yourself
Localizing Your App
Internationalizing the User Interface
iOS Localization Tutorial
Working with Localization in iOS 8 and Xcode 6
What’s new in localization in Xcode 9 and iOS 11

Language codes for simplified Chinese and traditional Chinese?

We are creating multi-language subsites on our website.
I would like to use the 2-letter language codes. Spanish and French are easy. They will get URLs like:
mydomain.com/es
mydomain.com/fr
but I run into a problem with Traditional and Simplified chinese. Are there standards for which 2 letter codes to use for these languages?
mydomain.com/zh
mydomain.com/?
#dkarp gives an excellent general answer. I will add some additional specifics regarding Chinese:
There are several countries where Chinese is the main written language. The major difference between them is whether they use simplified or traditional characters, but there are also minor regional differences (in vocabulary, etc). The standard way to distinguish these would be with a country code, e.g. zh_CN for mainland China, zh_SG for Singapore, zh_TW for Taiwan, or zh_HK for Hong Kong.
Mainland China and Singapore both use simplified characters, and the others use traditional characters. Since China and Taiwan are the two with the biggest populations, just zh_CN and zh_TW are often used to distinguish the simplified and traditional character versions of a website.
More technically correct but not commonly used in practice, however, would be to use zh_HANS for (generic) simplified Chinese characters, and zh_HANT for traditional Chinese characters, except for rare cases when it is meaningful to distinguish different countries.
There is indeed a standard representation for this. As people have run into the exact same problem you are seeing -- same language, but different dialects or characters -- they've extended the two-letter language code with a two-letter region code. So you might have a universal French page at mydomain.com/fr, but internationalizing for French Canadian readers might leave you with a mydomain.com/fr_CA (Canada) and mydomain.com/fr_FR (France). Some platforms use a dash instead of an underscore to separate the language and region codes (hence fr-CA and fr-FR).
The standard locale for simplified Chinese is zh_CN. The standard locale for traditional Chinese is zh_TW.
I hesitate to point you towards the actual BCP 47 standards documents, as they're, uh, a little heavy on the detail and a little light on the readability. Just go with standard locale identifiers, like the ones in used by Java, and you'll be fine.
I'm just going to leave this here.
CODE
LANG
FORM
REGION
zh
Chinese
-
-
zh_Hans
Chinese
Han Simplified
-
zh_Hans_CN
Chinese
Han Simplified
China
zh_Hans_HK
Chinese
Han Simplified
Hong Kong SAR China
zh_Hans_MO
Chinese
Han Simplified
Macau SAR China
zh_Hans_SG
Chinese
Han Simplified
Singapore
zh_Hant
Chinese
Han Traditional
-
zh_Hant_HK
Chinese
Han Traditional
Hong Kong SAR China
zh_Hant_MO
Chinese
Han Traditional
Macau SAR China
zh_Hant_TW
Chinese
Han Traditional
Taiwan
Language is dependent upon where it is spoken (doh!), so language and locale codes reflect that reality. zh is the basic language code, but because there are two major forms of it, there are zh_Hans and zh_Hant, but they are still only language codes, not locales.
Location-specific
To fully specify which language is used in a particular location, the country code still has to be suffixed, so making zh_Hans_HK and zh_Hant_HK for simplified and traditional Chinese, respectively, both as spoken in Hong Kong.
Actually, the reality is that something more specific than country code is often required in many countries, but that is likely to exponentially increase the complexity and maintenance of databases like CLDR, plus the support infrastructure to feed into it, like IP to location details extraction, is not generally available or accurate enough.
Fixed text
Now, if the code is just to specify which set of fixed strings to use in the user interface, or even whole pages sets on a site, a country suffix is not really necessary, unless there are more than a few places where the language varies significantly enough (location-based info) to bother creating a whole separate resource set.
The larger the resource set, the more likely that a language code based upon locale [in this context, just a language attribute, rather than a true locale, so you can call it what you like!] will be required, but at least you only have to do that when necessary.
On-the-fly values
However, if wanting to format particular variable values, like dates, times, currencies and numbers, on-the-fly, locales become important, because all the tools that support such functionality (like those based upon Unicode CLDR data) expect them. The locale for these needs to be a separate setting to the code for which an in-house-generated UI language is set to use, unless you want to create a resource set for every known locale, and maintain them ad nauseum!
Browser language tools
Note that when specifying locale for a web page that can be edited, as in input boxes, and spellcheck in attributes or css has been enabled for the field, the browser's language tools will spellcheck the field according to that locale.
Criteria
You have to be clear about what the resource set is providing, so consider:
Fixed strings? Language only.
Formatting on-the-fly? Locale.
Spellchecking in the viewing environment? Locale.
Whole pages/subsite? Language only, else locale (as a language variant) if significantly different content required.
Spreadsheet to minimise maintenance overhead
I use a spreadsheet to hold UI strings where each language code has a parent code, so that the cell for its version of a string has a formula that gets its string from the parent. To create a custom string for that language and string, I just overwrite the cell formula with the exact text. That minimises the amount of resource maintenance. I run a macro at the end that generates a complete resource file for each language.

Steps to develop a multilingual web application

What are the steps to develop a multilingual web application?
Should i store the languages texts and resources in database or should i use property files or resource files?
I understand that I need to use CurrentCulture with C# alone with CultureFormat etc.
I wanted to know you opinions on steps to build a multilingual web application.
Doesn't have to be language specific. I'm just looking for steps to build this.
The specific mechanisms are different depending on the platform you are developing on.
As a cursory set of work items:
Separation of code from content. Generally, resources are compiled into assemblies with the help of resource files (in dot net) or stored in property files (in java, though there are other options), or some other location, and referred to by ID. If you want localization costs to be reasonable, you need to avoid changes to the IDs between releases, as most localization tools will treat new IDs as new content.
Identification of areas in the application which make assumptions about the locale of the user, especially date/time, currency, number formatting or input.
Create some mechanism for locale-specific CSS content; not all fonts work for all languages, and not all font-sizes are sane for all languages. Don't paint yourself into a corner of forcing Thai text to be displayed in 8 pt. Also, text directionality is going to be right-to-left for at least two languages.
Design your page content to reflow or resize reasonably when more or less content than you expect is present. Many languages expand 50-80% from English for short strings, and 30-40% for longer pieces of content (that's a rough rule of thumb, not a law).
Identify cultural presumptions made by your UI designers, and try to make them more neutral, or, if you've got money and sanity to burn, localizable. Mailboxes don't look the same everywhere, hand gestures aren't universal, and something that's cute or clever or relies on a visual pun won't necessarily travel well.
Choose appropriate encodings for your supported languages. It's now reasonable to use UTF-8 for all content that's sent to web browsers, regardless of language.
Choose appropriate collation for your databases, or enable alternate collations, if you are dealing with content in multiple languages in your databases. Case-insensitivity works differently in many languages than it does in English, and accent insensitivity is acceptable in some languages and generally inappropriate in others.
Don't assume words are delimited by spaces or that sentences are delimited by punctuation, if you're trying to support search.
Avoid:
Storing localized content in databases, unless there's a really, really, good reason. And then, think again. If you have content that is somewhat dynamic and representatives of each region need to customize it, it may be reasonable to store certain categories of content with an associated locale ID.
Trying to be clever with string concatenation. Also, try not to assume rules about pluralization or counting work the same for every culture. Make sure, at least, that the order of strings (and controls) can be specified with format strings that are typical your platform, or well documented in your localization kit if you elect to roll your own for some reason.
Presuming that it's ok for code bugs to be fixed by localizers. That's generally not reasonable, at least if you want to deliver your product within a reasonable time at a reasonable cost; it's sometimes not even possible.
The first step is to internationalize. The second step is to localize. The third step is to translate.

Which ISO format should I use to store a user's language code?

Should I use ISO 639-1 (2-letter abbreviation) or ISO 639-2 (3 letter abbrv) to store a user's language code? Both are official standards, but which is the de facto standard in the development community? I think ISO 639-1 would be easier to remember, and is probably more popular for that reason, but thats just a guess.
The site I'm building will have a separate site for the US, Brazil, Russia, China, & the UK.
http://en.wikipedia.org/wiki/ISO_639
You should use IETF language tags because they are already used for HTTP/HTML/XML and many other technologies. They are based on several standards including the ISO-639 collection (yes language, region and culture selection are not so simple to define).
I wrote a more detailed article regarding the proper language code selection and usage. The idea is to use the simplest/shorter ISO-639-1 codes and specify more only for special cases. Inside the article there are codes for ~30 most used languages with reasons why I consider one alternative better than another.
In case you want to skip reading the entire article here is a short list of language codes (not to be confused with country codes): ar, cs, da, de, el, en, en-gb, es, fr, fi, he, hu, it, ja, ko, nb, nl, pl, pt, pt-pt, ro, ru, sv, tr, uk, zh, zh-hant
The following points may not be obvious but should be borne in mind:
en is used for en-us - American English, and for British English is used en-gb
pt is used for pt-br, and not pt-pt witch has much less speakers
zh is used instead of zh-hans, zh-CN,...
zh-hant (Traditional Chinese) is used instead of more specific codes like zh-hant-TW or zh-TW
You can find more explanations inside the article.
I would go with a derivative of ISO 639. Specifically I like to use this: http://en.wikipedia.org/wiki/IETF_language_tag
I'm no expert, but every site I've ever seen uses ISO 639-1, including the current site I'm working on.
It works for us!
I've only ever seen 2-character language codes in use - so I'd recommend going with them unless your work involves delving into linguistics in some way. If all you're doing is customizing the browsing experience for the world at large, you won't need the extra repertoire offered by 3-character codes.
ISO 639-1 Alpha-2 are used pretty much universally.
They are used for example in HTTP content negotiation. If you ever wondered how an international website can automatically show you their homepage in your native language, that's how it works. (Although it's sometimes kinda annoying. I, for example, often get shown the default Apache homepage in German, because the webmaster turned on content negotiation, but only put content for English in.)
Most web browsers use them directly in their settings dialog box.
Most operating systems use them in their settings dialog boxes or configuration files.
Wikipedia uses them in their server names for the different language versions.
In other words: if your users aren't native English speakers, they will probably already have encountered them when configuring their software, because otherwise they wouldn't be able to use their computers.
The other members of the ISO 639 family are mostly of interest to linguists. Unless you expect Jesus Christ himself (ISO 639-2 Alpha-3 code arc) to visit your website, or maybe Klingons (tlh), ISO 639-1 has more languages than you ever can hope to support.

Resources