Does SPIED CoreNLP support languages other than English? - stanford-nlp

I don't know Java at all so I'm struggling a bit to figure out whether SPIED could work with languages other than English.
I've tried substituting default models.jar with Spanish specific models.jar and overriding default props with Spanish specific props.
Still, edu.stanford.nlp.patterns.GetPatternsFromDataMultiClass seems to be using English specific annotators.
The command executed was:
java -cp stanford-corenlp-3.9.2.jar:stanford-spanish-corenlp-2018-10-05-models.jar:javax.json.jar:joda-time.jar:jollyday.jar edu.stanford.nlp.patterns.GetPatternsFromDataMultiClass -props patterns/example.properties
Where example.properties contains the properties for the Spanish model (as in the default Spanish properties file) as well as the properties for the patterns module.
That didn't work. Is there any straightforward way to apply patterns module to other languages?

Related

How to implement a language / culture (that is not yet supported) in ActiveReports 11?

We already have a vast group of .rpx files that contain report definitions in german. We have used scripting to translate some of the text to romansh (official swiss language used by <1% of population), yet it has been requested.
The vision now is to create reports in french and maybe italian as well. Yet we are well aware that the current scripting approach like:
if (txtSprache.Text == "RM")
{
lblAbonnentenNr.Text = "Abo-nr.:";
lblAbrechnungVon.Text = "Quen dils:";
lblBis.Text = " -";
lblZahlbarBis.Text = "Pagabel tochen:";
lblObjekt.Text = "Object:";
lblRechnungsNr.Text = "Nr. dil quen:";
lblRechnungsdatum.Text = "Datum da quen:";
lblERechnungsID.Text = "ID";
lblAbonnent.Text = "Abonnent";
}
Is not well suited for that. I have been asked to create options for I18n support. Quoted from the AR11 documentation:
To localize a Report at design time
1. Click the gray area around the design surface to select the Report in the Properties window.
2. In the Properties window, drop down the Culture or Language property and select the culture that you want to apply to the report.
The old default was: (default, inherit), I know changed that to German (Switzerland). I couldn't find any difference, no new stuff under C:\Program Files (x86)\GrapeCity\ActiveReports 11\Localization, nor elsewhere in the xml.
How do I add a new language sheet / values for the current report and it's labels?
How do I add a culture that does not exist yet? (In worst case I'd use any local and use it as romansh, since only german, italian, french and unlikely english could be used)
The Language property, mentioned in the documentation, works for the code-based templates only. When you set the new language by this property VSID creates the additional resource file, e.g. myreport.jp-JP.resx.
In such case, the compiled report will upload the needed resources according to CurrentThread.CurrentUICulture value. It does not work for RPX(xml-based) templates.
So if you want to use this functionality, you need to convert xml-based to code-based templates.
The Culture property helps only to specify a locale for OutputFormat feature(e.g. conversion to currency).
For RPX templates localization, I think you could combine the external localization resource files with the current scripting approach, I mean the loading of the resource file in script and update the report items.
Thanks,

WinRT Localization - Multiple Translations per Language

I am building a Windows Store application in XAML/C# for a Windows 8.1 Professional environment.
My project has a requirement that I must support multiple languages in addition to multiple translations for any given language. For example, I may have a label that would be displayed in English or French, but in English it may need to display the word "Title" or the word "Heading" depending on the customer's preferences.
My issue is that I cannot figure out a way to package and switch between multiple resource dictionaries for the same language while still using the built-in localization functionality provided by XAML for WinRT (i.e. using the Uid property on my controls to bind them to a resource dictionary).
I've noticed two functions, ResourceManager.LoadPriFiles and ResourceManager.UnloadPriFiles, that I thought might allow me to swap out resource dictionaries at runtime, but I can't figure out how to get the PRI files to be package outside of the application's main resource map to allow the loading and unloading.
I've also considered creating a custom data binding or converter that I could use to bind the controls' text manually, but that would cost me the ability to see labels at design time in Blend as well as sacrificing the convenience of the built-in localization capabilities.
Another option was to compile a separate instance of the application for each of the custom translations the customer might require, but obviously that's not a very maintainable way of solving the issue...
Finally, I had considered repurposing something like the homeregion qualifier of the ResourceContext to solve the issue; however, that seems very limiting as there are already pre-established homeregions that I would have to choose from. Repurposing fields seems like a bad idea in general.
You can use several resources files and use the PrimaryLanguageOverride property to select a different language than the default one. This will allow you to change the current resources set without doing anything specific.
You can use a structure like this one for your resources:
Strings
+- en-US
+-Resources.resw
+- fr-FR
+-Resources.resw
+- fr-other
+-Resources.resw
Then in you code, you will just have to call any of the following lines :
Windows.Globalization.ApplicationLanguages.PrimaryLanguageOverride = "fr-other";
Windows.Globalization.ApplicationLanguages.PrimaryLanguageOverride = "en-US";
Windows.Globalization.ApplicationLanguages.PrimaryLanguageOverride = "fr-FR";
You application will now use the "fr-other" language. You can use up to 8 characters in the second part of the language tag.

Do I need English beside Base localization which would contain the exact same 'translation'?

I'd expect the base file to contain my English words since my project has "Localization native development region" set to English.
Update - to clarify my question:
Apart from addressing question what language end-users will see, you need to consider also what will be shown in the AppStore.
My current experience is that if you use Base for English, English won't appear in list of supported languages (how Apple knows in which language your base localization is) in the description of your app.
I've met this issue myself - base (English), German and Russian
Target settings refer to:
Localization native development region = en
But on Appstore it appears in this form:
Languages: German, Russian
no reference to English
I consider to duplicate base localization to English (not a high priority, as users see from screenshots that App works in English anyway)
Edit: there seem to be a different behavior in iOS8 - Application Settings (Settings.bundle) seem to ignore Base translation, if any of translations match your "Preferred Language Order".
In other words, App is localized: Base, German, Russian.
iPhone is configured to use English, preferred languages order is English, German, Russian.
Application settings come in ... German!
Once again: this is applied to Settings only not to the application itself!
Although I am not entirely sure if I get this correctly, I will try to answer your question TTBOMK.
Suppose you’re using NSLocalizedString(key, comment) from in your code. You can clearly see that the first argument is actually is a key for a string, rather than the translated (or to be translated) string itself. Therefore when you “write code” you actually don’t write strings in base language — or any other language for that matter. You should think it as if you're adding string placeholders in your code.
Later on, you’re supposed to create a Localizable.strings file for each language you would like to support, in the form of key = value;. To make your UI appear at least in one humanly–readable language you should at least have one Localizable.strings file with proper string values for each placeholder key.
For example: if you had NSLocalizedString(#“ConfirmationButtonTitle", #“Yada yada”) in your code, then it makes totally sense having a Localizable.strings file that contains ”ConfirmationButtonTitle” = “Tap here to confirm”; element in it. If you don’t create a Localizable.strings file or no Localizable.strings file contain ConfirmationButtonTitle key, then button title falls back to ConfirmationButtonTitle, since it is the name of the placeholder key.
Having said that, most people prefer naming their keys exactly as string values for various reasons. This is arguably a convenient — and very common — practice, but could lead to conflicts in people’s minds.
So, if you were to create the previous NSLocalizedString example like NSLocalizedString(#“Tap here to confirm", #“Yada yada”) instead, then your default/base Localizable.strings file would probably contain an element like “Tap here to confirm” = “Tap here to confirm”;.
What happens here isn’t that you’re repeating yourself, but instead you’re naming your key exactly as your base language’s string value, that’s all.
EDIT
There always have been a base language concept, but as I understand it Xcode 5 emphasizes this even more: that’s good. If your base language is English, then you don’t have to have a Localizable.strings file for English, again.
According to the documentation (scroll down to Creating Strings Files for User-Facing Text in Your Code), you shouldn't add Localizable.strings to the Base localization. Even if your development language is English, create a separate folder and Localizable.strings for English. Create others for each additional language you want to add.
Further reading
Managing Strings Files Yourself
Localizing Your App
Internationalizing the User Interface
iOS Localization Tutorial
Working with Localization in iOS 8 and Xcode 6
What’s new in localization in Xcode 9 and iOS 11

Should language selectors list in English or in native language?

In a UI that lets the user select the language, should the languages in that list be named in:
English
the language that the UI is currently localised to
the native language itself
Look at what the big guys do. Twitter simply uses the pattern localized - their language. This way both will understand. For example, English gets shown as Engels - English here, and Simplified Chinese is listed as Vereenvoudigd Chinees - 简体中文

Is there a naming convention for locale-specific static files?

I have some static resources (images and HTML files) that will be localized. One piece of software I've seen do this is Apache, which appends the locale to the name; for example, test_en_US.html or test_de_CH.html. I'm wondering whether this naming scheme is considered standard, or whether every project does it differently.
While there is no documented standard for naming Localized files, I'd recommend using the format filename[_language[ _country]] where
language is the ISO-639 2 letter language code
territory is the ISO-3166 2 letter country code
For example:
myFile.txt (non-localized file)
myFile_en.txt (localized for global English)
myFile_en_US.txt (localized for US English)
myFile_en_GB.txt (localized for UK English)
Why? This is the most typical format used by operating systems, globalization tools (such as Trados and WorldServer), and programming languages. So unless you have a particular fondness for a different format, I see no reason to deviate from what most other folks are doing. It may save you some integration headaches down the road.
While there doesn't appear to a standard conventions as to where in the file name to place them, the international codes for language (e.g. "en") and region (e.g. "en-US") are both very common and very straightforward. Variations I've seen, excluding "enUS" vs. "en_US" vs. "en-US":
foo.enUS.ext
foo.ext_enUS
enUS.foo.ext
foo/enUS.ext
enUS/foo.ext
…ad nauseum
I personally favor the first and last variants. The former for grouping files by name/resource (good for situations in which a limited number of files need localized) and the latter for grouping files by locale (better for situations with a large number of localized files).
You should always use the "de-facto" standard, which is the unix/posix way with gettext. And you shoud use gettext to make your localization!
Therefore one and only correct way is to use localization naming like this:
en
en_US
en_UK
Some applications and especially Java developers ar sometimes using the en-US (hyphenated instead than underscored) and it is ALL WRONG!!!
gettext standard is this and only this:
locale
|_en_US
|_LC_MESSAGES
|_appname.mo
Where:
locale - Name of the directory, can vary but it is highly recommended to stay with "locale"-name
en_US - Any standard locale like *es_ES*, *es_PT*, ...
LC_MESSAGES - mandatory and cannot be changed!
appname.mo - msgfmt compiled appname.po file (appname is what ever you want)

Resources