Ruby localization: i18n, g18n, gettext, padrino... - what's the difference?

Ruby localization: i18n, g18n, gettext, padrino... - what's the difference? - ruby

Being somewhat new to Ruby I'm exploring existing libraries to do what I'd normally do in other scripting languages, and I'm a bit stumped by the localization libraries that might be available for something built on top of Sinatra/Sequel (Rails/AR being a bit too opinionated to my taste).
Now, I ran into a couple (i18n, r18n, GetText) though this wiki page, and there apparently is an extra library used in Padrino (based on the i18n thing from Rails?); and apparently plenty more.
Except for the obvious (i.e. GetText mo/po style vs yml files), I'm somewhat confused as to how these options might be different. The wiki doesn't point to much in that respect except saying the that they exist; not how they're different.
Adding to this confusion is the fact that essentially every piece of documentation seems to cover a single one of them (and typically in a RoR context). Moreover, these options don't look entirely incompatible with one another on closer inspection -- in the sense that, if I understood this properly, they can understand each other's files to a large extent.
Might anyone here be able to give a quick and to the point explanation/overview of these libraries, and outline the difference between the them? Some pointers on performance would also be welcome, if you're aware of any (besides the ones from the fast_gettext docs, which made little sense considering my lack of understanding the difference between these options).

I can see how this situation is confusing without knowing some of the history of i18n/l10n libraries in Ruby. I should probably write a few words up on that, but for now I'll try to give an overview from my perspective:
Gettext is obviously the oldest player in this game and it inherits both strengths and weaknesses from its ancestry which is being invented for a C dominated world. It has most features one needs, comes with some tools support that others lack (like desktop po file editors) and is widely accepted in the so called enterprise world.
Gettext as such defines an API and there are basically two libraries that implement it in the Ruby world, the traditional Ruby Gettext packages by Masao Mutoh and the fast_gettext gem by Michael Grosser.
Ruby Gettext is quite powerful and ships a lot of features that you may or may not need. The fast_gettext gem on the other hand focusses on raw speed and is implemented as a shiny, modern code-style Ruby library that is easily hackable and the author is a very smart and supportive person. Out of the two I'd personally strongly recommend fast_gettext.
The I18n gem is the result of the joint effort of various Ruby i18n/l10n solutions that existed a few years ago and that all strived to supersede Gettext for various reasons at that point of time. The resulting I18n API is basically covering the requirements and usecases of all the i18n/l10n solutions involved at that time, including the API of Gettext. So, today's Ruby I18n API is a superset of Gettext's API from the early 90s.
Today the I18n gem is the official solution that is shipped with Ruby on Rails, but it is also the probably most popular one in the Ruby world in general.
The I18n gem also makes it very easy to extend the featureset and add things like caching, other storage mechanisms (like Gettext po files, database tables, key-value stores; storage defaults to plain Ruby files and YAML) etc. and it ships with a number of modules for that (but external or custom modules can easily be crafted, tested and integrated).
There are translation files for 70+ languages (locales) for strings used by Ruby on Rails (which are useful in other projects, too) maintained by the community.
I can not tell much about R18n except that it was invented right after I18n hit its first release and as far as I remember it originated from the Merb community. It seems to be rather strong in the Russian Ruby world, but I might be wrong with all these assertions.
So, unless you have a very good reason to pick any other solution I'd strongly recommend using I18n.
Then on the other hand that means nothing because I've been leading this project more or less since it was invented.
I hope this helps.
[EDIT] added links to various references

I18n is a main stream.
R18n is a alternative with some extra features (model translations, syntax sugar) and some difference in ideology and architecture (flex extensibility by powerful filters).
G18n need to add model transtions to I18n.
Padrino is not a i18n library, it is just Sinatra framework with build-in I18n.
Gettext is IMHO old conception with very ugly format and problem with pluralization. Anyway, it isn’t popular in Ruby community.

First:
as svenfuchs wrote, I18n is a framework that provides modules for many translation and internationalisation approaches.
'gettext' is just one of many modules.
So there is really no question to use I18n.
The default setup of a Rails application is to use I18n with the YAML backend and I understand part of your question to compare that backend with other ones.
IMHO there are two major differences between the gettext and YAML based approaches:
life cycle support
hierarchy
gettext
One idea of gettext is, that translating an app is not a singular event but a life cycle process.
It is build to support this live cycle.
gettext is designed to use plain english as the keys for the translations. So the idea is to write the app in english and mark all text that is to be translated, typically by wrapping it with _().
As a result, the app source code is easily readable in english.
Then a programm scans all source code and extracts the textes to be translated and builds a repository (the .pot file) of these textes.
In the next step, and here comes the live cycle, the repository is merged with existing translations (.po files, one for each target language) and new or changed items are marked.
Mature editors support the translators by focusing on the new and changed items. Additionally project specific dictionaries can support partial automatic translations.
gettext is flat, meaning that each key phrase is translates exactly once in the translation files. There is no hierarchy. But there is context. In the translation files, all the source code positions of a key phrase are listed. An editor with access to the source code can display the source along with the translation (and some do).
Finally, .po files are translated to machine readable fast access forms (can be .mo, the classic standard, or a database or json or …)
YAML
YAML an the other hand is hierarchical so it’s easy to have variations of translations in different contexts.
I18n uses this structure to support scopes and uses the current file path as scope when using keys starting with a dot.
There is no information, where a key is used in the project (well unless auto scoped, but the key may be used in other places explicitly).
There is no information, whether there are any changes.
Unless your IDE supports you, the developer has to find the right place to put a key in the YAML and searching the usage can be cumbersome.
A lot more is said in the other answers.
I18n
I intentionally said YAML and not I18n, because I18n is a framework for internationalization (not only translation), and YAML is only one possible backend.
Plural support in I18n differs from plural support of vanilla gettext. I don’t have experience how they cooperate.
Examples
gettext with positional parameters:
sprintf(
_('Do you really want to delete tour %1$s_%2$s? Only empty tours can be deleted!'),
tag, idx)
translations are text files, but PO-Editors provide GUIs:
#: js/addDelRow.js:15
msgid "" "Do you really want to delete tour %1$s_%2$s? Only empty tours can be deleted!"
msgstr "" "Wollen sie die Spalte %1$s_%2$s wirklich löschen? Nur leere Spalten können "
"gelöscht werden."
YAML with parameters:
Source
<%= t('.checked_at', ts: l(checked_at), user: full_name) %>
translation
from
en:
hotels:
form:
checked_at: „set to checked by %{user} on %{ts}“
to
de:
hotels:
form:
checked_at: "geprüft gesetzt am %{ts} von %{user}“
Conclusion
YAML is much easier to start with, especially if you have support by an IDE.
Vanilla RAILS has it built in.
The is no native language. The first translation can be any language.
With growing projects and multiple laguages, my YAML files tend to repetition (same translation scattered over the hierarchy) and tracing of changes and therefore new translations is cumbersome.
gettext needs an extra toolchain and therefore a more difficult setup.
It supports the whole life cycle of continous translation of developing apps.
It is based on english source code.
I usually use the best parts of both, using YAML for internationalisation (number and date format, maybe model names?) and gettext for translation.

Andrey's response as to point me back to the r18n docs, which basically break it down to a single line:
R18n uses hierarchical, not English-centric, YAML format for translations by default.
Found this slideshare from Andrey. It's in Russian, but it's making a lot more sense now (slides 7 to 9 in particular clear-cut differences between i18n and r18n):
http://www.slideshare.net/iskin/r18n

Related

Handling comments in code when doing i18n

I'm in the process of translating a Open Source project from Chinese to English, and I've used i18n (in this case babel) to separate the code from both English and Chinese translations.
Everything's done, except for a rather large number of inline comments in the code.
Obviously, babel can't translate comments inline (and it would be rather obnoxious if it did, anyway. Since code would not be unique across languages and therefore less easily verifiable.)
The way I see it, there are a number of options:
Leave comments in -
Pro: Helps original author, etc.
Con: Makes it distracting for ongoing translation and anyone who doesn't speak the language
Strip out all the comments -
Pro: Code is native-language-agnostic, so it makes sense. Who needs comments anyway? Use the source, Luke!
Con: Goes against SE principles. Could lose something important in understanding how the code works - maybe something's been done to avoid a security risk, etc.
Place English comments near Chinese comments
(Possibly moved to lines above and prefixed with "EN" and "ZH", for example).
Pro: Best of both worlds, comments kept close to code
Con: Not conducive to dictionary-style translation. Can get bulky with more languages.
Create a comment dictionary / notes
Pro: Keeps the comments in a separate file for easy translation.
Con: Difficult to keep synced with code. Not intuitive to remember to update comments related to code when changing coe.
Use a different preprocessor for i18n before/after each development cycle.
Pro: Comments et al would be in your language. Could link this to git pull/push so you only ever see the code in your language.
Con: Bulky, non-obvious process. Could result in code-verification or even compilation errors.
None of these seem like really great solutions.
If you do alot of this, and the code is shared publicly between developers who don't share a native tongue, is there a recommended way to handle translating (or not) comments in the code itself?

I am not sure I understand... You say you separated the code from the languages part. So now you should have code (with comments) + English resources + Chinese resources (i used resources for whatever your programming language use to store localizable content)
Translators only see the resources, not the code, nor the comments. The comments stay untranslated, for the developers.

Short Answer
It seems to be a mixture of:
Strip out all the comments, and
Place English comments near Chinese comments.
Inline comments are almost always trivial - Strip them
Functional comments are not as intrusive - Translate them (possibly with a i18n prefix e.g. "[cn]:" or "[en]:").
Explanation
My meagre amount of research tends to suggest that larger projects make strong attempts to reduce comments and let the code explain itself, instead focusing on code quality which reduces the need for comments.
e.g. From the Linux Kernel Coding guidelines:
NEVER try to explain HOW your code works in a comment: it's much
better to write the code so that the working is obvious, and it's
a waste of time to explain badly written code.
...and from the MySQL coding standards:
Comment your code when you do something that someone else may think is
not trivial.
Both of these standards (and others) recommend minimal function descriptions also, so that's not as obtrusive to understanding the code, and, since function descriptions are generally multi-lined and above the code itself, multiple languages can be included as necessary.
Maybe someone, somewhere has built an interface that can isolate comments into the readers language, but I couldn't (yet) find any that do so.

I always think that API comments exported in the project and private comments in open source projects should be internationalized, which is very convenient for developers in other countries.
On Github, there are actually many developers who use their own national language to comment on some well-known open source projects and some of their own annotations. Most of the reason is that if they do not translate, the efficiency of developers reading comments very low.
Similar to .d.ts in TypeScript, I think function annotation translation can also take a similar form, which is more convenient for the community to feedback translation content, because in fact many developers are willing to do so.

hypothetical: how would you implement bidirectional language support in Sublime text editor, and what features would you like it to have?

Maybe this question is too open-ended and someone will kill it
--- however:
I am building systems (web apps and native) requiring multiple language support, including rtl languages like Arabic and Hebrew. Currently I have no need to be able to program in those languages, but writing content is a must.
There are some difficult choices to make I think in the implementation, because I think at some level (I don't know it's why I'm asking) the text file needs to have a consistent direction of string flow, but when we read and compose these files we need to view these elements with their character order reversed in order for them to be sensible.
(Open ended and non-constructive? I'm hoping to construct a solution.)

I fail to see the connection with SublimeText.
You need RTL support, you use a pre-made component that can handle it.
Or start with a library that can help with that support and does the heavy-lifting (for instance Uniscribe, http://msdn.microsoft.com/en-us/library/windows/desktop/dd374091%28v=vs.85%29.aspx,
or HarfBuzz, http://www.freedesktop.org/wiki/Software/HarfBuzz/)
Adding it yourself means a lot of work (SublimeText fails miserably at it, I don't even think it tries).
To get an idea what you have to deal with, take a look at the Unicode Bidirectional Algorithm
(http://www.unicode.org/reports/tr9/)

Just vote for adding RTL Languages here...
https://sublimetext.userecho.com/topic/37207-right-to-left-languages-support/
They will add it if the votes reach 600

Is there an easy way to replace a deprecated method call in Xcode?

So iOS 6 deprecates presentModalViewController:animated: and dismissModalViewControllerAnimated:, and it replaces them with presentViewController:animated:completion: and dismissViewControllerAnimated:completion:, respectively. I suppose I could use find-replace to update my app, although it would be awkward with the present* methods, since the controller to be presented is different every time. I know I could handle that situation with a regex, but I don't feel comfortable enough with regex to try using it with my 1000+-files-big app.
So I'm wondering: Does Xcode have some magic "update deprecated methods" command or something? I mean, I've described my particular situation above, but in general, deprecations come around with every OS release. Is there a better way to update an app than simply to use find-replace?

You might be interested in Program Transformation Systems.
These are tools that can automatically modify source code, using pattern-directed source-to-source transformations ("if you see this source-level pattern, replace it by that source-level pattern") that operate on code structures rather than text. Done properly, these transformations can be reliable and semantically correct, and they're a lot easier to write than low-level procedural code that navigates and smashes nanoscopic actual tree structures.
It is not the case that using such tools is easy; such tools have to know how to parse the language of interest into compiler data structures, (e.g., ObjectiveC), process the patterns, and regenerate compilable source code from the modified structures. Even with the basic transformation engine, somebody needs to carefully define parsers (and unparsers!) for the dialects of the languages of interest. And it takes time to learn how to use such a even if you have such parsers/unparsers. This is worth it if the changes you need to make are "regular" (in the program transformation sense, not the regexp sense) and widespread (as yours seem to be).
Our DMS Software Reengineering toolkit has an ObjectiveC front end, and can carry out such transformations.

no there is no magic like that

Code "internationalization"

I worked on different projects in different countries and remarked that sometimes the code became internationalized, like
SetLargeurEtHauteur() (for SetWidthAndHeight, fr)
Dim _ListaDeObiecte as List(Of Object) (for _ObjectList, ro)
internal void SohranenieUserov() (for SaveUsers, ru)
etc.
It happens that in countries with Latin alphabet this mix is more pronounced, because there is no need of transliteration.
More than that, often the programming "jargon" is inspired by the project specifications language. There are cases that terms in "project language" have a meaning that is not "translatable" in English.
There are also projects on which works only, say a French team, uses French words (say, Personne, Vehicule, Projet etc).
In that cases I personally add in specifications a "Dictionary" that explains all business object names and only these objects are used in other (French) language.
Say:
Collectif - ensemble des Personnes;
All the actions(Get, Set, Update, Modify, Load, etc) are in English.
Now that "strong" names could be used in code:
AddPersonneToCollectif.
What is your approach to "internationalization"?
PS.
I was amused that VisualStudio compiles and runs projects in .NET with buttons named à la "btnAddÉlève" or "кпкСтоп"...

My personal approach, which is shared by many but not all in the programming community, is that source code should be in English and, if possible, all the development tools should be in English too.
The most important reason for this is being able to share your problems and solutions with the world (like we are doing now in StackOverflow, no less) without having to translate class names, error messages, paths and other artifacts every time.
It also helps consistency, because most libraries are written in English and having element names that mix two languages doesn't really help anyone, besides being a constant focus of internal conflicts when a verb like Add isn't always traslated.
English code also makes it easier to add foreign people to a project without worrying about comprehension and misunderstandings (especially between closely related languages, like Spanish and Portuguese, which have lots of false cognates)
A good link on this subject: http://www.codinghorror.com/blog/2009/03/the-ugly-american-programmer.html
(In case anyone wonders, I'm south-american and English is not my primary language)

Even if everyone on the team is a good English speaker (which is not a given), they may not necessarily know the English equivalent of all the business terminology.
I think it's a project-specific decision what to allow, but I would generally tolerate and in some cases encourage business terms (e.g. entity names) in the local language, but not technical terms (i.e. not Largeur/Hauteur instead of Width/Height).
For example in the financial world in France, everyone knows what is meant by OPCVM and FCP - if you attempt English translations you might end up with more misunderstandings than you do by allowing mixed languages.

I have the same issue with Norwegian currently. I guess it depends on your position in the project, the available time and the role of the software.
In my case, I have decided to keep all terms in an existing protocol and library I am working with in Norwegian, as I can reasonably expect that generations of administrative workers have gotten used to these, and since the library depends on the protocol. In a library wrapper for an international project, I have translated each method name literally, and added an English language documentation of the method.
Comments and documentation on the code are in English.
If designing a software from scratch, I would try to find English terms for all method names and even business terms (if reasonable. I can hardly think of an example where no term can be found though.), to keep it "portable".

If you're writing code that you may be used internationally one day, write it in English. In doubt, write it in English, even comments if you can (although I suppose you can add a few comments in the language of your workplace).
It's not specific to coding unfortunately. English isn't my native language, but I've been able to read a number of technical papers and participate to international conferences with people from all over the world. These collaborations simply wouldn't work if everyone published in their respective native languages.
It may be sad if you feel like defending your language at all cost, but you have to be realistic about it. I suppose English has the advantage to be relatively simple for achieving a basic level: no genders for names, no conjugations, no cases.

Generally code referring to language concepts should be in the same mother language as the programming language (i.e. English - for, while, string are all English words).
It's OK (but not great) to have variables and domain concepts in a local language, but you definitely don't want to be translating List, Object, Decimal, etc. into terms which cause programmers more work in reconciling two languages. Even still, I would strongly lobby to restrict very common domain concepts like Collection, Membership, Person, User and possibly less common domain concepts like Invoice, Receipt to English where this is possible.
It would be like coding half your classes in VB and half in C# - your brain has to make a cognitive shift. While this is good for hybrid apps (JavaScript on the web and C# on the backend) because it helps you keep what's running where clear, it isn't good for a general programming.
In addition, using English for everything makes the domain and language words work together better.
There are always exceptions. There are certain cases where you would use a native word anyway - where the word describes the domain best. For instance, in our (English) code base, we had references to Mexican Spanish terms for certain concepts which were only relevant for people running our software in Mexico. Typically, Japanese terms were spelled out phonetically/Romaji, though - it was difficult for non-Japanese to be able to pronounce the pictograms ;-).

I think I'd call a code base like that "abysmal" rather than "internationalized", but the general rule I've always heard is that if you ever think that someone other than one who speaks your language might ever touch the code, do it in english.

I think that good design guideline is to write code with use of English names and with english comments only (of course if your team is capable to do this, but in case of international team English seems to be natural choose, since it's a most popular language, expecially in IT world).
Good explanation of such guidline is that keywords in most of programming languages are taken from English so writing your code using English names gives more consistent look and thanks to this you end with code that is easier to read.
Another reason is that most of compilers can handle only ascii characters as names of classes, methods, etc. so probably you will end with some strange names when you decide to use some language with alphabet containing non ascii chars.
Third reason that came to my mind is sharing your code on site like SO. Today I opened a post with a piece of code where classes had Spanish names. It was hard for me to guess what was the purpose of this class (even if sometimes is not necessary it is good when you read code and understand all used words:)).
To sum up I think that internationalization of code is not a good idea. You can imagine that keywords in programming languages (e.g. class, try, while) could also be localized and probably you can imagine also how hard life could be then...

To keep things consistent, I would make the code the same (human) language as the (programming) language. That is, if the programming language uses English keywords (like for, switch, public, etc) then keep the rest of the code in English. If you are using a compiler that recognizes (say) the Swahili translations of keywords, then keep the rest of the code in Swahili.
Many APIs have standardized naming schemes that are followed regardless of (human) language, and the accompanying documentation is translated as needed (instead of the source code).
No matter what (human) language you choose, pick one and stick with it. I'd much rather try to wade through source code in German than code that was a mix of German and English.

Which tools can help in translating (as in french -> english and not C++ -> java) source code? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I have some code that is written in french, that is the variables, class, function all have french names. The comments are also in french. I'd like to translate the code to english. This will be quite a challenge, since it's a 18K lines project and I'd like to know if there is any tool that could help me, especially with the variables/class/function names, since it will be error prone to rename them all.
Is there any tools that can help me? Advices?
edit : I'm not looking for machine translation. I'm looking for a tool that would help me translate the code. Let's say there is class name C and this class has a method named TraverserLaRue and I rename it CrossTheRoad I'd like all references to TraverserLaRue in all files to be translated as CrossTheRoad. However I don't want the method TraverserLaRue of class B to be translated.

I assume the langauge in question is one of the common ones, such as C, C++, C#, Java, ...
(You don't have a language with French keywords? I once encountered an entirely Swedish version of Pascal, and I gave up on working that).
So you have two problems:
Translating identifiers in the source code
Translating comments
Since comments contain arbitrary natural language text, you'll need an arbitrary translation of them. I don't think you can find an automated tool to do that.
Unlike others, however, I think you have a decent chance at translating the identifiers
and changing them en masse.
SD makes a line of source code "obfuscator" products. These tools don't process the code as raw text, rather they process the source code in terms of the targeted language; they accurately distinguish identifiers from operators, numbers, comments etc. In particular, they
operate reliably as need on just the identifiers.
One of the things these tools do is to replace one identifier name by another (usually a nonsense name) to make the code really hard to understand. Think abstractly of a map of identifier names I -> N. (They do other things, but that's not interesting here). Because you often want to re-obfuscate a file that has changed, the same way as an original, these tools allow you to reuse a previous cycle's identifier map, which is represented as list of I -> N pairs.
I think you can abuse this to do what you want.
Step 1: Run such an obfuscator on your original French code. This will produce a text file containing all the identifiers in the code as a map of the form
I1 -> N1
I2 -> N2
....
You don't care about the Ns, just the I's.
Step 2: Manually translate each French I to an English name E you think fits best.
(I have no specific suggestions about how to do this; some of the other answers here
have suggestions).
Some of the I's are likely to be library calls and are thus already correct.
You can modify the text obfuscation map file to be:
I1 -> E1
I2 -> E2
Step 3: Run the obfuscation tool, and make it use your modified obfuscation map.
It can be told to do that.
Viola, all the identifiers in your code will be changed the way you specify.
[You may get, as a freebie, the re-formatting of your original text. These tools can also format code nicely. Your name changes are likely to screw up the indentation/spacing in the original text so this is a nice bonus].

Any refactoring tool has a rename feature. Many questions on SO address language specific refactoring tools.
For the comments, you will have to handle them manually.

I did this with German code a while ago, but had mixed results because of abbreviations in names, etc. Using regular expressions, I wrote a parser that removed all of the language specific keywords and characters, then separated comments from the rest of the code, and now I had a lot of words that didn't necessarily mean anything to me by themselves. So I wrote a unique word finder that added them all to a ordered text file. Next stop was Google's language tools that attempted to translate every word in the list. I ran through the list to see if each word really translated, and if it did, I did a replace all in the code with the english equivalent. The comments I put back in with the complete translation, if it worked. What I found was that I ended up having to talk with someone who understood "Germish" to translate the abbreviations, slang terms, and mixed language pieces. So in short, regular expressions with a dictionary, unless someone has a real tool for this, which I would be interested in also.

You should definitely look into https://launchpad.net/rosetta
Ubuntu uses this to translate thousands of its packages written in hundreds of programming languages into hundreds of human languages, with updates for each new version. Truly herculean task.
edit: ...to clarify how Rosetta is used at Ubuntu: it modifies all natural language strings occuring in source code of the open-source apps, creating a language-specific source packages, which upon compiling create given kinds of binaries. Of course it does not edit binaries themselves.
First maintainers create "template files" which are something like "Patch with wildcards" - a set of rules what and where in the source tree needs to be translated, but not to what. Then Rosetta displays strings to be translated, and allows volunteering translators to provide translations to their language for each entry. Each entry can be discussed, modified, suggestions submitted and moderated. Stats are provided how much needs to be translated, which translations are unsure, which are missing etc. When the translation is complete, patch of given language is applied to the source creating its version for given language. Then a distribution is compiled from the modified sources.
This allows translation both for sources that use some external resources for multilingual allowing for language change on the fly, and for ones that have literal native language strings right in the source code, mixed with business logic.
When a new version of the package is released, template must be edited to include all new strings but it has quite good automation for preserving the existing ones. Of course only translations for new strings are required.

IMHO automatic tools won't be of any help here. Just translating variable and function names is not enough and will make the code worse because they cannot infer the original programmer intent when he choose a variable name.
Depending on what programming language this code is written to there are modern IDEs that might ease the refactoring but if you want to have good results manual code review is a must.

A good IDE will be able to list classes, methods, variables. There's also documentation generation tools that'll do that such as Javadoc for Java, Doxygen for many languages, etc.
To do the actual translation, there will be no tool that will perform well, or even to a satisfactory level. The only way to get something worthwile is to have a bilingual translator translate the terms. I've been doing freelance translations for many years, and can tell you that trying to have some machine do the translating is a waste of time. Many examples, choice of words, will be relevant to your culture and not the other. And that's just the tip of the iceberg.
Unless you find someone that can do the translation, I suggest you abandon the idea. Leave the source code as is. If a non-French speaker reads it, and needs to understand something, let them do the Google lookup. If they are native English speakers they'll probably do a better job of understanding the automatic translated stuff than you would, being French. When translating, you always want to translate into your native language.

For translating only comments you may try this simple utility I wrote (it's using Microsoft's Translator API): transource.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio