How to add a custom tmLanguage syntax to Sphinx/RST - python-sphinx

Is there a method to import a tmLanguage.json into Sphinx to add support for a new/custom language for RST?

There is not directly; if necessary you'll have to write a lexer for a new language in Python. I say if necessary because Sphinx's syntax highlighting is provided under the hood by Pygments, which supports a huge number of languages; you just need to turn support on in Sphinx using the highlight_language config value. The short names for all the various lexers are shown here.
If, somehow, your language doesn't have a lexer already, there are instructions on how to write your own. It's largely (but not entirely) a process of translating the Oniguruma regexes in the .tmLanguage.json file to Python-flavored ones.
One would also hope that you'd contribute it to the pygments Github project, too.

Related

Source code documentation in Racket (Scheme)

Is it possible to write documentation in source files like in Common Lisp or Go, for example, and extract it from source files? Or everybody uses Scribble to document their code?
The short answer is you can write in-source documentation by using scribble/srcdoc.
Unlike the other languages you mentioned, these aren't "doc strings":
Although you can write plain text, you have full Racket at-expressions and may use scribble/manual forms and functions.
Not only does this allow for "richer" documentation, the documentation goes into its own documentation submodule -- similar to how you can put tests into test submodules. This means the documentation or tests add no runtime overhead.
You do need one .scrbl file, in which you use scribble/extract to include the documentation submodule(s). However you probably want such a file, anyway, for non-function-specific documentation (topics such as introduction, installation, or "user's guide" prose as opposed to "reference" style documentation).
Of course you can define your own syntax to wrap scribble/srcdoc. For example, in one project I defined a little define/doc macro, which expands into proc-doc/names as well as a (module+ test ___) form. That way, doc examples can also be used as unit tests.
How Racket handles in-source documentation intersects a few interesting aspects of Racket:
Submodules let you define things like "test-time" and "doc-time" as well as run-time.
At-expressions are a different syntax for s-expressions, especially good when writing text.
Scribble is an example of using a custom language -- documentation-as-a-program -- demonstrating Racket's ability to be not just a programming language, but a programming-language programming language.

Writing ontologies in DL syntax?

I just discovered OWL and Protege. Upon reading through this reference page (which I quote below), I am left wondering whether it is possible to not use the abstract OWL syntax, and rather to write in DL syntax. My background is in logic, so it sounds like it would be more fun even if I would have to translate the ontologies later (though I am sure there must be applications to do this--besides, don't reasoners use DL?).
If it is possible, what configuration of settings should I use in Protege (or other software of your suggestion) in order to do this? I suspect it's not possible, but I want to be sure, as I see no good reason for this other than the awkwardness of special symbols.
EDIT: If it is NOT possible, how exactly are DL languages used?
OWL DL is the description logic SHOIN with support of data values, data types
and datatype properties, i.e., SHOIN(D), but since OWL is based on RDF(S), the
terminology slightly differs. ... For description of OWL ontology or knowledge
base, the DL syntax can be used. There is an "abstract" LISP-like syntax
defined that is easier to write in ASCII character set.
Here's a very brief working example of the two syntax styles for the same data.
don't reasoners use DL?
Not necessarily. They use all kinds of logics, some of which are DLs, some are not.
If it is possible, what configuration of settings should I use in Protege (or other software of your suggestion) in order to do this?
I'm pretty sure there is no such pluggin for Protégé. But if you really want some fun, use a text editor and write your ontology by hand. There are many syntaxes you can use: the functional syntax, the OWL/XML syntax, the RDF/XML syntax are all normative. In addition, you can use the Manchester syntax, Turtle, N-Triples, JSON-LD, that will be future recommendations for writing RDF (and therefore OWL). Or the more exotic RDF/JSON, HDT. Or again, more "powerful" syntaxes like Notation3, TriG, TriX, NQuads. Plenty of fun!
In any case, if you would like to write in the DL syntax, you would need to use special Unicode characters or special commands like in LaTeX for instance. And the parser that deals with it would have to read those characters or commands. Not ideal if you are programming. But you can always use the DL syntax in your writings.
BTW, the current standard Web Ontology Language is OWL 2. Its DL variant (viz., OWL 2 DL) is based on the even more irresistible SROIQ.

ruby markdown processor that does footnotes?

Do any of the maintained decent ruby markdown processors do an extension for footnotes? I know some markdown processors in other languages do (although I'm not sure which ones).
The ruby ones aren't so great at documenting what markdown extensions they might support. (Heck, neither are the ones in other languages).
anyone know?
I have used kramdown, and it's my library of choice for markdown. It actually provides a superset of Markdown syntax, borrowing additional functionality from other libraries. For example, the footnote capability was borrowed from PHP Markdown Extra package.
Example syntax:
That's some text with a footnote.[^1]
[^1]: And that's the footnote.

Ruby localization: i18n, g18n, gettext, padrino... - what's the difference?

Being somewhat new to Ruby I'm exploring existing libraries to do what I'd normally do in other scripting languages, and I'm a bit stumped by the localization libraries that might be available for something built on top of Sinatra/Sequel (Rails/AR being a bit too opinionated to my taste).
Now, I ran into a couple (i18n, r18n, GetText) though this wiki page, and there apparently is an extra library used in Padrino (based on the i18n thing from Rails?); and apparently plenty more.
Except for the obvious (i.e. GetText mo/po style vs yml files), I'm somewhat confused as to how these options might be different. The wiki doesn't point to much in that respect except saying the that they exist; not how they're different.
Adding to this confusion is the fact that essentially every piece of documentation seems to cover a single one of them (and typically in a RoR context). Moreover, these options don't look entirely incompatible with one another on closer inspection -- in the sense that, if I understood this properly, they can understand each other's files to a large extent.
Might anyone here be able to give a quick and to the point explanation/overview of these libraries, and outline the difference between the them? Some pointers on performance would also be welcome, if you're aware of any (besides the ones from the fast_gettext docs, which made little sense considering my lack of understanding the difference between these options).
I can see how this situation is confusing without knowing some of the history of i18n/l10n libraries in Ruby. I should probably write a few words up on that, but for now I'll try to give an overview from my perspective:
Gettext is obviously the oldest player in this game and it inherits both strengths and weaknesses from its ancestry which is being invented for a C dominated world. It has most features one needs, comes with some tools support that others lack (like desktop po file editors) and is widely accepted in the so called enterprise world.
Gettext as such defines an API and there are basically two libraries that implement it in the Ruby world, the traditional Ruby Gettext packages by Masao Mutoh and the fast_gettext gem by Michael Grosser.
Ruby Gettext is quite powerful and ships a lot of features that you may or may not need. The fast_gettext gem on the other hand focusses on raw speed and is implemented as a shiny, modern code-style Ruby library that is easily hackable and the author is a very smart and supportive person. Out of the two I'd personally strongly recommend fast_gettext.
The I18n gem is the result of the joint effort of various Ruby i18n/l10n solutions that existed a few years ago and that all strived to supersede Gettext for various reasons at that point of time. The resulting I18n API is basically covering the requirements and usecases of all the i18n/l10n solutions involved at that time, including the API of Gettext. So, today's Ruby I18n API is a superset of Gettext's API from the early 90s.
Today the I18n gem is the official solution that is shipped with Ruby on Rails, but it is also the probably most popular one in the Ruby world in general.
The I18n gem also makes it very easy to extend the featureset and add things like caching, other storage mechanisms (like Gettext po files, database tables, key-value stores; storage defaults to plain Ruby files and YAML) etc. and it ships with a number of modules for that (but external or custom modules can easily be crafted, tested and integrated).
There are translation files for 70+ languages (locales) for strings used by Ruby on Rails (which are useful in other projects, too) maintained by the community.
I can not tell much about R18n except that it was invented right after I18n hit its first release and as far as I remember it originated from the Merb community. It seems to be rather strong in the Russian Ruby world, but I might be wrong with all these assertions.
So, unless you have a very good reason to pick any other solution I'd strongly recommend using I18n.
Then on the other hand that means nothing because I've been leading this project more or less since it was invented.
I hope this helps.
[EDIT] added links to various references
I18n is a main stream.
R18n is a alternative with some extra features (model translations, syntax sugar) and some difference in ideology and architecture (flex extensibility by powerful filters).
G18n need to add model transtions to I18n.
Padrino is not a i18n library, it is just Sinatra framework with build-in I18n.
Gettext is IMHO old conception with very ugly format and problem with pluralization. Anyway, it isn’t popular in Ruby community.
First:
as svenfuchs wrote, I18n is a framework that provides modules for many translation and internationalisation approaches.
'gettext' is just one of many modules.
So there is really no question to use I18n.
The default setup of a Rails application is to use I18n with the YAML backend and I understand part of your question to compare that backend with other ones.
IMHO there are two major differences between the gettext and YAML based approaches:
life cycle support
hierarchy
gettext
One idea of gettext is, that translating an app is not a singular event but a life cycle process.
It is build to support this live cycle.
gettext is designed to use plain english as the keys for the translations. So the idea is to write the app in english and mark all text that is to be translated, typically by wrapping it with _().
As a result, the app source code is easily readable in english.
Then a programm scans all source code and extracts the textes to be translated and builds a repository (the .pot file) of these textes.
In the next step, and here comes the live cycle, the repository is merged with existing translations (.po files, one for each target language) and new or changed items are marked.
Mature editors support the translators by focusing on the new and changed items. Additionally project specific dictionaries can support partial automatic translations.
gettext is flat, meaning that each key phrase is translates exactly once in the translation files. There is no hierarchy. But there is context. In the translation files, all the source code positions of a key phrase are listed. An editor with access to the source code can display the source along with the translation (and some do).
Finally, .po files are translated to machine readable fast access forms (can be .mo, the classic standard, or a database or json or …)
YAML
YAML an the other hand is hierarchical so it’s easy to have variations of translations in different contexts.
I18n uses this structure to support scopes and uses the current file path as scope when using keys starting with a dot.
There is no information, where a key is used in the project (well unless auto scoped, but the key may be used in other places explicitly).
There is no information, whether there are any changes.
Unless your IDE supports you, the developer has to find the right place to put a key in the YAML and searching the usage can be cumbersome.
A lot more is said in the other answers.
I18n
I intentionally said YAML and not I18n, because I18n is a framework for internationalization (not only translation), and YAML is only one possible backend.
Plural support in I18n differs from plural support of vanilla gettext. I don’t have experience how they cooperate.
Examples
gettext with positional parameters:
sprintf(
_('Do you really want to delete tour %1$s_%2$s? Only empty tours can be deleted!'),
tag, idx)
translations are text files, but PO-Editors provide GUIs:
#: js/addDelRow.js:15
msgid "" "Do you really want to delete tour %1$s_%2$s? Only empty tours can be deleted!"
msgstr "" "Wollen sie die Spalte %1$s_%2$s wirklich löschen? Nur leere Spalten können "
"gelöscht werden."
YAML with parameters:
Source
<%= t('.checked_at', ts: l(checked_at), user: full_name) %>
translation
from
en:
hotels:
form:
checked_at: „set to checked by %{user} on %{ts}“
to
de:
hotels:
form:
checked_at: "geprüft gesetzt am %{ts} von %{user}“
Conclusion
YAML is much easier to start with, especially if you have support by an IDE.
Vanilla RAILS has it built in.
The is no native language. The first translation can be any language.
With growing projects and multiple laguages, my YAML files tend to repetition (same translation scattered over the hierarchy) and tracing of changes and therefore new translations is cumbersome.
gettext needs an extra toolchain and therefore a more difficult setup.
It supports the whole life cycle of continous translation of developing apps.
It is based on english source code.
I usually use the best parts of both, using YAML for internationalisation (number and date format, maybe model names?) and gettext for translation.
Andrey's response as to point me back to the r18n docs, which basically break it down to a single line:
R18n uses hierarchical, not English-centric, YAML format for translations by default.
Found this slideshare from Andrey. It's in Russian, but it's making a lot more sense now (slides 7 to 9 in particular clear-cut differences between i18n and r18n):
http://www.slideshare.net/iskin/r18n

Unifying enums across multiple languages

I have one large project with components in multiple languages that each depend on some of the same enum values. What solutions have you come up with to unify enums across multiple arbitrary languages? I can think of a few, but I'm looking for the best solution.
(In my implementation, I'm using Php, Java, Javascript, and SQL.)
You can put all of the enums in a text file, then use a code generator to write out the appropriate syntax for each language from that common file so that each component has the enums. Make that text file the authoritative source of information.
You can express the text file in XML but I'd think a tab-delimited flat file would work just fine.
Make them in a format that every language can understand or has a library for. I am using JSON for this at the moment.
Then you can include it with two ways:
For development: Load it from a file/URL at runtime
good for small changes you want too see immediately
slow
For productive usage: Include it in the files
using a build script
fast
no instant feedback
I would apply the dry principle and using code generator as such you could add anew language easely even if it has not enum natively existing.

Resources