Writing ontologies in DL syntax? - logic

I just discovered OWL and Protege. Upon reading through this reference page (which I quote below), I am left wondering whether it is possible to not use the abstract OWL syntax, and rather to write in DL syntax. My background is in logic, so it sounds like it would be more fun even if I would have to translate the ontologies later (though I am sure there must be applications to do this--besides, don't reasoners use DL?).
If it is possible, what configuration of settings should I use in Protege (or other software of your suggestion) in order to do this? I suspect it's not possible, but I want to be sure, as I see no good reason for this other than the awkwardness of special symbols.
EDIT: If it is NOT possible, how exactly are DL languages used?
OWL DL is the description logic SHOIN with support of data values, data types
and datatype properties, i.e., SHOIN(D), but since OWL is based on RDF(S), the
terminology slightly differs. ... For description of OWL ontology or knowledge
base, the DL syntax can be used. There is an "abstract" LISP-like syntax
defined that is easier to write in ASCII character set.
Here's a very brief working example of the two syntax styles for the same data.

don't reasoners use DL?
Not necessarily. They use all kinds of logics, some of which are DLs, some are not.
If it is possible, what configuration of settings should I use in Protege (or other software of your suggestion) in order to do this?
I'm pretty sure there is no such pluggin for Protégé. But if you really want some fun, use a text editor and write your ontology by hand. There are many syntaxes you can use: the functional syntax, the OWL/XML syntax, the RDF/XML syntax are all normative. In addition, you can use the Manchester syntax, Turtle, N-Triples, JSON-LD, that will be future recommendations for writing RDF (and therefore OWL). Or the more exotic RDF/JSON, HDT. Or again, more "powerful" syntaxes like Notation3, TriG, TriX, NQuads. Plenty of fun!
In any case, if you would like to write in the DL syntax, you would need to use special Unicode characters or special commands like in LaTeX for instance. And the parser that deals with it would have to read those characters or commands. Not ideal if you are programming. But you can always use the DL syntax in your writings.
BTW, the current standard Web Ontology Language is OWL 2. Its DL variant (viz., OWL 2 DL) is based on the even more irresistible SROIQ.

Related

CL/Scheme DSELs with non-lisp syntax

I have been curious lately about DSLs, specifically, how to implement them in Lisp,
since it looks like a piece of cake compare to the alternatives.
Looking for information I cannot find any evidence of a non-lisp DSEL in Lisp in internet.
So my question is:
Is it possible to implement a DSL with non-lisp syntax in lisp with the use of macros?
How is this achieved?
Can the reader of lisp be replaced by a custom reader that translates code to lisp structure?
If the former is true: is this a common way to implement "non-lispy" DSELs?
Short version: Racket does this.
In more detail: Racket, a descendant of Scheme, has a really well-thought-out story here. A Racket module/file can begin with a language declaration, e.g.
#lang algol60
... and then the rest of the file can be written in the given language. (Yes, algol60 is built in.)
In order to develop your own language, you need to write a package that is a language specification, that shows how to expand the syntax of this language into the syntax of the underlying language (in this case, Racket). Anyone can write such packages, and then distribute them to allow others to write programs in this language. There are examples of such language specifications included with Racket, e.g. the algol 60 example mentioned earlier.
I think this is exactly what you're asking for?
ObDisclaimer: Yes, I am a Racket developer.
How do you implement the surface language of a programming language? You write a parser or use a parser generator. You can do that in Lisp, too.
There are many examples of general purpose and domain specific languages written in Lisp - not using s-expression syntax.
Historically the first ML (an extension language for a theorem prover) was written in Lisp. Macsyma (a language for computer algebra) is written in Lisp. In many cases there is some kind of 'end user', for which a non-s-expression language needs to be written/supported. Sometimes there are languages which exist and need to be supported.
Using macros and read macros you can implement some languages or extend the Lisp language. For example it is easy to add JSON syntax to Lisp using a read macro. Also some kind of infix syntax. XML (example: XMLisp).
There's no problem in supporing non-Lisp syntax DSLs in Lisp. You'll need to use some parser/parser generator library as Rainer has mentioned. A good example is esrap that is used to parse markdown (see 3bmd) and also for the pgloader command language which is just an example of an external DSL you're asking about.
From Let Over Lambda, there is an implementation of Perl style regular expressions: http://letoverlambda.com/index.cl/guest/chap4.html#sec_4.
Also there are several attempts at making a "non-lispy" version of Lisp, the main one being the Readable Lisp S-expression Project: http://readable.sourceforge.net/.
One implementation-specific solution that sticks out (if you want to use Scheme rather than CL) is Gambit Scheme's built-in support for infix syntax via its SIX-script extension.
This provides a rich set of loosely C-like operators and syntax forms, which can either be used out-of-the-box to write code in a C-like style, or redefined to mean whatever you want (you can easily redefine e.g. the function definition format, if you aren't a fan of type name(args) {}). for, case, := and so on (even goto) are all already present and ready to mean whatever you need.
The actual core of the syntax (operator precedence, expressions vs. statements) is fixed, but you can assign things like a Scheme binding construct to the s-expression produced by an operator for a reasonably large amount of freedom.
a = b * c;
is translated by the reader into
(six.x=y (six.identifier a) (six.x*y (six.identifier b) (six.identifier c)))
You can then override the definitions of those macros with your own to make the syntax do whatever you want. Turning the C-style base into a Haskell-looking functional language isn't too hard (strategically redefine = and -> and you're halfway there...).

Scheme core language specification

I am learning my way around Scheme, and I am especially interested in how the language is constructed. I'm trying to find a nice description of the core syntax for a Scheme implementation. I don't know enough about the standards, but I assume that they all contain macro systems. If not, I'd like to read about a standard that also includes macros (they can't possibly be implemented in simpler Scheme constructs, can they?).
Does anyone have a good reference for the minimal syntax needed for a Scheme dialect?
Just an update:
I also stumbled upon this: http://matt.might.net/articles/compiling-to-java/#sec1. If you also add define-syntax and delay then it seems like it might be a good start.
In the R5RS specification, the following page appears to be what I was looking for: formal syntax
Although it may be a bit dry, you should read over the R5RS spec or the R6RS spec.
The docs really do not take that long to read through and you can just skim most of the sections until you need more detail. But either document does cover all of the minimal syntax required, including macros.

Which tools can help in translating (as in french -> english and not C++ -> java) source code? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I have some code that is written in french, that is the variables, class, function all have french names. The comments are also in french. I'd like to translate the code to english. This will be quite a challenge, since it's a 18K lines project and I'd like to know if there is any tool that could help me, especially with the variables/class/function names, since it will be error prone to rename them all.
Is there any tools that can help me? Advices?
edit : I'm not looking for machine translation. I'm looking for a tool that would help me translate the code. Let's say there is class name C and this class has a method named TraverserLaRue and I rename it CrossTheRoad I'd like all references to TraverserLaRue in all files to be translated as CrossTheRoad. However I don't want the method TraverserLaRue of class B to be translated.
I assume the langauge in question is one of the common ones, such as C, C++, C#, Java, ...
(You don't have a language with French keywords? I once encountered an entirely Swedish version of Pascal, and I gave up on working that).
So you have two problems:
Translating identifiers in the source code
Translating comments
Since comments contain arbitrary natural language text, you'll need an arbitrary translation of them. I don't think you can find an automated tool to do that.
Unlike others, however, I think you have a decent chance at translating the identifiers
and changing them en masse.
SD makes a line of source code "obfuscator" products. These tools don't process the code as raw text, rather they process the source code in terms of the targeted language; they accurately distinguish identifiers from operators, numbers, comments etc. In particular, they
operate reliably as need on just the identifiers.
One of the things these tools do is to replace one identifier name by another (usually a nonsense name) to make the code really hard to understand. Think abstractly of a map of identifier names I -> N. (They do other things, but that's not interesting here). Because you often want to re-obfuscate a file that has changed, the same way as an original, these tools allow you to reuse a previous cycle's identifier map, which is represented as list of I -> N pairs.
I think you can abuse this to do what you want.
Step 1: Run such an obfuscator on your original French code. This will produce a text file containing all the identifiers in the code as a map of the form
I1 -> N1
I2 -> N2
....
You don't care about the Ns, just the I's.
Step 2: Manually translate each French I to an English name E you think fits best.
(I have no specific suggestions about how to do this; some of the other answers here
have suggestions).
Some of the I's are likely to be library calls and are thus already correct.
You can modify the text obfuscation map file to be:
I1 -> E1
I2 -> E2
Step 3: Run the obfuscation tool, and make it use your modified obfuscation map.
It can be told to do that.
Viola, all the identifiers in your code will be changed the way you specify.
[You may get, as a freebie, the re-formatting of your original text. These tools can also format code nicely. Your name changes are likely to screw up the indentation/spacing in the original text so this is a nice bonus].
Any refactoring tool has a rename feature. Many questions on SO address language specific refactoring tools.
For the comments, you will have to handle them manually.
I did this with German code a while ago, but had mixed results because of abbreviations in names, etc. Using regular expressions, I wrote a parser that removed all of the language specific keywords and characters, then separated comments from the rest of the code, and now I had a lot of words that didn't necessarily mean anything to me by themselves. So I wrote a unique word finder that added them all to a ordered text file. Next stop was Google's language tools that attempted to translate every word in the list. I ran through the list to see if each word really translated, and if it did, I did a replace all in the code with the english equivalent. The comments I put back in with the complete translation, if it worked. What I found was that I ended up having to talk with someone who understood "Germish" to translate the abbreviations, slang terms, and mixed language pieces. So in short, regular expressions with a dictionary, unless someone has a real tool for this, which I would be interested in also.
You should definitely look into https://launchpad.net/rosetta
Ubuntu uses this to translate thousands of its packages written in hundreds of programming languages into hundreds of human languages, with updates for each new version. Truly herculean task.
edit: ...to clarify how Rosetta is used at Ubuntu: it modifies all natural language strings occuring in source code of the open-source apps, creating a language-specific source packages, which upon compiling create given kinds of binaries. Of course it does not edit binaries themselves.
First maintainers create "template files" which are something like "Patch with wildcards" - a set of rules what and where in the source tree needs to be translated, but not to what. Then Rosetta displays strings to be translated, and allows volunteering translators to provide translations to their language for each entry. Each entry can be discussed, modified, suggestions submitted and moderated. Stats are provided how much needs to be translated, which translations are unsure, which are missing etc. When the translation is complete, patch of given language is applied to the source creating its version for given language. Then a distribution is compiled from the modified sources.
This allows translation both for sources that use some external resources for multilingual allowing for language change on the fly, and for ones that have literal native language strings right in the source code, mixed with business logic.
When a new version of the package is released, template must be edited to include all new strings but it has quite good automation for preserving the existing ones. Of course only translations for new strings are required.
IMHO automatic tools won't be of any help here. Just translating variable and function names is not enough and will make the code worse because they cannot infer the original programmer intent when he choose a variable name.
Depending on what programming language this code is written to there are modern IDEs that might ease the refactoring but if you want to have good results manual code review is a must.
A good IDE will be able to list classes, methods, variables. There's also documentation generation tools that'll do that such as Javadoc for Java, Doxygen for many languages, etc.
To do the actual translation, there will be no tool that will perform well, or even to a satisfactory level. The only way to get something worthwile is to have a bilingual translator translate the terms. I've been doing freelance translations for many years, and can tell you that trying to have some machine do the translating is a waste of time. Many examples, choice of words, will be relevant to your culture and not the other. And that's just the tip of the iceberg.
Unless you find someone that can do the translation, I suggest you abandon the idea. Leave the source code as is. If a non-French speaker reads it, and needs to understand something, let them do the Google lookup. If they are native English speakers they'll probably do a better job of understanding the automatic translated stuff than you would, being French. When translating, you always want to translate into your native language.
For translating only comments you may try this simple utility I wrote (it's using Microsoft's Translator API): transource.

Migrating from one C language to another, change Style?

I find myself in conflict, regarding which code style I should follow when using a different C language.
Currently I am doing work (different projects) in C++, C# and Objective-C
I noticed there is a lot of discrepancy in the conventions basic frameworks follow. Generally, I don't think it's a bad idea to adhere to these conventions, as it makes code feel more "integrated" into the environment. However it is hard for me to remember all the differences and apply principles correctly.
In C# for example, all methods of a class start Uppercase, while Objective-C seems to prefer camelCase style methods.
What tactic would you choose:
One style to rule them all (as far as applicable)
Stick with what is common in the given environment
I do especially like the google styleguides, which seem to recommend the latter. However I disagree with them on using spaces instead of tabs and their indentation in general (e.g. methods on same level as class etc.)
I think you should stick to the "accepted" styles for each language. My rationale for that is that I think it would be much easier to recall what environment you're in when you have to think in the style used for that language. It will also be much easier for someone who is familiar with that environment to look at your code and feel more comfortable with the style and formatting (i.e. less chance for them to misunderstand what they're looking at).
My rule with porting code is: Don't touch it unless you have to.
My rule with modifying old code is: Use the style of the file.
Outside of those two situations, things like coding standards and perhaps your own opinion on good style can come into play.

Are semantics and syntax the same?

What is the difference in meaning between 'semantics' and 'syntax'? What are they?
Also, what's the difference between things like "semantic website vs. normal website", "semantic social networking vs. normal social networking" etc.
Syntax is the grammar. It describes the way to construct a correct sentence. For example, this water is triangular is syntactically correct.
Semantics relates to the meaning. this water is triangular does not mean anything, though the grammar is ok.
Talking about the semantic web has become trendy recently. The idea is to enhance the markup (structural with HTML) with additional data so computer could make sense of the web pages more easily.
Syntax is the grammar of a language - the rules by which to form sentences or expressions.
Semantics is the meaning you are trying to express with your code.
A program that is syntactically correct will compile and run.
A program that is semantically correct will actually do what you as the programmer intended it to do. i.e. it doesn't have any bugs in it.
Two programs written to perform the same task in different languages will use different syntaxes, but they would be the same semantically.
If you are talking about web (rather than programming languages):
The syntax of the language is whatever the browser (or processing program) can legally recognize and handle, and render to you. For example, your browser can render HTML, while your API can parse XML trees.
Semantics involve what is actually being represented. There's a lot of buzz now about semantic webs and all that stuff, but it essentially means that each entity is also associated with some human-readable information or metadata, so that a certain tag would have a supposed meaning and refer you to it.
Social networks are the same story. You put knowledge in the links
"An ant ate an aunt." has a correct syntax, but will not make sense semantically. A syntax is a set of rules that can be combined to produce infinite number of gramatically valid sentences, but few, very few of which has a semantics.
Syntax is the word order of a sentence. In English it would be the subject-verb-object form.
Semantic is the meaning behind words. E.g: she ate a saw. The word saw doesn't match according to the meaning of the sentence. but it is grammatically correct. so its syntax is correct. =)
Specifically, semantic social networking means embedding the actual social relationships within the page markup. The standard format for doing this as defined by microformats is XFN, XHTML Friends Network. In regards to the semantic web in general, microformats should be the go-to guide for defining embedded semantic content.
Semantic web sites use the concept of the semantic web, which aims to bring meaning to web content by using special annotations to identify certain concepts in a page. This makes possible the automatic (by a computer, not a human) reasoning about the content, which improves its aggregation, extraction, indexing and searching.
Explanations above are vague on the semantics side, semantics could mean the different elements at disposition to build arguments of value(these being comprehensible, to end-user man and digestible to the machine).
Of course this puts semantics and the programmer-editor-writer-communicator in the middle: he decides on the semantics that should be ideally defined to his public, comprehended by his public, general convention by his public and digestible to the machine-computer. Semantics should be agreed upon, are conceptual, must be implementable to both sides.
Say footnotes, inline and block-quotes, titles and on and on to end up into a well-defined and finite list. Mediawiki, wikitext as an example fails in that perspective, defining syntax for elements of semantic meaning left undefined, no finite list agreed upon. "meaning by form" as additional of what a title as an example again carries as textual content. Example "This is a title" becomes only semantics integrated by the supposition within agreed upon semantics, and there can be more then one set of say "This is important and will be detailed"
Asciidoc and pandoc markup is quite different in it's semantics, regardless of how each translates this by convention of syntax to output formats.
Programming, output formats as html, pdf, epub can have consequentially meaning by form, by semantics, the syntax having disappeared as a temporary tool of translation, and as one more consequence thus the output can be scanned robotically for meaning, the champ of algorithms of 'grep': Google. Looking for the meaning of "what" in "What is it that is looked for" based upon whether a title or a footnote, or a link is considered.
Semantics, and there can be more then one layer, even the textual message carries (Chomsky) semantics thus could be translated as meaning by form, creating functional differences to anything else in the output chain, including a human being, the reader.
As a conclusion, programmers and academics should be integrated, no academic should be without knowledge of his tools, as any bread and butter carpenter. Programmers should be academics in the sense that the other end of the bridging they accomplish is the end user, the bridge... much so: semantics.
m.

Resources