ISO error classification for multiple definition in Prolog DSL - prolog

I'm building a DSL that uses names for procedures (essentially) that must be unique.
It's unclear what sort of error term to use to represent a second definition.
existence_error sorta kinda fits, but I'm uncomfortable with it. It seems to imply missing definition, not multiple definition.
permission_error(modify, procedure, Name/Arity) seems promising, but seems to imply "some people could do this, but not you". Without further enlightenment, I'll use this.
syntax_error sorta kinda fits, but is defined as being for read_term only.
Should I define my own here? The spec says 'use these when you can'.

In the old times there was no SWISH or Pengines where a Prolog prozessor was used by multiple users, and probably through batch processing there wasn't much awareness that resources could be blocked by other users. So the explanation of the error term permission_error/3 is mostlikely as SICStus Prolog describes it here:
"A permission error occurs when an operation is attempted that is
among the kinds of operation that the system is in general capable of
performing, and among the kinds that you are in general allowed to
request, but this particular time it isn't permitted."
http://sicstus.sics.se/sicstus/docs/4.0.4/html/sicstus/ref_002dere_002derr_002dper.html
But I agree, from the name of the error term, we would expect its application range only some violation of access or modification rules, and not some semantic restrictions over syntactic structure such as a DSL.
But you are probably not the only one that has these problems. If your Prolog system has a messaging subsystem, where you can easily associate error terms with user friendly text, I don't see any reasons to not introduce new error terms.
You could adopt the follow error terms already suggested by SICStus Prolog and not found in the ISO core standard:
"A consistency error occurs when two otherwise valid values or
operations have been specified that are inconsistent with each other."
http://sicstus.sics.se/sicstus/docs/4.0.4/html/sicstus/ref_002dere_002derr_002dcns.html
"A context error occurs when a goal or declaration appears in the wrong
place. There may or may not be anything wrong with the goal or
declaration as such; the point is that it is out of place."
http://sicstus.sics.se/sicstus/docs/4.0.4/html/sicstus/ref_002dere_002derr_002dcon.html
Especially SWI-Prolog has such a messaging subsystem and SWI-Prolog has long said good-bye to interoperability with other Prolog systems. So the only danger if you would use SWI-Prologs messaging is a certain lock-in, which might not bother you.

Related

SWI-prolog semweb library processing of URI

Being new to prolog I am reading existing code (as well as trying to write some code). Having some prior background in semweb I started to play with it and see something that is confusing me. Example assertion:
?- rdf_assert(ex:bob, rdf:type, foaf:'Person').
I also did find the following in the documentation:
Remember: Internally, all resources are atoms. The transformations
above are realised at compile-time using rules for goal_expansion/2
provided by the rdf_db library
Am I correct in assuming that somehow the library is treating the three URIs as atoms? I thought that the compiler would treat this as module_name:predicate, but that does not seem to be the case. If that is true, could you please provide a simple example on how this could be done in prolog?
Thanks
Prolog is not a functional language. This implies 2+3 does not evaluate to 5 and is just a term that gets meaning from the predicate that processes it. Likewise, ex:bob is just a term that has no direct relations to modules or
predicates. Only predicates such call/1 will interpret this as "call bob in the module ex".
Next to that, (SWI-)Prolog (most Prolog's, but not all) have term expansion that allows you to rewrite the term that is read before it is handed to the compiler. That is used to rewrite the argument of rdf/3: each appearance of prefix:local is expanded to a full atom. You can check that by using listing/1 on predicates that call rdf/3 using the prefix notation.
See also rdf_meta

How to create AST parser which allows syntax errors?

First, what to read about parsing and building AST?
How to create parser for a language (like SQL) that will build an AST and allow syntax errors?
For example, for "3+4*5":
+
/ \
3 *
/ \
4 5
And for "3+4*+" with syntax error, parser would guess that the user meant:
+
/ \
3 *
/ \
4 +
/ \
? ?
Where to start?
SQL:
SELECT_________________
/ \ \
. FROM JOIN
/ \ | / \
a city_name people address ON
|
=______________
/ \
.____ .
/ \ / \
p address_id a id
The standard answer to the question of how to build parsers (that build ASTs), is to read the standard texts on compiling. Aho and Ullman's "Dragon" Compiler book is pretty classic. If you haven't got the patience to get the best reference materials, you're going to have more trouble, because they provide theory and investigate subtleties. But here is my answer for people in a hurry, building recursive descent parsers.
One can build parsers with built-in error recovery. There are many papers on this sort of thing, a hot topic in the 1980s. Check out Google Scholar, hunt for "syntax error repair". The basic idea is that the parser, on encountering a parsing error, skips to some well-known beacon (";" a statement delimiter is pretty popular for C-like languages, which is why you got asked in a comment if your language has statement terminators), or proposes various input stream deletions or insertions to climb over the point of the syntax error. The sheer variety of such schemes is surprising. The key idea is generally to take into account as much information around the point of error as possible. One of the most intriguing ideas I ever saw had two parsers, one running N tokens ahead of the other, looking for syntax-error land-mines, and the second parser being feed error repairs based on the N tokens available before it encounters the syntax error. This lets the second parser choose to act differently before arriving at the syntax error. If you don't have this, most parser throw away left context and thus lose the ability to repair. (I never implemented such a scheme.)
The choice of things to insert can often be derived from information used to build the parser (often First and Follow sets) in the first place. This is relatively easy to do with L(AL)R parsers, because the parse tables contain the necessary information and are available to the parser at the point where it encounters an error. If you want to understand how to do this, you need to understand the theory (oops, there's that compiler book again) of how the parsers are constructed. (I have implemented this scheme successfully several times).
Regardless of what you do, syntax error repair doesn't help much, because it is almost impossible to guess what the writer of the parsed document actually intended. This suggests fancy schemes won't be really helpful. I stick to simple ones; people are happy to get an error report and some semi-graceful continuation of parsing.
A real problem with rolling your own parser for a real language, is that real languages are nasty messy things; people building real implementations get it wrong and frozen in stone because of existing code bases, or insist on bending/improving the language (standards are for wimps, goodies are for marketing) because its cool. Expect to spend a lot of time re-calibrating what you think the grammar is, against the ground truth of real code. As a general rule, if you want a working parser, better to get one that has a track record rather than roll it yourself.
A lesson most people that are hell-bent to build a parser don't get, is that if they want to do anything useful with the parse result or tree, they'll need a lot more basic machinery than just the parser. Check my bio for "Life After Parsing".
There are two things the parser could do:
Report the error and have the user try again.
Repair the error and proceed.
Generally speaking the first one is easier (and safer). There may not always be enough information for the parser to infer the intent when the syntax is wrong. Depending on the circumstances, it may be dangerous to proceed with a repair that makes the input syntactically correct but semantically wrong.
I've written a few hand-rolled recursive descent parsers for little languages. When writing code to interpret the grammar rules explicitly (as opposed to using a parser-generator), it's easy to detect errors, because the next token doesn't fit the production rule. Generated parsers tend to spit out a simplistic "expected $(TOKEN_TYPE) here" message, which isn't always useful to the user. With a hand-written parser, it's often easy to give a more specific diagnostic message, but it can be time consuming to cover every case.
If your goal is the report the problem but to keep parsing (so that you can see if there are additional problems), you can put a special AST node in the tree at the point of the error. This keeps the tree from falling apart.
You then have to resync to some point beyond the error in order to continue parsing. As Ira Baxter mentioned in his answer, you might look for a token, like ';', that separates statements. The correct token(s) to look for depends on the language you're parsing. Another possibility is to guess what the user meant (e.g., infer an extra token or a different token at the point the error was detected) and then continue. If you encounter another syntax error within the next few tokens, you could backtrack, make a different guess, and try again.

How do I reinstate constraints collected with copy_term/3 in SICStus Prolog?

The documentation says that
copy_term(+Term, -Copy, -Body) makes a copy of Term in which all
variables have been replaced by new variables that occur nowhere
outside the newly created term. If Term contains attributed
variables, Body is unified with a term such that executing Body
will reinstate equivalent attributes on the variables in Copy.
I'm previously affirming numerical CLP(R) constraints over some variables, and at some point I collect these constraints using copy_term/3. Later, when I try to reinstate the constraints using 'call(Body)', I get an "Instantiation error" in arguments of the form [nfr:resubmit_eq(...)]
Here's a simplified example that demonstrates the problem:
:-use_module(library(clpr)).
{Old>=0, A>=0,A=<10, NR= Old+Z, Z=Old*(A/D)}, copy_term(Old,New,CTR), call(CTR).
Results in:
Instantiation error in argument 1 of '.'/2
! goal: [nfr:resubmit_eq([v(-1.0,[_90^ -1,_95^1,_100^1]),v(1.0,[_113^1])])]
My question is: how do I reinstate the constraints in Body over New? I haven't been able to find concrete examples.
copy_term/3 is a relatively new built-in predicate, that has been first introduced in SICStus about 2006. Its motivation was to replace the semantically cumbersome call_residue/2 which originated from SICStus 0.6 of 1987 by a cleaner and more efficient interface that splits the functionality in two:
call_residue_vars(Goal, Vars) which is like call(Goal) and upon success unifies Vars with a list variables (in unspecified order) that are attached to constraints and have been created or affected in Goal.
copy_term(Term, Copy, Body) like copy_term/2 and upon success unifies Body with a term to reinstate the actual constraints involved. Originally, Body was a goal that could be executed directly. Many systems that adopted this interface (like SWI, YAP) however, switched to use a list of goals instead. This simplifies frequent operations since you have less defaultyness, but at the expense of making reinstating more complex. You need to use maplist(call,Goals).
Most of the time, these two built-in predicates will be used together. You are using only one which makes me a bit suspicious. You first need to figure out which variables are involved, and only then you can copy them. Typically you will use call_residue_vars/2 for that. If you are copying only a couple of variables (as in your exemple) you are effectively projecting the constraints on these variables. This may or may not be your intention.
This is simply a bug in CLPR, which is unsupported. We lost touch with the CLPR supplier a long time ago.

On the use of of Internal`Bag, and any official documentation?

(Mathematica version: 8.0.4)
lst = Names["Internal`*"];
Length[lst]
Pick[lst, StringMatchQ[lst, "*Bag*"]]
gives
293
{"Internal`Bag", "Internal`BagLength", "Internal`BagPart", "Internal`StuffBag"}
The Mathematica guidebook for programming By Michael Trott, page 494 says on the Internal context
"But similar to Experimental` context, no guarantee exists that the behavior and syntax of the functions will still be available in later versions of Mathematica"
Also, here is a mention of Bag functions:
Implementing a Quadtree in Mathematica
But since I've seen number of Mathematica experts here suggest Internal`Bag functions and use them themselves, I am assuming it would be sort of safe to use them in actual code? and if so, I have the following question:
Where can I find a more official description of these functions (the API, etc..) like one finds in documenation center? There is nothing now about them now
??Internal`Bag
Internal`Bag
Attributes[Internal`Bag]={Protected}
If I am to start using them, I find it hard to learn about new functions by just looking at some examples and trial and error to see what they do. I wonder if someone here might have a more complete and self contained document on the use of these, describe the API and such more than what is out there already or a link to such place.
The Internal context is exactly what its name says: Meant for internal use by Wolfram developers.
This means, among other things, the following things hold about anything you might find in there:
You most likely won't be able to find any official documentation on it, as it's not meant to be used by the public.
It's not necessarily as robust about invalid arguments. (Crashing the kernel can easily happen on some of them.)
The API may change without notice.
The function may disappear completely without notice.
Now, in practice some of them may be reasonably stable, but I would strongly advise you to steer away from them. Using undocumented APIs can easily leave you in for a lot of pain and a nasty surprise in the future.

Are semantics and syntax the same?

What is the difference in meaning between 'semantics' and 'syntax'? What are they?
Also, what's the difference between things like "semantic website vs. normal website", "semantic social networking vs. normal social networking" etc.
Syntax is the grammar. It describes the way to construct a correct sentence. For example, this water is triangular is syntactically correct.
Semantics relates to the meaning. this water is triangular does not mean anything, though the grammar is ok.
Talking about the semantic web has become trendy recently. The idea is to enhance the markup (structural with HTML) with additional data so computer could make sense of the web pages more easily.
Syntax is the grammar of a language - the rules by which to form sentences or expressions.
Semantics is the meaning you are trying to express with your code.
A program that is syntactically correct will compile and run.
A program that is semantically correct will actually do what you as the programmer intended it to do. i.e. it doesn't have any bugs in it.
Two programs written to perform the same task in different languages will use different syntaxes, but they would be the same semantically.
If you are talking about web (rather than programming languages):
The syntax of the language is whatever the browser (or processing program) can legally recognize and handle, and render to you. For example, your browser can render HTML, while your API can parse XML trees.
Semantics involve what is actually being represented. There's a lot of buzz now about semantic webs and all that stuff, but it essentially means that each entity is also associated with some human-readable information or metadata, so that a certain tag would have a supposed meaning and refer you to it.
Social networks are the same story. You put knowledge in the links
"An ant ate an aunt." has a correct syntax, but will not make sense semantically. A syntax is a set of rules that can be combined to produce infinite number of gramatically valid sentences, but few, very few of which has a semantics.
Syntax is the word order of a sentence. In English it would be the subject-verb-object form.
Semantic is the meaning behind words. E.g: she ate a saw. The word saw doesn't match according to the meaning of the sentence. but it is grammatically correct. so its syntax is correct. =)
Specifically, semantic social networking means embedding the actual social relationships within the page markup. The standard format for doing this as defined by microformats is XFN, XHTML Friends Network. In regards to the semantic web in general, microformats should be the go-to guide for defining embedded semantic content.
Semantic web sites use the concept of the semantic web, which aims to bring meaning to web content by using special annotations to identify certain concepts in a page. This makes possible the automatic (by a computer, not a human) reasoning about the content, which improves its aggregation, extraction, indexing and searching.
Explanations above are vague on the semantics side, semantics could mean the different elements at disposition to build arguments of value(these being comprehensible, to end-user man and digestible to the machine).
Of course this puts semantics and the programmer-editor-writer-communicator in the middle: he decides on the semantics that should be ideally defined to his public, comprehended by his public, general convention by his public and digestible to the machine-computer. Semantics should be agreed upon, are conceptual, must be implementable to both sides.
Say footnotes, inline and block-quotes, titles and on and on to end up into a well-defined and finite list. Mediawiki, wikitext as an example fails in that perspective, defining syntax for elements of semantic meaning left undefined, no finite list agreed upon. "meaning by form" as additional of what a title as an example again carries as textual content. Example "This is a title" becomes only semantics integrated by the supposition within agreed upon semantics, and there can be more then one set of say "This is important and will be detailed"
Asciidoc and pandoc markup is quite different in it's semantics, regardless of how each translates this by convention of syntax to output formats.
Programming, output formats as html, pdf, epub can have consequentially meaning by form, by semantics, the syntax having disappeared as a temporary tool of translation, and as one more consequence thus the output can be scanned robotically for meaning, the champ of algorithms of 'grep': Google. Looking for the meaning of "what" in "What is it that is looked for" based upon whether a title or a footnote, or a link is considered.
Semantics, and there can be more then one layer, even the textual message carries (Chomsky) semantics thus could be translated as meaning by form, creating functional differences to anything else in the output chain, including a human being, the reader.
As a conclusion, programmers and academics should be integrated, no academic should be without knowledge of his tools, as any bread and butter carpenter. Programmers should be academics in the sense that the other end of the bridging they accomplish is the end user, the bridge... much so: semantics.
m.

Resources