Does Prolog need GC when the occurs check is globally enabled? - prolog

As far as I can tell, with sound unification, SLD resolution should not create cyclic data structures (Is this correct?)
If so, one could, in theory, implement Prolog in such a way that it wouldn't need garbage collection (GC). But then again, one might not.
Is this true for WAM-based Prolog implementations?
Is this true for SWI-Prolog? (I believe it's not WAM-based) Is it safe to disable GC in SWI-Prolog when the occurs check is globally enabled?
Specifically:
:- set_prolog_flag(occurs_check, true).
:- set_prolog_flag(gc, false). /* is this safe? */

The creation of cyclic terms is far from being the only operation that can create garbage (as in garbage collection) in Prolog (also worth noting that not all Prolog systems provide comprehensive support for cyclic terms but most of them support some form of garbage collection).
As an example, assume that you have in your code the following sequence of calls to go from a number to an atom:
...,
number_codes(Number, Codes),
atom_codes(Atom, Codes),
...
Here, Codes is a temporary list that should be garbage collected. Another example, assume that you're calling setof/3 to get an ordered list of results where you're only interested in the first two:
...,
setof(C, x(X), [X1, X2| _]),
...
You just created another temporary list. Or that you forgot about sub_atom/5 and decide to use atom_concat/3 to check the prefix of an atom:
...,
atom_concat(Prefix, _, Atom),
...
That second argument, the atom suffix that you don't care about (hence the anonymous variable), is a temporary atom that you just created. But not all Prolog systems provide an atom garbage collector, which can lead to trouble in long running applications.
But even when you think that you have carefully written your code to avoid the creation of temporary terms, the Prolog system may still be creating garbage when running your code. Prolog systems use different memory areas for different purposes, and operations may need to make temporary copies of segments of memory between different memory areas, depending on the implementation. The Prolog system may be written in a language, e.g. Java, that may eventually take care of that garbage. But most likely it's written in C or C++ and some sort of garbage collection is used internally. Not to mention that the Prolog system may be grabbing a big block of memory to be able to prove a query and then reclaiming that memory after the query terminates.

Well, something has to free up multiply-referenced memory to which references exist that can be dropped at any step in the computation. This is the case independently of whether structures are cyclic or not.
Consider variables A and B naming the same structure in memory (they "name the same term"). The structure is referenced from 2 places. Suppose the predicate in which B is defined succeeds or fails. The Prolog Processor cannot just deallocate that structure: it is still referenced from A. This means you need at least reference counting to make sure you don't free memory too early. That's a reference counting garbage collector.
I don't know what kind of garbage collection is implemented in any specific Prolog implementation (there are many approaches, some better suited to Prolog than others ... in a not completely-unrelated context 25 years of Java have created all of these) but you need to use one, not necessarily a reference counting one.
(Cyclic structures are only special to garbage collection because reference counting garbage collection algorithms are unable to free up cyclic structures, as all the cells in a loop have a reference count of at least 1.)
(Also, an IMHO, never a trust a programming languages in which you have to call free yourself. There is probably a variation of Greenspun's tenth rule ("Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.") in that any program written in a programming language in which you have to call free yourself contains an ad hoc, informally-specified, bug-ridden, slow implementation of a garbage collection algorithm.")
(OTOH, Rust seems to take kind of a middle way, offloading some effort onto the developer but having the advantage of being able to decide whether to free memory when a variable goes out of scope. But Rust is not Prolog.)

This here would be safe:
:- set_prolog_flag(gc, false).
But if your Prolog system has garbage collection, switching it off might be not a good idea, since even with occurs check set to true, there can be still garbage through temporary results. And having garbage continuously removed can improve locality of code, i.e. your memory gets less trashed by cache misses:
p(X,Y) :- q(X,Z), r(Z,Y).
The variable Z might point to some Prolog term which is only temporarily needed. Most modern Prolog systems can remove such Prolog terms through so called environment trimming.
But an occurs check opens up the path to a special kind of garbage collection. Namely since no more cyclic terms can appear, it is possible to use reference counting. An old Prolog system that had reference counting was this beast here:
xpProlog: High Performance Extended Pure Prolog - Lüdemann, 1988
https://open.library.ubc.ca/media/download/pdf/831/1.0051961/1
Also Jekejeke Prolog does still do reference counting. A problem with reference counting are attributed variables, which can create a cyclic term nevertheless, for example a freeze/2 as follows creates a cycle through the frozen goal back to the variable:
?- freeze(X, (write(X), nl)).
Edit 04.09.2021:
What also might demand garbage collection is setarg/3. It
can create cycles which cannot be so easily removed by
reference counting.
?- X = f(0), setarg(1,X,X).
X = f(X).
Since setarg/3 is backtrackable, the cycle would go away in
backtracking, at least I guess so. But the cycle might still bother
when we are deep in backtracking and running out of memory.
Or the cycle might not go away through backtracking since we
used the non-backtracking version nb_setarg/3.

Related

What makes a DCG predicate expensive?

I'm building a Definite Clause Grammar to parse 20,000 pieces of semi-natural text. As the size of my database of predicates grows (now up to 1,200 rules), parsing a string can take quite a long time -- particularly for strings that are not currently interpretable by the DCG, due to syntax I haven't yet encoded. The current worst-case is 3 minutes for a string containing 30 words. I'm trying to figure out how I can optimize this, or if I should just start researching cloud computing.
I'm using SWI-Prolog, and that provides a "profile" goal, which provides some statistics. I was surprised to find that the simplest rules in my database are taking up the majority of execution time. My corpus contains strings that represent numbers, and I want to capture these in a scalar/3 predicate. These are hogging ~50-60% of total execution time.
At the outset, I had 70 lines in my scalars.pl, representing the numeric and natural language representations of the numbers in my corpus. Like so:
scalar(scalar(3)) --> ["three"].
scalar(scalar(3)) --> ["3"].
scalar(scalar(4)) --> ["four"].
scalar(scalar(4)) --> ["4"].
...and so on.
Thinking that the length of the file was the problem, I put in a new rule that would automatically parse any numeric representations:
scalar(scalar(X)) --> [Y], { atom_number(Y, X) }.
Thanks to that, I've gone from 70 rules to 31, and helped a bit -- but it wasn't a huge savings. Is there anything more that can be done? My feeling is maybe not, because what could be simpler than a single atom in a list?
These scalars are called in a lot of places throughout the grammar, and I assume that's the root of the issue. Though they're simple rules, they're everywhere, and unavoidably so. A highly general grammar just won't work for my application, and I wouldn't be surprised if I end up with 3,000 rules or more.
I've never built a DCG this large, so I'm not sure how much I can expect in terms of performance. Happy to take any kind of advice on this one: is there some other way of encoding these rules? Should I accept that some parses will take a long time, and figure out how to run parses in parallel?
Thank you in advance!
EDIT: I was asked to provide a reproducible example, but to do that I'd have to link SO to the entire project, since this is an issue of scale. Here's a toy version of what I'm doing for the sake of completeness. Just imagine there were large files describing hundreds of nouns, hundreds of verbs, and hundreds of syntactic structures.
sent(sent(VP, NP)) --> vp(VP), np(NP).
vp(vp(V)) --> v(V).
np(np(Qty, Noun)) --> qty(Qty), n(Noun).
scalar(scalar(3)) --> ["three"].
scalar(scalar(X)) --> [Y], { atom_number(Y, X) }.
qty(qty(Scalar)) --> scalar(Scalar).
v(v(eat)) --> ["eat"].
n(n(pie)) --> ["pie"].
One aspect of your program that you might investigate is to make sure individual predicates succeed quickly and fail quickly. This is particularly useful to check for predicates that have many clauses.
For instance, when scalar(X) is evaluated on a token that is not a scalar then the program will have to try 31 (by your last count) times before it can determine that scalar//1 fails. If the structure of your program is such that scalar(X) is checked against every token then this could be very expensive.
Further, if scalar(X) does happen to find that a token matches but a subsequent goal fails then it appears that your program will retry the scalar(X) until all of the scalar//1 clauses have been attempted.
The judicious use of cut (!) or if-then-else (C1->G1;C2->G2;G3) can provide a tremendous performance improvement.
Or you can structure your predicates so that they rely on indexing to select the appropriate clause. E.g.:
scalar(scalar(N)) --> [Token], {scalar1(Token, scalar(N))}.
scalar1("3", scalar(3)) :- !.
scalar1(Y, scalar(X)) :- atom_number(Y, X).
This uses both cut and clause indexing (if the compiler provides it) with the scalar1/1 predicate.
EDIT: You should read R. A. O'Keefe's The Craft of Prolog. It is an excellent guide to the practical aspects of Prolog.
Here's how I've tackled performance and optimization problems as a novice Prologer.
1.) Introduce timeouts to your application. I'm calling Prolog via the subprocess module in Python 3.6, and that allows you to set a timeout. As I've worked with my code base more, I've got a pretty good sense of how long a successful parse might take, and can assume anything taking longer is not going to work.
2.) Make use of the graphical profiler that's packaged in the swi-prolog IDE. This gives a lot more insight, as you can bounce around the call tree. I found it particularly helpful to sort predicates by the execution time of their children. Before I was thinking about it like pollution in a river. "Man, there's a lot of junk floating in here," I thought, not considering that upstream some factories were contributing a lot of that junk.
As for how to optimize a DCG without hurting the semantics & expressivity of one's grammar, I think that will have to be a question for another Stack Overflow. And as for my initial question, that's still an open one -- predicates that seem simple (to me) take quite a while.

Space / time requirements for ISO-Prolog processor compliance

All implementations of the functional programming language scheme are required to perform tail-call-optimization whenever it is applicable.
Does iso-prolog have this one and / or similar requirements?
It's clear to me that Prolog processor features like first argument principal functor indexing and atom garbage collection are widely adopted, but are not prescribed by the ISO standard.
But what about prolog-cut?
Make believe that some Prolog system gets the semantics right, but does not guarantee that ...
rep.
rep :- !, rep.
rep.
?- rep, false.
... can run forever with constant stack space?
Could that system still be ISO-Prolog compliant?
Whenever you are reading a standard, first look at its scope (domaine d'application, область применения, Anwendungsbereich). Thus whether or not the standard applies to what you want to know. And in 13211-1:1995, 1 Scope there is a note:
NOTE - This part of ISO/IEC 13211 does not specify:
a) the size or complexity of Prolog text that will exceed the
capacity of any specific data processing system or language
processor, or the actions to be taken when the corresponding
limits are exceeded;
b) the minimal requirements of a data processing system
that is capable of supporting an implementation of a Prolog
processor;
...
Strictly speaking, this is only a note. But if you leaf through the standard, you will realize that there are no such requirements. For a similar situation see also this answer.
Further, resource errors (7.12.2 h) and system errors may occur at "any stage of execution".
Historically, the early implementations of DEC10 did not contain last call optimizations and a lot of effort was invested by programmers to either use failure driven loops or enable logarithmic stack usage.
In your example rep, a conforming system may run out of space. And that overflow may be signaled with a resource error, but even that is not required since the system might bail out with a system error. What is more irritating to me is the following program
rep2 :- rep2.
rep2.
Even this program may run infinitely without ever running out of space! And this although nobody cuts away the extra choice point.
In summary, recall that conformance with a standard is just a precondition for a working system.

Attributed variables: library interfaces / implementations / portability

When I was skimming some prolog related questions recently, I stumbled upon this answer by #mat to question How to represent directed cyclic graph in Prolog with direct access to neighbour verticies .
So far, my personal experience with attributed variables in Prolog has been very limited. But the use-case given by #mat sparked my interest. So I tried using it for answering another question, ordering lists with constraint logic programming.
First, the good news: My first use of attributed variables worked out like I wanted it to.
Then, the not so good news: When I had posted by answer, I realized there were several API's and implementations for attributed variables in Prolog.
I feel I'm over my head here... In particular I want to know the following:
What API's are in wide-spread use? Up to now, I found two: SICStus and SWI.
Which features do the different attributed variable implementations offer? The same ones? Or does one subsume the other?
Are there differences in semantics?
What about the actual implementation? Are some more efficient than others?
Can be (or is) using attributed variables a portability issue?
Lots of question marks, here... Please share your experience / stance?
Thank you in advance!
Edit 2015-04-22
Here's a code snippet of the answer mentioned above:
init_att_var(X,Z) :-
put_attr(Z,value,X).
get_att_value(Var,Value) :-
get_attr(Var,value,Value).
So far I "only" use put_attr/3 and get_attr/3, but---according to the SICStus Prolog documentation on attributed variables---SICStus offers put_attr/2 and get_attr/2.
So even this very shallow use-case requires some emulation layer (one way or the other).
I would like to focus on one important general point I noticed when working with different interfaces for attributes variables: When designing an interface for attributed variables, an implementor should also keep in mind the following:
Is it possible to take attributes into account when reasoning about simultaneous unifications, as in [X,Y] = [0,1]?
This is possible for example in SICStus Prolog, because such bindings are undone before verify_attributes/3 is called. In the interface provided by hProlog (attr_unify_hook/2, called after the unification and with all bindings already in place) it is hard to take into account the (previous) attributes of Y when reasoning about the unification of X in attr_unify_hook/2, because Y is no longer a variable at this point! This may be sufficient for solvers that can make decisions based on ground values alone, but it is a serious limitation for solvers that need additional data, typically stored in attributes, to see whether a unification should succeed, and which are then no longer easily available. One obvious example: Boolean unification with decision diagrams.
As of 2016, the verify-attributes branch of SWI-Prolog also supports verify_attributes/3, thanks to great implementation work by Douglas Miles. The branch is ready for testing and intended to be merged into master as soon as it works correctly and efficiently. For compatibility with hProlog, the branch also supports attr_unify_hook/2: It does so by rewriting such definitions to the more general verify_attributes/3 at compilation time.
Performance-wise, it is clear that there may be a downside to verify_attributes/3, because making several variables ground at the same time may let you sooner see (in attr_unify_hook/2) that a unification cannot succeed. However, I will gladly and any time exchange this typically negligible advantage for the improved reliability, ease of use, and increased functionality that the more general interface gives you, and which is in any case already the standard behaviour in SICStus Prolog which is on top of its generality also one of the faster Prolog systems around.
SICStus Prolog also features an important predicate called project_attributes/2: It is used by the toplevel to project constraints to query variables. SWI-Prolog also supports this in recent versions.
There is also one huge advantage of the SWI interface: The residual goals that attribute_goals//1 and hence copy_term/3 give you are always a list. This helps users to avoid defaultyness in their code, and encourages a more declarative interface, because a list of pure constraint goals cannot contain control structures.
Interestingly, neither interface lets you interpret unifications other than syntactically. Personally, I think there are cases where you may want to interpret unifications differently than syntactically, however, there may also be good arguments against that.
The other interface predicates for attributed variables are mostly easily interchangable with simple wrapper predicates for different systems.
Jekejeke Minlog has state-less or thin attribute variables. Well not exactly, an attribute variable can have zero, one or many hooks, which are allowed to be closures, and hence can carry a little state.
But typically an implementation manages the state elsewere. For this
purpose Jekejeke Minlog allows creating reference types from variables,
so that they can be used as indexes into tables.
The full potential is unleashed if this combined with trailing and/or
forward chaining. As an example we have implemented CLP(FD). There is also a little solver tutorial.
The primitive ingredients in our case are:
1) State-less Attribute Variables
2) Trailing and Variable Keys
3) Continuation Queue
The attribute variables hooks might have binding effects upto extending the continuation queue but are only executed once. Goals from the continuation queue can be non-deterministic.
There are some additional layers before realizing applications, that are mostly aggregations of the primitives to make changes temporarily.
The main applications so far are open source here and here:
a) Finite Domain Constraint Solver
b) Herbrand Constraints
c) Goal Suspension
Bye
An additional perspective on attributed variable libraries is how many attributes can be defined per module. In the case of SWI-Prolog/YAP and citing SWI documentation:
Each attribute is associated to a module, and the hook
(attr_unify_hook/2) is executed in this module.
This is a severe limitation for implementers of libraries such as CLP(FD) as it forces using additional modules for the sole purpose of having multiple attributes instead of being able to define as many attributes as required in the module implementing their library. This limitation doesn't exist on the SICStus Prolog interface, which provides a directive attribute/1 that allows the declaration of an arbitrary number of attributes per module.
You can find one of the oldest and most elaborate implementations of attributed variables in ECLiPSe, where it forms part of the wider infrastructure for implementing constraint solvers.
The main characteristics of this design are:
attributes must be declared, and in return the compiler supports efficient access
a syntax for attributed variables, so that they can be read and written
a more complete set of handlers for attribute operations, so that attributes are not only taken into account for unification, but also for other generic operations such as term copying and subsumption tests
a clear separation between the concepts of variable attribute and suspended goals
used in over a dozen of ECLiPSe's libraries
This paper (section 4) and the ECLiPSe documentation have more details.

O(1) term look up

I wish to be able to look up the existence of a term as fast as possible in my current prolog program, without the prolog engine traversing all the terms until it finally reaches the existing term.
I have not found any proof of it.. but I assume that given
animal(lion).
animal(zebra).
...
% thousands of other animals
...
animal(tiger).
The swi-prolog engine will have to go through thousands of animals trying to unify with tiger in order to confirm that animal(tiger) is in my prolog database.
In other languages I believe a HashSet would solve this problem, enabling a O(1) look up... However I cannot seem to find any hashsets or hashtables in the swi-prolog documentation.
Is there a swi-prolog library for hashsets, or can I somehow built it myself using term_hash\2?
Bonus info, I will most likely have to do the look up on some dynamically added data, either added to a hashset data-structure or using assertz
All serious Prolog systems perform this O(1) lookup via hashing automatically and implicitly for you, so you do not have to do it yourself.
It is called argument-indexing, and you find this explained in all good Prolog books. See also "JIT (just-in-time) indexing" in more recent versions of many Prolog systems, including SWI. Indexing is applied to dynamically added clauses too, and is one reason why assertz/1 is slowed down and therefore not a good choice for data that changes more often than it is read.
You can also easily test this yourself by creating databases with increasingly more facts and seeing that the lookup time remains roughly constant when argument indexing applies.
When the built-in first argument indexing is not enough (note that some Prolog systems also provide multi-argument indexing), depending on the system, you can construct your own indexing scheme using a built-in or library term hashing predicate. In the case of ECLiPSe, GNU Prolog, SICStus Prolog, SWI-Prolog, and YAP, look into the documentation of the term_hash/4 predicate.

Does "Call by name" slow down Haskell?

I assume it doesn't.
My reason is that Haskell is pure functional programming (without I/O Monad), they could have made every "call by name" use the same evaluated value if the "name"s are the same.
I don't know anything about the implementation details but I'm really interested.
Detailed explanations will be much appreciated :)
BTW, I tried google, it was quite hard to get anything useful.
First of all, Haskell is a specification, not an implementation; the report does not actually require use of call-by-name evaluation, or lazy evaluation for that matter. Haskell implementations are only required to be non-strict, which does rule out call-by-value and similar strategies.
So, strictly (ha, ha) speaking, evaluation strategies can't slow down Haskell. I'm not sure what can slow down Haskell, though clearly something has or else it wouldn't have taken 12 years to get the next version of the Report out after Haskell 98. My guess is that it involves committees somehow.
Anyway, "lazy evaluation" refers to a "call by need" strategy, which is the most common implementation choice for Haskell. This differs from call-by-name in that if a subexpression is used in multiple places, it will be evaluated at most once.
The details of what qualifies as a subexpression that will be shared is a bit subtle and probably somewhat implementation-dependent, but to use an example from GHC Haskell: Consider the function cycle, which repeats an input list infinitely. A naive implementation might be:
cycle xs = xs ++ cycle xs
This ends up being inefficient because there is no single cycle xs expression that can be shared, so the resulting list has to be constructed continually as it's traversed, allocating more memory and doing more computation each time.
In contrast, the actual implementation looks like this:
cycle xs = xs' where xs' = xs ++ xs'
Here the name xs' is defined recursively as itself appended to the end of the input list. This time xs' is shared, and evaluated only once; the resulting infinite list is actually a finite, circular linked list in memory, and once the entire loop has been evaluated no further work is needed.
In general, GHC will not memoize functions for you: given f and x, each use of f x will be re-evaluated unless you give the result a name and use that. The resulting value will be the same in either case, but the performance can differ significantly. This is mostly a matter of avoiding pessimizations--it would be easy for GHC to memoize things for you, but in many cases this would cost large amounts of memory to gain tiny or nonexistent amounts of speed.
The flip side is that shared values are retained; if you have a data structure that's very expensive to compute, naming the result of constructing it and passing that to functions using it ensures that no work is duplicated--even if it's used simultaneously by different threads.
You can also pessimize things yourself this way--if a data structure is cheap to compute and uses lots of memory, you should probably avoid sharing references to the full structure, as that will keep the whole thing alive in memory as long as anything could possibly use it later.
Yes, it does, somewhat. The problem is that Haskell can't, in general, calculate the value too early (e.g. if it would lead to an exception), so it sometimes needs to keep a thunk (code for calculating the value) instead of the value itself, which uses more memory and slows things down. The compiler tries to detect cases where this can be avoided, but it's impossible to detect all of them.

Resources