How to choose between bagof, setof and findall in Prolog - prolog

How does one choose between bagof, setof and findall? Are there any important differences? Which is most commonly used and which is the safest? Thanks for your comments/answers.
I checked the SWI-Prolog manual page on findall/3 and found them to be very similar.

That's a great question!
I am not attempting to give an exhaustive answer, but I would like to suggest a few lines of thought that can help you to understand these predicates better:
Think about which of these predicates are more fundamental in the sense of what can be expressed in terms of what. For example, can you express findall/3 in terms of setof/3? And what about the other way around? This helps to see what you need to provide at least to implement all these predicates.
Think about which declarative properties are preserved by these predicates. For example, does the order in which the solutions are found influence the results? For which of these predicates precisely? Can any of these predicates fail? In which cases precisely?
Think about the space and time complexity of each of these predicates. Which of these predicates, if any, can be implemented and used more efficiently than others? And at what cost and tradeoffs?
In addition, I recommend you read Richard O'Keefe's book The Craft of Prolog for valuable information about these predicates.

an important difference I found between findall/3 and bagof/3, is that the latter doesn't make copies of the terms it accumulates. This can be crucial, for instance, if your list collects attributed variables, like when modelling a problem with library(clpfd).

Related

Dealing with complicated prolog loops

I am using Prolog to encode some fairly complicated rules in a project of mine. There is a lot of recursion, including mutual recursion. Part of the rules look something like this:
pred1(X) :- ...
pred1(X) :- someguard(X), pred2(X).
pred2(X) :- ...
pred2(X) :- othercondition(X), pred1(X).
There is a fairly obvious infinite loop between pred1 and pred2. Unfortunately, the interaction between these predicates is very complicated and difficult to isolate. I was able to eliminate the infinite loop in this instance by passing around a list of objects that have been passed to pred1, but this is extremely unwieldy! In fact, it largely defeats the purpose of using Prolog in this application.
How can I make Prolog avoid infinite loops? For example, if in the course of proving pred1(foo) it tries to prove pred1(foo) as a sub-goal, fail and backtrack.
Is it possible to do this with meta-interpreters?
Yes, you can use meta-interpreters for this purpose, as mat suggests. But for the normal use case, that is going far beyond the regular effort.
What you may consider instead is to separate the looping functionality from your actual logic using higher-order predicates. That is a very safe way to go — SWI even checks if all the uses have a corresponding definition. This checking is either invoked when typing make. or check.
As an example, consider closure0/3 and path/4 which both handle loop checks "once and forever".
One feature that is available in some Prolog systems and that may help you to solve such issues is called tabling. See for example the related question and prolog-tabling.
If tabling is not available, then yes, meta-interpreters can definitely help a lot with this. For example, you can change the executation strategy etc. with a meta-interpreter.
In SWI-Prolog, also check out call_with_inference_limit/3 to robustly limit the execution, independent of CPU type and system load.
Related and also useful are termination analyzers like cTI: They allow you to statically derive termination conditions.

Why are DCGs defined as a compact way to describe lists?

In my opinion, define the DCGs (Definite Clause Grammars) as a compact way to describe the lists in Prolog, is a poorly way to define them. As far as I know, the DCGs are not only used in Prolog, but also in other programming languages, such as Mercury.
In addition, they are called DCGs, because they represent a grammar in a set of definite clauses (Horn clauses), the basis of logic programming.
So why if an entire Prolog program can be written using definite clauses, DCGs are solely defined as a compact way to describe the lists in Prolog?
Note: The doubt arises from the description for the tag dcg given by SO.
The extended info from the DCG tag wiki provides additional information, which I think is both correct and also in close agreement with your first point:
"DCGs are usually associated with Prolog, but similar languages such
as Mercury also include DCGs."
Regarding your second point: Emphasizing the close association with Prolog lists is in my opinion well justified, since a DCG indeed always describes a list, and typically also quite compactly.

O(1) term look up

I wish to be able to look up the existence of a term as fast as possible in my current prolog program, without the prolog engine traversing all the terms until it finally reaches the existing term.
I have not found any proof of it.. but I assume that given
animal(lion).
animal(zebra).
...
% thousands of other animals
...
animal(tiger).
The swi-prolog engine will have to go through thousands of animals trying to unify with tiger in order to confirm that animal(tiger) is in my prolog database.
In other languages I believe a HashSet would solve this problem, enabling a O(1) look up... However I cannot seem to find any hashsets or hashtables in the swi-prolog documentation.
Is there a swi-prolog library for hashsets, or can I somehow built it myself using term_hash\2?
Bonus info, I will most likely have to do the look up on some dynamically added data, either added to a hashset data-structure or using assertz
All serious Prolog systems perform this O(1) lookup via hashing automatically and implicitly for you, so you do not have to do it yourself.
It is called argument-indexing, and you find this explained in all good Prolog books. See also "JIT (just-in-time) indexing" in more recent versions of many Prolog systems, including SWI. Indexing is applied to dynamically added clauses too, and is one reason why assertz/1 is slowed down and therefore not a good choice for data that changes more often than it is read.
You can also easily test this yourself by creating databases with increasingly more facts and seeing that the lookup time remains roughly constant when argument indexing applies.
When the built-in first argument indexing is not enough (note that some Prolog systems also provide multi-argument indexing), depending on the system, you can construct your own indexing scheme using a built-in or library term hashing predicate. In the case of ECLiPSe, GNU Prolog, SICStus Prolog, SWI-Prolog, and YAP, look into the documentation of the term_hash/4 predicate.

Implementing arithmetic for Prolog

I'm implementing a Prolog interpreter, and I'd like to include some built-in mathematical functions (sum, product, etc). For example, I would like to be able to make calculations using knowledge bases like this one:
NetForce(F) :- Mass(M), Acceleration(A), Product(M, A, F)
Mass(10) :- []
Acceration(12) :- []
So then I should be able to make queries like ?NetForce(X). My question is: what is the right way to build functionality like this into my interpreter?
In particular, the problem I'm encountering is that, in order to evaluate Sum, Product, etc., all their arguments have to be evaluated (i.e. bound to numerical constants) first. For example, while to code above should evaluate properly, the permuted rule:
NetForce(F) :- Product(M, A, F), Mass(M), Acceleration(A)
wouldn't, because M and A aren't bound when the Product term is processed. My current approach is to simply reorder the terms so that mathematical expressions appear last. This works in simple cases, but it seems hacky, and I would expect problems to arise in situations with multiple mathematical terms, or with recursion. Is there a better solution?
The functionality you are describing exists in existing systems as constraint extensions. There is CLP(Q) over the rationals, CLP(R) over the reals - actually floats, and last but not least CLP(FD) which is often extended to a CLP(Z). See for example
library(clpfd).
In any case, starting a Prolog implementation from scratch will be a non-trivial effort, you will have no time to investigate what you want to implement because you will be inundated by much lower level details. So you will have to use a more economical approach and clarify what you actually want to do.
You might study and implement constraint languages in existing systems. Or you might want to use a meta-interpreter based approach. Or maybe you want to implement a Prolog system from scratch. But don't expect that you succeed in all of it.
And to save you another effort: Reuse existing standard syntax. The syntax you use would require you to build an extra parser.
You could use coroutining to delay the evaluation of the product:
product(X, A, B) :- freeze(A, freeze(B, X is A*B))
freeze/2 delays the evaluation of its second argument until its first argument is ground. Used nested like this, it only evaluates X is A*B after both A and B are bound to actual terms.
(Disclaimer: I'm not an expert on advanced Prolog topics, there might be an even simpler way to do this - e.g. I think SICStus Prolog has "block declarations" which do pretty much the same thing in a more concise way and generalized over all declarations of the predicate.)
Your predicates would not be clause order independent, which is pretty important. You need to determine usage modes of your predicates - what will the usage mode of NetForce() be? If I were designing a predicate like Force, I would do something like
force(Mass,Acceleration,Force):- Force is Mass * Acceleration.
This has a usage mode of +,+,- meaning you give me Mass and Acceleration and I will give you the Force.
Otherwise, you are depending on the facts you have defined to unify your variables, and if you pass them to Product first they will continue to unify and unify and you will never stop.

Is there a library/technique to collect statistics for optimal clause ordering in Prolog?

I'm writing a program where I need to see if strings match a particular pattern. Right now I've got this implemented in Prolog as a rule matchesPattern(S), with well over 20 different definition.
I end up running all the binary strings up to a certain length through the pattern checking predicate. The program is fairly slow (as Prolog often is), and since there are so many different definitions, I'd ideally like to order them so the ones most matched are earliest in the ordering, and thus matched first by Prolog, avoid backtracking as much as I can.
I'm using SWI Prolog right now, but I have access to SICStus, so I'm willing to use it or any Prolog interpreter I can get for free.
SWI-Prolog has profile/3 and show_profile/2 that could help with your task.
Left factoring your pattern rules, and applying cuts, could improve the runtime, if there are common parts between patterns. Such analysis should be combined with the statistics.
You shoudl think about using DCG and cuts.
You should look into cuts. The prolog syntax for this is:
!

Resources