Prolog unknowns in the knowledge base - prolog

I am trying to learn Prolog and it seems the completeness of the knowledge is very important because obviously if the knowledge base does not have the fact, or the fact is incorrect, it will affect the query results. I am wondering how best to handle unknown details of a fact. For example,
%life(<name>,<birth year>,<death year>)
%ruler(<name>,<precededBy>,<succeededBy>)
Some people I add to the knowledge base would still be alive, therefore their year of death is not known. In the example of rulers, the first ruler did not have a predecessor and the current ruler does not have a successor. In the event that there are these unknowns should I put some kind of unknown flag value or can the detail be left out. In the case of the ruler, not knowing the predecessor would the fact look like this?
ruler(great_ruler,,second_ruler).

Well, you have a few options.
In this particular case, I would question your design. Rather than putting both previous and next on the ruler, you could just put next and use a rule to find the previous:
ruler(great_ruler, second_ruler).
ruler(second_ruler, third_ruler).
previous(Ruler, Previous) :- ruler(Previous, Ruler).
This predicate will simply fail for great_ruler, which is probably appropriate—there wasn't anyone before them, after all.
In other cases, it may not be straightforward. So you have to decide if you want to make an explicit value for unknown or use a variable. Basically, do you want to do this:
ruler(great_ruler, unknown, second_ruler).
or do you want to do this:
ruler(great_ruler, _, second_ruler).
In the first case, you might get spurious answers featuring unknown, unless you write some custom logic to catch it. But I actually think the second case is worse, because that empty variable will unify with anything, so lots of queries will produce weird results:
ruler(_, SucceededHimself, SucceededHimself)
will succeed, for instance, unifying SucceededHimself = second_ruler, which probably isn't what you want. You can check for variables using var/1 and ground/1, but at that point you're tampering with Prolog's search and it's going to get more complex. So a blank variable is not as much like NULL in SQL as you might want it to be.
In summary:
prefer representations that do not lead to this problem
if forced, use a special value

Related

Reporting *why* a query failed in Prolog in a systematic way

I'm looking for an approach, pattern, or built-in feature in Prolog that I can use to return why a set of predicates failed, at least as far as the predicates in the database are concerned. I'm trying to be able to say more than "That is false" when a user poses a query in a system.
For example, let's say I have two predicates. blue/1 is true if something is blue, and dog/1 is true if something is a dog:
blue(X) :- ...
dog(X) :- ...
If I pose the following query to Prolog and foo is a dog, but not blue, Prolog would normally just return "false":
? blue(foo), dog(foo)
false.
What I want is to find out why the conjunction of predicates was not true, even if it is an out of band call such as:
? getReasonForFailure(X)
X = not(blue(foo))
I'm OK if the predicates have to be written in a certain way, I'm just looking for any approaches people have used.
The way I've done this to date, with some success, is by writing the predicates in a stylized way and using some helper predicates to find out the reason after the fact. For example:
blue(X) :-
recordFailureReason(not(blue(X))),
isBlue(X).
And then implementing recordFailureReason/1 such that it always remembers the "reason" that happened deepest in the stack. If a query fails, whatever failure happened the deepest is recorded as the "best" reason for failure. That heuristic works surprisingly well for many cases, but does require careful building of the predicates to work well.
Any ideas? I'm willing to look outside of Prolog if there are predicate logic systems designed for this kind of analysis.
As long as you remain within the pure monotonic subset of Prolog, you may consider generalizations as explanations. To take your example, the following generalizations might be thinkable depending on your precise definition of blue/1 and dog/1.
?- blue(foo), * dog(foo).
false.
In this generalization, the entire goal dog(foo) was removed. The prefix * is actually a predicate defined like :- op(950, fy, *). *(_).
Informally, above can be read as: Not only this query fails, but even this generalized query fails. There is no blue foo at all (provided there is none). But maybe there is a blue foo, but no blue dog at all...
?- blue(_X/*foo*/), dog(_X/*foo*/).
false.
Now we have generalized the program by replacing foo with the new variable _X. In this manner the sharing between the two goals is retained.
There are more such generalizations possible like introducing dif/2.
This technique can be both manually and automatically applied. For more, there is a collection of example sessions. Also see Declarative program development in Prolog with GUPU
Some thoughts:
Why did the logic program fail: The answer to "why" is of course "because there is no variable assignment that fulfills the constraints given by the Prolog program".
This is evidently rather unhelpful, but it is exactly the case of the "blue dog": there are no such thing (at least in the problem you model).
In fact the only acceptable answer to the blue dog problem is obtained when the system goes into full theorem-proving mode and outputs:
blue(X) <=> ~dog(X)
or maybe just
dog(X) => ~blue(X)
or maybe just
blue(X) => ~dog(X)
depending on assumptions. "There is no evidence of blue dogs". Which is true, as that's what the program states. So a "why" in this question is a demand to rewrite the program...
There may not be a good answer: "Why is there no x such that x² < 0" is ill-posed and may have as answer "just because" or "because you are restricting yourself to the reals" or "because that 0 in the equation is just wrong" ... so it depends very much.
To make a "why" more helpful, you will have to qualify this "why" somehow. which may be done by structuring the program and extending the query so that additional information collecting during proof tree construction is bubbling up, but you will have to decide beforehand what information that is:
query(Sought, [Info1, Info2, Info3])
And this query will always succeed (for query/2, "success" no longer means "success in finding a solution to the modeled problem" but "success in finishing the computation"),
Variable Sought will be the reified answer of the actual query you want answered, i.e. one of the atoms true or false (and maybe unknown if you have had enough with two-valued logic) and Info1, Info2, Info3 will be additional details to help you answer a why something something in case Sought is false.
Note that much of the time, the desire to ask "why" is down to the mix-up between the two distinct failures: "failure in finding a solution to the modeled problem" and "failure in finishing the computation". For example, you want to apply maplist/3 to two lists and expect this to work but erroneously the two lists are of different length: You will get false - but it will be a false from computation (in this case, due to a bug), not a false from modeling. Being heavy-handed with assertion/1 may help here, but this is ugly in its own way.
In fact, compare with imperative or functional languages w/o logic programming parts: In the event of failure (maybe an exception?), what would be a corresponding "why"? It is unclear.
Addendum
This is a great question but the more I reflect on it, the more I think it can only be answer in a task-specific way: You must structure your logic program to be why-able, and you must decide what kind of information why should actually return. It will be something task-specific: something about missing information, "if only this or that were true" indications, where "this or that" are chosen from a dedicates set of predicates. This is of course expected, as there is no general way to make imperative or functional programs explain their results (or lack thereof) either.
I have looked a bit for papers on this (including IEEE Xplore and ACM Library), and have just found:
Reasoning about Explanations for Negative Query Answers in DL-Lite which is actually for Description Logics and uses abductive reasoning.
WhyNot: Debugging Failed Queries in Large Knowledge Bases which discusses a tool for Cyc.
I also took a random look at the documentation for Flora-2 but they basically seem to say "use the debugger". But debugging is just debugging, not explaining.
There must be more.

Prolog: Looping through elements of list A and comparing to members of list B

I'm trying to write Prolog logic for the first time, but I'm having trouble. I am to write logic that takes two lists and checks for like elements between the two. For example, consider the predicate similarity/2 :
?- similarity([2,4,5,6,8], [1,3,5,6,9]).
true.
?- similarity([1,2,3], [5,6,8]).
false.
The first query will return true as those two lists have 5 and 6 in common. The second returns false as there are no common elements between the two lists in that query.
I CANNOT use built in logic, such as member, disjoint, intersection, etc. I am thinking of iterating through the first list provided, and checking to see if it matches each element in the second list. Is this an efficient approach to this problem? I will appreciate any advice and help. Thank you so much.
Writing Prolog for the first time can be really daunting, since it is unlike many traditional programming languages that you will most likely encounter; however it is a very rewarding experience once you've got a grasp on this new style of programming! Since you mention that you are writing Prolog for the first time I'll give some general tips and tricks about writing Prolog, and then move onto some hints to your problem, and then provide what I believe to be a solution.
Think Recursively
You can think of every Prolog program that you write to be intrinsically recursive in nature. i.e. you can provide it with a series of "base-cases" which take the following form:
human(John). or wildling(Ygritte) In my opinion, these rules should always be the first ones that you write. Try to break down the problem into its simplest case and then work from there.
On the other hand, you can also provide it with more complex rules which will look something like this: contains(X, [H|T]):- contains(X, T) The key bit is that writing a rule like this is very much equivalent to writing a recursive function in say, Python. This rule does a lot of the heavy lifting in looking to see whether a value is contained in a list, but it isn't complete without a "base-case". A complete contains rule would actually be two rules put together: contains(X, [X|_]).
contains(X, [H|T]):-contains(X, T).
The big takeaway from this is to try and identify the simple cases of your problem, which can act like base cases in a recursive function, and then try to identify how you want to "recurse" and actually do work on the problem at hand.
Pattern Matching
Part of the great thing about Prolog is the pattern matching system that it has in place. You should 100% use this to your advantage whenever you can -- it is especially helpful when trying to do anything with lists. For example:
head(X, [X|T]).
Will evaluate to true when called thusly: head(1, [1, 2, 3]) because intrinsic in the rule is the matching of X. This sort of pattern matching on the first element of a list is incredibly important and really the key way that you will do any work on lists in Prolog. In my experience, pattern matching on the head of a list will often be one of the "base-cases" that I mentioned beforehand.
Understand The Flow of the Program
Another key component of how Prolog works is that it takes a "top-down" approach to reading code. What I mean by that is that every time a rule is called (except for definitions of the form king(James).), Prolog starts at line 1 and continues until it reaches a rule that is true or the end of the file. Therefore, the ordering of your rules is incredibly important. I'm assuming that you know that you can combine rules together via a comma to indicate logical AND, but what is maybe more subtle is that if you order one rule above another, it can act as a logical OR, simply because it will be evaluated before another rule, and can potentially cause the program to recurse.
Specific Example
Now that I've gotten all of my general advice out of the way, I'll actually reference the given problem. First, I'd write my "base-case". What would happen if you are given two lists whose first elements are the same? If the first element in each list is not the same, then they have to be different. So, you have to look through the second list to see if this element is contained anywhere in the rest of the list. What kind of rule would this produce? OR it could be the case that the first element of the first list is not contained within the second at all, in which case you have to advance once in the first list, and start again with the second list. What kind of rule would this produce?
In the end, I would say that your approach is the correct one to take, and I have provided my own solution below:
similarity([H|_], [H|_]).
similarity(H1|T1], [_|T2]):- similarity([H1|T1], T2).
similarity([_|T1], [H2|T2]):- similarity(T1, [H2|T2]).
Hope all of this helps in some way!

Could anybody explain me the independence in bayesian nets?

Could anybody explain me conditional independence in the following cases? Could you give me any other appropriate examples for each case?
First and third examples fall under rule, that says if a variable's all parents are known, it should care only about its children and it is conditionally independent of all other variables.
In the first example the random variable JohnCalls(child) is conditionally independent of the random variable Burglary(grandpa), which means that, if we know the state of random variable Alarm(parent), Johncalls will act accordingly regardless whether there was a Burglary or not.
The similar example would be WasPartying -> HomeworkWasntCompleted -> ReceivedBadGrade. Here, regardless whether you were partying or not, if homework wasn't completed (the parent is known), you gonna receive bad grade. So if we have a value of HomeworkWasntCompleted, learning value of WasPartying doesn't give us any new information about ReceivedBadGrade.
In the third example it's the same: if we know that Alarm is on, Marycalls won't give us any new hint about JohnCalls, so JohnCalls is conditionally independent of MaryCalls given the value of Alarm.
The second example is a little bit trickier. Although we know all the parents of Burglary (obviously, cause it doesn't have any parents), we can't say that Burglary is conditionally independent of Earthquake. Cause if we know that Alarm is on, and we received an information about Earthquake, we would guess that the Alarm was triggered by Earthquake and the chances of Burglary is considerably lower. So, in this case Earthquake gives us some information about Burglary. This example doesn't fall under the rule described above, cause the variables questioned upon conditional independence share the same descendant.
The similar example would be WasPartying -> HomeworkWasntCompleted <- DidntUnderstandTopic (pay attention to arrow directions).
Here you can find a nice lecture about conditional independence.

Negated possibilities in Prolog

This is a somewhat silly example but I'm trying to keep the concept pretty basic for better understanding. Say I have the following unary relations:
person(steve).
person(joe).
fruit(apples).
fruit(pears).
fruit(mangos).
And the following binary relations:
eats(steve, apples).
eats(steve, pears).
eats(joe, mangos).
I know that querying eats(steve, F). will return all the fruit that steve eats (apples and pears). My problem is that I want to get all of the fruits that Steve doesn't eat. I know that this: \+eats(steve,F) will just return "no" because F can't be bound to an infinite number of possibilities, however I would like it to return mangos, as that's the only existing fruit possibility that steve doesn't eat. Is there a way to write this that would produce the desired result?
I tried this but no luck here either: \+eats(steve,F), fruit(F).
If a better title is appropriate for this question I would appreciate any input.
Prolog provides only a very crude form of negation, in fact, (\+)/1 means simply "not provable at this point in time of the execution". So you have to take into account the exact moment when (\+)/1 is executed. In your particular case, there is an easy way out:
fruit(F), \+eats(steve,F).
In the general case, however, this is far from being fixed easily. Think of \+ X = Y, see this answer.
Another issue is that negation, even if used properly, will introduce non-monotonic properties into your program: By adding further facts for eats/2 less might be deduced. So unless you really want this (as in this example where it does make sense), avoid the construct.

In PROLOG, How can I use assert recursively without getting 'true' result?

I'm planning to make new facts based on existing facts, by using assert.
However, the number of facts to be made will be more than 500, so that typing semicolon to go further steps become pretty tedious work.
thus I want to ignore or pass the 'true'(in the SWI PROLOG)
Are there any ways to deal with this?(ex. automatically pass all the 'true's...)
here's a part of my code
%initialize
initialize :-
discipline(X,Y),
assert(result(X,0)).
I have too many Xs in discipline(X,Y)..
maybe
?- forall(a_fact(F), your_fact_processing(F)).
In this specific case forall is actually preferred, but in general, in Prolog you have to rely on the language's mechanism for this kind of iteration. Here's an example for your case:
initialize:-
discipline(X,Y),
assert(result(X,0)),
fail.
initialize.
In this bit of code above, you are telling the interpreter that initialize should perform all the 'asserts' given possible disciplines through the backtracking mechanism. Unless you become really familiar with this, Prolog will never "click" for you.
Note that in this example initialize will never fail, even if there are no disciplines (and therefore no results) to assert. You'll need some extra work to detect edge-cases like that - which is why forall is actually preferred for this specific task of assertin many facts.
Also note that if it's good practice to not have singleton variables declared, you can use the notation where variables that you won't use start with the '_' (underscore) character.

Resources