Which is more common practice in Prolog? - prolog

I'm writing a rule that searches a database of facts in the form:
overground(Station1, Station2, DurationOfTravel).
and allows to you search for all journeys that take the same duration of travel.
I've written these two rules:
timesearch(Duration) :-
overground(Station1, Station2, Duration),
print([Station1, Station2]).
timesearch(Duration, [Station1,Station2]) :-
overground(Station1, Station2, Duration).
Which do essentially the same thing. What I'm unsure about is which is best practice? Or are they two equally good solutions?

They don't do essentially the same thing; they contain the same "business" logic, but the first mixes in presentation logic (output code). It's a general principle of program design that business logic and presentation should be separated, so go with the second option and put the printing in some kind of main predicate.
In particular, in this example you don't want the printing to be done in the timesearch predicate; what if you decide one day that you want a more complicated algorithm that can determine the duration of a route of more than two hops? You can implement such an algorithm in terms of the second definition of timesearch, but not in terms of the first.
(This has very little to do with Prolog and all the more with the craft of good software design.)

In addition to #larsmans's answer, I'd like to add a link about pure functions. In any language where you have the chance to apply this concept, prefer pure functions when possible and handle the IO in separate parts.
Especially here in prolog, when backtracking is needed, the fact that you output things in your business logic predicates might reveal really problematic, since those things may be printed during the execution of a branch that won't lead to a relevant result.

Related

Looking for a more compact syntax for Prolog

Prolog is a nice language. I use it occasionally, from time to time.
But approaching it every subsequent time makes me feel less and less comfortable syntactically.
The modern programming languages are moving to allow
programmer less repeating himself
omit unnecessary pieces if they can be deduced, or their names are just placeholders.
The DCG is a step in the right direction allowing one to write
sentence --> noun_phrase, verb_phrase.
instead of
sentence(A,Z) :- noun_phrase(A,B), verb_phrase(B,Z).
but its entanglement with difference lists makes it less useful.
So what I am looking for are projects giving Prolog
a more compact syntactic representation, while preserving its semantic expressiveness.
Higher-order programming based on call/N is still a pretty much unexplored terrain. Major implementations like SICStus Prolog added call/N as late as 2006. So there is still a lot to explore. Consider library(lambda), library(reif) (both here) and other definitions using the meta-predicate declaration.
One thing you might want to look into in case of Swi-Prolog are actual language extensions introduced specifically by Swi-Prolog 7:
http://www.swi-prolog.org/pldoc/man?section=extensions
Another thing is Quasi-Quotation library which allows you to insert pieces of code in your own language (defined using DCG) inside "regular" Prolog code:
http://www.swi-prolog.org/pldoc/man?section=quasiquotations
The last thing I can recommend is the list of additional Swi-Prolog packages, some of which are specifically designed to extend the language, e.g. 'func', 'lambda', etc.:
http://www.swi-prolog.org/pack/list

Dealing with complicated prolog loops

I am using Prolog to encode some fairly complicated rules in a project of mine. There is a lot of recursion, including mutual recursion. Part of the rules look something like this:
pred1(X) :- ...
pred1(X) :- someguard(X), pred2(X).
pred2(X) :- ...
pred2(X) :- othercondition(X), pred1(X).
There is a fairly obvious infinite loop between pred1 and pred2. Unfortunately, the interaction between these predicates is very complicated and difficult to isolate. I was able to eliminate the infinite loop in this instance by passing around a list of objects that have been passed to pred1, but this is extremely unwieldy! In fact, it largely defeats the purpose of using Prolog in this application.
How can I make Prolog avoid infinite loops? For example, if in the course of proving pred1(foo) it tries to prove pred1(foo) as a sub-goal, fail and backtrack.
Is it possible to do this with meta-interpreters?
Yes, you can use meta-interpreters for this purpose, as mat suggests. But for the normal use case, that is going far beyond the regular effort.
What you may consider instead is to separate the looping functionality from your actual logic using higher-order predicates. That is a very safe way to go — SWI even checks if all the uses have a corresponding definition. This checking is either invoked when typing make. or check.
As an example, consider closure0/3 and path/4 which both handle loop checks "once and forever".
One feature that is available in some Prolog systems and that may help you to solve such issues is called tabling. See for example the related question and prolog-tabling.
If tabling is not available, then yes, meta-interpreters can definitely help a lot with this. For example, you can change the executation strategy etc. with a meta-interpreter.
In SWI-Prolog, also check out call_with_inference_limit/3 to robustly limit the execution, independent of CPU type and system load.
Related and also useful are termination analyzers like cTI: They allow you to statically derive termination conditions.

Theorem Proof Using Prolog

How can I write theorem proofs using Prolog?
I have tried to write it like this:
parallel(X,Y) :-
perpendicular(X,Z),
perpendicular(Y,Z),
X \== Y,
!.
perpendicular(X,Y) :-
perpendicular(X,Z),
parallel(Z,Y),
!.
Can you help me?
I was reluctant to post an Answer because this Question is poorly framed. Thanks to theJollySin for adding clean formatting! Something omitted in the rewrite, indicative of what Aman had in mind, was "I inter in Loop" (sic).
We don't know what query was entered that resulted in this looping, so speculation is required. The two rules suggest that Goal involved either the parallel/2 or the perpendicular/2 predicate.
With practice it's not hard to understand what the Prolog engine will do when a query is posed, especially a single goal query. Prolog uses a pretty simple "follow your nose" strategy in attempting to satisfy a goal. Look for the rules for whichever predicate is invoked. Then see if any of those rules, starting with the first and going down in the list of them, can be applied.
There are three topics that beginning Prolog programmers will typically struggle with. One is the recursive nature of the search the Prolog engine makes. Here the only rule for parallel/2 has a right-hand side that invokes two subgoals for perpendicular/2, while the only rule for perpendicular/2 invokes both a subgoal for itself and another subgoal for parallel/2. One should expect that trying to satisfy either kind of query inevitably leads to a Hydra-like struggle with bifurcating heads.
The second topic we see in this example is the use of free variables. If we are to gain knowledge about perpendicularity or parallelism of two specific lines (geometry), then somehow the query or the rules need to provide "binding" of variables to "ground" terms. Again without the actual Goal being queried, it's hard to guess how Aman expected that to work. Perhaps there should have been "facts" supplied about specific lines that are perpendicular or parallel. Lines could be represented merely as atoms (perhaps lowercase letters), but Prolog variables are names that begin with an uppercase letter (as in the two given rules) or with an underscore (_) character.
Finally the third topic that can be quite confusing is how Prolog handles negation. There's only a touch of that in these rules, the place where X\==Y is invoked. But even that brief subgoal requires careful understanding. Prolog implements "negation as failure", so that X\==Y succeeds if and only if X==Y does not succeed. This latter goal is also subtle, because it asks whether X and Y are the same without trying to do any unification. Thus if these are different variables, both free, then X==Y fails (and X\==Ysucceeds). On the other hand, the only way for X==Yto succeed (and thus for X\==Y to fail) would be if both variables were bound to the same ground term. As discussed above the two rules as stated don't provide a way for that to be the case, though something might have taken care of this in the query Goal.
The homework assignment for Aman is to learn about these Prolog topics:
recursion
free and bound variables
negation
Perhaps more concrete suggestions can then be made about Prolog doing geometry proofs!
Added: PTTP (Prolog Technology Theorem Prover) was written by M.E. Stickel in the late 1980's, and this 2006 web page describes it and links to a download.
It also summarizes succinctly why Prolog alone is not " a full general-purpose theorem-proving system." Pointers to later, more capable theorem provers can be followed there as well.

When should I break a function?

Its prudent to break a long function into a chief function and helper functions.
I know that the outside the module only chief function will be called, but its long length may prove to be intimidating.
Textbooks put a limit on the number of lines, but I feel that this is too rigid.
P.S. I am programming in Python and need to process incoming, messages. The function returns a tuple containing the message but in Python's internal data types.
So you can see somewhat independent code for each message type.
Duplicate Question
When is a function too long?
I think you need to go about this from the other end of the problem. Think bottom-up. Identify small units of work, as small as possible, and start composing your code that way. You will only run into spaghetti-code issues when you code top-down and don't keep a structured approach.
If you already have spaghetti code and need to refactor, you pretty much have to start over. It is probably more work to break up existing spaghetti code than to rewrite it, and the result may not be as good.
I don't think there should be a hard number for the lines of code in a method either, but well written code does not have methods with more than 5 to 10 lines in the lower layers, and 20 to 30 lines in the business logic. To give you some kind of metric.
I'm not a big fan of breaking a function into multiple functions unnecessarily. It's not a hard and fast thing - if there are things that seem like distinct logical units, then by all means, break those out and think about them separately. But don't just break things out for the sake of some guideline like "one page per function" or "N lines per function".
One good rule of thumb is that if it doesn't fit on a single screen it is worth thinking about splitting it up. But only if it makes sense to split it up, some long functions are perfectly readable and it doesn't make any sense to slavishly split them into multiple functions just for the sake of it.
Never write a function that, when printed on fanfold paper, is taller than you are.
I like the rule of thumb that you should break out the subfunction if you can think of a good domain-relevant name for it.
When someone can understand the top-level function without necessarily having to look up the definition of the sub-function, you've likely made a net gain. (But when you break it down too far, your names start referring to your implementation artifacts rather than the domain)
I was recently discussing this with a friend. He suggested refactoring to separate concerns and I must say I have to agree. That is, one function should do one thing, if it does more than one thing, split it up. If not, let it be together, it makes no sense to split up a function, only to have it obfuscate the meaning. After all, a function is a block of code that does one thing!
The limit in term of number of lines is often impractical becuase it doesn't account for readability well. It's better to try to seperate groups of lines of code that have just a few inputs and just a few outputs and make this a separate functon. It's not always possible - then it's often wise to just leave the code as it is and not to refactor for the sake of refactoring.
Well since I am coding in Python so I have the liberty to write functions inside functions, unlike C, C++ or Java. This i feel is a better choice.
It's not specified. But line should be as low as possible. But you may follow the Role of 30. I follow this in my PHP scripts when needed.
Rule of 30:
“Rule of 30” in Refactoring in Large Software Projects by Martin Lippert and Stephen Roock:
Methods should not have more than an average of 30 code lines.
A class should contain an average of less than 30 methods.
A package/library shouldn’t contain more than 30 classes.
Subsystems should avoid more than 30 packages.
A system more than 30 subsystems may create problem.
If an element consists of more than 30 subelements, it is highly probable that there is a serious problem.
personally I break a function if it either saves total lines or total processing time.
if I only run the helper once per chief function I don't bother
The point is that in principal it's better to have specialiced functions. But where one sets the limit depends very much on
1) the "usual" programming style in certain languages. (one can observe that, object-oriented langauges tend to shorter procedureds than let's say C or the like
2) it depends on your way of programming. Every hard limit must be questioned. IMHO. Overall there will probably some "natural" distribution of programs
3) I think what one should keep on one's mind is that a function should do a certain task take for example some function for parsing it is usually much longer than a function just settin some field in a structure. Or getting back just consider how a event loop in the Windows API may look. So that all suggests that there may be good reasons for long methods...
If there is independent code (in your case specifics for each message type) those areas should be broken out.
Size matters not. Judge me by my size do you? - Yoda
Your main concerns are readability, simplicity and maintainability. A good indicator is if you need to write comments to explain a section of a function then that section is a good candidate for a separate function.
There are many reasons to break a long function into its constituent pieces. Most important is:
readability
maintainability
code clarity/intent
Some functions simple cannot be broken into smaller pieces without negatively impacting the listed goals, so there is no hard-and-fast rule.
If you didn't write it and it's already in production: NEVER!!! If you break it up, you're likely to break it, it's that simple.
If you are writing it and you're not sure, the on screen rule apples as others have said.

Are there any tools to aid with complex 'if' logic?

One of my personal programming demons has always been complex logic that needs to be controlled by if statements (or similiar). Not always necessarily that complex either, sometimes just a few states that needs to be accounted for.
Are there any tools or steps a developer can perform during design time to help see the 'states' and take measures to refactor the code down to simplify the resulting code? I'm thinking drawing up a matrix or something along those lines...?
I'd recommend a basic course in propositional logic for every aspiring programmer. At first, the notation and Greek letters may seem off-putting to the math-averse, but it is really one of the most powerful (and oft-neglected) tools in your skillset, and rather simple, at the core.
The basic operators, de Morgan's and other basic laws, truth tables, and existence of e.g. disjunctive and conjunctive normal forms were an eye-opener to me. Before I learned about them, conditional expressions felt like dangerous beasts. Ever since, I know that I can whip them into submission whenever necessary by breaking out the heavy artillery!
Truth tables are basically the exhaustive approach and will (hopefully) highlight all the possibilities.
You might like to take a look at Microsoft Pex, which can be helpful for spotting the fringe cases you hadn't thought of.
I think that the developer is asking how to make his life easier when dealing with complex if code.
The way that I handle complex if code is to code as flat as possible and weed out all negations first. If you can get rid of compound if by placing a portion of it above, then do that.
The beauty of simplicity is that it doesn't take a book or a class to learn it. If you can break it up, do so. If you can remove any part of it, do so. If you don't understand it, do it differently. And flat is almost always better than nested (thanks python!).
It's simpler to read:
if(broken){
return false;
}
if (simple){
doit();
return true;
}
if(complicated){
divide();
conquor();
}
if(extra){
extra();
}
than it is to read:
if(!broken && (simple || complicated)){
....
}
return false;
Truth tables and unit tests - draw up the tables (n dimensional for n variables), and then use these as inputs to your unit test, which can test each combination of variables and verify the results.
The biggest problem I've seen through the years with complex IFs is that people don't test all the branches. Make sure to write a test for each possible branch no matter how unlikely it seems that you will hit it.
You might also want to try Karnaugh maps, which are good for up to 4 variables.
If you haven't already, I'd highly suggest reading Code Complete. It has a lot of advice on topics such as this. I don't have my copy handy at the moment, otherwise I'd post a summary of this section in the book.
Split the logic down into discrete units (a && b, etc.), each with their own variable. Then build these up using the logic you need. Name each variable with something appropriate, so that your complex statement is fairly readable (although it may take up several extra lines and a fair few temporary variables).
Any reason you cannot just handle the logic with guard statements?
Karnaugh maps can be nice ways of taking information from a truth table (suggested by Visage) and turning them into compact and/or/not expressions. These are typically taught in an EE digital logic course.
Have you tried a design pattern? You might look into what is known as the Strategy pattern: http://en.wikipedia.org/wiki/Strategy_pattern
Check out the nuclear option: Drools. There's quite a lot to it-- took me a day or two of perusing the literature just to get a handle on its capabilities. But if you have applications where your complex if-then logic is an evolving part of the project (for example, an application with modular algorithms) it might be just the thing.

Resources