Forward and Backward Chaining - prolog

I am attempting to understand the best uses of backward and forward chaining in AI programming for a program I am writing. Would anyone be able to explain the most ideal uses of backward and forward chaining? Also, could you provide an example?

I have done some research on the current understanding of "forward chaining" and "backward chaining". This brings up a lot of material. Here is a résumé.
First a diagram, partially based on:
The Sad State Concerning the Relationships between Logic, Rules and Logic Programming (Robert Kowalski)
LHS stands for "left-hand-side", RHS stands for "right-hand-side" of a rule throughout.
Let us separate "Rule-Based Systems" (i.e. systems which do local computation based on rules), into three groups as follows:
Production Rule Systems, which include the old-school Expert System Shells, which are not built on logical principles, i.e. "without a guiding model".
Logic Rule Systems, i.e. system based on a logical formalism (generally a fragment of first-order logic, classical or intuitionistic). This includes Prolog.
Rewrite Rule Systems, systems which rewrite some working memory based on, LHS => RHS rewrite rules.
There may be others. Features of one group can be found in another group. Systems of one group may be partially or wholly implemented by systems of another group. Overlap is not only possible but certain.
(Sadly, imgur does not accept .svg in 2020, so it's a .png)
Green: Forward Chaining
Orange: Backward Chaining
Yellow: Prolog
RuleML (an organization) tries to XML-ize the various rulesets which exist. They classify rules as follows:
The above appears in The RuleML Perspective on Reaction Rule Standards by Adrian Paschke.
So they make a differentiation between "deliberative rules" and "reactive rules", which fits.
First box: "Production Rule Systems"
The General Idea of the "Production Rule System" (PRS)
There are "LHS->RHS" rules & meta-rules, with the latter controlling application of the first. Rules can be "logical" (similar to Prolog Horn Clauses), but they need not be!
The PRS has a "working memory", which is changed destructively whenever a rule is applied: elements or facts can be removed, added or replaced in the working memory.
PRS have "operational semantics" only (they are defined by what they do).
PRS have no "declarative semantics", which means there is no proper way to reason about the ruleset itself: What it computes, what its fixpoint is (if there is one), what its invariants are, whether it terminates etc.
More features:
Ad-hoc handling of uncertainty using locally computable functions (i.e.
not probability computations) as in MYCIN, with Fuzzy rules, Dempster-Shaefer theory etc.
Strong Negation may be expressed in an ad-hoc fashion.
Generally, backtracking on impasse is not performed, one has to implement it explicitly.
PRS can connect to other systems rather directly: Call a neural network, call an optimizer or SAT Solver, call a sensor, call Prolog etc.
Special support for explanations & debugging may or may not exist.
Example Implementations
Ancient:
Old-school "expert systems shells", often written in LISP.
Planner of 1971, which is language with rudimentary (?) forward and backward chaining. The implementations of that language were never complete.
The original OPSx series, in particular OPS5, on which R1/XCON - a VAX system configurator with 2500 rules - was running. This was actually a forward-chaining implementation.
Recent:
CLIPS (written in C): http://www.clipsrules.net/
Jess (written in Java): https://jess.sandia.gov/
Drools (writen in "Enterprise" Java): https://www.drools.org/
Drools supports "backwards-chaining" (how exactly), but I'm not sure any of the others does, and if they do, how it looks like)
"Forward chaining" in PRS
Forward-chaining is the original approach to the PRS "cycle", also called "recognize-act" cycle, or the "data-driven cycle", which indicates what it is for. Event-Condition-Action architecture is another commonly used description.
The inner working are straightforward:
The rule LHSs are matched against the working memory (which happens at every working memory update thanks to the RETE algorithm).
One of the matching rules is selected according to some criterium (e.g. priority) and its RHS is executed. This continues until no LHS matches anymore.
This cycle can be seen as higher-level approach to imperative state-based languages.
Robert Kowalski notes that the "forward chaining" rules are actually an amalgamation of two distinct uses:
Forward-chained logic rules
These rules apply Modus Ponens repeatedly to the working memory and add deduced facts.
Example:
"IF X is a man, THEN X is mortal"
Uses:
Deliberation, refinement of representations.
Exploration of state spaces.
Planning if you want more control or space is at a premium (R1/XCON was a forward chaining system, which I find astonishing. This was apparently due to the desire to keep resource usage within bounds).
In Making forward chaining relevant (1998), Fahiem Bacchus writes:
Forward chaining planners have two particularly useful properties. First, they maintain complete information about the intermediate states generated by a potential plan. This information can be utilized to provide highly effective search control, both domain independent heuristic control and even more effective domain dependent control ... The second advantage of forward chaining planners is they can support rich planning languages. The TLPlan system for example, supports the full ADL language, including functions and numeric calculations. Numbers and functions are essential for modeling many features of real planning domains, particularly resourcs and resource consumption.
How much of the above really applies is debatable. You can always write your backward-chaining planner to retain more information or to be open to configuration by a search strategy selecting module.
Forward-chaining "reactive rules" aka "stimulus-response rules"
Example:
"IF you are hungry THEN eat something"
The stimulus is "hunger" (which can be read off a sensor). The response is to "eat something" (which may mean controlling an effector). There is an unstated goal, hich is to be "less hungry", which is attained by eating, but there is no deliberative phase where that goal is made explicit.
Uses:
Immediate, non-deliberative agent control: LHS can be sensor input, RHS can be effector output.
"Backward chaining" in PRS
Backward chaining, also called "goal-directed search", applies "goal-reduction rules" and runs the "hypothesis-driven cycle", which indicates what it is for.
Examples:
BDI Agents
MYCIN
Use this when:
Your problem looks like a "goal" that may be broken up into "subgoals", which can be solved individually. Depending on the problem, this may not be possible. The subgoals have too many interdependencies or too little structure.
You need to "pull in more data" on demand. For example, you ask the user Y/N question until you have classified an object properly, or, equivalently, until a diagnosis has been obtained.
When you need to plan, search, or build a proof of a goal.
One can encode backward-chaining rules also as forward-chaining rules as a programming exercise. However, one should choose the representation and the computational approach that is best adapted to one's problem. That's why backward chaining exists after all.
Second box: "Logic Rule Systems" (LRS)
These are systems based on some underlying logic. The system's behaviour can (at least generally) be studied independently from its implementation.
See this overview: Stanford Encyclopedia of Philosophy: Automated Reasoning.
I make a distinction between systems for "Modeling Problems in Logic" and systems for "Programming in Logic". The two are merged in textbooks on Prolog. Simple "Problems in Logic" can be directly modeled in Prolog (i.e. using Logic
Programming) because the language is "good enough" and there is no mismatch. However, at some point you need dedicated systems for your task, and these may be quite different from Prolog. See Isabelle or Coq for examples.
Restricting ourselves to Prolog family of systems for "Logic Programming":
"Forward chaining" in LRS
Forward-chaining is not supported by a Prolog system as such.
Forward-chained logic rules
If you want to forward-chained logic rules you can write your own interpreter "on top of Prolog". This is possible because Prolog is general purpose programming language.
Here is a very silly example of forward chaining of logic rules. It would certainly be preferable to define a domain-specific language and appropriate data structures instead:
add_but_fail_if_exists(Fact,KB,[Fact|KB]) :- \+member(Fact,KB).
fwd_chain(KB,KBFinal,"forall x: man(x) -> mortal(x)") :-
member(man(X),KB),
add_but_fail_if_exists(mortal(X),KB,KB2),
!,
fwd_chain(KB2,KBFinal,_).
fwd_chain(KB,KBFinal,"forall x: man(x),woman(y),(married(x,y);married(y,x)) -> needles(y,x)") :-
member(man(X),KB),
member(woman(Y),KB),
(member(married(X,Y),KB);member(married(Y,X),KB)),
add_but_fail_if_exists(needles(Y,X),KB,KB2),
!,
fwd_chain(KB2,KBFinal,_).
fwd_chain(KB,KB,"nothing to deduce anymore").
rt(KBin,KBout) :- fwd_chain(KBin,KBout,_).
Try it:
?- rt([man(socrates),man(plato),woman(xanthippe),married(socrates,xanthippe)],KB).
KB = [needles(xanthippe, socrates), mortal(plato),
mortal(socrates), man(socrates), man(plato),
woman(xanthippe), married(socrates, xanthippe)].
Extensions to add efficient forward-chaining to Prolog have been studied but they seem to all have been abandoned. I found:
1989: Adding Forward Chaining and Truth Maintenance to Prolog (PDF) (Tom_Finin, Rich Fritzson, Dave Matuszek)
There is an active implementation of this on GitHub: Pfc -- forward chaining in Prolog, and an SWI-Prolog pack, see also this discussion.
1997: Efficient Support for Reactive Rules in Prolog (PDF) (Mauro Gaspari) ... the author talks about "reactive rules" but apparently means "forward-chained deliberative rules".
1998: On Active Deductive Database: The Statelog Approach (Georg Lausen, Bertram Ludäscher, Wolfgang May).
Kowalski writes:
"Zaniolo (LDL++?) and Statelog use a situation calculus-like representation with frame axioms, and reduce Production Rules and Event-Condition-Action rules to Logic Programs. Both suffer from the frame problem."
Forward-chained reactive rules
Prolog is not really made for "reactive rules". There have been some attempts:
LUPS : A language for updating logic programs (1999) (Moniz Pereira , Halina Przymusinska , Teodor C. Przymusinski C)
The "Logic-Based Production System" (LPS) is recent and rather interesting:
Integrating Logic Programming and Production Systems in Abductive Logic Programming Agents (Robert Kowalski, Fariba Sadri)
Presentation at RR2009: Integrating Logic Programming and Production Systems in Abductive Logic Programming Agents
LPS website
It defines a new language where Observations lead to Forward-Chaining and Backward-Chaining lead to Acts. Both "silos" are linked by Integrity Constraints from Abductive Logic Programming.
So you can replace a reactive rule like this:
By something like this, which has a logic interpretation:
Third Box: "Rewrite Rule Systems" (forward-chaining)
See also: Rewriting.
Here, I will just mention CHR. It is a forward-chaining system which successively rewrites elements in a working memory according to rules with match working memory elements, verify a logic guard condition , and removed/add working memory elements if the logic guard condition succeeds.
CHR can be understood as an application of a fragment of linear logic (see "A Unified Analytical Foundation for Constraint Handling Rules" by Hariolf Betz).
A CHR implementation exists for SWI Prolog. It provides backtracking capability for CHR rules and a CHR goal can be called like any other Prolog goal.
Usage of CHR:
General model of computational (i.e. like Turing Machines etc.)
Bottom up parsing.
Type checking.
Constraint propagation in constraint logic programmning.
Anything that you would rather forward-chain (process bottom-up)
rather than backward-chain (process top-down).

I find it useful to start with your process and goals.
If your process can be easily expressed as trying to satisfy a goal by satisfying sub-goals then you should consider a backward-chaining system such as Prolog. These systems work by processing rules for the various ways in which a goal can be satisfied and the constraints on these applying these ways. Rule processing searches the network of goals with backtracking to try alternatives when one way of satisfying a goal fails.
If your process starts with a set of known information and applies the rules to add information then you should consider a forward-chaining system such as Ops5, CLIPS or JESS. These languages apply matching to the left hand side of the rule and invoke the right hand side of rules for which the matching succeeds. The working memory is better thought of as "what is known" than "true facts". Working memory can contain information known to be true, information known to be false, goals, sub-goals, and even domain rules. How this information is used is determined by the rules, not the language. To these languages there is no difference between rules that create values (deduce facts), rules that create goals, rules that create new domain knowledge or rules that change state. It is all in how you write your rules and organize your data and add base clauses to represent this knowledge.
It is fairly easy to implement either method using the other method. If you have a body of knowledge and want to make dedications but this needs to be directed by some goals go ahead and use a forward chaining language with rules to keep track of goals. In a backward chaining language you can have goals to deduce knowledge.
I would suggest that you consider writing rules to handle the processing of domain knowledge and not to encode your domain knowledge directly in the rules processed by the inference engine. Instead, the working memory or base clauses can contain your domain knowledge and the language rules can apply them. By representing the domain knowledge in working memory you can also write rules to check the domain knowledge (validate data, check for overlapping ranges, check for missing values, etc.), add a logic system on top of the rules (to calculate probabilities, confidence values, or truth values) or handle missing values by prompting for user input.

Related

Is there a higher order Prolog that wouldn't need a type system?

I suspect that λProlog needs a type system to make their higher
order unification sound. Otherwise through self application some
Russell type anomalies can appear.
Are there alternative higher order Prologs that don't need .sig files?
Maybe by a much simpler type system, that doesn't need that many
declarations but still has some form of higher order unification?
Can this dilemma be solved?
Is there a higher order Prolog that wouldn't need a type system?
These are type-free:
HiLog
HiOrd
From the HiOrd paper:
The framework proposed gives rise to many questions the authors hope to ad-
dress in future research. In particular, a rigorous treatment must be developed for
comparison with other higher-order formal systems (Hilog, Lambda-Prolog). For
example, it is reasonably straightforward to conservatively translate the Higher-
order Horn fragment of λProlog into Hiord by erasing types, as the resolution
rules are essentially the same (assuming a type-safe higher-order unification pro-
cedure).
Ciao (includes HiOrd)

Attributed variables: library interfaces / implementations / portability

When I was skimming some prolog related questions recently, I stumbled upon this answer by #mat to question How to represent directed cyclic graph in Prolog with direct access to neighbour verticies .
So far, my personal experience with attributed variables in Prolog has been very limited. But the use-case given by #mat sparked my interest. So I tried using it for answering another question, ordering lists with constraint logic programming.
First, the good news: My first use of attributed variables worked out like I wanted it to.
Then, the not so good news: When I had posted by answer, I realized there were several API's and implementations for attributed variables in Prolog.
I feel I'm over my head here... In particular I want to know the following:
What API's are in wide-spread use? Up to now, I found two: SICStus and SWI.
Which features do the different attributed variable implementations offer? The same ones? Or does one subsume the other?
Are there differences in semantics?
What about the actual implementation? Are some more efficient than others?
Can be (or is) using attributed variables a portability issue?
Lots of question marks, here... Please share your experience / stance?
Thank you in advance!
Edit 2015-04-22
Here's a code snippet of the answer mentioned above:
init_att_var(X,Z) :-
put_attr(Z,value,X).
get_att_value(Var,Value) :-
get_attr(Var,value,Value).
So far I "only" use put_attr/3 and get_attr/3, but---according to the SICStus Prolog documentation on attributed variables---SICStus offers put_attr/2 and get_attr/2.
So even this very shallow use-case requires some emulation layer (one way or the other).
I would like to focus on one important general point I noticed when working with different interfaces for attributes variables: When designing an interface for attributed variables, an implementor should also keep in mind the following:
Is it possible to take attributes into account when reasoning about simultaneous unifications, as in [X,Y] = [0,1]?
This is possible for example in SICStus Prolog, because such bindings are undone before verify_attributes/3 is called. In the interface provided by hProlog (attr_unify_hook/2, called after the unification and with all bindings already in place) it is hard to take into account the (previous) attributes of Y when reasoning about the unification of X in attr_unify_hook/2, because Y is no longer a variable at this point! This may be sufficient for solvers that can make decisions based on ground values alone, but it is a serious limitation for solvers that need additional data, typically stored in attributes, to see whether a unification should succeed, and which are then no longer easily available. One obvious example: Boolean unification with decision diagrams.
As of 2016, the verify-attributes branch of SWI-Prolog also supports verify_attributes/3, thanks to great implementation work by Douglas Miles. The branch is ready for testing and intended to be merged into master as soon as it works correctly and efficiently. For compatibility with hProlog, the branch also supports attr_unify_hook/2: It does so by rewriting such definitions to the more general verify_attributes/3 at compilation time.
Performance-wise, it is clear that there may be a downside to verify_attributes/3, because making several variables ground at the same time may let you sooner see (in attr_unify_hook/2) that a unification cannot succeed. However, I will gladly and any time exchange this typically negligible advantage for the improved reliability, ease of use, and increased functionality that the more general interface gives you, and which is in any case already the standard behaviour in SICStus Prolog which is on top of its generality also one of the faster Prolog systems around.
SICStus Prolog also features an important predicate called project_attributes/2: It is used by the toplevel to project constraints to query variables. SWI-Prolog also supports this in recent versions.
There is also one huge advantage of the SWI interface: The residual goals that attribute_goals//1 and hence copy_term/3 give you are always a list. This helps users to avoid defaultyness in their code, and encourages a more declarative interface, because a list of pure constraint goals cannot contain control structures.
Interestingly, neither interface lets you interpret unifications other than syntactically. Personally, I think there are cases where you may want to interpret unifications differently than syntactically, however, there may also be good arguments against that.
The other interface predicates for attributed variables are mostly easily interchangable with simple wrapper predicates for different systems.
Jekejeke Minlog has state-less or thin attribute variables. Well not exactly, an attribute variable can have zero, one or many hooks, which are allowed to be closures, and hence can carry a little state.
But typically an implementation manages the state elsewere. For this
purpose Jekejeke Minlog allows creating reference types from variables,
so that they can be used as indexes into tables.
The full potential is unleashed if this combined with trailing and/or
forward chaining. As an example we have implemented CLP(FD). There is also a little solver tutorial.
The primitive ingredients in our case are:
1) State-less Attribute Variables
2) Trailing and Variable Keys
3) Continuation Queue
The attribute variables hooks might have binding effects upto extending the continuation queue but are only executed once. Goals from the continuation queue can be non-deterministic.
There are some additional layers before realizing applications, that are mostly aggregations of the primitives to make changes temporarily.
The main applications so far are open source here and here:
a) Finite Domain Constraint Solver
b) Herbrand Constraints
c) Goal Suspension
Bye
An additional perspective on attributed variable libraries is how many attributes can be defined per module. In the case of SWI-Prolog/YAP and citing SWI documentation:
Each attribute is associated to a module, and the hook
(attr_unify_hook/2) is executed in this module.
This is a severe limitation for implementers of libraries such as CLP(FD) as it forces using additional modules for the sole purpose of having multiple attributes instead of being able to define as many attributes as required in the module implementing their library. This limitation doesn't exist on the SICStus Prolog interface, which provides a directive attribute/1 that allows the declaration of an arbitrary number of attributes per module.
You can find one of the oldest and most elaborate implementations of attributed variables in ECLiPSe, where it forms part of the wider infrastructure for implementing constraint solvers.
The main characteristics of this design are:
attributes must be declared, and in return the compiler supports efficient access
a syntax for attributed variables, so that they can be read and written
a more complete set of handlers for attribute operations, so that attributes are not only taken into account for unification, but also for other generic operations such as term copying and subsumption tests
a clear separation between the concepts of variable attribute and suspended goals
used in over a dozen of ECLiPSe's libraries
This paper (section 4) and the ECLiPSe documentation have more details.

Difference between rules in based expert systems and conditions in normal algorithmic programming

In rule based expert systems the knowledge base contains large number of rules in the form of "if (template) then (action)". The inference engine chooses the rules that match the input facts. That is those rules that their condition section matches the input data are shortlisted and one of them is selected.
Now it is possible to use a normal program with similar conditional statements in some way to possibly reach a result.
I am trying to find a "sound and clear description" of the difference between the two and why we cannot achieve what expert system rules could do with normal algorithmic programming?
Is it just that an algorithm needs complete and very well known inputs while expert systems can accept incomplete information with any order?
Thanks.
Rules are more accurately described as the form "when (template) then (action)". The semantic of "when" is quite different than those of "if". For example, the most direct translation of a distinct set of rules to a procedural programming language would look something like this:
if <rule-1 conditions>
then <rule-1 actions>
if <rule-2 conditions>
then <rule-2 actions>
.
.
.
if <rule-n conditions>
then <rule-n actions>
Since the actions of a rule can effect the conditions of another rule, every time any rule action is applied the rule conditions will all need to be rechecked. This can be quite inefficient.
The benefit provided by rule-based systems is to allow you to express rules as discrete units of logic while efficiently handling the matching process for you. Typically this will involve detecting and sharing common conditions between rules so they don't need to be checked multiple times as well as data-driven approaches where the system predetermines which rules will be effected by changes to specific data so that rule conditions do not need to be rechecked when unrelated data is changed.
This benefit is similar to the one provided by garbage collection in languages such as Java. There's nothing that automatic memory management provides in Java that can't be achieved by writing your own memory management routines in C. But since that's tedious and error prone, there's a clear benefit to using automatic memory management.
There is nothing that a "rule-based expert system" can do that a "normal algorithmic program" can't do because a rule-based expert system is a normal algorithmic program. It is not only possible to write a normal algorithmic program to match the way an expert system inference engine works, that is exactly what the people who wrote the inference engine did.
Perhaps the "difference" that you are seeing is that in one case the rules are "hard-coded" in the programming language, whereas in the other case the rules are treated as data to be processed by the program. The same logic is present in both cases, it's just that in one the "program" is specific to one task while the other shuffles the complexity out of the "program" and into the "data".
To expand on what Gary said, in hard-coded if-then systems the firing order for the rules is more constrained than in most expert systems. In an expert system, the rules can fire based on criteria other than some coded order. For example, some measure of relevance may be used to fire the rule, e.g. firing the rule the success or failure of which will rule in or out the most hypotheses.
Similarly, in many systems, the "knowledge engineer" can state the rules in any order. Although the potential order of firing may need to be considered, the order in which the rules are declared may not be important.
In some types of systems the rules are only loosely coupled. That is, a rule's contribution may not be all or nothing. If a rule fires, it may contribute evidence, if it fails to fire (or is absent), it may not contribute evidence, yet the hypothesis may succeed if some other suite of rules pushes it over a certainty threshold.
This allows experts to contribute rules in a more natural manner. The expert can think of a few rules and they can be tested. The expert can add a few more rules, perhaps even months later, etc., all the while improving the system's accuracy without the need to re-write any of the earlier rules or re-arrange any code.
The ways in which the above are accomplished are myriad, but the production rules described by Gary are one of the most common, easily understood, and effective means and are used by many expert systems.

Optimal planning with incrementally tabled abductive logic programming

I'd like to use abductive logic programming to find optimal plans. Exhaustively searching the space of plans would be impractical but there are ordering heuristics that, in ordinary logic programming, would be used to represent facts (ground predicates) as sorted lists. Sorted lists can, of course, be recast in predicate form as facts (ground predicates) with an ordering predicate -- and it is in this form that I would prefer to work given that abducibles are predicates.
In this form, I'd like to search the ground predicates with priority accorded to the their (respective) ordering predicate(s), and terminate at the first solution as it is provable that any other solutions would be less optimal.
I understand that this would require, at the very least, tabled logic programming. Fortunately tabling is now widely supported. However, it may also require incremental tabling as abducibles are asserted and retracted during abduction -- which would limit it to XSB, AFAIK.
How can one tell the Prolog engine to use an ordering predicate to search ground terms?
Also, is incremental tabling necessary to make this practical?
Tabling, at least in XSB, follows (by default) a scheduling strategy known as local scheduling. This means that answers to a goal are not returned as they are derived (like in the usual Prolog, without tabling), but only when the table of that goal has been completed. For this reason, the use of tabling to help return (and terminate at) the first solution only (as in your case) may not be appropriate. One can nevertheless opt for batched scheduling of XSB tabling, so the answers are returned as soon as they are derived. But this option can be only be set during the XSB installation, and not in the level of predicate.
Alternatively, XSB provides tries data structure that can be used to store facts. It can be used to simulate batched scheduling (returning an answer as soon as it is derived) in the presence of the default local scheduling. This technique is used for example in computing dual rules by-need. The idea is to compute and store in a trie the solutions one by one according to the given ordering.
These strategies, and their (dis)advantages, along with tries are discussed in XSB first manual.
With respect to the use of incremental tabling, it can certainly be useful if the asserted or retracted abducibles affect other tabled predicates; the latter predicates should then be incrementally tabled and abducibles should be declared incrementally dynamic (not just simply dynamic predicates). So doing, the tabled predicates will correctly reflect such updates.
I and my PhD student Ari Saptawijaya,ari.saptawijaya#gmail.com, have been publishing on tabled abduction, implemented in XSB, and you might like to see our publications, available for download at my home page (where you can find our latest paper, accepted at ICLP'14). At present we are combining tabled abduction with tabled incremental updating of fluents, where we abduce actions and incrementally propagate their effects on fluents.
One general concept we use is contextual abduction, whereby abductive may be usable from one context to another, or reject worse attempted solutions. The issues there are quite technical and not susceptible to explaining here.
I suggest you glance at our papers and come back to us, after seeing how you are attempting might benefit from our stance. I also advise you to look at the tabling chapters of the XSB user's manual, available at Sourceforge. Professor David Warren, the main architect of XSB Prolog may help you too.
Best wishes
Luis Moniz Pereira

Expert system for writing programs?

I am brainstorming an idea of developing a high level software to manipulate matrix algebra equations, tensor manipulations to be exact, to produce optimized C++ code using several criteria such as sizes of dimensions, available memory on the system, etc.
Something which is similar in spirit to tensor contraction engine, TCE, but specifically oriented towards producing optimized rather than general code.
The end result desired is software which is expert in producing parallel program in my domain.
Does this sort of development fall on the category of expert systems?
What other projects out there work in the same area of producing code given the constraints?
What you are describing is more like a Domain-Specific Language.
http://en.wikipedia.org/wiki/Domain-specific_language
It wouldn't be called an expert system, at least not in the traditional sense of this concept.
Expert systems are rule-based inference engines, whereby the expertise in question is clearly encapsulated in the rules. The system you suggest, while possibly encapsulating insight about the nature of the problem domain inside a linear algebra model of sorts, would act more as a black box than an expert system. One of the characteristics of expert systems is that they can produce an "explanation" of their reasoning, and such a feature is possible in part because the knowledge representation, while formalized, remains close to simple statements in a natural language; matrices and operations on them, while possibly being derived upon similar observation of reality, are a lot less transparent...
It is unclear from the description in the question if the system you propose would optimize existing code (possibly in a limited domain), or if it would produced optimized code, in that case driven bay some external goal/function...
Well production systems (rule systems) are one of four general approaches to computation (Turing machines, Church recursive functions, Post production systems and Markov algorithms [and several more have been added to that list]) which more or less have these respective realizations: imperative programming, functional programming, rule based programming - as far as I know Markov algorithms don't have an independent implementation. These are all Turing equivalent.
So rule based programming can be used to write anything at all. Also early mathematical/symbolic manipulation programs did generally use rule based programming until the problem was sufficiently well understood (whereupon the approach was changed to imperative or constraint programming - see MACSYMA - hmmm MACSYMA was written in Lisp so perhaps I have a different program in mind or perhaps they originally implemented a rule system in Lisp for this).
You could easily write a rule system to perform the matrix manipulations. You could keep a trace depending on logical support to record the actual rules fired that contributed to a solution (some rules that fire might not contribute directly to a solution afterall). Then for every rule you have a mapping to a set of C++ instructions (these don't have to be "complete" - they sort of act more like a semi-executable requirement) which are output as an intermediate language. Then that is read by a parser to link it to the required input data and any kind of fix up needed. You might find it easier to generate functional code - for one thing after the fix up you could more easily optimize the output code in functional source.
Having said that, other contributors have outlined a domain specific language approach and that is what the TED people did too (my suggestion is that too just using rules).

Resources