As you know many rule engines use Rete algorithm when processing rules and this algorithm constructs a tree so called Rete tree.
What is the ideal topology for Rete tree to ensure better rule processing performance? In other words, I want to know the tree topology a rule set should better correspond to for better performance.
The short answer is that the performance is affected by the number of rules and objects, the number of tests, how you order the tests in your rules, and how many tests/conditions are shared between rules.
You should rewrite rules for optimal performance by:
Reordering tests and conditions so that the most discriminating conditions are moved to the beginning of the rule
Sharing conditions
See the Adjusting conditions IBM ODM documentation.
You should also reduce the number of objects that need to be evaluated by rules, and the number of tests.
For your reference regarding Rete and IBM ODM:
For an example of the structure of a Rete tree, refer to the RetePlus network structure IBM ODM documentation
What affects the performance of a Decision Server application : RetePlus
RetePlus is designed to optimize the evaluation of large numbers of
rules across large numbers of objects. RetePlus filters the tests such
that irrelevant tests are not evaluated. Tests can be shared between
rules that use similar tests so that they do not need to be
re-evaluated for all the rules.
What affects the performance of a Decision Server application: Rule organization
For the best results:
Common tests on different objects are shared.
The number of tests carried out are minimized.
Performance degrades when a single evaluation contains too many variable definitions and conditions.
The test uses less memory.
Simply put, if you would like to use the RetePlus algorithm in your orchestration, use only Decision Trees business rules.
It's much faster when used this way. Although you may used in combination with other algorithms as well as Sequential (for Action Rules, in this case).
So your solution could be part of Action Rules (with Sequential) and part with Decision Tables (with RetePlus).
Hope this helps.
Related
Rete Algorithm is an efficient pattern matching algorithm that compares a large collection of patterns to a large collection of objects. It is also used in one of the expert system shell that I am exploring right now: is drools.
What is the time complexity of the algorithm, based on the number of rules I have?
Here is a link for Rete Algorithm: http://www.balasubramanyamlanka.com/rete-algorithm/
Also for Drools: https://drools.org/
Estimating the complexity of RETE is a non-trivial problem.
Firstly, you cannot use the number of rules as a dimension. What you should look at are the single constraints or matches the rules have. You can see a rule as a collection of constraints grouped together. This is all what RETE reasons about.
Once you have a rough estimate of the amount of constraints your rule base has, you will need to look at those which are inter-dependent. Inter-dependent constraints are the most complex matches and are pretty similar in concept as JOINS in SQL queries. Their complexity varies based on their nature as well as the state of your working memory.
Then you will need to look at the size of your working memory. The amount of facts you assert within a RETE based expert system strongly influence its performance.
Lastly, you need to consider the engine conflict resolution strategy. If you have several conflicting rules, it might take a lot of time to figure out in which order to execute them.
Regarding RETE performance, there is a very good PhD dissertation I'd suggest you to look at. The author is Robert B. Doorenbos and the title is "Production matching for large learning systems".
I have created a big Ontology (.owl) and I'm now in the reasoning step. In fact, the problem is how to ensure a scalable reasoning for my ontology. I have searched in literature and I found that Big Data can be an adequate solution for that. Unfortunately, I found that Map-reduce can't accept as input OWL file. In addition semantic language as SWRL, SPARQL can not be used.
My questions are:
should I change the owl file with others?
How to transform Rules (SWRL for example) in an acceptable format with Map-reduce?
Thanks
"Big data can be an adequate solution to that" is too simple a statement for this problem.
Ensuring scalability of OWL ontologies is a very complex issue. The main variables involved are number of axioms and expressivity of the ontology; however, these are not always the most important characteristics. A lot depends also on the api used and, for apis where the reasoning step is separate from parsing, which reasoner is being used.
SWRL rules add another level of complexity, as they are of (almost) arbitrary complexity - so it is not possible to guarantee scalability in general. For specific ontologies and specific sets of rules, it is possible to provide better guesses.
A translation to a MapReduce format /might/ help, but there is no standard transformation as far as I'm aware, and it would be quite complex to guarantee that the transformation preserves the semantics of the ontology and of the rule entailments. So, the task would amount to rewrite the data in a way that allows you to answer the queries you need to run, but this might prove impossible, depending on the specific ontology.
On the other hand, what is the size of this ontology and the amount of memory you allocated to the task?
In data mining, frequent itemset are found using different algorithms like Apriori Algorithm , FP-Tree , etc. So are these the pattern evaluation methods?
You can try Association Rules (apriori for example), Collaborative Filtering (item-based or user-based) or even Clustering.
I don't know what you are trying to do, but if you have a data-set and you need to find the most frequent item-set you should try some of the above techniques.
If you're using R you should explore the arules package for association rules (for example).
Apriori algorithm and FP-tree algorithm is used to find frequent itemsets for the given transactional data. This would help in market basket analysis applications. For pattern evaluation, there are many components namely:
support,
confidence,
Lift,
Imbalance ratio, etc.
More details can be seen at the paper:
Selecting the right interestingness measure for association patterns by Pang Ning Tan, Vipin Kumar, Jaideep Srivastava, KDD 2002.
Refer URL:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.13.1494&rep=rep1&type=pdf
I have a need to build an app (Ruby) that allows the user to select one or more patterns and in case those patterns are matched to proceed and complete a set of actions.
While doing my research I've discovered the new (to me) field of rules based systems and have spent some time reading about it and it seems exactly the kind of functionality I need.
The app will be integrated with different web services and would allow rules like these one:
When Highrise contact is added and Zendesk ticket is created do add email to database
I had two ideas to build this. The first is to build some kind os DSL to be able to specify the rule conditions and build them on the fly with the user input.
The second one is to build some rule classes each one having a pattern/matcher and action methods. The pattern would evaluate the expression and return true or false and the action would be executed if the match is positive.
The rules will then need to be persisted and then evaluated periodically.
Can anyone shed some light on this design or point somewhere where I can get more information on this?
Thanks
In a commercial rules engine e.g. Drools, FlexRule... the pattern matching is handled by RETE algorithm. And also, some of them provide multiple different engines for different logic e.g. procedural, validation, inference, flow, workflow,... and they also provide DSL customization...
Rule sequencing and execution is handled based on agenda and activation that can be defined on the engine. And conflict resolution strategy would help you to find proper activation to fire.
I recommend you to use a commercial product hosting on a host/service. And use simple Json/Xml format to communicate to the rule server and execute your rules. This will be giving you a better result probably than creating your own one. However if you are interested in creating your own one as a pattern matching engine consider RETE algorithm, agenda and activation mechanisms for complex production system.
In RETE algorithm you may consider at least implementing Positive and Negative conditions. In implementing RETE you need to implement beta and alpha memories as well ad join nodes that supports left and right activations.
Do you think you could represent your problem in a graph-based representation? I'm pretty sure that your problem can be considered as a graph-based problem
If yes, why don't you use a graph transformation system to define and apply you rules. The one that I would recommend is GrGen.NET. The use of GrGen.NET builds on five steps
Definition of the metamodel: Here, you define you building blocks, i.e. types of graph nodes and graph edges.
Definition of the ruleset: This is where you can put your pattern detecting rules. Moreover, you can create rule encapsulating procedures to manipulate your graph-based data structure.
Compilation: Based on the previous two steps, a C#-assembly (DLL) is created. There should be a way to access such a DLL from Ruby.
Definition of a rule sequence: Rule sequences contain the structure in which individual rules are executed. Typically, it's a logical structure in which the rules are concatenated.
Graph transformation: The application of a rule sequences on a DLL results in the transformation of a graph that can subsequently be exported, saved or further manipulated.
You can find a very good manual of GrGen.NET here: http://www.info.uni-karlsruhe.de/software/grgen/GrGenNET-Manual.pdf
In the past I had to develop a program which acted as a rule evaluator. You had an antecedent and some consecuents (actions) so if the antecedent evaled to true the actions where performed.
At that time I used a modified version of the RETE algorithm (there are three versions of RETE only the first being public) for the antecedent pattern matching. We're talking about a big system here with million of operations per rule and some operators "repeated" in several rules.
It's possible I'll have to implement it all over again in other language and, even though I'm experienced in RETE, does anyone know of other pattern matching algorithms? Any suggestions or should I keep using RETE?
The TREAT algorithm is similar to RETE, but doesn't record partial matches. As a result, it may use less memory than RETE in certain situations. Also, if you modify a significant number of the known facts, then TREAT can be much faster because you don't have to spend time on retractions.
There's also RETE* which balances between RETE and TREAT by saving some join node state depending on how much memory you want to use. So you still save some assertion time, but also get memory and retraction time savings depending on how you tune your system.
You may also want to check out LEAPS, which uses a lazy evaluation scheme and incorporates elements of both RETE and TREAT.
I only have personal experience with RETE, but it seems like RETE* or LEAPS are the better, more flexible choices.