Design patterns/advise on building a Rule engine - ruby

I have a need to build an app (Ruby) that allows the user to select one or more patterns and in case those patterns are matched to proceed and complete a set of actions.
While doing my research I've discovered the new (to me) field of rules based systems and have spent some time reading about it and it seems exactly the kind of functionality I need.
The app will be integrated with different web services and would allow rules like these one:
When Highrise contact is added and Zendesk ticket is created do add email to database
I had two ideas to build this. The first is to build some kind os DSL to be able to specify the rule conditions and build them on the fly with the user input.
The second one is to build some rule classes each one having a pattern/matcher and action methods. The pattern would evaluate the expression and return true or false and the action would be executed if the match is positive.
The rules will then need to be persisted and then evaluated periodically.
Can anyone shed some light on this design or point somewhere where I can get more information on this?
Thanks

In a commercial rules engine e.g. Drools, FlexRule... the pattern matching is handled by RETE algorithm. And also, some of them provide multiple different engines for different logic e.g. procedural, validation, inference, flow, workflow,... and they also provide DSL customization...
Rule sequencing and execution is handled based on agenda and activation that can be defined on the engine. And conflict resolution strategy would help you to find proper activation to fire.
I recommend you to use a commercial product hosting on a host/service. And use simple Json/Xml format to communicate to the rule server and execute your rules. This will be giving you a better result probably than creating your own one. However if you are interested in creating your own one as a pattern matching engine consider RETE algorithm, agenda and activation mechanisms for complex production system.
In RETE algorithm you may consider at least implementing Positive and Negative conditions. In implementing RETE you need to implement beta and alpha memories as well ad join nodes that supports left and right activations.

Do you think you could represent your problem in a graph-based representation? I'm pretty sure that your problem can be considered as a graph-based problem
If yes, why don't you use a graph transformation system to define and apply you rules. The one that I would recommend is GrGen.NET. The use of GrGen.NET builds on five steps
Definition of the metamodel: Here, you define you building blocks, i.e. types of graph nodes and graph edges.
Definition of the ruleset: This is where you can put your pattern detecting rules. Moreover, you can create rule encapsulating procedures to manipulate your graph-based data structure.
Compilation: Based on the previous two steps, a C#-assembly (DLL) is created. There should be a way to access such a DLL from Ruby.
Definition of a rule sequence: Rule sequences contain the structure in which individual rules are executed. Typically, it's a logical structure in which the rules are concatenated.
Graph transformation: The application of a rule sequences on a DLL results in the transformation of a graph that can subsequently be exported, saved or further manipulated.
You can find a very good manual of GrGen.NET here: http://www.info.uni-karlsruhe.de/software/grgen/GrGenNET-Manual.pdf

Related

What does it mean by express this puzzle as a CSP

What is meant by the below for the attached image.
By labelling each cell with a variable, express the puzzle as a CSP. Hint:
recall that a CSP is composed of three parts.
I initially thought just add variables to each cell like A, B, C etc to each cell and then constrain those cells, but I do not believe that is correct. I do not want the answer just an explanation of what is required. in terms of CSP.
In my opinion, a CSP is best divided into two parts:
State the constraints. This is called the modeling part or model.
Search for solutions using enumeration predicates like labeling/2.
These parts are best kept separate by using a predicate which we call core relation and which has the following properties:
It posts the constraints, i.e., it expresses part (1) above.
Its last argument is the list of variables that still need to be labeled.
By convention, its name ends with an underscore _.
Having this distinction in place allows you to:
try different search strategies without the need to recompile your code
reason about termination properties of the core relation in isolation of any concrete (and often very costly) search.
I can see how some instructors may decompose part (1) into:
1a. stating the domains of the variables, using for example in/2 constraints
1b. stating the other constraints that hold among the variables.
In my view, this distinction is artificial, because in/2 constraints are constraints like all other constraints in the modeling part, but some instructors may teach this separately also for historical reasons, dating back to the time when CSP systems were not as dynamic as they are now.
Nowadays, you can typically post additional domain restrictions any time you like and freely mix in/2 constraints with other constraints in any order.
So, the parts that are expected from you are likely: (a) state in/2 constraints, (b) state further constraints and (c) use enumeration predicates to search for concrete solutions. It also appears that you already have the right idea about how to solve this concrete CSP with this method.

Ideal Topology of Rete Tree for Rule Engine

As you know many rule engines use Rete algorithm when processing rules and this algorithm constructs a tree so called Rete tree.
What is the ideal topology for Rete tree to ensure better rule processing performance? In other words, I want to know the tree topology a rule set should better correspond to for better performance.
The short answer is that the performance is affected by the number of rules and objects, the number of tests, how you order the tests in your rules, and how many tests/conditions are shared between rules.
You should rewrite rules for optimal performance by:
Reordering tests and conditions so that the most discriminating conditions are moved to the beginning of the rule
Sharing conditions
See the Adjusting conditions IBM ODM documentation.
You should also reduce the number of objects that need to be evaluated by rules, and the number of tests.
For your reference regarding Rete and IBM ODM:
For an example of the structure of a Rete tree, refer to the RetePlus network structure IBM ODM documentation
What affects the performance of a Decision Server application : RetePlus
RetePlus is designed to optimize the evaluation of large numbers of
rules across large numbers of objects. RetePlus filters the tests such
that irrelevant tests are not evaluated. Tests can be shared between
rules that use similar tests so that they do not need to be
re-evaluated for all the rules.
What affects the performance of a Decision Server application: Rule organization
For the best results:
Common tests on different objects are shared.
The number of tests carried out are minimized.
Performance degrades when a single evaluation contains too many variable definitions and conditions.
The test uses less memory.
Simply put, if you would like to use the RetePlus algorithm in your orchestration, use only Decision Trees business rules.
It's much faster when used this way. Although you may used in combination with other algorithms as well as Sequential (for Action Rules, in this case).
So your solution could be part of Action Rules (with Sequential) and part with Decision Tables (with RetePlus).
Hope this helps.

Data dependency and consistency

I'm developing a quite large (for me) ruby script for engineering calculations. The script creates a few objects that are interconnected in a hierarchical fashion.
For example one object (Inp) contains the input parameters for a set of simulations. Other objects (SimA, SimB, SimC) are used to actually perform the simulations and each of them may generate one or more output objects (OutA, OutB, OutC) that contain the results and produce the actual files used for the visualization or analysis by other objects and so on.
The first time I perform and complete all the simulations all the objects will be fully defined and I will have a series or files that represent the outputs for the user.
Now suppose that the user needs to change one of the attributes of Inp. Depending on which attribute has been modified some simulations will have to be re-run and some object OutX will be rendered invalid otherwise the consistency would be loss as the outputs would not correspond to the inputs anymore.
I would like to know whether there is a design pattern that would facilitate this process. Also I was wondering whether some sort of graph could be used to visually represents the various dependencies between objects in a clear way.
From what I have been reading (this question is a year old) I think that the Ruby Observable class could be used for this purpose. Every time a parent object changes, it should send a message to its children so that they can update their state.
Is this the recommended approach?
I hope this makes the question a bit clearer.
I'm not sure that I fully understand your question, but the problem of stages which depend on results of previous stages which in turn depend on results from previous stages which themselves depend on result from previous stages, and every one of those stages can fail or take an arbitrary amount of time, is as old as programming itself and has been solved a number of times.
Tools which do this are typically called "build tools", because this is a problem that often occurs when building complex software systems, but they are in no way limited to building software. A more fitting term would be "dependency-oriented programming". Examples include make, ant, or Ruby's own rake.

Whats the best way to approach rule validation

So I'm currently working as an intern at a company and have been tasked with creating the middle tier layer of a UI rule editor for a analytical engine. As part of this task I have ensure that all rules created are valid rules. These rules can be quite complex, consisting of around 10 fields with multiple possibilities for each field.
I'm in way over my head here , I've been trying to find some material to guide me on this task but I cant seem to find much. Is there any pattern or design approach I can take to break this up into more manageable tasks? A book to read? Anything ideas or guidance would be appreciated.
You may consider to invest the time to learn a lexer/parser e.g. Anltr4. You can use the Antlrwork2 ide to assist in the visualization and debugging.
Antlrworks2: http://tunnelvisionlabs.com/products/demo/antlrworks
You can get off the ground by searching for example grammars and then tweak them for your particular needs.
Grammars: https://github.com/antlr/grammars-v4
Antlr provides output bindings in a number of different languages - so you will likely have one that fits your needs.
This is not a trivial task in any case - but an interesting and rewarding one.
You need to build the algorithm for the same.
Points to be followed
1.) Validating for Parameters based on datatype support and there compatibility.
2.) Which operator to be followed by operand of specific datatype.
3.) The return result of some expression should again be compatible with next operand or operator.
Give a feature of simulating the rule, where in user can select the dataset on which rule has to be fired.
eg
a + b > c
Possible combinations.
1.) A, b can be String, number or integer.
2.) But combination result of a+b if String then operator ">" cannot come.

Implementing a model written in a Predicate Calculus into ProLog, how do I start?

I have four sets of algorithms that I want to set up as modules but I need all algorithms executed at the same time within each module, I'm a complete noob and have no programming experience. I do however, know how to prove my models are decidable and have already done so (I know Applied Logic).
The models are sensory parsers. I know how to create the state-spaces for the modules but I don't know how to program driver access into ProLog for my web cam (I have a Toshiba Satellite Laptop with a built in web cam). I also don't know how to link the input from the web cam to the variables in the algorithms I've written. The variables I use, when combined and identified with functions, are set to identify unknown input using a probabilistic, database search for best match after a breadth first search. The parsers aren't holistic, which is why I want to run them either in parallel or as needed.
How should I go about this?
I also don't know how to link the
input from the web cam to the
variables in the algorithms I've
written.
I think the most common way for this is to use the machine learning approach: first calculate features from your video stream (like position of color blobs, optical flow, amount of green in image, whatever you like). Then you use supervised learning on labeled data to train models like HMMs, SVMs, ANNs to recognize the labels from the features. The labels are usually higher level things like faces, a smile or waving hands.
Depending on the nature of your "variables", they may already be covered on the feature-level, i.e. they can be computed from the data in a known way. If this is the case you can get away without training/learning.

Resources