Multiple events matching algorithm - algorithm

I have a task to match multiple events(facts) with each other by some their properties.
As a result of events matching some action should be generated. Action can be generated when events of all exists types were matched.
Is there any algorithm which could be used for such task? Or any direction?
Thanks
Example:
We have several events with different types and properties.
Type SEEN is cumulative event (several events could be merged for matching) and type FOUND is not.
Event 1 (SEEN):
DATE="2009-09-30"
EYES_COLOR="BLUE"
LEFT_SOCK_COLOR="RED"
Event 2 (SEEN):
DATE="2009-09-30"
EYES_COLOR="BLUE"
RIGHT_SOCK_COLOR="GREEN"
Event 3 (FOUND):
DATE="2009-09-30"
EYES_COLOR="BLUE"
LEFT_SOCK_COLOR="BLUE"
RIGHT_SOCK_COLOR="GREEN"
PLACE="MARKET"
Event 4 (FOUND):
DATE="2009-09-30"
EYES_COLOR="BLUE"
LEFT_SOCK_COLOR="GREEN"
PLACE="SHOP"
Event 5 (FOUND):
DATE="2009-09-30"
EYES_COLOR="BLUE"
PLACE="AIRPORT"
For above events such actions should be generated (by composing matched events):
Action 1_2_3:
DATE="2009-09-30"
EYES_COLOR="BLUE"
LEFT_SOCK_COLOR="RED"
RIGHT_SOCK_COLOR="GREEN"
PLACE="MARKET"
Action 2_4:
DATE="2009-09-30"
EYES_COLOR="BLUE"
LEFT_SOCK_COLOR="GREEN"
PLACE="SHOP"
Means:
Event 1 + Event 2 + Event 3 => Action 1_2_3
Event 2 + Event 4 => Action 2_4
Event 5 does not match with anything.

in your case every two events are either compatible or not; we can denote this by C(e,e'), meaning that event e is compatible with event e'. You can build a maximal set of compatible events of course iteratively; when you have a set {e1,e2,...,en} of compatible events, you can add e' to the set if and only if e' is compatible with every e1,...,en, i.e. C(ei,e') is true for all 1<=i<=n.
Unfortunately in your case the number of maximal sets of compatible events can be exponential to the number of events, because you can have e.g. events e1, e2, e3 and e4 so that they are all pair-wisely compatible but none of them is compatible with TWO other events; for this set you will already get 6 different "actions", and they overlap each other.
A simple algorithm is to have a recursive search where you add events one by one to the prospectual "action", and when you can't add any more events you register the action; then you backtrack. It's called "backtracking search". You can improve its running time then by proper datastructures for "quickly" looking up the matching events.
As in the comment, the question about SEEN/FOUND is open; I'm assuming here that the fields are merged "as is".

This pseudo-code may help: (C# syntax)
foreach (var found in events.Where(x => x.EventType == "Found"))
{
var matches = events.Where(x => x.EventType == "Seen"
&& x.Whatever == found.Whatever);
if (matches.Count() > 0)
{
// Create an action based on the single "Found" event
// and the multiple matching "Seen" events.
}
}

I'm not sure I understand the question correctly. It seems that for every FOUND event, you want to identify all matching SEEN events and merge them? Python code:
# assume events are dictionaries, and you have 2 lists of them by type:
# (omitting DATE because it's always "2009-09-03" in your example)
seen_events = [
{
"EYES_COLOR": "BLUE",
"LEFT_SOCK_COLOR": "RED",
},
{
"EYES_COLOR": "BLUE",
"RIGHT_SOCK_COLOR": "GREEN",
},
]
found_events = [
{
"EYES_COLOR": "BLUE",
"LEFT_SOCK_COLOR": "BLUE",
"RIGHT_SOCK_COLOR": "GREEN",
"PLACE": "MARKET",
},
{
"EYES_COLOR": "BLUE",
"LEFT_SOCK_COLOR": "GREEN",
"PLACE": "SHOP",
},
{
"EYES_COLOR": "BLUE",
"PLACE": "AIRPORT",
},
]
def do_action(seen_events, found):
"""DUMMY"""
for seen in seen_events:
print seen
print found
print
# brute force
for found in found_events:
matching = []
for seen in seen_events:
for k in found:
if k in seen and seen[k] != found[k]:
break
else: # for ended without break (Python syntax)
matching.append(seen)
if matching:
do_action(matching, found)
which prints:
{'EYES_COLOR': 'BLUE', 'RIGHT_SOCK_COLOR': 'GREEN'}
{'EYES_COLOR': 'BLUE', 'PLACE': 'MARKET', 'LEFT_SOCK_COLOR': 'BLUE', 'RIGHT_SOCK_COLOR': 'GREEN'}
{'EYES_COLOR': 'BLUE', 'RIGHT_SOCK_COLOR': 'GREEN'}
{'EYES_COLOR': 'BLUE', 'PLACE': 'SHOP', 'LEFT_SOCK_COLOR': 'GREEN'}
{'EYES_COLOR': 'BLUE', 'LEFT_SOCK_COLOR': 'RED'}
{'EYES_COLOR': 'BLUE', 'RIGHT_SOCK_COLOR': 'GREEN'}
{'EYES_COLOR': 'BLUE', 'PLACE': 'AIRPORT'}
Right, this is not effecient - O(n*m) - but does this even describe the problem correctly?

Related

How to implement buffering with timeout in RX.JS

I'm trying to to group the values from an observable into an array of n size, to be able to batch send these to a service to improve the overall performance.
The thing is that I want to make sure that when the items left are less then n, they will be still be passed down the chain after a certain timeout.
I'm trying to rewrite the C# solution from
https://stackoverflow.com/a/22873833/2157455
in Javascript.
The main problem is that in Rx.Js lots of methods have been deprecated and it's hard to find the new functions.
var people = new List<(string name, int age)>
{
("Sue", 25 ),
("Joe", 30 ),
("Frank", 25 ),
("Sarah", 35 ),
("John", 37)
}.ToObservable();
var buffers = people
.GroupByUntil(
// yes. yes. all items belong to the same group.
x => true,
g => Observable.Amb(
// close the group after 5 seconds of inactivity
g.Throttle(TimeSpan.FromSeconds(5)),
// close the group after 10 items
g.Skip(1)
))
// Turn those groups into buffers
.SelectMany(x => x.ToArray());
I could get this far, but I can't find the replacement for groupByUntil. And I'm not sure what's the selectMany operator in Rx.Js, probably toArray().
Most examples I find are using deprecated or non-exising functions.
I'm using rxjs 7.8.0
The syntax does not help as well, using the pipe all the time makes the code difficult to read in my opinion.
const people = [
{ name: 'Sue', age: 25 },
{ name: 'Joe', age: 30 },
{ name: 'Frank', age: 25 },
{ name: 'Sarah', age: 35 },
{ name: 'John', age: 37 }
];
const source = from(people);
const example = source.pipe(
groupBy(person => true),
mergeMap(group => group.pipe(
raceWith(
group.pipe(throttle(() => interval(1000))),
group.pipe(skip(2))
),
toArray()
)));
example.forEach(x => console.log(x.length));
I'm getting all 5, instead of two arrays, one with 3 the other with 2.
Perhaps there is a better way to write it in js, but I can;t see the replacement for groupByUntil.
Thanks.
bufferTime is probably what you are looking for
One of its signature is :
bufferTime(bufferTimeSpan: number, bufferCreationInterval: number, maxBufferSize: number, scheduler?: SchedulerLike): OperatorFunction<T, T[]>
so with bufferTime(1000, null, 2) you get a buffered of length=2 or every 1s.

Data Store - dash_table conditional formatting failing

#dashapp.callback(
Output(component_id='data-storage', component_property='data'),
Input(component_id='input', component_property='n_submit')
.
.
.
return json_data
#dashapp.callback(
Output('table', component_property='columns'),
Output('table', component_property='data'),
Output('table', component_property='style_cell_conditional'),
Input(component_id='data-storage', component_property='data'),
.
.
.
column_name = 'Target Column'
value = 'This value is a string'
table_columns = [{"name": i, "id": i} for i in df.columns]
table_data = df.to_dict("records")
conditional_formatting = [{
'if': {
'filter_query': f'{{{column_name}}} = {value}'
},
'backgroundColor': 'white',
'color' : 'black',
}
]
return table_columns, table_data, conditional_formatting
When the code above is used WITH the conditional_formatting part - it works for some 'value's, and does not work for other 'value's
When the code above is used WITHOUT the conditional_formatting part - it works as expected for all 'value's
To be noted that when the conditional_formatting part is used, all callbacks are triggered twice. After this happens, the Data Store acts as if it has been infected by the "sick" value and does not allow new data.
Example:
Step 1. Use working input -> All callbacks triggered once -> Data Store is populated -> Data is displayed as expected
Step 2. Use working input -> All callbacks triggered once -> Data Store is populated -> Data is displayed as expected
Step 3. Use not working input -> All callbacks triggered once -> All callbacks are triggered again -> Data related to Input from b) is displayed
Step 4. Use working input -> All callbacks triggered once -> All callbacks are triggered again -> Data related to Input from b) is displayed
Any ideas why does this happen?
Any feedback is appreciated!
conditional_formatting = [{
'if': {
'filter_query': f'{{{column_name}}} = "{value}"'
},
'backgroundColor': 'white',
'color' : 'black',
}
]
Issue was because the failing values had empty space (e.g. San Francisco). Adding quotes around solved the issue.

How to synchronise rxjs observables

I'm looking for a way to make streams that are combined
Note: this is the simplest form of my problem, in reality I'm combining 8 different streams some are intertwined, some are async etc :(
import { BehaviorSubject, map, combineLatest } from 'rxjs';
const $A = new BehaviorSubject(1)
const $B = $A.pipe(map(val => `$B : ${val}`))
const $C = $A.pipe(map(val => `$C : ${val}`))
// prints out:
// (1) [1, "$B : 1", "$C : 1"]
combineLatest([$A,$B,$C]).subscribe(console.log)
$A.next(2)
// prints out:
// (2) [2, "$B : 1", "$C : 1"]
// (3) [2, "$B : 2", "$C : 1"]
// (4) [2, "$B : 2", "$C : 2"]
Code example
The print out (1) is great, all streams have a value of "1": [1, "$B : 1", "$C : 1"]
The print out (4) is great, all streams have a value of "2": [2, "$B : 2", "$C : 2"]
But the combine latest fires for (2) and (3) after each stream is updated individually meaning that you have a mixture of "1" and "2"
**What way can I modify the code to only get notified when a change has fully propgaged? **
My best solutions so far:
A) using debouceTime(100)
combineLatest([$A,$B,$C]).pipe(debounceTime(100)).subscribe(console.log)
But it's flaky because it can either swallow valid states if the are process to quickly or notify with invalid states if individual pipes are too slow
B) filter only valid state
combineLatest([$A,$B,$C]).pipe(
filter(([a,b,c])=>{
return b.indexOf(a) > -1 && c.indexOf(a) > -1
})
).subscribe(console.log)
works but adding a validation function seems like the wrong way to do it (and more work :))
C) Make B$ and C$ in which we push the latest and reset at every change"
A$.pipe(tap(val)=>{
B$.next(undefined);
B$.next(val);
C$.next(undefined)
C$.next(val);
})
...
combineLatest([$A,$B.pipe(filter(b => !!b)),$C.pipe(filter(c => !!c))]).pipe(
filter(([a,b,c])=>{
return b.indexOf(a) > -1 && c.indexOf(a) > -1
})
Works but quite a lot of extra code and vars
I have the feeling I'm missing a concept or not seeing how to achieve this in a clean/robust way, but I sure I'm not the first one :)
Thanks
As you've observed, the observable created by combineLatest will emit when any of its sources emit.
Your problem is occurring because you pass multiple observables into combineLatest that share a common source. So whenever that common source emits, it causes each derived observable to emit.
One way to "fix" this in a synchronous scenario is to simply apply debounceTime(0) which will mask the duplicate emission that happens in the same event loop. This approach is a bit naive, but works in simple scenarios:
combineLatest([$A,$B,$C]).pipe(
debounceTime(0)
)
But, since you have some async things going on, I think your solution is to not include duplicate sources inside combineLatest and handle the logic further down the chain:
combineLatest([$A]).pipe(
map(([val]) => [
val,
`$B : ${val}`,
`$C : ${val}`,
])
)
The code above produces the desired output. Obviously, you wouldn't need combineLatest with a single source, but the idea is the same if you had multiple sources.
Let's use a more concrete example that has the same issue:
const userId$ = new ReplaySubject<string>(1);
const maxMsgCount$ = new BehaviorSubject(2);
const details$ = userId$.pipe(switchMap(id => getDetails(id)));
const messages$ = combineLatest([userId$, maxMsgCount$]).pipe(
switchMap(([id, max]) => getMessages(id, max))
);
const user$ = combineLatest([userId$, details$, messages$]).pipe(
map(([id, details, messages]) => ({
id,
age: details.age,
name: details.name,
messages
}))
);
Notice when userId emits a new value, the user$ observable would end up emitting values that had the new userId, but the details from the old user!
We can prevent this by only including unique sources in our combineLatest:
const userId$ = new ReplaySubject<string>(1);
const maxMsgCount$ = new BehaviorSubject(2);
const user$ = combineLatest([userId$, maxMsgCount$]).pipe(
switchMap(([id, max]) => combineLatest([getDetails(id), getMessages(id, max)]).pipe(
map(([details, messages]) => ({
id,
age: details.age,
name: details.name,
messages
}))
))
);
You can see this behavior in action in the below stackblitz samples:
Problem
Solution

pytransitions is there a simple way to get the history of triggered events

class Matter(object):
def __init__(self, states, transitions):
self.states = states
self.transitions = transitions
self.machine = Machine(model=self, states=self.states, transitions=transitions, initial='liquid')
def get_triggered_events(self, source, dest):
self.machine.set_state(source)
eval("self.to_{}()".format(dest))
return
states=['solid', 'liquid', 'gas', 'plasma']
transitions = [
{ 'trigger': 'melt', 'source': 'solid', 'dest': 'liquid' },
{ 'trigger': 'evaporate', 'source': 'liquid', 'dest': 'gas' },
{ 'trigger': 'sublimate', 'source': 'solid', 'dest': 'gas' },
{ 'trigger': 'ionize', 'source': 'gas', 'dest': 'plasma' }
]
matter=Matter(states,transitions)
matter.get_triggered_events("solid","plasma")
I want to get the history of triggered events from source to destination in get_triggered_events method. E.g. runing matter.get_triggered_events("solid","plasma") will get [["melt","evaporate","ionize"],["sublimate","ionize"]]. Is there a simple way to achieve it?
I guess what you intend to do is to get all possible paths from one state to another.
You are interested in the names of the events that must be emitted/triggered to reach your target.
A solution is to traverse through all events that can be triggered in a state. Since one event can result in multiple transitions we need to loop over a list of transitions as well. With Machine.get_triggers(<state_name>) we'll get a list of all events that can be triggered from <state_name>.
With Machine.get_transitions(<event_name>, source=<state_name>) we will get a list of all transitions that are associated with <event_name> and can be triggered from <state_name>.
We basically feed the result from Machine.get_triggers to Machine.get_transitions and loop over the list of transitions while keeping track of the events that have been processed so far.
Furthermore, to prevent cyclic traversal, we also keep track of the transition entities we have already visited:
from transitions import Machine
states = ['solid', 'liquid', 'gas', 'plasma']
transitions = [
{'trigger': 'melt', 'source': 'solid', 'dest': 'liquid'},
{'trigger': 'evaporate', 'source': 'liquid', 'dest': 'gas'},
{'trigger': 'sublimate', 'source': 'solid', 'dest': 'gas'},
{'trigger': 'ionize', 'source': 'gas', 'dest': 'plasma'}
]
class TraverseMachine(Machine):
def traverse(self, current_state, target_state, seen=None, events=None):
seen = seen or []
events = events or []
# we reached our destination and return the list of events that brought us here
if current_state == target_state:
return events
paths = [self.traverse(t.dest, target_state, seen + [t], events + [e])
for e in self.get_triggers(current_state)
for t in self.get_transitions(e, source=current_state)
if t not in seen]
# the len check is meant to prevent deep nested return values
# if you just return path the result will be:
# [[[['melt', 'evaporate', 'ionize']]], [['sublimate', 'ionize']]]
return paths[0] if len(paths) == 1 and isinstance(paths[0], list) else paths
# disable auto transitions!
# otherwise virtually every path ending with `to_<target_state>` is a valid solution
m = TraverseMachine(states=states, transitions=transitions, initial='solid', auto_transitions=False)
print(m.traverse("solid", "plasma"))
# [['melt', 'evaporate', 'ionize'], ['sublimate', 'ionize']]
You might want to have a look at the discussion in transitions issue tracker about automatic traversal as it contains some insights about conditional traversal.

Ruby, Parsing a JSON response for an array of values

I'm new to ruby so please excuse any ignorance I may bear. I was wondering how to parse a JSON reponse for every value belonging to a specific key. The response is in the format,
[
{
"id": 10008,
"name": "vpop-fms-inventory-ws-client",
"msr": [
{
"key": "blocker_violations",
"val": 0,
"frmt_val": "0"
},
]
},
{
"id": 10422,
"name": "websample Maven Webapp",
"msr": [
{
"key": "blocker_violations",
"val": 0,
"frmt_val": "0"
}...
There's some other entries in the response, but for the sake of not having a huge block of code, I've shortened it.The code I've written is:
require 'uri'
require 'net/http'
require 'JSON'
url = URI({my url})
http = Net::HTTP.new(url.host, url.port)
request = Net::HTTP::Get.new(url)
request["cache-control"] = 'no-cache'
request["postman-token"] = '69430784-307c-ea1f-a488-a96cdc39e504'
response = http.request(request)
parsed = response.read_body
h = JSON.parse(parsed)
num = h["msr"].find {|h1| h1['key']=='blocker_violations'}['val']
I am essentially looking for the val for each blocker violation (the json reponse contains hundreds of entries, so im expecting hundreds of blocker values). I had hoped num would contain an array of all the 'val's. If you have any insight in this, it would be of great help!
EDIT! I'm getting a console output of
scheduler caught exception:
no implicit conversion of String into Integer
C:/dashing/test_board/jobs/issue_types.rb:20:in `[]'
C:/dashing/test_board/jobs/issue_types.rb:20:in `block (2 levels) in <top (requi
red)>'
C:/dashing/test_board/jobs/issue_types.rb:20:in `select'
I suspect that might have too much to do with the question, but some help is appreciated!
You need to do 2 things. Firstly, you're being returned an array and you're only interested in a subset of the elements. This is a common pattern that is solved by a filter, or select in Ruby. Secondly, the condition by which you wish to select these elements also depends on the values of another array, which you need to filter using a different technique. You could attempt it like this:
res = [
{
"id": 10008,
"name": "vpop-fms-inventory-ws-client",
"msr": [
{
"key": "blocker_violations",
"val": 123,
"frmt_val": "0"
}
]
},
{
"id": 10008,
"name": "vpop-fms-inventory-ws-client",
"msr": [
{
"key": "safe",
"val": 0,
"frmt_val": "0"
}
]
}
]
# define a lambda function that we will use later on to filter out the blocker violations
violation = -> (h) { h[:key] == 'blocker_violations' }
# Select only those objects who contain any msr with a key of blocker_violations
violations = res.select {|h1| h1[:msr].any? &violation }
# Which msr value should we take? Here I just take the first.
values = violations.map {|v| v[:msr].first[:val] }
The problem you may have with this code is that msr is an array. So theoretically, you could end up with 2 objects in msr, one that is a blocker violation and one that is not. You have to decide how you handle that. In my example, I include it if it has a single blocker violation through the use of any?. However, you may wish to only include them if all msr objects are blocker violations. You can do this via the all? method.
The second problem you then face is, which value to return? If there are multiple blocker violations in the msr object, which value do you choose? I just took the first one - but this might not work for you.
Depending on your requirements, my example might work or you might need to adapt it.
Also, if you've never come across the lambda syntax before, you can read more about it here

Resources