What's the best way to inherit properties in a tree-based structure? - algorithm

I have a simple CMS system, that has a simple tree hierarchy:
We have pages A through E that has the following hierarchy:
A -> B -> C -> D -> E
All the pages are the same class, and have a parent-child relationship.
Now, let's say I have a property I want inherited among the pages. Let's say A is red:
A (red) -> B -> C -> D -> E
In this case, B through E would inherit "red".
Or a more complex scenarios:
A (red) -> B -> C (blue) -> D -> E
B would inherit red, and D/E would both be blue.
What would be the best way to solve something like this? I have a tree structure with over 6,000 leafs and about 100 of those leaves have inheritable properties. Those 100 or so leaves have their properties saved in the database. For leaves without explicit properties, I look up the ancestors and use memcached to save the properties. Then there are very overly-complex algorithms to handle expiring those caches. It's terribly convoluted and I'd like refactor to a more cleaner solution / data structure.
Does anybody have any ideas?
Thanks!

There is a data model that allows you to express this kind of information perfectly, that is RDF/RDFS . RDF is a W3C standard to model data based on triples (subject, predicate, object) and URIs; and RDFS , among other things, allows you to describe Class hierarchies and Property hierarchies. And the good thing is that there are many libraries out there that help you to create and query this type of data.
For instance if I want to say that a specific document Lion is of class Animal and programmer is of class Geek , I could say:
doc:lion rdf:type class:mamal .
doc:programmer rdf:type class:Geek .
Now I could declare a hierarchy of classes, and say that every mamal is an animal and every animal is a living thing.
class:mamal rdfs:subClassOf class:animal .
class:animal rdfs:subClassOf class:LivingThing .
And, that every geek is a human and that every human is living thing:
class:geek rdfs:subClassOf class:human .
class:human rdfs:subClassOf class:LivingThing .
There is a language , similar to SQL, called SPARQL to query this kind of data, so for instance if I issue the query:
SELECT * WHERE {
?doc rdf:type class:LivingThing .
}
Where ?doc is a variable that will bind things type of class:LivingThing. I would get as result of this query doc:lion and doc:programmer because the database technology will follow the semantics of RDFS and therefore by computing the closure of classes it'll know that doc:lion and doc:programmer are class:LivingThing.
In the same way the query:
SELECT * WHERE {
doc:lion rdf:type ?class .
}
Will tell me that doc:lion is rdf:type of class:mamal class:animal and class:LivingThing.
In the same way that as I just explained, with RDFS, you can create hierarchies of properties, and say:
doc:programmer doc:studies doc:computerscience .
doc:lion doc:instint doc:hunting .
And we can say that both properties doc:skill and doc:instint are sub-properties of doc:knows:
doc:studies rdfs:subPropertyOf doc:knows .
doc:instint rdfs:subPropertyOf doc:knows .
With the query:
SELECT * WHERE {
?s doc:knows ?o .
}
We will get that a lion knows how to hunt and programmers know computer science.
Most RDF/RDFS databases can easily deal with the numbers of elements you mentioned in your question, and there are many choices to start. If you are a Java person you could have a look at Jena, there are also frameworks for .Net lije this one or Python with RDFLIB
But most importantly, have a look at the documentation of your CMS, because maybe there are plugins to export metadata as RDF. Drupal, for instance, is quite advance in this case (see http://drupal.org/project/rdf

If your problem is performance-related...
I assume you'd want to save on memory of all these inheritable properties (or perhaps you have a lot of them), otherwise this can be trivially solved with virtual properties.
If you need sparse inheritable properties, say if you are modelling how HTML DOM properties or CSS properties propagate, you'll need to:
Keep a pointer to the parent node (for walking upwards)
Use a hash dictionary to store the properties inside each class (or each instance, depending on your needs), keyed by name
If the properties don't vary by instance, use a class-static dictionary
If the properties can be overridden instance-by-instance, add an instance dictionary on top
When accessing a property, start finding it at the leaf, look in the instance dictionary first, then the class-static dictionary, then walk up the tree
Of course you can add more functionalities on top of this. This is similar to how Windows Presentation Foundation solves this problem via DependencyProperty.
If your problem is database-related...
If instead your problem is to avoid reading the database to walk up the tree (i.e. loading the parents to find inherited properties), you'll need to do some sort of caching for the parent values. Or alternatively, when you load a leaf from the database, you can load all its parents and create a master merged properties dictionary in memory.
If you want to avoid multiple database lookups to find each parent, one trick is to encode the path to each node into a text field, e.g. "1.2.1.3.4" for a leaf on the 6th level. Then, only load up nodes that have paths which are beginning substrings. You can then get the entire parents path in one SQL query.

Related

How to add this type of node description by Mermaid?

This is a flowchart pattern that I really like to use and I currently use drawio to draw it:
Notice that there are two kinds of descriptions in the flow chart
description1:How does A get to B
description2:Some properties of B
I know Mermaid can implement the description1 by:
graph TB
A --->|"description1:<br>How does A get to B"| B
But description2 is also very important to me, is there any way to achieve it?
The current workaround:
I use the heading of subgraph instead of description2:
graph TB
A --->|"description1:<br>How does A get to B"| B
subgraph description2:<br>Some properties of B
B
end
But I have to say it's a very ugly temporary solution. So I ask here..
While some types of Mermaid diagrams explicitly support notes (e.g. sequence diagrams), flowcharts do not.
I believe the closest you're going to get is to connect B to itself with an invisible link (~~~):
graph TB
A --->|"description1:<br>How does A get to B"| B
B ~~~|"description2:<br>Some properties of B"| B

In macOS AppleScript, how do I check if a property exists?

In AppleScript I access an object in an application, e.g.:
tell application id "DNtp"
repeat with g in (children of root of (think window 1))
set theAnnotation to annotation of g
end
end tell
I know that depending on the child accessed, g sometimes has no annotation property.
N.B. This is not the same as having a missing value or something like that.
When running the code, AppleScript simply complains that the variable theAnnotation is not defined.
Other than trapping the error with a try ... on error statement, how do I check if g has the property annotation?
If you do
tell application id "DNtp"
annotation of children of root of (think window 1)
end tell
You'll get a list containing annotations and if those are missing, missing values, like:
--{missing value, missing value, missing value, «class DTcn» id 49 of «class DTkb» id 1, missing value}
The problem you describe in DEVONthink (of AS ignoring a declared variable during compile time) is a problem I've seen in other apps, too (but it's fairly rare).
If you want to check for the existence of a property, what usually works (and I've tested this with DEVONthink3) is to use exists, like:
if (exists annotation of g) then
Which will return true or false as you would expect. Not sure how you'd use that in the way you first posted, but I don't really know all the steps you're taking, so ....
I hope this helps
I don't use DEVON-related software, but in ordinary1 situations when dealing with AppleScript records, CRGreen's suggestions won't apply: exists is not a command that properties understand, especially non-existent properties; and referencing a property that does not exist will throw an error.
I'm glad you're looking for an alternative to try...end try. I've seen your previous code samples that are drowning in them, and they are expensive operations when an error is caught, so not ideal for what you're attempting. try really has no place in AppleScript at all.
Rather than testing for the existence of a property within a record, the way to approach this in a general context is to create a record object that contains all the property values that would ever be needed, and to assign default values to each of them.
In AppleScript, a record is observed to follow the following behaviour:
A single record object can only contain one property with a given identifier. Should you attempt to insert two properties both of which are identified the same, the compiler will keep the first property and its associated value, and scrub the rest:
{a:1, b:2, a:3} # will resolve on compilation immediately to {a:1, b:2}
Two record objects can contain properties with shared identifiers, like so:
set L to {a:1, b:2, c:3}
set R to {d:missing value, c:L}
Similarly to list objects, two record objects can be merged into a single record, and the properties will be amalgamated: properties with identifiers unique to each record will simply be inserted without any change into the resulting data structure. Where an identifier occurs in both record objects before the merge, again, precedence is given in a left-to-right reading order, so the properties in the prefixed record (on the left) will prevail and the suffixed record (on the right) will have its non-unique property identifiers (and their values) scrubbed:
L & R --> {a:1, b:2, c:3, d:missing value}
R & L --> {d:missing value, c:{a:1, b:2, c:3}, a:1, b:2}
Your code snippet contains this:
repeat with g in (children of root of (think window 1))
set theAnnotation to annotation of g
end
Therefore, g is an item contained within children (a list object), and the type class of g is a record. Depending on which item of children is being examined, I'm assuming that some of those items are records that do contain a property identified by annotation, and some of those items are records that do not contain such a property.
However, consider the following record that results from this merge:
g & {annotation:missing value}
Here are the two possible scenarios:
g is a record that already contains a property identified as annotation, e.g.:
set g to {cannotation:"doe", bannotation:"ray", annotation:me}
g & {annotation:missing value} --> {cannotation:"doe", bannotation:"ray", annotation:«script»}
set theAnnotation to annotation of (g & {annotation:missing value})
--> «script» (i.e. me)
OR:
g is a record that in which the property identifier annotation does not exist, e.g.:
set g to {doe:"a deer", ray:"a drop of golden sun"}
g & {annotation:missing value} --> {doe:"a deer", ray:"a drop of golden sun", annotation:missing value}
set theAnnotation to annotation of (g & {annotation:missing value})
--> missing value
Therefore, for every place in your scripts where try...end try has been used to catch non-occurrences of properties inside a record data structure, simply delete the try blocks, and wherever you are assign values read from speculative property values, artificially insert default values you can then test for and will know whether or not the value came from your DEVONthink source or from your brain:
tell application id "DNtp"
repeat with g in (children of root of (think window 1))
set theAnnotation to annotation of (g & {annotation:false})
if theAnnotation ≠ false then exit repeat
end
end tell
1This isn't in any way meant to suggest his solution isn't viable. If DEVON returns collections that are not de-referenced--and it very well may do--these can be operated upon as a whole, without looping through individual items, and he, of course, uses DEVON. But the situation I hopefully address above is one that arises much more commonly, and will also work here.

Execute query lazily in Orient-DB

In current project we need to find cheapest paths in almost fully connected graph which can contain lots of edges per vertex pair.
We developed a plugin containing functions
for special traversal this graph to lower reoccurences of similar paths while TRAVERSE execution. We will refer it as search()
for special effective extraction of desired information from results of such traverses. We will refer it as extract()
for extracting best N records according to target parameter without costly ORDER BY. We will refer it as best()
But resulted query still has unsatisfactory performance on full data.
So we decided to modify search() function so it could watch best edges first and prune paths leading to definitely undesired result by using current state of best() function.
Overall solution is effectively a flexible implementation of Branch and Bound method
Resulting query (omitting extract() step) should look like
SELECT best(path, <limit>) FROM (
TRAVERSE search(<params>) FROM #<starting_point>
WHILE <conditions on intermediate vertixes>
) WHERE <conditions on result elements>
This form is very desired so we could adapt conditions under WHILE and WHERE for our current task. The path field is generated by search() containing all information for best() to proceed.
The trouble is that best() function is executed strictly after search() function, so search() can not prune non-optimal branches according to results already evaluated by best().
So the Question is:
Is there a way to pipeline results from TRAVERSE step to SELECT step in the way that older paths were TRAVERSEd with search() after earlier paths handled by SELECT with best()?
the query execution in this case will be streamed. If you add a
System.out.println()
or you put a breakpoint in your functions you'll see that the invocation sequence will be
search
best
search
best
search
...
You can use a ThreadLocal object http://docs.oracle.com/javase/7/docs/api/java/lang/ThreadLocal.html
to store some context data and share it between the two functions, or you can use the OCommandContext (the last parameter in OSQLFunction.execute() method to store context information.
You can use context.getVariable() and context.setVariable() for this.
The contexts of the two queries (the parent and the inner query) are different, but they should be linked by a parent/child relationship, so you should be able to retrieve them using OCommandContext.getParent()

Defining a flexible structure in Prolog

Well, I'm a bit new to Prolog, so my question is on Prolog pattern/logic.
I have an relationship called tablet. It has many parameters, such as name, operationSystem, ramCapacity, etc. I have many objects/predicates of this relationship, like
tablet(
name("tablet1"),
operatingSystem("ios"),
ramCapacity(1024),
screen(
type("IPS"),
resolution(1024,2048)
)
).
tablet(
name("tablet2"),
operatingSystem("android"),
ramCapacity(2048),
screen(
type("IPS"),
resolution(1024,2048),
protected(yes)
),
isSupported(yes)
).
And some others similar relationships, BUT with different amounts of parameters. Some of attributes in different objects I do not need OR I have created some tablets, and one day add one more field and started to use it in new tablets.
There are two questions:
I need to use the most flexible structure as possible in prolog. Some of the tablets have attributes/innerPredicates and some do not, but They are all tablets.
I need to access data the easiest way, for example find all tablets that have ramCapacity(1024), not include ones that do not have this attributes.
I do need to change some attributes' values in the easiest way. For example query - change ramCapacity to 2048 for tablet that has name "tablet1".
If it's possible it should be pretty to read in a word editor :)
Is this structure flexible? Should I use another one? Do I need additional rules to manipulate this structure? Is this structure easy to change with query?(I keep this structure in a file).
Since the number of attributes is not fixed and needs to be so flexible, consider to represent these items as in option lists, like this:
tablet([name=tablet1,
operating_system=ios,
ram_capacity=1024,
screen=screen([type="IPS",
resolution = res(1024,2048)])]).
tablet([name=tablet2,
operating_system=android,
ram_capacity=2048,
screen=screen([type="IPS",
resolution = res(1024,2048)]),
is_supported=yes]).
You can easily query and arbitrarily extend such lists. Example:
?- tablet(Ts), member(name=tablet2, Ts).
Ts = [name=tablet2, operating_system=android, ram_capacity=2048, screen=screen([type="IPS", resolution=res(..., ...)]), is_supported=yes] ;
false.
Notice also the common Prolog naming_convention_to_use_underscores_for_readability instead of mixingCasesAndMakingEverythingExtremelyHardToRead.

XPATH - cannot select grandparent node

I am trying to parse a live betting XML feed and need to grab each bet from within the code. In plain English I need to use the tag 'EventSelections' for my base query and 'loop' through these tags on the XML so I grab all that data and it creates and entity for each one which I can use on a CMS.
My problem is I want to go up two places in the tree to a grandparent node to gather that info. Each EventID refers to the unique name of a game and some games have more bets than others. It's important that I grab each bet AND the EventID associated with it, problem is, this ID is the grandparent each time. Example:
<Sportsbet Time="2013-08-03T08:38:01.6859354+09:30">
<Competition CompetitionID="18" CompetitionName="Baseball">
<Round RoundID="2549" RoundName="Major League Baseball">
<Event EventID="849849" EventName="Los Angeles Dodgers (H Ryu) At Chicago Cubs (T Wood)" Venue="" EventDate="2013-08-03T05:35:00" Group="MTCH">
<Market Type="Match Betting - BIR" EachWayPlaces="0">
<EventSelections BetSelectionID="75989549" EventSelectionName="Los Angeles Dodgers">
<Bet Odds="1.00" Line=""/>
</EventSelections>
<EventSelections BetSelectionID="75989551" EventSelectionName="Chicago Cubs">
<Bet Odds="17.00" Line=""/>
</EventSelections>
Does anyone know how I can grab the granparent tags as well?
Currently I am using:
//EventSelections (this is the context)
.//#BetSelectionID
.//#EventSelectionName
I have tried dozens of different ways to do this including the ../.. operator which won't work either. I'd be eternally grateful for any help on this. Thanks.
I think you just haven't gone far enough up the tree.
../* is a two-step location bath with abbreviations, expanded to parent::node()/child::* ... so in effect you are going up the tree with the first step, but back down the tree for the second step.
Therefore, ../* gives you your siblings (parent's children), ../../* gives you your aunts and uncles (grandparent's children), and ../../../* gives you your grandparent and its siblings (great-grandparent's children).
For attributes, ../#* is an abbreviation for parent::node()/attribute::* and attributes are attached to elements, they are not considered children. So you are going sideways, not down the tree in the second step.
Therefore, unlike above, ../#* gives you your parent's attributes, while ../../#* gives you your grandparent's attributes.
But using // in your situation is really inappropriate. // is an abbreviation for /descendent-or-self::node()/ which walks all the way down a tree to the leaves of the tree. It should be used only in rare occasions (and I cringe when I see it abused on SO questions).
So ..//..//..//#RoundID may work for you, but it is in effect addressing attributes all over the tree and not just an attribute of your great-grandparent, which is why it is finding the attribute of your grandparent. ../../#RoundID should be all you need to get the attribute of your grandparent.
If you torture a stylesheet long enough, it will eventually work for you, but it really is more robust and likely faster executing to address things properly.
You could go with ancestor::Event/#EventID, which does exactly you asked for: matches an ancestor element named Event and returns it's EventID attribute.

Resources