Disjunctive filter in SPARQL query - filter

I want to search a case law database for a specific title. I look for that title in the ?title and ?alttitle properties, using FILTER and || (logical-or).
Now I have noticed that this only seems to yield results when both FILTER conditions are true. So, the example below yields no results, even though the value of the ?alttitle field in the record I'm looking for is "NTN Toyo Bearing and Others v Council" and the value of the ?title field is "NTN Toyo Bearing Company Ltd and others v Council". I would expect the fact that there is a match in the ?alttitle property to be sufficient when using ||.
How can I make sure that this record is returned even if there is only a match with the ?alttitle value? Additionally, any tips to make this query faster are appreciated.
SPARQL Endpoint: http://publications.europa.eu/webapi/rdf/sparql
PREFIX cdm: <http://publications.europa.eu/ontology/cdm#>
SELECT DISTINCT ?work ?expression ?ecli ?celex ?alttitle ?agname ?title
WHERE {{{
?work a ?class.
?expression cdm:expression_belongs_to_work ?work.
?expression cdm:expression_title ?title.
?expression cdm:expression_uses_language <http://publications.europa.eu/resource/authority/language/ENG>.
?work cdm:case-law_ecli ?ecli.
?work cdm:resource_legal_id_celex ?celex.
OPTIONAL{?expression cdm:expression_case-law_parties|cdm:expression_title_alternative ?alttitle}
}
FILTER(?class in (<http://publications.europa.eu/ontology/cdm#judgement>,<http://publications.europa.eu/ontology/cdm#opinion_cjeu>))
FILTER (CONTAINS(?alttitle, "NTN Toyo Bearing and Others v Council")||CONTAINS(?title,"NTN Toyo Bearing v Council"))}
UNION{?work a ?class.
?expression cdm:expression_belongs_to_work ?work.
?expression cdm:expression_title ?title.
?expression cdm:expression_uses_language <http://publications.europa.eu/resource/authority/language/ENG>.
?work cdm:case-law_ecli ?ecli.
?work cdm:resource_legal_id_celex ?celex.
?work cdm:case-law_delivered_by_advocate-general ?ag.
?ag cdm:agent_name ?agname.
OPTIONAL{?expression cdm:expression_case-law_parties|cdm:expression_title_alternative ?alttitle}
}
FILTER(?class in (<http://publications.europa.eu/ontology/cdm#opinion_advocate-general>))
FILTER (CONTAINS(?alttitle, "NTN Toyo Bearing and Others v Council")||CONTAINS(?title,"NTN Toyo Bearing v Council"))}
LIMIT 15

Related

How to restrict query result from multiple instances of overlapping date ranges in Django ORM

First off, I admit that I am not sure whether what I am trying to achieve is possible (or even logical). Still I am putting forth this query (and if nothing else, at least be told that I need to redesign my table structure / business logic).
In a table (myValueTable) I have the following records:
Item
article
from_date
to_date
myStock
1
Paper
01/04/2021
31/12/9999
100
2
Tray
12/04/2021
31/12/9999
12
3
Paper
28/04/2021
31/12/9999
150
4
Paper
06/05/2021
31/12/9999
130
As part of the underlying process, I am to find out the value (of field myStock) as on a particular date, say 30/04/2021 (assuming no inward / outward stock movement in the interim).
To that end, I have the following values:
varRefDate = 30/04/2021
varArticle = "Paper"
And my query goes something like this:
get_value = myValueTable.objects.filter(from_date__lte=varRefDate, to_date__gte=varRefDate).get(article=varArticle).myStock
which should translate to:
get_value = SELECT myStock FROM myValueTable WHERE varRefDate BETWEEN from_date AND to_date
But with this I am coming up with more than one result (actually THREE!).
How do I restrict the query result to get ONLY the 3rd instance i.e. the one with value "150" (for article = "paper")?
NOTE: The upper limit of date range (to_date) is being kept constant at 31/12/9999.
Edit
Solved it. In a round about manner. Instead of .get, resorted to generating values_list with fields from_date and myStock. Using the count of objects returned; appended a list with date difference between from_date and the ref date (which is 30/04/2021) and the value of field myStock, sorted (ascending) the generated list. The first tuple in the sorted list will have the least date difference and the corresponding myStock value and that will be the value I am searching for. Tested and works.

Power BI DAX measure: Count occurences of a value in a column considering the filter context of the visual

I want to count the occurrences of values in a column. In my case the value I want to count is TRUE().
Lets say my table is called Table and has two columns:
boolean value
TRUE() A
FALSE() B
TRUE() A
TRUE() B
All solutions I found so far are like this:
count_true = COUNTROWS(FILTER(Table, Table[boolean] = TRUE()))
The problem is that I still want the visual (card), that displays the measure, to consider the filters (coming from the slicers) to reduce the table. So if I have a slicer that is set to value = A, the card with the count_true measure should show 2 and not 3.
As far as I understand the FILTER function always overwrites the visuals filter context.
To further explain my intent: At an earlier point the TRUE/FALSE column had the values 1/0 and I could achieve my goal by just using the SUM function that does not specify a filter context and just acts within the visuals filter context.
I think the DAX you gave should work as long as it's a measure, not a calculated column. (Calculated columns cannot read filter context from the report.)
When evaluating the measure,
count_true = COUNTROWS ( FILTER ( Table, Table[boolean] = TRUE() ) )
the first argument inside FILTER is not necessarily the full table but that table already filtered by the local filter context (including report/page/visual filters along with slicer selections and local context from e.g. rows/column a matrix visual).
So if you select Value = "A" via slicer, then the table in FILTER is already filtered to only include "A" values.
I do not know for sure if this will fix your problem but it is more efficient dax in my opinion:
count_true = CALCULATE(COUNTROWS(Table), Table[boolean])
If you still have the issue after changing your measure to use this format, you may have an underlying issue with the model. There is also the function KEEPFILTERS that may apply here but I think using KEEPFILTERS is overcomplicating your case.

How can I boost the weighting of a particular relationship in Neo4j/Cypher?

I'm finding similar movies with the following Cypher query:
MATCH (m:Movie)-[r*1..2]-(m2:Movie)
WHERE m.movieID = '1'
UNWIND r AS rels
WITH count(rels) as foo, m2, m
ORDER BY foo desc
RETURN DISTINCT m2.title
LIMIT 25
Basically it finds movies with common relationships, and returns movies ordered by those with the most common relationships to m. However, some relationships are more important than others. For example, I'd like to boost the [:DIRECTED] relationship so that movies directed by the same director are returned before others. How can I do this? Something like Dijkstra's algorithm with the :DIRECTED relationship having a low cost?
It is easier than that, you can just use an expression with CASE to apply weights.
MATCH (m:Movie)-[r*1..2]-(m2:Movie)
WHERE m.movieID = '1'
UNWIND r AS rels
WITH rels,case type(rels) when "DIRECTED" then 1.2 else 1.0 end as weight
WITH sum(weight) as foo, m2, m
ORDER BY foo desc
RETURN DISTINCT m2.title
LIMIT 25

Slow Neo4j query despite indices

Here I'm trying to find all Twitter users who are followed by and who follow any members of some group G:
MATCH (x:User)-[:FOLLOWS]->(t:User)-[:FOLLOWS]->(y:User)
WHERE (x.screen_name IN {{G_SCREEN_NAMES}} OR x.id IN {{G_IDS}})
AND (y.screen_name IN {{G_SCREEN_NAMES}} OR y.id IN {{G_IDS}})
RETURN t.id
But for the group G I sometime have their screen names and sometimes have their ids, thus the OR clause above. Unfortunately this query is long running and doesn't appear to ever return.
I have indices and constraints on both on both id and screen_name:
Indexes
ON :User(screen_name) ONLINE (for uniqueness constraint)
ON :User(id) ONLINE (for uniqueness constraint)
Constraints
ON (user:User) ASSERT user.screen_name IS UNIQUE
ON (user:User) ASSERT user.id IS UNIQUE
If I get rid of the OR clause (for instance if I happen to have all screen_names or all ids for group G) then the query runs quite fast.
I'm using neo4j-community-2.1.3 on a Mac. My graph has 286039 nodes, all of which have the User label.
And ideas to improve this? Otherwise I'll have to chop this up into 4 queries to get all possible combinations of members. This is really even more problematic because I really want to keep track of how commonly a user appears in a G-->user-->G relationship, and I'll need to do a lot of extra bookkeeping if the counts are spread among 4 different queries.
Update
I created an issue related to this: https://github.com/neo4j/neo4j/issues/2834
I ended up using
MATCH (x:User) WHERE x.screen_name IN ["apple","banana","coconut"]
WITH collect(id(x)) as x_ids
MATCH (x:User) WHERE x.id in [12345,98765]
WITH x_ids+collect(id(x)) as x_ids
MATCH (y:User) WHERE y.screen_name IN ["apple","banana","coconut"]
WITH x_ids,collect(id(y)) as y_ids
MATCH (y:User) WHERE y.id in [12345,98765]
WITH x_ids,y_ids+collect(id(y)) as y_ids
MATCH (x:User)-[:FOLLOWS]->(t:User)-[:FOLLOWS]->(y:User)
WHERE id(x) in x_ids AND id(y) in y_ids
RETURN count(*) as c, t.screen_name,t.id
ORDER BY c DESC
LIMIT 1000
But this basically represents a hack to get around a place where neo4j isn't using the indices that it could be.
I guess the query does not make use of indexes due to the OR condition, you can verify by prefixing the query with PROFILE and run it in neo4j-shell.
If there's no notion of index usage, you might split the query up into two parts. The first one fetches the combined list of user ids, instead of the OR we do a UNION on two queries (each using a index lookup):
MATCH (x:User) WHERE x.screen_name in {G_SCREEN_NAMES} RETURN id(x) as ids UNION
MATCH (x:User) WHERE x.id in {G_IDS} RETURN id(x) as ids
On the client side, use the list of node ids as parameter for the next query:
MATCH (x:User)-[:FOLLOWS]->(t)-[:FOLLOWS]->(y)
WHERE id(x) in {ids} AND id(y) in {ids}
RETURN t.id
I've intentionally removed the labels for t and y with the assumption that you can only follow User and no other kind of nodes. This removes a unnecessary label check.
JnBrymn,
How about this query?
MATCH (x:User)
WHERE x.screen_name IN {{G_SCREEN_NAMES}} OR x.id IN {{G_IDS}}
WITH x
MATCH (x)-[:FOLLOWS]->(t:User)
WITH t
MATCH (t)-[:FOLLOWS]->(y:User)
WHERE y.screen_name IN {{G_SCREEN_NAMES}} OR y.id IN {{G_IDS}}
RETURN t.id
Grace and peace,
Jim

Oracle full text, the syntax of CONTAINS()

contains(columnname, 'ABC')=0
this means search for the data which doesn't contain word 'ABC'
contains(columnname, 'ABC and XYZ')=0
contains(columnname, 'ABC or XYZ')=0
what do these 2 sql mean? I tested them, there's no syntax error, but they didn't work as I expected, the 'and' seems like an 'or', and 'or' seems like an 'and', could anyone help to explain this?
all doc found in google are for contains()>0, those're not what I need.
thanks in advance.
According to oracle documentation, the contains function :
returns a relevance score for every row selected
If you ask for
contains(columnname, 'ABC')=0
You actually ask for a score of 0 which means: columnname doesn't contain 'ABC'
According to the docs:
In an AND query, the score returned is the score of the lowest query term
In an OR query, the score returned is the score for the highest query term
So if you ask for:
contains(columnname, 'ABC and XYZ')=0
then if either 'ABC' or 'XYZ' has a score of 0 it will have the lowest score and that's what you'll get from the function, so you're actually asking for: columnname doesn't contain 'ABC' or 'XYZ' (at least one of them).
Same thing for the or -
contains(columnname, 'ABC or XYZ')=0
only if both 'ABC' and 'XYZ' have the score of 0 the function will return 0, so you're actually asking for: columnname doesn't contain 'ABC' and 'XYZ' (both of them).
IMHO, this behaviour is correct since it meets De-Moragan's Laws

Resources