SPARQL: Use DISTINCT and ORDER BY with multiple valriables - sql-order-by

When I use DISTINCT and ORDER BY with the following query, I get an empty result
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT ?s_type ?p ?o_type
WHERE{
?s ?p ?o.
?s rdf:type ?s_type.
FILTER NOT EXISTS {
?s rdf:type ?type1 .
?type1 rdfs:subClassOf ?s_type.
FILTER NOT EXISTS {
?type1 owl:equivalentClass ?s_type.
}
}.
FILTER EXISTS {
?s_type rdfs:subClassOf ?superType1 .
}.
FILTER strstarts(str(?s_type ), str(dbo:)).
?o rdf:type ?o_type.
FILTER NOT EXISTS {
?o rdf:type ?type2 .
?type2 rdfs:subClassOf ?o_type.
FILTER NOT EXISTS {
?type2 owl:equivalentClass ?o_type.
}
}.
FILTER EXISTS {
?o_type rdfs:subClassOf ?superType2 .
}.
FILTER strstarts(str(?s_type ), str(dbo:)).
}
The result on the DBpedia endpoint is available here.
I'd like to eliminate duplicates and order them by ?p. Any help will be appreciated.

Related

SPARQL DBpedia filter out specific results

I am working on a small part where I receive all types of a resource. The thing is: I don't want to have all types, only the "http://dbpedia.org/ontology"-types. How do I filter them within a SPARQL query? I don't really care as long I receive only the ontologies.
In this query I need only the dbpedia-Ontologies "Country", "Location" "PopulatedPlace" and "Place".
SPARQL endpoint: http://de.dbpedia.org/sparql
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?type WHERE {
?i rdfs:label "Deutschland"#de ; a ?type .
}
I set up a FILTER which filters out the Ontologies. But that's not the solution as it is static and only works for this example. It also duplicates. But that's a minor problem.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?type WHERE {
?i rdfs:label "Deutschland"#de ; a ?type .
FILTER (?type = <http://dbpedia.org/ontology/Country> ||
?type = <http://dbpedia.org/ontology/PopulatedPlace> ||
?type = <http://dbpedia.org/ontology/Place> ||
?type = <http://dbpedia.org/ontology/Location>)
}
Need some suggestions or help. Thx in advance.
Okay i thought i shouldn't have asked, but this took me some time to realize... There is a function to filter when a string starts with the same letters...
strstarts
Solution. Hope i could at least help someone.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?type WHERE {
?i rdfs:label "Deutschland"#de ; a ?type .
FILTER (strstarts(str(?type), "http://dbpedia.org/ontology/"))
}
Well what you did is fine but probably not the correct way of doing it. What you're looking for are called owl classes. So you just need to check if the type that you're looking for is an owl:Class or not.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT ?type WHERE {
?i rdfs:label "Deutschland"#de ; a ?type .
?type a owl:Class .
}

sparql delete query optimization

I have a query delete/insert that I'd like to optimize if possible. The query does delete/insert on up to 50 objects at a time. My Jmeter tests show that the DELETE clause takes 4 times longer in comparison to INSERT: delete takes around 3300 ms and insert takes about 860 ms. I'd like to improve the DELETE clause. I was thinking of using FILTER, but was told it would not scale well. Any recommendation is much appreciated.
What I have right now is:
DELETE {
?s ?p ?o.
?collection dc:identifier ?cid;
rdf:type ?ct;
rdf:li ?list.
?list rdf:first ?first;
rdf:rest ?rest.
}
WHERE
{
{ ?s dc:identifier "11111"^^xsd:int; ?p ?o. }
UNION { ?s dc:identifier "22222"^^xsd:int; ?p ?o.}
UNION {?s dc:identifier "33333"^^xsd:int; ?p ?o.}
UNION{} UNION{}.......
OPTIONAL{
?s dc:hasPart ?collection.
?collection dc:identifier ?cid;
rdf:type ?ct;
rdf:li ?list.
?list rdf:first ?first;
rdf:rest ?rest.
}
INSERT DATA
{
GRAPH <http://test.org/>
{.....}
GRAPH <http://test.org/>
{.....}
GRAPH....
}
Without having your data, or even knowing what triple store you're using, we can't really help much in optimization. It might just be that deletes are more expensive than insertions. That said, one thing that might help is to use values rather than unions in your where block. That is, instead of:
{ ?s dc:identifier "11111"^^xsd:int; ?p ?o. }
UNION { ?s dc:identifier "22222"^^xsd:int; ?p ?o.}
UNION {?s dc:identifier "33333"^^xsd:int; ?p ?o.}
UNION{} UNION{}.......
do:
values ?identifier { "11111"^^xsd:int "22222"^^xsd:int "33333"^^xsd:int "44444"^^xsd:int }
?s dc:identifier ?identifier ; ?p ? o

Accelerate SPARQL query - filtering out rows which contain

I am currently working with SPARQL (and TopBraidComposer). I have a query which only brings back matching literals, and then filters out the literals based on not wanting certain categories.
Currently, this query is taking a long time to run, and I think it is my FILTER which is causing the delay. I was wondering if someone would have a better and faster way of filtering out (NOT returning) rows which contain a set of key words (ex. cat1, cat2, cat3).
As of now, I am using;
SELECT ?category
WHERE {
?s1 ?p ?category .
?s2 ?p ?category .
FILTER (str(?category) != "Cat1") .
FILTER (str(?category) != "Cat2") .
FILTER (str(?category) != "Cat3") .
FILTER (str(?category) != "Cat4") .
FILTER (str(?category) != "Cat6") .
FILTER (str(?category) != "Cat8") .
}
It's not clear how much you've trimmed down your example, but the code you presented is doing more work than it needs to.
SELECT ?category
WHERE {
?s1 ?p ?category .
?s2 ?p ?category .
FILTER (str(?category) != "Cat1") .
FILTER (str(?category) != "Cat2") .
FILTER (str(?category) != "Cat3") .
FILTER (str(?category) != "Cat4") .
FILTER (str(?category) != "Cat6") .
FILTER (str(?category) != "Cat8") .
}
Suppose your data has
:a :p "Cat0" .
:b :p "Cat0" .
Then the bindings for ?s1, ?s2, ?p? and ?category can be
?s1 ?s2 ?p ?category
--------------------
:a :a :p "Cat0"
:a :b :p "Cat0"
:b :b :p "Cat0"
:b :a :p "Cat0"
That's four ways to select "Cat0". You said that you want literals, but right now you're hitting every kind of ?category and applying str to it multiple times. You might do this instead:
SELECT DISTINCT ?category
WHERE {
?s ?p ?category .
FILTER( isLiteral(?category) &&
!(str(?category) in ("Cat1", "Cat2", "Cat3",
"Cat4", "Cat6", "Cat8")) )
}

How to improve slow query using FILTER (?id IN ( … ) )

I just started using SPARQL, and I'm trying to create a query that retrieves all information where an id has one of a number of predefined values? I have something like this :
SELECT *
WHERE {
?id ?property ?value .
?value a ?type .
?type rdfs:label ?type_value .
FILTER ( ?id IN (<id1>,<idi>,<idn> ) )
}
The problem I've been running into is the query gets really slow when the list of ids gets increasingly large. I intuitively think there's a better way to write this query, but I'm having trouble figuring out how to create this kind of query. I'm thinking along the lines of something like this:
SELECT *
WHERE {
<id_value> ?property ?value .
?value a ?type .
?type rdfs:label ?type_value .
}
where it retrieves all values only for the multiple ids, eliminating the filtering of results at the end, but I can't figure out how to write the query so that it returns all values for an id_value. when I add another line for another id_value, it filters out other values I'm expecting, so I think I'm writing it incorrectly. How can I do this?
Using values, you can write:
SELECT * WHERE {
values ?id { <id1> <idi> <idn> }
?id ?property ?value .
?value a ?type .
?type rdfs:label ?type_value .
}
The SPARQL 1.1 says about values:
Data can be directly written in a graph pattern or added to a query
using VALUES. VALUES provides inline data as a solution sequence which
are combined with the results of query evaluation by a join operation.
It can be used by an application to provide specific requirements on
query results and also by SPARQL query engine implementations that
provide federated query through the SERVICE keyword to send a more
constrained query to a remote query service.
One of the examples is actually very close to what you've already got:
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX : <http://example.org/book/>
PREFIX ns: <http://example.org/ns#>
SELECT ?book ?title ?price
{
VALUES ?book { :book1 :book3 }
?book dc:title ?title ;
ns:price ?price .
}
Try using the VALUES clause instead like so:
SELECT *
WHERE {
VALUES ?id { ...list of ids... }
?id ?property ?value .
?value a ?type .
?type rdfs:label ?type_value .
}
This should hopefully be much more efficient that using the FILTER approach.

SPARQL query for RECIPE selection

ok, let's say that I need to extract all those recipes containing 2/3 ingredients.
The recipes are represented as linked data and this is the ontology used http://linkedrecipes.org/schema.
I know how to find recipes with curry:
PREFIX rdfs: <htp://ww.w3.org/2000/01/rdf-schema#>
PREFIX recipe: <htp://linkedrecipes.org/schema/>
SELECT ?label ?recipe
WHERE{
?food rdfs:label ?label2 .
?food recipe:ingredient_of ?recipe .
?recipe a recipe:Recipe .
?recipe rdfs:label ?label.
FILTER (REGEX(STR(?label2), 'curry', 'i'))
}
But how can i find recipes with curry and chicken for example?
This should find curry and chicken:
PREFIX rdfs: <htp://ww.w3.org/2000/01/rdf-schema#>
PREFIX recipe: <htp://linkedrecipes.org/schema/>
SELECT ?label ?recipe {
?recipe a recipe:Recipe .
?recipe rdfs:label ?label.
?curry recipe:ingredient_of ?recipe .
?curry rdfs:label ?curry_label .
FILTER (REGEX(STR(?curry_label), 'curry', 'i'))
?chicken recipe:ingredient_of ?recipe .
?chicken rdfs:label ?chicken_label .
FILTER (REGEX(STR(?chicken_label), 'chicken', 'i'))
}

Resources