Comparison of LINQ to SPARQL? [closed] - linq

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm not a MS.NET person but got curious about LINQ, and this article http://www.linqpad.net/WhyLINQBeatsSQL.aspx explains very well why it's better than SQL.
I work a lot with SPARQL and in many respects it's worse than SQL (even 1.1 is a bit immature IMHO). Is there a comparison of LINQ to SPARQL in the style of the above article?
I think the LINQ aspect that's most interesting for RDF data is that LINQ can return hierarchical structures (in RDBMS speak that's table-valued variables; or think XML structures). SPARQL cannot do that:
you can't make CONSTRUCT subqueries, see GroupGraphPattern
and SubSelect in the grammar
if you try to return an array of complexly CONSTRUCTed objects, I bet you'll get a mess. Nor I believe you can mix arrays & hierarchical structures several levels deep. If you disagree, see this http://vocab.getty.edu/doc/queries/#All_Data_For_Subject and try to write a query to return it for all ?s gvp:broaderExtended aat:300264089 (disclaimers: I built that endpoint)
So with SPARQL we either return tabular data, or a single graph object but can't mix them freely. Which is ironic, because RDF is a graph data model.
There are several LINQ to SPARQL bindings:
https://github.com/Efimster/LINQtoSPARQL
http://rdfsharp.codeplex.com/
http://code.google.com/p/linqtordf/
http://www.dotnetrdf.org/
http://brightstardb.com
https://github.com/semiodesk/trinity-rdf
But does any of them handle this "hierarchical structures" aspect?

(I prefer this to be a comment but it is too too long)
I am not an expert, neither in LINQ nor SPARQL, and I didn't check the links you've provided, but recently I've listened to a Podcast with Anders Heilsberg, Microsoft Technical Fellow, and Lead Designer of C# and LINQ.
http://www.se-radio.net/2008/05/episode-97-interview-anders-hejlsberg/
He said that LINQ syntax follows a FROM ... WHERE ... SELECT structure to allow for better type inference.
When it knows the table name (FROM ... clause) the type inference engine in LINQ can start working at compile time, to fetch the column names for the autocompleter, and e.g interact with the syntax checker in Visual Studio.
I don't think this is easy/possible for a RDF triple where the property/predicate (SELECT /WHERE) can even be empty, but not FROM, (Blank Nodes).
These Blank Nodes denote existence of an individual with specific attributes, but without providing an identification or reference).

Related

Linq to sql/xml , to object ,and to dataset comparison

Hellow Dears,
i read deeply in LinQ articles and i wonder about one thing not got it is what difference between LinQ Types [To SQL/XML - To Object - To DataSet].
need simple clarify comparison specially for memory view
Thanks
"Linq" stands for "Langage Integrated Query" --- basically, it means that the query syntax keywords (from where select etc) are now officially part of the language.
Now, at the highest level, a query against an array or a database table is doing essentially the same thing --- but the actual mechanics of how the query happens are quite different.
Linq2Sql, Linq2Object et al, are different subsystems which allow very different queries to be expressed using a common syntax.

Non aggregate functions, relational algebra

How can we translate the non aggregate functions of Structured Query Language into relational algebra expressions?! I know how to express the aggregate functions, but what about the non aggregate functions?!
e.g How can we write the Year( a date format column) function?! Just Year(date)?
select e.name,year(e.dateOfEmployment) from Employees e
Thanks!
(This is a very reasonable question, I don't understand why it should get downvoted.)
The "Relational" in RA means expressing functions as mathematical relations -- using a set-theoretic approach. (It doesn't mean, as often thought, relating one table or datum to another as in "Entity Relational" modelling.) I can't grab a very succinct reference for this off the top of my head, but start here http://en.wikipedia.org/wiki/Binary_relation and follow the links.
How does this get to answer your question in context of a practical RA? Have a look at this:
http://www.dcs.warwick.ac.uk/~hugh/TTM/APPXA.pdf, especially the section Treating Operators as Relations.
See how the relations PLUS and SQRT can be 'applied' (using COMPOSE, which is a shorthand for Natural JOIN and PROJECT) to behave as a function.
For your specific question, you need a relation with two attributes (type Date and Year).

Build LINQ Select conditionally similar of using PredicateBuilder for Where clause [duplicate]

This question already has answers here:
LINQ : Dynamic select
(12 answers)
Closed 8 years ago.
I'm working with dynamic queries using LINQ on Entity Framework.
To query some tables by user input filters, we are using PredicateBuilder to create conditional WHERE sections. That works really great, but the number of columns returned are fixed.
Now, if we need the user to select which columns he needs in their report, besides their filters, we are in trouble, as we don't know how to do dynamic myQuery.Select( x => new { ... }) as we do for Where clause.
How can we achieve something like this?
A bit of searching reveals that this is tricky. Anonymous types are created at compile time, so it is not easy to create one dynamically. This answer contains a solution using Reflection.emit.
If possible, I would recommend just returning something like a IDictionary<,> instead.

LINQ - Using where or join - Performance difference?

Based on this question:
What is difference between Where and Join in linq?
My question is following:
Is there a performance difference in the following two statements:
from order in myDB.OrdersSet
from person in myDB.PersonSet
from product in myDB.ProductSet
where order.Persons_Id==person.Id && order.Products_Id==product.Id
select new { order.Id, person.Name, person.SurName, product.Model,UrunAdı=product.Name };
and
from order in myDB.OrdersSet
join person in myDB.PersonSet on order.Persons_Id equals person.Id
join product in myDB.ProductSet on order.Products_Id equals product.Id
select new { order.Id, person.Name, person.SurName, product.Model,UrunAdı=product.Name };
I would always use the second one just because it´s more clear.
My question is now, is the first one slower than the second one?
Does it build a cartesic product and filters it afterwards with the where clauses ?
Thank you.
It entirely depends on the provider you're using.
With LINQ to Objects, it will absolutely build the Cartesian product and filter afterwards.
For out-of-process query providers such as LINQ to SQL, it depends on whether it's smart enough to realise that it can translate it into a SQL join. Even if LINQ to SQL doesn't, it's likely that the query engine actually performing the query will do so - you'd have to check with the relevant query plan tool for your database to see what's actually going to happen.
Side-note: multiple "from" clauses don't always result in a Cartesian product - the contents of one "from" can depend on the current element of earlier ones, e.g.
from file in files
from line in ReadLines(file)
...
My question is now, is the first one slower than the second one? Does it build a cartesic product and filters it afterwards with the where clauses ?
If the collections are in memory, then yes. There is no query optimizer for LinqToObjects - it simply does what the programmer asks in the order that is asked.
If the collections are in a database (which is suspected due to the myDB variable), then no. The query is translated into sql and sent off to the database where there is a query optimizer. This optimizer will generate an execution plan. Since both queries are asking for the same logical result, it is reasonable to expect the same efficient plan will be generated for both. The only ways to be certain are to
inspect the execution plans
or measure the IO (SET STATISTICS IO ON).
Is there a performance difference
If you find yourself in a scenario where you have to ask, you should cultivate tools with which to measure and discover the truth for yourself. Measure - not ask.

Data structure to hold HQL or EJB QL

We need to produce a fairly complex dynamic query builder for retrieving reports on the fly. We're scratching our heads a little on what sort of data structure would be best.
It's really nothing more than holding a list of selectParts, a list of fromParts, a list of where criteria, order by, group by, that sort of thing, for persistence. When we start thinking about joins, especially outer joins, having clauses, and aggregate functions, things start getting a little fuzzy.
We're building it up interfaces first for now and trying to think as far ahead as we can, but definitely will go through a series of refactorings when we discover limitations with our structures.
I'm posting this question here in the hopes that someone has already come up with something that we can base it on. Or know of some library or some such. It would be nice to get some tips or heads-up on potential issues before we dive into implementations next week.
I've done something similar couple of times in the past. A couple of the bigger things spring to mind..
The where clause is the hardest to get right. If you divide things up into what I would call "expressions" and "predicates" it makes it easier.
Expressions - column references, parameters, literals, functions, aggregates (count/sum)
Predicates - comparisons, like, between, in, is null (predicates have expression as children, e.g. expr1 = expr2. Then you also having composites such as and/or/not.
The whole where clause, as you can imagine, is a tree with a predicate at the root, with maybe sub-predicates underneath eventually terminating with expressions at the leaves.
To construct the HQL you walk the model (depth first usually). I used a visitor as I need to walk my models for other reasons, but if you don't have multiple purposes you can build the rendering code right into the model.
e.g. If you had
"where upper(column1) = :param1 AND ( column2 is null OR column3 between :param2 and param3)"
Then the tree is
Root
- AND
- Equal
- Function(upper)
- ColumnReference(column1)
- Parameter(param1)
- OR
- IsNull
- ColumnReference(column2)
- Between
- ColumnReference(column3)
- Parameter(param2)
- Parameter(param3)
Then you walk the tree depth first and merge rendered bits of HQL on the way back up. The upper function for example would expect one piece of child HQL to be rendered and it would then generate
"upper( " + childHql + " )"
and pass that up to it's parent. Something like Between expects three child HQL pieces.
You can then re-use the expression model in the select/group by/order by clauses
You can skip storing the group by if you wish by just storing the select and before query construction scan for aggregate. If there is one or more then just copy all the non-aggregate select expressions into the group by.
From clause is just a list of table reference + zero or more join clauses. Each join clause has a type (inner/left/right) and a table reference. Table reference is a table name + optional alias.
Plus, if you ever get into wanting to parse a query language (or anything really) then I can highly recommend ANTLR. Learning curve is quite steep but there are plenty of example grammars to look at.
HTH.
if you need EJB-QL parser and data structures, EclipseLink (well several of it's internal classes) have good one:
JPQLParseTree tree = org.eclipse.persistence.internal.jpa.parsing.jpql.JPQLParser.buildParserFor(" _the_ejb_ql_string_ ").parse();
JPQLParseTree contains the all the data.
but generating EJB-QL back from modified JPQLParseTree is something you have to do yourself.

Resources