An alternative to LinqToRDF, a library to bring Linq to RDF data? - linq

LinqToRDF (http://code.google.com/p/linqtordf/) is a well known library to bring Linq to RDF data. But it is not active for nearly two years.
So I am looking for an alternative. My basic requirement is providing basic Linq function with general RDF data sources. Commercial library is welcome also.
Any suggestions are welcome.
Ying

You could try my library dotNetRDF which is designed so that many things can be accessed in a Linq style i.e. there is significant and pervasive use of IEnumerable<T> as a return type throughout the library.
But it doesn't have a completely Linq style API like LinqToRdf provided by which I mean it doesn't have the kind of methods which LinqToRdf had which allow you to write something like the following and have the library translate it to SPARQL or some other appropriate query language under the hood:
MusicDataContext ctx = new MusicDataContext(#"http://localhost/linqtordf/SparqlQuery.aspx");
var q = (from t in ctx.Tracks
where t.Year == "2006" &&
t.GenreName == "History 5 | Fall 2006 | UC Berkeley"
orderby t.FileLocation
select new {t.Title, t.FileLocation}).Skip(10).Take(5);
My library is much more low level, either you'd have to write the equivalent SPARQL query yourself or write a block of code which extracts the various Triples used to identify something and makes the relevant comparisons you want.
Eventually my intention is to port LinqToRdf to using dotNetRDF as it's underlying library for accessing RDF but this is fairly low on the priority list at the moment as I'm working on a major release of the core library which adds a lot of new functionality related to SPARQL 1.1
In terms of commercial options take a look at Intellidimension's Semantics Framework which is a commercial library though there is a free express version available - I haven't used the library so have no idea how Linq friendly it is. Main downside to the free version is a very strict redistribution policy.

Related

Is there a Cypher syntax definition anywhere?

I'm looking for a definition of the syntax for the Cypher query language. I tried the docs but they're very vague.
Ideally, I'd like a BNF (or any variant) definition, or one of those "graph" definitions like this or this. Really, anything resembling a formal definition.
What you are looking for will be available in openCypher. Several items will be released as part of the project, one of the first of which is the BNF grammar.
Update 2016-01-30: A first draft of the grammar is now avialable at \https://github.com/opencypher/openCypher/blob/master/grammar.ebnf.
Update: 2016-10-17: EBNF and Antlr grammars, TCK, railroad diagrams, and a list of community projects are available at http://www.opencypher.org/#resources
Take a look at the recently announced (Oct 2015) openCypher project. It involves releasing the language specification, among other things.
From the announcement:
1. Cypher reference documentation:
A comprehensive user documentation describing use of the Cypher query language with examples and tutorials.
2. Technology compatibility kit (TCK):
The TCK consists of a number of tests that a software supplier would run in order to self-certify support for a given version of Cypher.
3. Reference implementation:
Distributed under the Apache 2.0 license, the reference implementation is a fully functional implementation of key parts of the stack needed to support Cypher inside a data platform or tool. The first planned deliverable is a parser that will take a Cypher statement and parse it into an AST (abstract syntax tree) representation. The reference implementation complements the documentation and tests by providing working implementations of Cypher – which are permissively licensed – and can be used as examples or as a foundation for one’s own implementation.
4. Cypher language specification:
Licensed under a Creative Commons license, the Cypher language specification is a technical expression of the language syntax to enable parsers to auto-generate the query syntax. A full semantic specification is also planned as a part of the openCypher project.
The same announcement also says that the process is open and that it is possible to submit, review and comment on language proposals.
Update!
Neo4j has changed a lot since this answer was written. In 2017 the simple answer is yes, you can download the grammar files from https://www.opencypher.org/
Below is the old answer, which was accurate in 2014
As far as I can tell, the only formal definition is in the code. That's the bad news.
The good news is that the code uses a scala library to do the parsing which makes the code rules look kinda/sorta like BNF. And there's some documentation on how to read it.
Here's a link into a scala object that defines what a query is.
This general package on github looks to me like it contains all of the cypher command implementations, and should have everything you're asking for.
Code in this package is written in scala, and looks like this:
object Query {
def start(startItems: StartItem*) = new QueryBuilder().startItems(startItems:_*)
def matches(patterns:Pattern*) = new QueryBuilder().matches(patterns:_*)
def optionalMatches(patterns:Pattern*) = new QueryBuilder().matches(patterns:_*).makeOptional()
def updates(cmds:UpdateAction*) = new QueryBuilder().updates(cmds:_*)
def unique(cmds:UniqueLink*) = new QueryBuilder().startItems(Seq(CreateUniqueStartItem(CreateUniqueAction(cmds:_*))):_*)
(...)
This matches roughly with the upper right hand quadrant of the Cypher refcard. You can sorta see that there can be a start clause, a match clause, and so on. This includes links to other implementation classes (like UpdateAction which further define clauses considered update actions.
Make sure to also read How Neo4J Uses Scala's Parser Combinator: Cypher's Internals Part 1 for more information on what's going on here, and the mapping between the scala classes and what we'd normally consider EBNF. This blog post is old (2011) and the specific code examples it gives shouldn't be trusted, but I think it has good general information on how the implementation works, and what to look for if you want to understand the EBNF behind cypher.
Disclaimer: I'm not a scala hardcore, YMMV, IANAL, devs please overrule me if I'm wrong.
(Michael Hunger answered in a comment, so I can't accept his answer. Here's his answer:)
Cypher uses parboiled as parser, the parboiled rule DSL are pretty easy to read and understand. https://github.com/neo4j/neo4j/blob/d18583d260a957ab1a14bd27d34eb5625df42bc5/community/cypher/cypher-compiler-2.2/src/main/scala/org/neo4j/cypher/internal/compiler/v2_2/parser/Clauses.scala
None of these seem to work any more.
I don't see anything on the opencypher.org site that looks like a grammar to download.
None of the github links from Michael Hunger work.
I'd really like access to SOME resource where I can learn how to construct queries for functions like avg that allegedly take a list expression as an argument, yet barf at every variant I can figure out.

Extend Compiler LINQ translations

Is there a way to add custom linq keywords and tell the compiler how to translate them to actual extension methods?
For example, translate the single keyword:
var color = from c in colors
where c.IsFavorite
select single c
To
var color = colors.Where( c => c.IsFavorite ).SingleOrDefault();
No there is not a way to do this.
As to why, I worked on the VB.Net LINQ implementation vs. C# but the issues are mostly the same.
Adding LINQ to the language was a huge undertaking. As Eric Lippert has blogged about recently, LINQ barely fit into the VS2008 schedule and as such, essentially only the features that were absolutely essentially to shipping LINQ were added to the language.
Making LINQ arbitrarily extensible to users was not one of those features. It's also something that would have been very costly. Right now LINQ is a very complex feature which has a fixed set of constructs. Allowing it to be arbitrarily extensible would have severely inflated these costs (especially on the IDE side) in at least the following areas
Language Design (huge)
Intellisense
Pretty Printing / Formatting
Low level code emit details
etc ...

How do I build up LINQ dynamically

I have a scenario where I have custom configured column names, associated operators like < > = between etc. and then a value associated.
I'm trying to determine if it is possible to build up a LINQ query with a dynamic (string) where clause?
I've noticed the Predicate.OR Preditcate.AND stuff, but that is not quite what I'm talking about.
Any suggestions?
If you are talking about a string Where clause (rather than building the expression etc yourself) - then the Dynamic LINQ Library (in the 3.5 samples, IIRC) should suffice.
Note that the example below is for database usage; but you can use it with LINQ-to-Objects by calling .AsQueryable() on your in-memory data.
Actually, there is a specific library from Microsoft (System.Linq.Dynamic) that comes with the C# VS2008 samples that supports this. Get it from here (Microsoft Download)
The library is included in the \LinqSamples\DynamicQuery directory of the samples of above download.
For extensive usage examples check this page: http://weblogs.asp.net/scottgu/archive/2008/01/07/dynamic-linq-part-1-using-the-linq-dynamic-query-library.aspx
Also you can use expression trees to created dynamic queries. See:
http://msdn.microsoft.com/en-us/library/bb397951.aspx
http://www.interact-sw.co.uk/iangblog/2005/09/30/expressiontrees
http://blogs.msdn.com/charlie/archive/2008/01/31/expression-tree-basics.aspx

Simple (Dumb) LINQ Provider

How easy would it be to write a dumb LINQ provider that can just use my class definitions (which don't have any object references as properties) and give me the translated SQL. It can assume the name of the properties and the columns to be same as well as the names of the classes and the underlying tables. Can you please give me some pointers.?
It took me about 4 months of fulltime work (8 hours a day) to build a stable, working provider that implements the entire spec of linq. I would say I had a very simple, buggy and unstable version after about three weeks, so if you're just looking for something rough I would say you're probably looking at anything from a week up to two months depending on how good you are and what types of requiements you have.
I must point you to the Wayward blog for this, Matt has written a really good walkthrough on how to implement a linq provider, and even if you're probably not going to be able to copy and paste, it will help you to get to grips with how to think when working. You can find Matt´s walkthrough here: http://blogs.msdn.com/mattwar/archive/2007/07/30/linq-building-an-iqueryable-provider-part-i.aspx . I recommend you go about it the same way Matt does, and extend the expression tree visitor Matt includes in the second part of his tutorial.
Also, when I began working with this, I had so much help from the expression tree visualizer, it really made parsing a whole lot easier once you could see how linq parsed to queries.
Building a provider is really a lot of fun, even if a bit frustrating at times. I wish you all the best of luck!
Give a look to the LINQExtender project, is a toolkit for creating custom LINQ providers.
Another option for giving you a leg up seems to be re-linq which is a framework for creating custom LINQ providers.
Here's the Source code and a nice overview (pdf) of what's involved in writing one.
I’ve written a tutorial series on my blog base on my experience developing a LINQ-to-SQL provider from scratch, starting with the expression tree composition stage (calling the LINQ methods), continuing with the expression visitor, breaking down of the query into components, parsing the where clause, generating the text and parameter and, eventually, compiling the whole thing into IL using the .NET expression namespace.
I’ve seen many incomplete posts that promised to explain how to write a provider, falling very short of the mark, barely scratching the surface and not actually delivering anything remotely executable.
The blog series I’ve written based on my experience has a sample project available for download with the simple provider that covers only the functionality required by the tutorial example. However, it also includes the production version supporting a number of operations (where, join, first, count, top, etc.), subqueries, nested statements, and etc. Additionally, it produces a cleaner SQL than a lot of what I’ve seen from Entities and LINQ-to-SQL. There’s no unnecessary/redundant nesting, wrapping everything in brackets and etc.
For anyone with a good level of abstract thinking, developing such a provider isn’t such a difficult task many set it out to be. I’ve developed one that’s used in production environment in about 3 months of part time work (meaning some evenings and weekends). From the get go it was aimed with performance and tidy SQL in mind – a goal it achieved.
It was a little hard to find the time to publish this material, but I thought – if it may help someone out there, there’s no reason for this experience to go to waste:
How to write a LINQ to SQL provider in C# Part 1 - Introduction
How to write a LINQ to SQL provider in C# Part 2 - Expression Visitor
How to write a LINQ to SQL provider in C# Part 3 - Where Clause Visitor
How to write a LINQ to SQL provider in C# Part 4 - Compiling Expression Trees
I have created a project 'LinqToAnything' which is designed to make it very very easy to implement a (simple) Linq provider.

How does Linq work (behind the scenes)?

I was thinking about making something like Linq for Lua, and I have a general idea how Linq works, but was wondering if there was a good article or if someone could explain how C# makes Linq possible
Note: I mean behind the scenes, like how it generates code bindings and all that, not end user syntax.
It's hard to answer the question because LINQ is so many different things. For instance, sticking to C#, the following things are involved:
Query expressions are "pre-processed" into "C# without query expressions" which is then compiled normally. The query expression part of the spec is really short - it's basically a mechanical translation which doesn't assume anything about the real meaning of the query, beyond "order by is translated into OrderBy/ThenBy/etc".
Delegates are used to represent arbitrary actions with a particular signature, as executable code.
Expression trees are used to represent the same thing, but as data (which can be examined and translated into a different form, e.g. SQL)
Lambda expressions are used to convert source code into either delegates or expression trees.
Extension methods are used by most LINQ providers to chain together static method calls. This allows a simple interface (e.g. IEnumerable<T>) to effectively gain a lot more power.
Anonymous types are used for projections - where you have some disparate collection of data, and you want bits of each of the aspects of that data, an anonymous type allows you to gather them together.
Implicitly typed local variables (var) are used primarily when working with anonymous types, to maintain a statically typed language where you may not be able to "speak" the name of the type explicitly.
Iterator blocks are usually used to implement in-process querying, e.g. for LINQ to Objects.
Type inference is used to make the whole thing a lot smoother - there are a lot of generic methods in LINQ, and without type inference it would be really painful.
Code generation is used to turn a model (e.g. DBML) into code
Partial types are used to provide extensibility to generated code
Attributes are used to provide metadata to LINQ providers
Obviously a lot of these aren't only used by LINQ, but different LINQ technologies will depend on them.
If you can give more indication of what aspects you're interested in, we may be able to provide more detail.
If you're interested in effectively implementing LINQ to Objects, you might be interested in a talk I gave at DDD in Reading a couple of weeks ago - basically implementing as much of LINQ to Objects as possible in an hour. We were far from complete by the end of it, but it should give a pretty good idea of the kind of thing you need to do (and buffering/streaming, iterator blocks, query expression translation etc). The videos aren't up yet (and I haven't put the code up for download yet) but if you're interested, drop me a mail at skeet#pobox.com and I'll let you know when they're up. (I'll probably blog about it too.)
Mono (partially?) implements LINQ, and is opensource. Maybe you could look into their implementation?
Read this article:
Learn how to create custom LINQ providers
Perhaps my LINQ for R6RS Scheme will provide some insights.
It is 100% semantically, and almost 100% syntactically the same as LINQ, with the noted exception of additional sort parameters using 'then' instead of ','.
Some rules/assumptions:
Only dealing with lists, no query providers.
Not lazy, but eager comprehension.
No static types, as Scheme does not use them.
My implementation depends on a few core procedures:
map - used for 'Select'
filter - used for 'Where'
flatten - used for 'SelectMany'
sort - a multi-key sorting procedure
groupby - for grouping constructs
The rest of the structure is all built up using a macro.
Bindings are stored in a list that is tagged with bound identifiers to ensure hygiene. The binding are extracted and rebound locally where ever an expression occurs.
I did track the progress on my blog, that may provide some insight to possible issues.
For design ideas, take a look at c omega, the research project that birthed Linq. Linq is a more pragmatic or watered down version of c omega, depending on your perspective.
Matt Warren's blog has all the answers (and a sample IQueryable provider implementation to give you a headstart):
http://blogs.msdn.com/mattwar/

Resources