sparql queries in project - spring-boot

I have start working on project using SPARQL and springboot. How to managing very large SPARSQL queries? What is the right place to implement them in project? Currently I am just using methods with Springbuilder and returned a query as a String.

Constructing your queries as a String is fine, as long as you are very careful when injecting any user-provided input into your query.
A safer approach is to use a query builder, such as the RDF4J SparqlBuilder, so you can construct your queries in a fluent API, e.g. like this:
// SELECT ?product where { ?product a ex:book }
selectQuery.prefix(ex).select(product).where(product.isA(ex.iri("book"));
As for where to manage this stuff in your project, it depends a bit of the APIs you're using, but assuming you're using RDF4J, for example, I would typically recommend some variation of a DAO pattern, and creating your DAO class by means of a repository (connection) wrapper object.

Related

OData accent-insensitive filter

How can I apply a filter accent-insensitive? In OData the "eq" operator is case and accent sensitive. The case is easy to fix, because the "tolower" but relative to the accent I'm not getting a simple solution. I know contains is supposed to be accent-insensitive but if I use contains filtering by "São José" I am only getting these responses "São José" and "São José dos Campos", it is missing "Sao Jose".
The following example filtering by "Florianopolis" is expected to return "Florianópolis", but it does not:
url: api/cidades/get?$filter=contains(nome, 'Florianopolis')
[HttpGet]
[EnableQuery]
public ActionResult<IQueryable<CidadeDTO>> Get()
{
try
{
return Ok(_mapper.Map<IEnumerable<CidadeDTO>>(_db.Cidades));
}
catch (System.Exception e)
{
return BadRequest(e.GetBaseException().Message);
}
}
It should bring aswell, like entity framework.
If your OData model was mapped directly to EF models AND an IQueryable<T> expression was passed into OK() then the query is explicitly passed through to the database engine as SQL:
SELECT * FROM Cidades WHERE nome LIKE '%Florianopolis%'
When that occurs, the Collation settings in the database connection will determine the comparison matching logic.
When your database collation is case and accent insensitive, but your data is still filtered as if it was not, then this is an indication that an IEnumerable<T> has been passed into OK() and the comparison logic is being evaluated in C# which by default in insensitive to both case and accent. Unfortunately this means that it is very likely that the entire data table has been loaded into memory first so that the filter can be applied.
In your case the OData model is mapped to DTO expressions that are mapped to the EF models via AutoMapper and that is where the generated query can break down. By calling Map() you are loading ALL records from the EF table and leaving the $filter criteria to be applied by the EnableQueryAttribute
For OData query conventions to be applied automatically you must return an IQueryable<T> from your method, or atleast pass an IQueryable<T> into the OK() response handler. With AutoMapper, you can use the Queryable Extensions to satisfy the IQueryable<T> requirement:
Queryable Extensions
When using an ORM such as NHibernate or Entity Framework with AutoMapper’s standard mapper.Map functions, you may notice that the ORM will query all the fields of all the objects within a graph when AutoMapper is attempting to map the results to a destination type.
...
ProjectTo must be the last call in the chain. ORMs work with entities, not DTOs. So apply any filtering and sorting on entities and, as the last step, project to DTOs.
In OData the last requirement (about ProjectTo) is still problematic because the EnableQueryAttribute will append the query options to the IQueryable<T> response, which will still end up materializing the entire table into memory first (IEnumerable<T>) and then apply the filter, which is still incredibly inefficient. It is this behaviour that is generally observed when someone complains about poor performance from an OData implementation, it is not always AutoMapper, but usually the pattern that the data source is loaded into memory in its entirety and then filtered. Following the default guidance for AutoMapper will lead you in this direction.
Instead we need to use an additional package: AutoMapper.Extensions.ExpressionMapping that will give us access to the UseAsDataSource extension method.
UseAsDataSource
Mapping expressions to one another is a tedious and produces long ugly code.
UseAsDataSource().For<DTO>() makes this translation clean by not having to explicitly map expressions. It also calls ProjectTo<TDO>() for you as well, where applicable.
This changes your implementation to the following:
[HttpGet]
[EnableQuery]
public ActionResult<IQueryable<CidadeDTO>> Get()
{
return Ok(_db.Cidades.UseAsDataSource().For<CidadeDTO>());
}
Don't fall into the trap of assuming that AutoMapper is necessary or best practice for an OData API implementation. If you are not using the unique features that AutoMapper provides then adding an additional abstraction layer can end up over-complicating your solution.
I'm not against AutoMapper, I use it a lot for Integrations, ETL, GraphQL and non-DDD style data schemas where the DTO models are significantly different to the underlying EF/data storage models. But it is a maintenance and performance overhead that a simple DDD data model and OData API based solution can easily do without.
Don't hire an excavator when a simple shovel will do the job.
AutoMapper is a convention based ORM that can be useful when you want to change the structure between implementation layers in your code, traditionally you might map Business Domain models that may represent aggregates or have flattened structures to highly normalised Database models.
OData is also a convention based ORM. It was designed to facilitate many of the same operations that AutoAmpper provides with the exception of Flattening and Unflattening models. These operations are deferred to the EF engine. The types exposed via OData mapping are DTOs
If your DTO models are the same relational structure as your EF models, then you would generally not use AutoMapper at all, the OData Edm mapping is optimised specifically to manage this type of workload and is designed to be and has been integrated directly into the serialization layer, making the Edm truely Data Transfer Objects that only exist over the wire and in the client.
This did the job
[HttpGet]
public ActionResult<IQueryable<PessoaDTO>> Get(ODataQueryOptions<Pessoa> options)
{
try
{
var queryResult = options.ApplyTo(_db.Pessoas);
return Ok(queryResult);
}
catch (System.Exception e)
{
return BadRequest(e.GetBaseException().Message);
}
}

What is the benefit of using IQueryable and LINQ queries?

I have a project where was realized own configuration classes:
IconSizesConfigSection: ConfigurationSection
IconSizesCollection: ConfigurationElementCollection
IconSize: ConfigurationElement
In Config class exists this property:
public IQueryable<IconSize> IconSizes
{
get
{
IconSizesConfigSection configInfo = (IconSizesConfigSection)ConfigurationManager.GetSection("iconConfig");
return configInfo.IconSizes.OfType<IconSize>().AsQueryable<IconSize>();
}
}
IconSizes property returns IconSizesCollection which derives from ConfigurationElementCollection. In turn ConfigurationElementCollection derives from ICollection, IEnumerable.
In some another class I have such code:
var previewIconSize = Config.IconSizes.FirstOrDefault(c => c.Name == "AvatarSize");
Why in such case uses Deffered Execution?
Why initially it uses AsQueryable<IconSize>() for collection and then uses LINQ and Deffered Execution?
Is there any benefits compared with using simple List?
In these case, there is no practical benefit. Using IQueryable is helpful for cases when query rewriting/translation will optimize performance. You will actually incur decreased performance in the provided example.
One example of using IQueryable in a helpful way is the significant performance increase gained when lazily translating and evaluating queries against a database or web service. This will perform significantly better than the alternative of pulling massive result sets and applying query logic in active memory with a "simple List".
The way you can tell that using the IQueryable in your case is detrimental is that the collection is already loaded into memory, when you begin the query.
Both IEnumerable and IQueryable use deferred execution. The difference is that IQueryable is used to cross boundaries like database queries, entity framework queries or OData queries.
When an IQueryable is iterated over, the query is translated to the remote provider's idiom and executed there. When the response is received from the remote provider, it is translated to a local object representation.
Deferred Execution is good because your user may never use the result set and hence there would have been no point querying the data source.
There may be some LINQ methods your user can't use unless they cast the result to IQueryable which means you might restrict what they can do, or force them to cast/copy the list into something more useful.
If you use a List, then you're hard coding your solution to a List, do you care what the implementation of the collection is, does your user ... probably not as long as it supports the necessary interfaces.

Expose IQueryable Over WCF Service

I've been learning about IQueryable and lazy loading/deferred execution of queries.
Is it possible to expose this functionality over WCF? I'd like to expose a LINQ-to-SQL service that returns an IQueryable which I can then perform additional queries on at the client, and finally execute using a .ToList(). Is OData format applicable at all in this context?
If possible, what is the term for this technique and what are some good tutorials I can follow? Thank you.
You should check WCF Data Services which will allow you to define Linq query on the client. WCF Data Services are probably the only solution for your requirement.
IQueryable is still only interface and the functionality depends on the type implementing the interface. You can't directly expose Linq-To-Sql or Linq-To-Entities queries. There are multiple reasons like short living contexts or serialization which will execute the query so the client will get list of all objects instead of the query.
as far as i know, the datacontext is not serializable meaning that you cannot pass it around with WCF
You can use http://interlinq.codeplex.com/ which allows you to send Linq query over WCF.
WCF Data Services only can be using with webHttpBinding and not all Linq queries can be expressed. Writing queries when WCF Data Service is used is not so attractive - requires string expressions such as:
.AddQueryOption("$filter", "Id eq 100");
https://remotelinq.codeplex.com/ is another choice. But it works in AppDomain to scan the Current Assemblies and serialize them. This tech is not suitable to WinRT as no Domains for WinRT App
I've been struggling with the same question, and realized that a well formed question is a problem solved.
IQueryable basically serves to filter the query before sending it to your DB call so instead of getting 1000 records and filter only 10, you get those 10 to begin with. That filtering belongs to your Service Layer, but if you are building an API I would assume you would map it with AND/OR parameters in your URL.
http://{host}/{entity}/q?name=john&age=21.
So you end up with something like this:
Filter:Column1=Value1 > http://{host}/{entity}q?column1=value1 > SELECT *
FROM Entity
WHERE Column1=Value1
MVC > WCF > DB
You can find a very good sample [here]
Lastly, since your payload from the WCF will be most likely a JSON, you can (and should) then deserialize them in your Domain Models inside a collection. It is until this point where the paging should occur, so I would recommend some WCF caching (and since its HTTP, it's really simple). You still will be using LINQ on the WebApp side, just w/o "WHERE" LINQ clause (unless you want to dynamically create the URL expressed above?)
For a complex OR query, you mind end up with multiple WCF queries (1 per "AND") and then concatenate them all together
If it's possible to send IQuerable<> over WCF it's not a good thing security wise, since IQuerable<> could expose stuff like the connection string to the database.
Some of the previous comments seem promising though.

EF4, Lambda, Repository pattern and DTOs

I have a semi complicated question regarding Entity Framework4, Lambda expressions, and Data Transfer Objects (DTO).
So I have a small EF4 project, and following established OO principles, I have a DTO to provide a layer of abstraction between the data consumers (GUI) and the data model.
VideoDTO = DTO with getters/setters, used by the GUI
VideoEntity = Entity generated by EF4
My question revolves around the use of the DTO by the GUI (and not having the GUI use the Entity at all), combined with a need to pass a lambda to the data layer. My data layer is a basic repository pattern with Add. Change, Delete, Get, GetList, etc.
Trying to implement a Find method with a signature like so:
public IEnumerable<VideoDTO> Find(Expression<Func<VideoEntity, bool>> exp)
...
_dataModel.Videos.Where(exp).ToList<Video>()
---
My problem/concern is the "exp" needing to be of type VideoEntity instead of VideoDTO. I want to preserve the separation of concerns so that the GUI does not know about the Entity objects. But if I try to pass in
Func<VideoDTO, bool>
I cannot then do a LINQ Where on that expression using the actual data model.
Is there a way to convert a Func<VideoDTO,bool> to a Func<VideoEntity, bool>
Ideally my method signature would accept Func<VideoDTO, bool> and that way the GUI would have no reference to the underlying data entity.
Is this clear enough? Thanks for your help
Thanks for the repliesto both of you.
I'll try the idea of defining the search criteria in an object and using that in the LINQ expression. Just starting out with both EF4 and L2S, using this as a learning project.
Thanks again!
In architectures like CQRS there isn't need for such a conversion at all cause read & write sides of app are separated.
But in Your case, You can't runaway from translation.
First of all - You should be more specific when defining repositories. Repository signature is thing You want to keep explicit instead of generic.
Common example to show this idea - can You tell what indexes You need in Your database when You look at Your repository signature (maybe looking at repository implementation, but certainly w/o looking at client code)? You can't. Cause it's too generic and client side can search by anything.
In Your example it's a bit better cause expression genericness is tied with dto instead of entity.
This is what I do (using NHibernate.Linq, but the idea remains)
public class Application{
public Project Project {get;set;}
}
public class ApplicationRepository{
public IEnumerable<Application> Search(SearchCriteria inp){
var c=Session.Linq<Application>();
var q=c.AsQueryable();
if(!string.IsNullOrEmpty(inp.Acronym))
q=q.Where(a=>a.Project.Acronym.Contains(inp.Acronym));
/*~20 lines of similar code snipped*/
return q.AsQueryable();
}
}
//used by client
public class SearchCriteria{
public string Acronym{get;set;}
/*some more fields that defines how we can search Applications*/
}
If You do want to keep Your expressions, one way would be to define dictionary manually like this:
var d=new Dictionary<Expression<Func<VideoDTO,object>>,
Expression<Func<VideoEntity,object>>{
{x=>x.DtoPropNumberOne,x=>x.EntityPropNumberOne} /*, {2}, {3}, etc.*/
};
And use it later:
//can You spot it?
//client does not know explicitly what expressions dictionary contains
_dataModel.Videos.Where(d[exp]).ToList<Video>();
//and I'm not 100% sure checking expression equality would actually work
If You don't want to write mapping dictionary manually, You will need some advanced techniques. One idea would be to translate dto expression to string and then back to entity expression. Here are some ideas (sorting related though) that might help. Expressions are quite complicated beasts.
Anyway - as I said, You should avoid this. Otherwise - You will produce really fragile code.
Perhaps your design goal is to prevent propagation of the data model entities to the client tier rather than to prevent a dependency between the presentation layer and data model. If viewed that way then there would be nothing wrong with the query being formed the way you state.
To go further you could expose the searchable fields from VideoEntity via an interface (IVideoEntityQueryFields) and use that as the type in the expression.
If you don't want to add an interface to your entities then the more complicated option is to use a VideoEntityQuery object and something that translates an Expression<Func<VideoEntityQuery,bool>> to an Expression<Func<VideoEntity,bool>>.

If Entity Framework is meant to work with POCOs, then why the dependency on IObjectSet?

I keep hearing about EF 4.0, POCO, IObjectSet, UnitOfWork (by the way, UoW is atleast more than 17 years old when I first heard it) etc.
So some folks talk about Repository "pattern". etc. There are numerous bloggers showcasing their concoction of a "wrapper" or repository or something similar.
But they all require IObjectSets (or in some cases - IQueryables) to be hanging off their POCOs. Expectation seems to be that you can write queries against them.
So if one needs IObjectSet and not just IList or some other simpler collection, why are we saying this is POCO and free from EF?
If I want to swap EF from underneath, I need to make sure my "other" O/R Mapper (I know I know.. EF is not just an O/R Mapper) understands IObjectSet and be able to parse the ExpressionTrees from the queries, execute and otherwise behave similar to EF.
IObjectSet is not the interface that makes an Entity POCO, it's just the persistence container IObjectSet. The point of POCO is to prevent you from having to derive your Model classes from an EF type, which the T4 POCO template in EF4 provides.
The Repository pattern is an optional additional layer of abstraction from your ORM to allow easier implementation of a different one if the need arose. Separation of concerns etc etc.
Take a look at Entity Framework Code First: http://weblogs.asp.net/scottgu/archive/2010/08/03/using-ef-code-first-with-an-existing-database.aspx
In response to the phrase: "If I want to swap EF from underneath":
In my business, it is more likely that I would swap out the database, say from Oracle to SQL Server (or vice versa), than that I would swap out the data access framework. On the other hand, there do exist options that make EF a favorable choice.
There are other LINQ providers than those provided by EF (e.g. LLBLGen). Sure, swapping out an EF data tier for NHibernate or EasyObjects would be difficult, because the frameworks do not have sufficient feature parity to ease the transition; however, LINQ was designed to open the way for other LINQ providers to step in and provide their own solution.
Your question contains a wrong statement: Correct is that POCOs do not depend on IObjectSet.
POCOs themselves are independent from EF. Or better: They are supposed to be independent from EF. Since YOU are implementing the POCO classes you are finally responsible to make this sure. (Otherwise the term POCO would be the wrong one.)
If you are using the standard T4 template to create POCO classes from a model description instead of writing the classes on your own the template ensures that the classes do not depend on EF - they are not derived from Entity and collections as members of a class are generated with ICollection by this template, not with IObjectSet.
Repository pattern is another question. The POCO T4 template does not create a Repository as an abstract interface to act on a database with POCOs. It creates a derived ObjectContext which is rather an EF specific implementation of a possible repository interface (or at least helps to easily implement a possible repository interface).
If you want to have a repository interface which doesn't depend on EF or LINQ you have to define it this way. Nothing forces you to use IObjectSet or IQueryable in that interface. Perhaps the examples of implementing the Repository pattern you saw didn't intend to be independent from Entity Framework or LINQ.
An example:
Suppose, in your business layer you need a list of all products of a given category returned from the persistance layer. What would this layer expose to fulfill the request?
If you only have databases in mind which offer a LINQ provider you might design the repository interface like so:
public interface IProductsRepository
{
IQueryable<Product> AllProducts { get; } // Product is the POCO class
}
A concrete implementation of this repository based on EF would simply return an ObjectSet<Product> from the ObjectContext which the T4 template did create.
And your business layer runs a query this way:
IProductsRepository rep = new SomeConcreteImplementationOfProductsRepository();
IList productsOfCategory =
rep.AllProducts.Select(p => p.Category == "stuff").ToList();
But if you want to be more open what kind of persistance storage you like to support it might be better to design the repository independent from IQueryable. The consequence could be that your abstract repository interface needs more specific methods to answer requests from the business layer, for instance you need now:
public interface IProductsRepository
{
IList<Product> GetProductsOfCategory(string category);
}
and the business layer does this:
IProductsRepository rep = new SomeConcreteImplementationOfProductsRepository();
IList productsOfCategory = rep.GetProductsOfCategory("stuff");
A concrete implementation of this Repository using EF (or another data framework supporting LINQ) could still leverage a LINQ query like the business layer did in the first example. But other implementations could work in another way (say: you have a "database" which stores products in one text file per category. Then the implementation for that interface method would read one specific file from disk. Or your repository implementation asks a webservice for the data, and so on...)
Key point is: If you are using POCO classes you are open for all those kinds of repositories. EF with POCO support doesn't force you to build repository interfaces based on IQueryable or even IObjectSet. It finally depends on what kind of persistance layers you have in mind. The more different they are the more specific methods you might need to support in your repository interface and the more work you'll have to implement those methods. Using IQueryable is a comfortable compromise which allows to define a simple repository interface while enabling simple implementations by EF but also other databases with LINQ provider. I think that's the only reason why you see examples of repository pattern implementations with IQueryable so often. It's not an inherent restriction imposed by EF with POCOs.
(That's how I think about it, not being an expert in design patterns, so heavy attacks and corrections in the comments are welcome.)

Resources