Longish LINQ query breakes SQLite-parser - simplify? - linq

I'm programming a search for a SQLite-database using C# and LINQ.
The idea of the search is, that you can provide one or more keywords, any of which must be contained in any of several column-entries for that row to be added to the results.
The implementation consists of several linq-queries which are all put together by union. More keywords and columns that have to be considered result in a more complicated query that way. This can lead to SQL-code, which is to long for the SQLite-parser.
Here is some sample code to illustrate:
IQueryable<Reference> query = null;
if (searchAuthor)
foreach (string w in words)
{
string word = w;
var result = from r in _dbConnection.GetTable<Reference>()
where r.ReferenceAuthor.Any(a => a.Person.LastName.Contains(word) || a.Person.FirstName.Contains(word))
orderby r.Title
select r;
query = query == null ? result : query.Union(result);
}
if (searchTitle)
foreach (string word in words)
{
var result = from r in _dbConnection.GetTable<Reference>()
where r.Title.Contains(word)
orderby r.Title
select r;
query = query == null ? result : query.Union(result);
}
//...
Is there a way to structure the query in a way that results in more compact SQL?
I tried to force the creation of smaller SQL-statments by calling GetEnumerator() on the query after every loop. But apparently Union() doesn't operate on data, but on the underlying LINQ/SQL statement, so I was generating to long statements regardless.
The only solution I can think of right now, is to really gather the data after every "sub-query" and doing a union on the actual data and not in the statement. Any ideas?

For something like that, you might want to use a PredicateBuilder, as shown in the chosen answer to this question.

Related

Parametrize F# queries based on properties

I'm looking to factor out some common queries over several tables. In a very simple example all tables have a DataDate column, so I have queries like this:
let dtexp1 = query { for x in table1 do maxBy x.Datadate }
let dtexp2 = query { for x in table2 do maxBy x.Datadate }
Based on a previous question I can do the following:
let mkQuery t q =
query { for rows in t do maxBy ((%q) rows) }
let getMaxDt1 = mkQuery table1 (<# fun q -> q.Datadate #>)
let getMaxDt2 = mkQuery table2 (<# fun q -> q.Datadate #>)
I would be interested if there are any other solutions not using quotations. The reason being is that for more complicated queries the quotations and the splicing become difficult to read.
This for example won't work, obviously, as we don't know that x has property DataDate.
let getMaxDt t = query { for x in t do maxBy x.Datadate }
Unless I can abstract over the type of table1, table2, etc. which are generated by SqlProvider.
The answer very much depends on what kind of queries you need to construct and how static or dynamic they are. Generally speaking:
LINQ is great if they are mostly static and if you can easily list all the templates for all queries you'll need - the main nice thing is that it statically type checks the queries
LINQ is not so great when your query structure is very dynamic, because then you end up composing lots of quotations and the type checking sometimes gets into the way.
If your queries are very dynamic (including selecting the source dynamically), but are not too complex (e.g. no fancy groupings no fancy joins), then it might be easier to write code to generate SQL query from an F# domain model.
For your simple example, the query is really just a table name and aggregation:
type Column = string
type Table = string
type QueryAggregate =
| MaxBy of Column
type Query =
{ Table : Table
Aggregate : QueryAggregate }
You can then create your two queries using:
let q1 = { Table = "table1"; Aggregate = MaxBy "Datadate" }
let q2 = { Table = "table2"; Aggregate = MaxBy "Datadate" }
Translating those queries to SQL is quite simple:
let translateAgg = function
| MaxBy col -> sprintf "MAX(%s)" col
let translateQuery q =
sprintf "SELECT %s FROM %s" (translateAgg q.Aggregate) q.Table
Depending on how rich your queries can be, the translation can get very complicated, but if the structure is fairly simple then this might just be an easier alternative than constructing the query using LINQ. As I said, it's hard to say what will be better without knowing the exact use case!

Dynamic Linq on DataTable error: no Field or Property in DataRow, c#

I have some errors using Linq on DataTable and I couldn't figure it out how to solve it. I have to admit that i am pretty new to Linq and I searched the forum and Internet and couldn't figure it out. hope you can help.
I have a DataTable called campaign with three columns: ID (int), Product (string), Channel (string). The DataTable is already filled with data. I am trying to select a subset of the campaign records which satisfied the conditions selected by the end user. For example, the user want to list only if the Product is either 'EWH' or 'HEC'. The selection criteria is dynaically determined by the end user.
I have the following C# code:
private void btnClick()
{
IEnumerable<DataRow> query =
from zz in campaign.AsEnumerable()
orderby zz.Field<string>("ID")
select zz;
string whereClause = "zz.Field<string>(\"Product\") in ('EWH','HEC')";
query = query.Where(whereClause);
DataTable sublist = query.CopyToDataTable<DataRow>();
}
But it gives me an error on line: query = query.Where(whereClause), saying
No property or field 'zz' exists in type 'DataRow'".
If I changed to:
string whereClause = "Product in ('EWH','HEC')"; it will say:
No property or field 'Product' exists in type 'DataRow'
Can anyone help me on how to solve this problem? I feel it could be a pretty simple syntax change, but I just don't know at this time.
First, this line has an error
orderby zz.Field<string>("ID")
because as you said, your ID column is of type int.
Second, you need to learn LINQ query syntax. Forget about strings, the same way you used from, orderby, select in the query, you can also use where and many other operators. Also you'll need to learn the equivalent LINQ constructs for SQL-ish things, like for instance IN (...) is mapped to Enumerable.Contains etc.
With all that being said, here is your query
var productFilter = new[] { "EWH", "HEC" };
var query =
from zz in campaign.AsEnumerable()
where productFilter.Contains(zz.Field<string>("Product"))
orderby zz.Field<int>("ID")
select zz;
Update As per your comment, if you want to make this dynamic, then you need to switch to lambda syntax. Multiple and criteria can be composed by chaining multiple Where clauses like this
List<string> productFilter = ...; // coming from outside
List<string> channelFilter = ...; // coming from outside
var query = campaign.AsEnumerable();
// Apply filters if needed
if (productFilter != null && productFilter.Count > 0)
query = query.Where(zz => productFilter.Contains(zz.Field<string>("Product")));
if (channelFilter != null && channelFilter.Count > 0)
query = query.Where(zz => channelFilter.Contains(zz.Field<string>("Channel")));
// Once finished with filtering, do the ordering
query = query.OrderBy(zz => zz.Field<int>("ID"));

LINQ return records where string[] values match Comma Delimited String Field

I am trying to select some records using LINQ for Entities (EF4 Code First).
I have a table called Monitoring with a field called AnimalType which has values such as
"Lion,Tiger,Goat"
"Snake,Lion,Horse"
"Rattlesnake"
"Mountain Lion"
I want to pass in some values in a string array (animalValues) and have the rows returned from the Monitorings table where one or more values in the field AnimalType match the one or more values from the animalValues. The following code ALMOST works as I wanted but I've discovered a major flaw with the approach I've taken.
public IQueryable<Monitoring> GetMonitoringList(string[] animalValues)
{
var result = from m in db.Monitorings
where animalValues.Any(c => m.AnimalType.Contains(c))
select m;
return result;
}
To explain the problem, if I pass in animalValues = { "Lion", "Tiger" } I find that three rows are selected due to the fact that the 4th record "Mountain Lion" contains the word "Lion" which it regards as a match.
This isn't what I wanted to happen. I need "Lion" to only match "Lion" and not "Mountain Lion".
Another example is if I pass in "Snake" I get rows which include "Rattlesnake". I'm hoping somebody has a better bit of LINQ code that will allow for matches that match the exact comma delimited value and not just a part of it as in "Snake" matching "Rattlesnake".
This is a kind of hack that will do the work:
public IQueryable<Monitoring> GetMonitoringList(string[] animalValues)
{
var values = animalValues.Select(x => "," + x + ",");
var result = from m in db.Monitorings
where values.Any(c => ("," + m.AnimalType + ",").Contains(c))
select m;
return result;
}
This way, you will have
",Lion,Tiger,Goat,"
",Snake,Lion,Horse,"
",Rattlesnake,"
",Mountain Lion,"
And check for ",Lion," and "Mountain Lion" won't match.
It's dirty, I know.
Because the data in your field is comma delimited you really need to break those entries up individually. Since SQL doesn't really support a way to split strings, the option that I've come up with is to execute two queries.
The first query uses the code you started with to at least get you in the ballpark and minimize the amount of data you're retrieving. It converts it to a List<> to actually execute the query and bring the results into memory which will allow access to more extension methods like Split().
The second query uses the subset of data in memory and joins it with your database table to then pull out the exact matches:
public IQueryable<Monitoring> GetMonitoringList(string[] animalValues)
{
// execute a query that is greedy in its matches, but at least
// it's still only a subset of data. The ToList()
// brings the data into memory, so to speak
var subsetData = (from m in db.Monitorings
where animalValues.Any(c => m.AnimalType.Contains(c))
select m).ToList();
// given that subset of data in the List<>, join it against the DB again
// and get the exact matches this time
var result = from data in subsetData
join m in db.Monitorings on data.ID equals m.ID
where data.AnimalType.Split(',').Intersect(animalValues).Any ()
select m;
return result;
}

Linq: Dynamic Query Contruction: query moves to client-side

I've been following with great interest the converstaion here:
Construct Query with Linq rather than SQL strings
with regards to constructing expression trees where even the table name is dynamic.
Toward that end, I've created a Extension method, addWhere, that looks like:
static public IQueryable<TResult> addWhere<TResult>(this IQueryable<TResult> query, string columnName, string value)
{
var providerType = query.Provider.GetType();
// Find the specific type parameter (the T in IQueryable<T>)
var iqueryableT = providerType.FindInterfaces((ty, obj) => ty.IsGenericType && ty.GetGenericTypeDefinition() == typeof(IQueryable<>), null).FirstOrDefault();
var tableType = iqueryableT.GetGenericArguments()[0];
var tableName = tableType.Name;
var tableParam = Expression.Parameter(tableType, tableName);
var columnExpression = Expression.Equal(
Expression.Property(tableParam, columnName),
Expression.Constant(value));
var predicate = Expression.Lambda(columnExpression, tableParam);
var function = (Func<TResult, Boolean>)predicate.Compile();
var whereRes = query.Where(function);
var newquery = whereRes.AsQueryable();
return newquery;
}
[thanks to Timwi for the basis of that code]
Which functionally, works.
I can call:
query = query.addWhere("CurUnitType", "ML 15521.1");
and it's functionally equivalent to :
query = query.Where(l => l.CurUnitType.Equals("ML 15521.1"));
ie, the rows returned are the same.
However, I started watching the sql log, and I noticed with the line:
query = query.Where(l => l.CurUnitType.Equals("ML 15521.1"));
The Query generated is:
SELECT (A bunch of columns)
FROM [dbo].[ObjCurLocView] AS [t0]
WHERE [t0].[CurUnitType] = #p0
whereas when I use the line
query = query.addWhere("CurUnitType", "ML 15521.1");
The query generated is :
SELECT (the same bunch of columns)
FROM [dbo].[ObjCurLocView] AS [t0]
So, the comparison is now happening on the client side, instead of being added to the sql.
Obviously, this isn't so hot.
To be honest, I mostly cut-and-pasted the addWhere code from Timwi's (slightly different) example, so some of it is over my head. I'm wondering if there's any adjustment I can make to this code, so the expression is converted into the SQL statement, instead of being determined client-side
Thanks for taking the time to read through this, I welcome any comments, solutions, links, etc, that could help me with this. And of course if I find the solution through other means, I'll post the answer here.
Cheers.
The big problem is that you're converting the expression tree into a delegate. Look at the signature of Queryable.Where - it's expressed in expression trees, not delegates. So you're actually calling Enumerable.Where instead. That's why you need to call AsQueryable afterwards - but that doesn't do enough magic here. It doesn't really put it back into "just expression trees internally" land, because you've still got the delegate in there. It's now wrapped in an expression tree, but you've lost the details of what's going on inside.
I suspect what you want is this:
var predicate = Expression.Lambda<Func<TResult, Boolean>>
(columnExpression, tableParam);
return query.Where(predicate);
I readily admit that I haven't read the rest of your code, so there may be other things going on... but that's the core bit. You want a strongly typed expression tree (hence the call to the generic form of Expression.Lambda) which you can then pass into Queryable.Where. Give it a shot :)

Linq to EF Expression Tree / Predicate int.Parse workaround

I have a linq Entity called Enquiry, which has a property: string DateSubmitted.
I'm writing an app where I need to return IQueryable for Enquiry that have a DateSubmitted within a particular date range.
Ideally I'd like to write something like
IQueryable<Enquiry> query = Context.EnquirySet.AsQueryable<Enquiry>();
int dateStart = int.Parse("20090729");
int dateEnd = int.Parse("20090930");
query = (from e in query
where(enq => int.Parse(enq.DateSubmitted) < dateEnd)
where(enq => int.Parse(enq.DateSubmitted) > dateStart)
select e);
Obviously Linq to EF doesn't recognise int.Parse, so I think I can achieve what I want with an Expression method that returns a predicate???
I've been playing around with PredicateBuilder and looking all over but I've successfully fried my brains trying to work this out. Sure I could add another property to my Entity and convert it there but I'd really like to understand this. Can anyone explain or give an example/link that doesn't fry my brains?
Thanks in advance
Mark
If you know your date strings are valid, and they're really in that order (which is a natural sort order) you might be able to get away with string comparisons:
IQueryable<Enquiry> query = Context.EnquirySet.AsQueryable<Enquiry>();
string dateStart ="20090729";
string dateEnd = "20090930";
query = (from e in query
where(enq => enq.DateSubmitted.CompareTo(dateEnd)) < 0)
where(enq => enq.DateSubmitted.CompareTo(dateStart)) > 0)
select e);

Resources