LINQ return records where string[] values match Comma Delimited String Field - linq

I am trying to select some records using LINQ for Entities (EF4 Code First).
I have a table called Monitoring with a field called AnimalType which has values such as
"Lion,Tiger,Goat"
"Snake,Lion,Horse"
"Rattlesnake"
"Mountain Lion"
I want to pass in some values in a string array (animalValues) and have the rows returned from the Monitorings table where one or more values in the field AnimalType match the one or more values from the animalValues. The following code ALMOST works as I wanted but I've discovered a major flaw with the approach I've taken.
public IQueryable<Monitoring> GetMonitoringList(string[] animalValues)
{
var result = from m in db.Monitorings
where animalValues.Any(c => m.AnimalType.Contains(c))
select m;
return result;
}
To explain the problem, if I pass in animalValues = { "Lion", "Tiger" } I find that three rows are selected due to the fact that the 4th record "Mountain Lion" contains the word "Lion" which it regards as a match.
This isn't what I wanted to happen. I need "Lion" to only match "Lion" and not "Mountain Lion".
Another example is if I pass in "Snake" I get rows which include "Rattlesnake". I'm hoping somebody has a better bit of LINQ code that will allow for matches that match the exact comma delimited value and not just a part of it as in "Snake" matching "Rattlesnake".

This is a kind of hack that will do the work:
public IQueryable<Monitoring> GetMonitoringList(string[] animalValues)
{
var values = animalValues.Select(x => "," + x + ",");
var result = from m in db.Monitorings
where values.Any(c => ("," + m.AnimalType + ",").Contains(c))
select m;
return result;
}
This way, you will have
",Lion,Tiger,Goat,"
",Snake,Lion,Horse,"
",Rattlesnake,"
",Mountain Lion,"
And check for ",Lion," and "Mountain Lion" won't match.
It's dirty, I know.

Because the data in your field is comma delimited you really need to break those entries up individually. Since SQL doesn't really support a way to split strings, the option that I've come up with is to execute two queries.
The first query uses the code you started with to at least get you in the ballpark and minimize the amount of data you're retrieving. It converts it to a List<> to actually execute the query and bring the results into memory which will allow access to more extension methods like Split().
The second query uses the subset of data in memory and joins it with your database table to then pull out the exact matches:
public IQueryable<Monitoring> GetMonitoringList(string[] animalValues)
{
// execute a query that is greedy in its matches, but at least
// it's still only a subset of data. The ToList()
// brings the data into memory, so to speak
var subsetData = (from m in db.Monitorings
where animalValues.Any(c => m.AnimalType.Contains(c))
select m).ToList();
// given that subset of data in the List<>, join it against the DB again
// and get the exact matches this time
var result = from data in subsetData
join m in db.Monitorings on data.ID equals m.ID
where data.AnimalType.Split(',').Intersect(animalValues).Any ()
select m;
return result;
}

Related

Is there any better way to check if the same data is present in a table in .Net core 3.1?

I'm pulling data from a third party api. The api runs multiple times in a day. So, if the same data is present in the table it should ignore that record, else if there are any changes it should update that record or insert a new record if anything new shows up in the json received.
I'm using the below code for inserting any new data.
var input = JsonConvert.DeserializeObject<List<DeserializeLookup>>(resultJson).ToList();
var entryset = input.Select(y => new Lookup
{
lookupType = "JOBCODE",
code = y.Code,
description = y.Description,
isNew = true,
lastUpdatedDate = DateTime.UtcNow
}).ToList();
await _context.Lookup.AddRangeAsync(entryset);
await _context.SaveChangesAsync();
But, after the first run, when the api runs again it's again inserting the same data in the table. As a result, duplicate entries are getting into table. To handle the same, I used a foreach loop as below before inserting data to the table.
foreach (var item in input)
{
if (!_context.Lookup.Any(r =>
r.code== item.Code))
{
//above insert code
}
}
But, the same doesn't work as expected. Also, the api takes a lot of time to run when I put a foreach loop. Is there a solution to this in .net core 3.1
List<DeserializeLookup> newList=new();
foreach (var item in input)
{
if (!_context.Lookup.Any(r =>
r.code== item.Code))
{
newList.add(item);
//above insert code
}
}
await _context.Lookup.AddRangeAsync(newList);
await _context.SaveChangesAsync();
It will be better if you try this way
I’m on my phone so forgive me for not being able to format the code in my response. The solution to your problem is something I actually just encountered myself while syncing data from an azure function and third party app and into a sql database.
Depending on your table schema, you would need one column with a unique identifier. Make this column a primary key (first step to preventing duplicates). Here’s a resource for that: https://www.w3schools.com/sql/sql_primarykey.ASP
The next step you want to take care of is your stored procedure. You’ll need to perform what’s commonly referred to as an UPSERT. To do this you’ll need to merge a table with the incoming data...on a specified column (whichever is your primary key).
That would look something like this:
MERGE
Table_1 AS T1
USING
Incoming_Data AS source
ON
T1.column1 = source.column1
/// you can use an AND / OR operator in here for matching on additional values or combinations
WHEN MATCHED THEN
UPDATE SET T1.column2= source.column2
//// etc for more columns
WHEN NOT MATCHED THEN
INSERT (column1, column2, column3) VALUES (source.column1, source.column2, source.column3);
First of all, you should decouple the format in which you get your data from your actual data handling. In your case: get rid of the JSon before you actually interpret the data.
Alas, I haven't got a clue what your data represents, so Let's assume your data is a sequence of Customer Orders. When you get new data, you want to Add all new orders, and you want to update changed orders.
So somewhere you have a method with input your json data, and as output a sequence of Orders:
IEnumerable<Order> InterpretJsonData(string jsonData)
{
...
}
You know Json better than I do, besides this conversion is a bit beside your question.
You wrote:
So, if the same data is present in the table it should ignore that record, else if there are any changes it should update that record or insert a new record
You need an Equality Comparer
To detect whether there are Added or Changed Customer Orders, you need something to detect whether Order A equals Order B. There must be at least one unique field by which you can identify an Order, even if all other values are of the Order are changed.
This unique value is usually called the primary key, or the Id. I assume your Orders have an Id.
So if your new Order data contains an Id that was not available before, then you are certain that the Order was Added.
If your new Order data has an Id that was already in previously processed Orders, then you have to check the other values to detect whether it was changed.
For this you need Equality comparers: one that says that two Orders are equal if they have the same Id, and one that says checks all values for equality.
A standard pattern is to derive your comparer from class EqualityComparer<Order>
class OrderComparer : EqualityComparer<Order>
{
public static IEqualityComparer<Order> ByValue = new OrderComparer();
... // TODO implement
}
Fist I'll show you how to use this to detect additions and changes, then I'll show you how to implement it.
Somewhere you have access to the already processed Orders:
IEnumerable<Order> GetProcessedOrders() {...}
var jsondata = FetchNewJsonOrderData();
// convert the jsonData into a sequence of Orders
IEnumerable<Order> orders = this.InterpretJsonData(jsondata);
To detect which Orders are added or changed, you could make a Dictonary of the already Processed orders and check the orders one-by-one if they are changed:
IEqualityComparer<Order> comparer = OrderComparer.ByValue;
Dictionary<int, Order> processedOrders = this.GetProcessedOrders()
.ToDictionary(order => order.Id);
foreach (Order order in Orders)
{
if(processedOrders.TryGetValue(order.Id, out Order originalOrder)
{
// order already existed. Is it changed?
if(!comparer.Equals(order, originalOrder))
{
// unequal!
this.ProcessChangedOrder(order);
// remember the changed values of this Order
processedOrder[order.Id] = Order;
}
// else: no changes, nothing to do
}
else
{
// Added!
this.ProcessAddedOrder(order);
processedOrder.Add(order.Id, order);
}
}
Immediately after Processing the changed / added order, I remember the new value, because the same Order might be changed again.
If you want this in a LINQ fashion, you have to GroupJoin the Orders with the ProcessedOrders, to get "Orders with their zero or more Previously processed Orders" (there will probably be zero or one Previously processed order).
var ordersWithTPreviouslyProcessedOrder = orders.GroupJoin(this.GetProcessedOrders(),
order => order.Id, // from every Order take the Id
processedOrder => processedOrder.Id, // from every previously processed Order take the Id
// parameter resultSelector: from every Order, with its zero or more previously
// processed Orders make one new:
(order, previouslyProcessedOrders) => new
{
Order = order,
ProcessedOrder = previouslyProcessedOrders.FirstOrDefault(),
})
.ToList();
I use GroupJoin instead of Join, because this way I also get the "Orders that have no previously processed orders" (= new orders). If you would use a simple Join, you would not get them.
I do a ToList, so that in the next statements the group join is not done twice:
var addedOrders = ordersWithTPreviouslyProcessedOrder
.Where(orderCombi => orderCombi.ProcessedOrder == null);
var changedOrders = ordersWithTPreviouslyProcessedOrder
.Where(orderCombi => !comparer.Equals(orderCombi.Order, orderCombi.PreviousOrder);
Implementation of "Compare by Value"
// equal if all values equal
protected override bool Equals(bool x, bool y)
{
if (x == null) return y == null; // true if both null, false if x null but y not null
if (y == null) return false; // because x not null
if (Object.ReferenceEquals(x, y) return true;
if (x.GetType() != y.GetType()) return false;
// compare all properties one by one:
return x.Id == y.Id
&& x.Date == y.Date
&& ...
}
For GetHashCode is one rule: if X equals Y then they must have the same hash code. If not equal, then there is no rule, but it is more efficient for lookups if they have different hash codes. Make a tradeoff between calculation speed and hash code uniqueness.
In this case: If two Orders are equal, then I am certain that they have the same Id. For speed I don't check the other properties.
protected override int GetHashCode(Order x)
{
if (x == null)
return 34339d98; // just a hash code for all null Orders
else
return x.Id.GetHashCode();
}

How to query for columns in EF using a list of column names as string?

Let's say I have a table that I can query with EF + LINQ like this:
var results = dbContext.MyTable.Where(q => q.Flag = true);
Then, I know that if I want to limit the columns returned, I can just add a select in that line like this:
var results = dbContext.MyTable
.Select(model => new { model.column2, model.column4, model.column9 })
.Where(q => q.Flag == true);
The next step that I need to figure out, is how to select those columns dynamically. Said another way, I need to be able to select columns in a table withoutknowing what they are at compile time. So, for example, I need to be able to do something like this:
public IEnumerable<object> GetWhateverColumnsYouWant(List<string> columns = new List<string{ "column3", "column4", "column999"})
{
// automagical stuff goes here.
}
It is important to keep the returned record values strongly typed, meaning the values can't just be dumped into a list of strings. Is this something that can be accomplished with reflection? Or would generics fit this better? Honestly, I'm not sure where to start with this.
Thanks for any help.
I think you want some dynamic linq, Im no expert on this but i THINK it will go something like this
public static IEnumerable<object> GetWhateverColumnsYouWant(this IQueriable<T> query, List<string> columns = new List<string{ "column3", "column4", "column999"})
{
return query.Select("new (" + String.Join(", ", columns) + ")");
}
See scott Gu's blog here http://weblogs.asp.net/scottgu/archive/2008/01/07/dynamic-linq-part-1-using-the-linq-dynamic-query-library.aspx and this question System.LINQ.Dynamic: Select(" new (...)") into a List<T> (or any other enumerable collection of <T>)
You could probably also do this by dynamically composing an expression tree of the columns youre wanting to select but this would be substantially more code to write.

LINQ Lamba Select all from table where field contains all elements in list

I need a Linq statement that will select all from a table where a field contains all elements in a list<String> while searching other fields for the entire string regardless of words.
It's basically just a inclusive word search on a field where all words need to be in the record and string search on other fields.
Ie I have a lookup screen that allows the user to search AccountID or Detail for the entire search string, or search clientID for words inclusive words, I'll expand this to the detail field if I can figure out the ClientId component.
The complexity is that the AccountId and Detail are being searched as well which Basically stops me from doing the foreach in the second example due to the "ors".
Example 1, this gives me an the following error when I do query.Count() afterwards:
query.Count(); 'query.Count()' threw an exception of type 'System.NotSupportedException' int {System.NotSupportedException}
+base {"Local sequence cannot be used in LINQ to SQL implementations of query operators except the Contains operator."} System.SystemException {System.NotSupportedException}
var StrList = searchStr.Split(' ').ToList();
query = query.Where(p => p.AccountID.Contains(searchStr) || StrList.All(x => p.clientID.Contains(x)) || p.Detail.Contains(searchStr));
Example 2, this gives me an any search word type result:
var StrList = searchStr.Split(' ').ToList();
foreach (var item in StrList)
query = query.Where(p => p.AccountID.Contains(searchStr) || p.clientID.Contains(item) || p.Detail.Contains(searchStr));
Update
I have a table with 3 fields, AccountID, ClientId, Details
Records
Id, AccountID, CLientId, Details
1, "123223", "bobo and co", "this client suxs"
2, "654355", "Jesses hair", "we like this client and stuff"
3, "456455", "Microsoft", "We love Mircosoft"
Search examples
searchStr = "232"
Returns Record 1;
searchStr = "bobo hair"
Returns no records
searchStr = "bobo and"
Returns Record 1
searchStr = "123 bobo and"
Returns returns nothing
The idea here is:
if the client enters a partial AccountId it returns stuff,
if the client wants to search for a ClientId they can type and cancel down clients by search terms, ie word list. due to the large number of clients the ClientID Will need to contain all words (in any order)
I know this seems strange but it's just a simple interface to find accounts in a powerful way.
I think there are 2 solutions to your problem.
One is to count the results in memory like this:
int count = query.ToList().Count();
The other one is to not use All in your query:
var query2 = query;
foreach (var item in StrList)
{
query2 = query2.Where(p => p.clientID.Contains(item));
}
var result = query2.Union(query.Where(p => p.AccountID.Contains(searchStr) || p.Detail.Contains(searchStr)));
The Union at the end acts like an OR between the 2 queries.

Longish LINQ query breakes SQLite-parser - simplify?

I'm programming a search for a SQLite-database using C# and LINQ.
The idea of the search is, that you can provide one or more keywords, any of which must be contained in any of several column-entries for that row to be added to the results.
The implementation consists of several linq-queries which are all put together by union. More keywords and columns that have to be considered result in a more complicated query that way. This can lead to SQL-code, which is to long for the SQLite-parser.
Here is some sample code to illustrate:
IQueryable<Reference> query = null;
if (searchAuthor)
foreach (string w in words)
{
string word = w;
var result = from r in _dbConnection.GetTable<Reference>()
where r.ReferenceAuthor.Any(a => a.Person.LastName.Contains(word) || a.Person.FirstName.Contains(word))
orderby r.Title
select r;
query = query == null ? result : query.Union(result);
}
if (searchTitle)
foreach (string word in words)
{
var result = from r in _dbConnection.GetTable<Reference>()
where r.Title.Contains(word)
orderby r.Title
select r;
query = query == null ? result : query.Union(result);
}
//...
Is there a way to structure the query in a way that results in more compact SQL?
I tried to force the creation of smaller SQL-statments by calling GetEnumerator() on the query after every loop. But apparently Union() doesn't operate on data, but on the underlying LINQ/SQL statement, so I was generating to long statements regardless.
The only solution I can think of right now, is to really gather the data after every "sub-query" and doing a union on the actual data and not in the statement. Any ideas?
For something like that, you might want to use a PredicateBuilder, as shown in the chosen answer to this question.

Using LINQ to SQL and chained Replace

I have a need to replace multiple strings with others in a query
from p in dx.Table
where p.Field.Replace("A", "a").Replace("B", "b").ToLower() = SomeVar
select p
Which provides a nice single SQL statement with the relevant REPLACE() sql commands.
All good :)
I need to do this in a few queries around the application... So i'm looking for some help in this regard; that will work as above as a single SQL hit/command on the server
It seems from looking around i can't use RegEx as there is no SQL eq
Being a LINQ newbie is there a nice way for me to do this?
eg is it possible to get it as a IQueryable "var result" say and pass that to a function to add needed .Replace()'s and pass back? Can i get a quick example of how if so?
EDIT: This seems to work! does it look like it would be a problem?
var data = from p in dx.Videos select p;
data = AddReplacements(data, checkMediaItem);
theitem = data.FirstOrDefault();
...
public IQueryable<Video> AddReplacements(IQueryable<Video> DataSet, string checkMediaItem)
{
return DataSet.Where(p =>
p.Title.Replace(" ", "-").Replace("&", "-").Replace("?", "-") == checkMediaItem);
}
Wouldn't it be more performant to reverse what you are trying to do here, ie reformat the string you are checking against rather than reformatting the data in the database?
public IQueryable<Video> AddReplacements(IQueryable<Video> DataSet, string checkMediaItem)
{
var str = checkMediaItem.Replace("-", "?").Replace("&", "-").Replace("-", " "));
return DataSet.Where(p => p.Title == str);
}
Thus you are now comparing the field in the database with a set value, rather than scanning the table and transforming the data in each row and comparing that.

Resources