LINQ query left joining two tables with concatenation - linq

I am using this as a reference -- how concatenate multiple rows in LINQ with two tables?
I have the exact same needs, except that not all "printers" have "resolutions". In my particular case, I have a Lead table, which stores some basic information. Then there is a tag table, which stores tags used for the Lead. Not every lead has a tag.
This is what I have so far based on the above reference:
var leads = _dbRO.Leads.Join(_dbRO.Tags, p => p.LeadId, r => r.EntityId, (p, r) => new
{
LeadId = p.LeadId,
GigDate = p.GigDate,
Location = p.Location,
Tags = String.Join("|", _dbRO.Tags.Where(k => k.EntityId == p.LeadId)
.Select(lm => lm.TagName.ToString()))
}).Distinct();
This works well for me. However, leads without tags are NOT returned. How do I ensure all leads are returned regardless of tags. An empty string or null for Tags field would be fine.
Also if you don't mind, if I want to return the Tags in an object array, how do I do that? The reason is because there could be additional information associated with each tag, like color etc. So a simple concatenated string might not be sufficient.
Thanks a bunch!

I've figured out -- I do not need to join the tag table at all. This causes the problem. I just need to select from my Lead table and in the Select section, get the tags as I was already doing.

If you’ve declared a relationship between Lead and Tag entity types, then EF already supplies your requirements through the Include() extension method.
ctx.Leads.Include(l => l.Tags).ToList()
This requires that Lead declares a navigation property to Tag as shown below.
class Lead
{ ... public List<Tag> Tags { get; set; } }

Related

Realm Xamarin LINQ Select

Is there a way to restrict the "columns" returned from a Realm Xamarin LINQ query?
For example, if I have a Customer RealmObject and I want a list of all customer names, do I have to query All<Customer> and then enumerate the results to build the names list? That seems cumbersome and inefficient. I am not seeing anything in the docs. Am I missing something obvious here? Thanks!
You have to remember that Realm is an object based store. In a RDBMS like Sqlite, restricting the return results to a sub-set of "columns" of an "record" makes sense, but in an object store, you would be removing attributes from the original class and thus creating a new dynamic class to then instantiate these new classes as objects.
Thus is you want just a List of strings representing the customer names you can do this:
List<string> names = theRealm.All<Customer>().ToList().Select(customer => customer.Name).ToList();
Note: That you take the Realm.All<> results to a List first and then using a Linq Select "filter" just the property that you want. Using a .Select directly on a RealmResults is not currently supported (v0.80.0).
If you need to return a complex type that is a subset of attributes from the original RealObject, assuming you have a matching POCO, you can use:
var custNames = theRealm.All<Customer>().ToList().Select((Customer c) => new Name() { firstName = c.firstName, lastName = c.lastName } );
Remember, once you convert a RealmResult to a static list of POCOs you do lose the liveliness of using RealmObjects.
Personally I avoid doing this whenever possible as Realm is so fast that using a RealmResult and thus the RealObjects directly is more efficient on processing time and memory overhead then converting those to POCOs everytime you need to new list...

Select distinct value from a list in linq to entity

There is a table, it is a poco entity generated by entity framework.
class Log
{
int DoneByEmpId;
string DoneByEmpName
}
I am retrieving a list from the data base. I want distinct values based on donebyempid and order by those values empname.
I have tried lot of ways to do it but it is not working
var lstLogUsers = (context.Logs.GroupBy(logList => logList.DoneByEmpId).Select(item => item.First())).ToList(); // it gives error
this one get all the user.
var lstLogUsers = context.Logs.ToList().OrderBy(logList => logList.DoneByEmpName).Distinct();
Can any one suggest how to achieve this.
Can I just point out that you probably have a problem with your data model here? I would imagine you should just have DoneByEmpId here, and a separate table Employee which has EmpId and Name.
I think this is why you are needing to use Distinct/GroupBy (which doesn't really work for this scenario, as you are finding).
I'm not near a compiler, so i can't test it, but...
Use the other version of Distinct(), the one that takes an IEqualityComparer<TSource> argument, and then use OrderBy().
See here for example.

Get all the includes from an Entity Framework Query?

I've the following Entity Model : Employee has a Company and a Company has Employees.
When using the Include statement like below:
var query = context.Employees.Include(e => e.Company);
query.Dump();
All related data is retrieved from the database correctly. (Using LEFT OUTER JOIN on Company table)
The problem is hat when I use the GroupBy() from System.Linq.Dynamic to group by Company.Name, the Employees are missing the Company data because the Include is lost.
Example:
var groupByQuery = query.GroupBy("new (Company.Name as CompanyName)", "it");
groupByQuery.Dump();
Is there a way to easily retrieve the applied Includes on the 'query' as a string collection, so that I can include them in the dynamic GroupBy like this:
var groupByQuery2 = query.GroupBy("new (Company, Company.Name as CompanyName)", "it");
groupByQuery2.Dump();
I thought about using the ToString() functionality to get the SQL Command like this:
string sql = query.ToString();
And then use RegEx to extract all LEFT OUTER JOINS, but probably there is a better solution ?
if you're creating the query in the first place - I'd always opt to save the includes (and add to them if you're making a composite query/filtering).
e.g. instead of returning just 'query' return new QueryContext {Query = query, Includes = ...}
I'd like to see a more elegant solution - but I think that's your best bet.
Otherwise you're looking at expression trees, visitors and all those nice things.
SQL parsing isn't that straight either - as queries are not always that simple (often a combo of things etc.).
e.g. there is a `span' inside the query object (if you traverse a bit) which seems to be holding the 'Includes' but it's not much help.

LINQ Query to find all tags?

I have an application that manages documents called Notes. Like a blog, Notes can be searched for matches against one or more Tags, which are contained in a Note.Tags collection property. A Tag has Name and ID properties, and matches are made against the ID. A user can specify multiple tags to match against, in which case a Note must contain all Tags specified to match.
I have a very complex LINQ query to perform a Note search, with extension methods and looping. Quite frankly, it has a real code smell to it. I want to rewrite the query with something much simpler. I know that if I made the Tag a simple string, I could use something like this:
var matchingNotes = from n in myNotes
where n.Tags.All(tag => searchTags.Contains(tag))
Can I do something that simple if my model uses a Tag object with an ID? What would the query look like. Could it be written in fluent syntax? what would that look like?
I believe you can find notes that have the relevant tags in a single LINQ expression:
IQueryable<Note> query = ... // top part of query
query = query.Where(note => searchTags.All(st =>
note.Tags.Any(notetag => notetag.Id == st.Id)));
Unfortunately there is no “fluent syntax” equivalent for All and Any, so the best you can do there is
query = from note in query
where searchTags.All(st =>
note.Tags.Any(notetag => notetag.Id == st.Id))
select note;
which is not that much better either.
For starters see my comment; I suspect the query is wrong anyway! I would simplifiy it, by simply enforcing separately that each tag exists:
IQueryable<Note> query = ... // top part of query
foreach(var tagId in searchTagIds) {
var tmpId = tagId; // modified closures...
query = query.Where(note => note.Tags.Any(t => t.Id == tmpId));
}
This should have the net effect of enforcing all the tags specified are present and accounted for.
Timwi's solution works in most dialects of LINQ, but not in Linq to Entities. I did find a single-statement LINQ query that works, courtesy of ReSharper. Basically, I wrote a foreach block to do the search, and ReSharper offered to convert the block to a LINQ statement--I had no idea it could do this.
I let ReSharper perform the conversion, and here is what it gave me:
return searchTags.Aggregate<Tag, IQueryable<Note>>(DataStore.ObjectContext.Notes, (current, tag) => current.Where(n => n.Tags.Any(t => t.Id == tag.Id)).OrderBy(n => n.Title));
I read my Notes collection from a database, using Entity Framework 4. DataStore is the custom class I use to manage my EF4 connection; it holds the EF4 ObjectContext as a property.

Can I force the auto-generated Linq-to-SQL classes to use an OUTER JOIN?

Let's say I have an Order table which has a FirstSalesPersonId field and a SecondSalesPersonId field. Both of these are foreign keys that reference the SalesPerson table. For any given order, either one or two salespersons may be credited with the order. In other words, FirstSalesPersonId can never be NULL, but SecondSalesPersonId can be NULL.
When I drop my Order and SalesPerson tables onto the "Linq to SQL Classes" design surface, the class builder spots the two FK relationships from the Order table to the SalesPerson table, and so the generated Order class has a SalesPerson field and a SalesPerson1 field (which I can rename to SalesPerson1 and SalesPerson2 to avoid confusion).
Because I always want to have the salesperson data available whenever I process an order, I am using DataLoadOptions.LoadWith to specify that the two salesperson fields are populated when the order instance is populated, as follows:
dataLoadOptions.LoadWith<Order>(o => o.SalesPerson1);
dataLoadOptions.LoadWith<Order>(o => o.SalesPerson2);
The problem I'm having is that Linq to SQL is using something like the following SQL to load an order:
SELECT ...
FROM Order O
INNER JOIN SalesPerson SP1 ON SP1.salesPersonId = O.firstSalesPersonId
INNER JOIN SalesPerson SP2 ON SP2.salesPersonId = O.secondSalesPersonId
This would make sense if there were always two salesperson records, but because there is sometimes no second salesperson (secondSalesPersonId is NULL), the INNER JOIN causes the query to return no records in that case.
What I effectively want here is to change the second INNER JOIN into a LEFT OUTER JOIN. Is there a way to do that through the UI for the class generator? If not, how else can I achieve this?
(Note that because I'm using the generated classes almost exclusively, I'd rather not have something tacked on the side for this one case if I can avoid it).
Edit: per my comment reply, the SecondSalesPersonId field is nullable (in the DB, and in the generated classes).
The default behaviour actually is a LEFT JOIN, assuming you've set up the model correctly.
Here's a slightly anonymized example that I just tested on one of my own databases:
class Program
{
static void Main(string[] args)
{
using (TestDataContext context = new TestDataContext())
{
DataLoadOptions dlo = new DataLoadOptions();
dlo.LoadWith<Place>(p => p.Address);
context.LoadOptions = dlo;
var places = context.Places.Where(p => p.ID >= 100 && p.ID <= 200);
foreach (var place in places)
{
Console.WriteLine(p.ID, p.AddressID);
}
}
}
}
This is just a simple test that prints out a list of places and their address IDs. Here is the query text that appears in the profiler:
SELECT [t0].[ID], [t0].[Name], [t0].[AddressID], ...
FROM [dbo].[Places] AS [t0]
LEFT OUTER JOIN (
SELECT 1 AS [test], [t1].[AddressID],
[t1].[StreetLine1], [t1].[StreetLine2],
[t1].[City], [t1].[Region], [t1].[Country], [t1].[PostalCode]
FROM [dbo].[Addresses] AS [t1]
) AS [t2] ON [t2].[AddressID] = [t0].[AddressID]
WHERE ([t0].[PlaceID] >= #p0) AND ([t0].[PlaceID] <= #p1)
This isn't exactly a very pretty query (your guess is as good as mine as to what that 1 as [test] is all about), but it's definitively a LEFT JOIN and doesn't exhibit the problem you seem to be having. And this is just using the generated classes, I haven't made any changes.
Note that I also tested this on a dual relationship (i.e. a single Place having two Address references, one nullable, one not), and I get the exact same results. The first (non-nullable) gets turned into an INNER JOIN, and the second gets turned into a LEFT JOIN.
It has to be something in your model, like changing the nullability of the second reference. I know you say it's configured as nullable, but maybe you need to double-check? If it's definitely nullable then I suggest you post your full schema and DBML so somebody can try to reproduce the behaviour that you're seeing.
If you make the secondSalesPersonId field in the database table nullable, LINQ-to-SQL should properly construct the Association object so that the resulting SQL statement will do the LEFT OUTER JOIN.
UPDATE:
Since the field is nullable, your problem may be in explicitly declaring dataLoadOptions.LoadWith<>(). I'm running a similar situation in my current project where I have an Order, but the order goes through multiple stages. Each stage corresponds to a separate table with data related to that stage. I simply retrieve the Order, and the appropriate data follows along, if it exists. I don't use the dataLoadOptions at all, and it does what I need it to do. For example, if the Order has a purchase order record, but no invoice record, Order.PurchaseOrder will contain the purchase order data and Order.Invoice will be null. My query looks something like this:
DC.Orders.Where(a => a.Order_ID == id).SingleOrDefault();
I try not to micromanage LINQ-to-SQL...it does 95% of what I need straight out of the box.
UPDATE 2:
I found this post that discusses the use of DefaultIfEmpty() in order to populated child entities with null if they don't exist. I tried it out with LINQPad on my database and converted that example to lambda syntax (since that's what I use):
ParentTable.GroupJoin
(
ChildTable,
p => p.ParentTable_ID,
c => c.ChildTable_ID,
(p, aggregate) => new { p = p, aggregate = aggregate }
)
.SelectMany (a => a.aggregate.DefaultIfEmpty (),
(a, c) => new
{
ParentTableEntity = a.p,
ChildTableEntity = c
}
)
From what I can figure out from this statement, the GroupJoin expression relates the parent and child tables, while the SelectMany expression aggregates the related child records. The key appears to be the use of the DefaultIfEmpty, which forces the inclusion of the parent entity record even if there are no related child records. (Thanks for compelling me to dig into this further...I think I may have found some useful stuff to help with a pretty huge report I've got on my pipeline...)
UPDATE 3:
If the goal is to keep it simple, then it looks like you're going to have to reference those salesperson fields directly in your Select() expression. The reason you're having to use LoadWith<>() in the first place is because the tables are not being referenced anywhere in your query statement, so the LINQ engine won't automatically pull that information in.
As an example, given this structure:
MailingList ListCompany
=========== ===========
List_ID (PK) ListCompany_ID (PK)
ListCompany_ID (FK) FullName (string)
I want to get the name of the company associated with a particular mailing list:
MailingLists.Where(a => a.List_ID == 2).Select(a => a.ListCompany.FullName)
If that association has NOT been made, meaning that the ListCompany_ID field in the MailingList table for that record is equal to null, this is the resulting SQL generated by the LINQ engine:
SELECT [t1].[FullName]
FROM [MailingLists] AS [t0]
LEFT OUTER JOIN [ListCompanies] AS [t1] ON [t1].[ListCompany_ID] = [t0].[ListCompany_ID]
WHERE [t0].[List_ID] = #p0

Resources