All items that match all the words in a collection - linq

I have two lists: a list of type Person and a list of type profession. Both are many-to-many related.
In addition, I have a third list with some professions.
I would like to select all persons that match all the professions in the third list.
What would be LINQ/Lambda expression?
Thanks

The answer depends on how your sequence of Persons are connected to your sequence of Professions.
You are talking about Lists, but also about many-to-many relation, so I assume your lists are in fact tables in a relational database, with a joining table that remembers which Persons and Professions are related.
If you use entity framework, and you have set-up the many-to-many relationship correctly you don't need the third table:
class Person
{
public int Id {get; set;}
... // other properties
// every Person has zero or more Professions (many-to-many)
public virtual ICollection<Profession> Professions {get; set;}
}
class Profession
{
public int Id {get; set;}
... // other properties
// every Profession has zero or more Persons (many-to-many)
public virtual ICollection<Person> Persons {get; set;}
}
class MyDbContext : DbContext
{
public DbSet<Person> Persons {get; set;}
public DbSet<Profession> Professions {get; set;}
}
That is all!
Entity Framework will recognize that you are modeling a many-to-many relationship and will create the third table for it. You don't need this third table, just access the ICollections, and entity framework will automatically perform the required joins with the third table.
using (var dbContext = new MyDbContext())
{
IEnumerable<Profession> professionList = ... // the third list
// Keep only the persons where have exactly all Professions from the profession list
// do this by checking the Ids of the professions in the list
IEnumerable<int> professionIds = professions
.Select(profession => profession.Id)
.OrderBy(id => id);
var personsWithProfessions = dbContext.Persons
// keep only persons that have the same Profession Ids as professionIds
// first extract the the profession Ids the person has
.Where(person => person.Professions
.Select(profession => profession.Id)
// order this in ascending order
.OrderBy(id => id)
// check if equal to professionIds:
.SequenceEqual(professionIds))
If you are not using Entity Framework, or the classes are not set-up properly with the virtual ICollection, you'll have to do the join between Persons and Professions yourself
Assuming you have a joining table that joins your Persons and Professions:
class Person_Profession
{
public int Id {get; set;}
public int PersonId {get; set;}
public int ProfessionId {get; set;}
}
IQueryable<Person_Profession> Person_Profession_Table = ...
First group every person with all ProfessionIds in the Person_Profession_Table.
var personsWithProfessionIds = Persons.GroupJoin(person_profession_table,
person => person.Id,
personProfession => personProfession.PersonId,
person, matchingJoiningItems => new
{
Person = person,
ProfessionIds = matchingJoiningItems
.Select(matchingJoiningItem => matchingJoiningItem.ProfessionId)
.OrderBy(id => id)
.ToList(),
})
In words: take the two tables: Persons and PersonProfessions. From every person take the Id, from every personProfession element take the PersonId, for every person and all matching personProfessions make one new object: this object contains the matching Person, and all ProfessionIds of the matching joiningItems.
From these Persons with their ProfessionIds, keep only those Persons that have all ProfessionIds in your third list
IEnumerable<int> professionIds = professions
.Select(profession => profession.Id)
.OrderBy(id => id);
IEnumerable<Person> matchingPersons = personsWithProfessionIds
.Where(personWithProfessionId => personWithProfessioinId.ProfessionIds
.SequenceEqual(professiondIds))
.Select(personWithProfessionId => perfonWithProfessionId.Person);

Assuming your Lists are related by containing member Lists of the other type,
var AllPersons = new List<Person>();
var AllProfessions = new List<Profession>();
var desiredProfessions = new List<Profession>();
var findPersons = from p in AllPersons
where p.Professions.Any(pp => desiredProfessions.Contains(pp))
select p;

Related

are we able to Include(x=>x.Entity) then Select(x=> new Entity{ A=x.A })

I'm trying to call my CompanyCatalog table with its FileRepo table. There is a One to One relationship between them and i wanna apply when i Include(x=>x.FileRepo.Select(a=> new{ FileName=a.FileNAme} )) or any query like that.
Let me show to you my query in the bellow :
return TradeTurkDBContext.CompanyCatalog.Include(x=>x.FileRepo
.Select(x=> new FileRepo(FileName=x.FileName))).AsNoTracking().ToList();
I'm trying to do something like that. I'm asking is it possible or not ? if it's possible then how ?
So you have a table of CompanyCatalogs and a table of FileRepos. Every CompanyCatalog has exactly one FileRepo (one-to-one), namely the one that the foreign key refers to.
If you've followed the entity framework conventions, you will have classes similar to the following:
class CompanyCatalog
{
public int Id {get; set;}
public string Name {get; set;}
... // other properties
// every CompanyCatalog has one FileRepo, the one that the foreign key refers to
public int FileRepoId {get; set;}
public virtual FileRepo FileRepo {get; set;}
}
class FileRepo
{
public int Id {get; set;}
public string Name {get; set;}
... // other properties
// every FileRepo is the FileRepo of exactly one CompanyCatalog
// namely the one that the foreign key refers to
public int CompanyCatalogId {get; set;}
public virtual CompanyCatalog CompanyCatalog {get; set;}
}
This is enough for entity framework to detect your tables, the columns in the tables and the relations between the tables. If you had a one-to-many, you would have had a virtual ICollectioni<...> on the "one side". Only if you deviate from the conventions, for instance because you want other table names, or other column names, you need attributes or fluent API.
In entity framework the columns are represented by non-virtual properties. The virtual properties represent the relations between the tables (one-to-many, many-to-many, etc)
Foreign keys are columns in a table, hence they are not virtual. FileRepo is no column in the CompanyCatalogs table, hence it is declared virtual.
You want several properties of CompanyCatalogs, each with several properties of their FileRepos. You use Include for this. This is not necessary. A simple Select will do.
var companyCatalogs = dbContext.CompanyCatalogs
.Where(catalog => ...) // only if you don't want all CompanyCatalogs
.Select(companyCatalog => new
{
// Select only the CompanyCatalog properties that you plan to use:
Id = companyCatalog.Id,
Name = companyCatalog.Name,
...
// Select the FileRepo of this CompanyCatalog as one sub object
FileRepo = new
{
Date = companyCatalog.FileRepo.Date,
Title = companyCatalog.FileRepo.Title,
...
},
// if you want you can select the FileRepo properties one by one:
FileRepoDate = companyCatalog.FileRepo.Date,
FileRepoTitle = companyCatalog.FileRepo.Title,
});
Entity Framework knows your relations, and because you used the virtual properties of the class, it knows it has to perform a (Group-)Join.

Use LINQ to pull data where rows match other rows

I'm pretty new to LINQ, but I'm trying to find a fast way to take a set of data and pull out only rows where particular columns have duplicates in other rows. E.g. in a set of people, pull out only people who share a phone number with another person. Here's a breakdown of what I'm up to:
public class Person
{
public int Id { get; set; }
public string AddressLine1 { get; set; }
public string City { get; set; }
public string PostalCode { get; set; }
public int Province { get; set; }
public string Name { get; set; }
public string Phone { get; set; }
public string Email { get; set; }
public string Fax { get; set; }
public string Web { get; set; }
}
And then I want to sort them in different ways to look for possible duplicates in my input values, so if I want to find where Address Line 1 and Postal Code match, I can sort it like so:
IOrderedEnumerable<Person> sortedPeople = people.OrderBy(x => x.AddressLine1).ThenByDescending(x => x.PostalCode);
And then just go through and bundle any matches together, but there will be a lot of data that doesn't match anything, so if I can cull it out in the first place, it could potentially save a lot of time.
I have a suspicion it will end up costing me more time than it would save, but I figured I'd ask if there's an efficient way.
Your sorting lines won't find duplicates. If you want to find duplicates, you need to make groups of Persons that share something, for instance Persons that have the same PostalCode, or that live in the same City.
Making groups of items that share something is done by using one of the overloads of Enumerable.GroupBy. The most important parameter of GroupBy is parameter keySelector. With this parameter you say what should be the common value for all Persons in the group.
The following will give you a sequence of Groups of Persons. All Persons in the group have the same City. The group is identified by the Key, which has the value of the common element. So you have a Group with all Parisians, with key "Paris"; Another group contains all Amsterdammers with key "Amsterdam", etc
var result = persons.GroupBy(person => person.City);
However, you don't want to keep all groups, you only want to keep groups that have more than one member: those are the groups that have duplicates.
var duplicateCitizens = persons.GroupBy(person => person.City)
.Where(group => group.Skip(1).Any());
In words: first make groups of persons that live in the same City. Then from every group, keep only those groups that have more than one element (= if you skip one element, there are still elements left).
I use the Skip(1).Any() method for efficiency reasons. If you already know after the 1st element that there are duplicates, why continue counting all hundred elements?
The result is a sequence of Groups of more than one Person. All Persons in the group live in the same city. The Key of the group is the City that they have in common.
You can group the results by the phone number:
var query = Persons.GroupBy(p => p.Phone)
.Where(p => p.Count() > 1)
.ToList();
After this you have a grouped list with the phone number as key and the persons with the matching number if this number is more than one time assigned.

Querying many to many table in EF Core/LINQ [duplicate]

This question already has answers here:
Many-to-many query in Entity Framework 7
(4 answers)
Closed 2 years ago.
I have three tables: Posts, Tags and PostTags (link table between Post and Tag). How can I write a query to get all Posts by a TagId?
DB structure:
public class Post {
public string Id {get;set;}
public string Content {get;set;}
public List<PostTag> PostTags {get;set;}
}
public class Tag {
public string Id {get;set;}
public string Name {get;set;}
public List<PostTag> PostTags {get;set;}
}
public class PostTag
{
public string PostId { get; set; }
public Post Post { get; set; }
public string TagId { get; set; }
public Tag Tag { get; set; }
}
Relationships:
builder.Entity<PostTag>()
.HasKey(x => new { x.PostId, x.TagId });
builder.Entity<PostTag>()
.HasOne(st => st.Post)
.WithMany(s => s.PostTags)
.HasForeignKey(st => st.PostId);
builder.Entity<PostTag>()
.HasOne(st => st.Tag)
.WithMany(s => s.PostTags)
.HasForeignKey(st => st.TagId);
If you've followed the entity framework code first conventions, there are two methods to query "Posts with their Tags"
The easy way: Use the virtual ICollection<Tag> to get the tags of each post.
Do the (group-)join yourself.
Use the irtual ICollection
Your classes will be similar to the following:
class Post
{
public int Id {get; set;}
... // other properties
// every Post has zero or more Tags (many-to-many)
public virtual ICollection<Tag> Tags {get; set;}
}
class Tag
{
public int Id {get; set;}
... // other properties
// every Tag is used by zero or more Posts (many-to-many)
public virtual ICollection<Post> Posts {get; set;}
}
This is all that entity framework needs to know the many-to-many relation between Posts and Tags. You even don't have to mention the junction table, entity framework will create a standard table for you, and use it whenever needed. Only if you want non-standard names for tables and or columns, you need Attributes or fluent API.
In entity framework, the columns of the tables are represented by the non-virtual properties; the virtual properties represent the relations between the tables (one-to-many, many-to-many, ...)
To get all (or some) Posts, each with all (or some of) their Tables, you can use the virtual ICollection:
var postsWithTheirTags = dbContext.Posts
// only if you don't want all Posts:
.Where(post => ...)
.Select(post => new
{
// Select only the Post properties that you plan to use:
Id = post.Id,
Author = post.Author,
...
Tags = post.Tags.Select(tag => new
{
// again: only the properties that you plan to use
Id = tag.Id,
Text = tag.Text,
...
})
.ToList(),
});
Entity framework knows your relation and will automatically create a Group-join for you using the proper junction table.
This solutions seems to me the most natural one.
Do the GroupJoin yourself
For this you need to have access to the junction table, you'll have to mention it in your dbContext, and use fluent API to tell entity framework that this is the junction table for the many-to-many relation between Posts and Tags.
var postsWithTheirTags = dbContext.Posts.GroupJoin(dbContext.PostTags,
post => post.Id, // from every Post take the primary key
postTag => postTag.PostId // from every PostTag take the foreign key to Post
(post, postTagsOfThisPost) => new
{
// Post properties:
Id = post.Id,
Title = post.Title,
...
Tags = dbContext.Tags.Join(postTagsOfThisPost,
tag => tag.Id // from every Tag take the primary key
postTag => postTag.TagId // from every postTagOfThisPost take the foreign key
(tag, postTagfThisPostAndThisTag) => new
{
Id = tag.Id,
Text = tag.Text,
...
})
.ToList(),
});
You can try this:
public List<Posts> GetPosts(string needTagID)
{
var dataQuery = from tags in _db.Tags
where needTagID == tags.Id
join postTags in _db.PostTags on tags.Id equals postTags.TagId
join posts in _db.Posts on postTags.PostId equals posts.Id
select posts;
var data = dataQuery.ToList();
}

Linq Table select where IN IQueryable

I have two classes; the first one is:
public class People
{
public int Id {get;set;}
public Dog Dogs {get;set;}
}
public class Dog
{
public int Id {get;set;}
public string Name {get; set;}
public int PeopleId {get;set;}
public bool IsNewborn {get;set;}
}
PeopleId of Dog class is the Id of People class.
Now, with Entity Framework, I retrive the list of Newborn Dogs:
var AllNB_dogs = _dog_repository.Table;
AllNB_dogs = AllNB_dogs.Where(x => x.IsNewborn );
What I need to retrive now, is the list of People that have newborn dogs.
I try with:
var PeopleWithNB = _people_repository.Table.Where(x => AllNB_dogs.Contains(x.Id));
but I know that in "Contains" I cannot put an Int but I need to insert a People object.
I try also with:
var PeopleWithNB = _people_repository.Table.Select(x => ...);
but without success.
Can someone help me? Or there is another way to accomplish this?
I'm using EF Core 2.2.
Assuming you have a relation between People and Dogs, so that you can use Any:
var PeopleWithNB = _people_repository.Table.Where(x => x.Dogs.Any(d => d.IsNewborn)).ToList();
See Relationships in Entity-Framework Core
According to your design every human has exactly one Dog, there are no People without dogs (I'm sure there are a lot of people without dogs), nor people with more than one Dog. I wonder what you would do if someone sells his Dog.
Furthermore, People have a property Dogs, it seems that you wanted to design a one-to-many relationship between People and Dogs. Every Human has zero or more Dogs, every Dog belongs to exactly one Human, namely the Human with PeopleId.
If you really wanted a one-to-one relationship (every Human has exactly one Dog), see below. I'll first handle the one-to-many relation.
One To Many
People have zero or more Dogs, every Dog belongs to exactly one Human, namely the Dog that PeopleId refers to.
Usually tables are identified with plural nouns, rows in tables are identified with singular nouns. Because the word People has some problems when explaining, I'll change it to something more standard. For your final solution the identifiers that you select are not important. Most important is that everybody immediately understands what you mean.
Rewriting your classes:
public class Customer
{
public int Id {get;set;}
public string Name {get; set;}
... // other Customer properties
// every Customer has zero or more Dogs (one-to-many)
public virtual ICollection<Dog> Dogs {get;set;}
}
public class Dog
{
public int Id {get;set;}
public string Name {get; set;}
public bool IsNewborn {get;set;}
... // Other Dog properties
// Every Dog is owned by exactly one Customer, the Customer that OwnerId refers to
public int OwnerId {get;set;}
public virtual Customer Owner {get; set;}
}
In Entity framework, the columns of the tables are represented by the non-virtual properties, the virtual properties represent the relations between the tables (one-to-many, one-to-one, many-to-many, ...)
Note that a foreign key is an actual column in your table, hence it is a non-virtual property.
The -to-many part is best represented by a Collection<...>, not by a IList<...>. The reason for this is that all functionality of a Collection is supported by Entity Framework: Add / Remove / Count. An IList has functionality that is not supported by entity framework, better not offer it then, don't you agree?
For completeness the DbContext:
class MyDbContext : DbContext
{
public DbSet<Customer> Customers {get; set; }
public DbSet<Dog> Dogs {get; set;}
}
Most people directly access the DbContext. It seems you have a repository class. Alas you forgot to mention this class. Anyway, you have functions like:
IQueryable<Customer> Customers = ... // query all Customers
IQueryable<Dog> Dogs = ... // query all Dogs
** Back to your question **
Ok, so you have a query to fetch the sequence of several Dogs (in this case: all new born Dogs) and you want the list of Customers that own these Dogs.
If you use entity framework with proper virtual relations, this is quite easy:
// Get all owners of New Born Dogs:
var result = dogs.Where(dog => dog.IsNewBorn)
.Select(dog => new
{
// Select only the Dog properties that you plan to use:
Id = dog.Id,
Name = dog.Name,
...
Owner = new
{
// Select only the properties that you plan to use:
Id = dog.Owner.Id,
Name = dog.Owner.Name,
...
}),
});
In words: from the sequence of Dogs, keep only the newborn dogs. From every remaining Dog, select properties Id, Name. From the Owner of this Dog, select properties Id and Name.
You see: by using proper plural and singular nouns, and by using the virtual relations, the query seems to be very natural.
The problem with this is that, if Customer[4] has two new born dogs, you get this Customer twice. It would be better to query:
Give me all Customers with their newborn Dogs.
Whenever you have a one-to-many relationship, like Customers with their Dogs, Schools with their Students, Orders with their Products, so whenever you want items with their sub-items, it is best to start at the one side (Customers) and use a GroupJoin.
var customersWithTheirNewbornDogs = customers.Select(customer => new
{
// Select only the Customer properties that you plan to use
Id = customer.Id,
...
NewbornDogs = customer.Dogs.Where(dog => dog.IsNewborn)
.Select(dog => new
{
Id = dog.Id,
...
// not needed, you already know the value:
// OwnerId = dog.OwnerId,
})
.ToList(),
})
// Result: all Customers, each with their newborn Dogs,
// even if the Customer has no newborn Dogs.
// if you only want Customers that have newborn Dogs:
.Where(customer => customer.NewbornDogs.Any());
In words: from every Customer take some properties, and from all Dogs that he owns, keep only the newborn ones. From the remaining ones, take some properties. Finally, keep only Customers that have at least one newborn Dog.
** but I'm using entity framework CORE! **
I've heard from some people who can't use the virtual properties when using EF-core. In that case, you'll have to join the tables yourself:
Starting on the many side: GroupJoin:
// GroupJoin Customers and newbornDogs
var customersWithTheirNewbornDogs = customers
.GroupJoin(dogs.Select(dog => dog.IsNewborn),
customer => customer.Id, // from every Customer take the Id
dog => dog.OwnerId, // from every Dog take the foreign key
(customer, dogsOfThisCustomer) => new // when they match, take the customer and his Dogs
{ // to make one new object
Id = customer.Id,
...
NewBornDogs = dogsOfThisCustomer.Select(dog => new
{
Id = dog.Id,
...
});
});
Or if you want to start on the many side: Join
var newbornDogsWithTheirOwners = dogs.Where(dog => dog.IsNewborn)
.Join(customers,
dog => dog.OwnerId, // from every Dog take the OwnerId,
customer => customer.Id, // from every Customer take the Id,
(dog, owner) => new // when they match, take the dog and the owner
{ // to make one new
Id = dog.Id,
...
Owner = new
{
Id = owner.Id,
...
}
});
One-to-One
If in your world every People has exactly one Dog, there are no People without Dogs, and no one has several Dogs, than you can do the Join similar as above.
(by the way, see how strange it sounds if you don't use proper plurals and singular nouns?)
var newbornDogsWithTheirOwners = dogs.Where(dog => dog.IsNewborn)
.Join(People,
dog => dog.PeopleId, // from every Dog take the foreign key,
customer => customer.Id, // from every Customer take the Id,
(dog, owner) => new // when they match, take the dog and the owner
{ // to make one new
Id = dog.Id,
...
Owner = new
{
Id = owner.Id,
...
}
});

How to combine linq queries and selecting distinct records?

I have an object that can have a single user assigned to it or a work group. A user may be assigned directly or though a work group, but the object can never have both set.
public class Procedure
{
.....
public Guid? AssignedToId {get;set;} //Foreign Key to AssignedTo
public Contact AssignedTo {get;set;} //Single user assignment
public Guid? AssignedWorkGroupId {get;set;} //Foreign Key to AssignedWorkGroup
public WorkGroup AssignedWorkGroup {get;set;} //Multiple user assignment
public Guid? AssignedBuisnessPartnerId {get;set;}
public BusinessPartner AssignedBuisnessPartner {get;set;}
}
I am trying to figure out how to write a single query where I can find procedures where a user may be assigned directly or is part of a work group that is assigned. Currently I have 2 separate queries and combining the lists I get back. Which works, but probably not as efficient.
Here is what I have now:
var procedures = _procedureRepository.Get(p => p.AssignedToId == assignedId).ToList();
procedures.AddRange(_procedureRepository.Get(p => p.AssignedWorkGroup.Contacts.Select(c => c.Id).Contains(assignedId) || p.AssignedBuisnessPartner.Contacts.Select(c => c.Id).Contains(assignedId));
It looks like you are looking for a Union All in sql, which is equivalent to Concat in linq. The following code will only execute one call to the database. Not sure if it will be faster than your current method.
var procedures2 = _procedureRepository.Get(p => p.AssignedWorkGroup.Contacts
.Select(c => c.Id)
.Contains(assignedId) ||
p.AssignedBuisnessPartner.Contacts
.Select(c => c.Id)
.Contains(assignedId));
var procedures = _procedureRepository.Get(p => p.AssignedToId == assignedId)
.Concat(procedures2);

Resources