Say I have a forum system with Threads, Posts and Tags.
The structure is the same as StackOverflow: Threads have a 1-many relationship to Posts, and Tags have a many-many relationship to Threads.
The (simplified) tables:
Thread
------
ThreadID int PK
Title varchar(200)
Tag
----
TagID int PK
Name varchar(50)
ThreadTag
-----------
ThreadTagID int PK
ThreadID int FK
TagID int FK
So SubSonic ActiveRecord templates generate my classes for me.
Code
For the front page I need to get a list of Threads, and attach to each of these its list of related Tags. Leaving the posts count aside, what is the best way to retrieve the Tags and build this object graph?
If I get the threads like:
var threadQuery = Thread.All().Skip(x).Take(n);
var threadList = threadQuery.ToList();
Should I add an "IList<Tag> Tags" property to a partial of the Thread class?
And to retrieve the right tags, should I execute two queries: one to get the ThreadTags and one to get the Tags themselves: e.g.
var tagLinks = (from t in threadQuery
join l in ThreadTag.All() on t.ThreadID equals l.ThreadID
select l).ToList();
var tags = (from t in threadQuery
join l in ThreadTag.All() on t.ThreadID equals l.ThreadID
join tg in Tag.All() on l.TagID equals tg.TagID
select tg).ToList();
...and then use these lists to sort the tags in to the correct Thread.Tags list?
Is there a better way? I don't think I can use the IQueryable properties generated by SubSonic using the foreign keys, as that would trigger a database call for each of the Threads in my list.
After using SubSonic for a while, I think this is the best way. For large complex object graphs Entity Framework seems better suited.
Related
I've contrived this example because it's an easily digested version of the actual problem I'm trying to solve. Here are the classes and their relationships.
First we have a Country class that contains a Dictionary of State objects indexed by a string (their name or abbreviation for example). The contents of the State class are irrelevant:
class Country
{
Dictionary<string, State> states;
}
class State { ... }
We also have a Company class which contains a Dictionary of zero or more BranchOffice objects also indexed by state names or abbreviations.
class Company
{
Dictionary<string, BranchOffice> branches;
}
class BranchOffice { ... }
The instances we're working with are one Country object and an array of Company objects:
Country usa;
Company companies[];
What I want is an array of the State objects which contain a branch. The LINQ I wrote is below. First it grabs all the companies which actually contain a branch, then joins to the list of states by comparing the keys of both lists.
The problem is that ToArray returns an anonymous type. I understand why anonymous types can't be cast to strong types. I'm trying to figure out whether I could change something to get back a strongly typed array. (And I'm open to suggestions about better ways to write the LINQ overall.)
I've tried casting to BranchOffice all over the place (up front, at list2, at the final select, and other less-likely candidates).
BranchOffice[] offices =
(from cm in companies
where cm.branches.Count > 0
select new {
list2 =
(from br in cm.branches
join st in usa.states on br.Key equals st.Key
select st.Value
)
}
).ToArray();
You can do:
select new MyClassOfSomeType {
..
)
For selection, you can give it a custom class type. You can also then use ToList. With ArrayList, if you need to keep it loosely typed, you can then make it strongly typed later using Cast<>, though only for any select result that doesn't generate an anonymous class.
HTH.
If i understand the problem correctly, the you want just the states that have office brances in them, not the branches too. If so, one posible linq is the following:
State[] offices =
(from cm in companies
where cm.branches.Count > 0
from br in cm.branches
join st in usa.states on br.Key equals st.Key
select st.Value
).Distinct().ToArray();
If you want both the states and the branches, then you will have to do a group by, and the result will be an IEnumerable>, which you can process after.
var statesAndBranches =
from cm in companies
where cm.branches.Count > 0
from br in cm.branches
join st in usa.states on br.Key equals st.Key
group br.Value by st.Value into g
select g;
Just one more thing, even though you have countries and branches declared as dictionaries, they are used as IEnumerable (from keyValuePair in dictionary) so you will not get any perf benefit form them.
Here is the expression
x => x.stf_Category.CategoryID == categoryId
x refers to an Product Entity that contains a Category. I am trying to load all Products that match given categoryId.
In the db the Product table contains a Foreign Key reference to Category (via CategoryId).
Question: I think I am doing it wrong. Is there something else one has to do in EF4 to create a LINQ expression of this type?
Are there any good examples of EF4 Linq expressions out there? Specifically something that queries on the basis of related entities such as my problem ?
Thanks !
You're looking for the Include method.
var query = db.Products.Include("Categories");
This is commonly referred to as eager loading.
Entity Framework will 'infer' the JOIN constraint based on the mapping you have specified.
The "magic string" needs to match the Entity Set name on your EDMX.
Check out this post for more info.
EDIT
I'm a little confused as to whether you want the Products and Categories, or just the Products which have a specific Category ID.
If the latter, this is the way to go:
var query = from p in db.products
join c in db.categories
on p.CategoryId equals c.CategoryId
where c.CategoryId == someCategoryId
select p;
Keep in mind though, the above query is exactly the same result as your original query.
If p is a product, then p.Categories will look at the Navigational Property of your Product entity on the EDMX, in which case it will be your Category FK.
As long as you setup your Navigational properties right, p.Categories is fine.
If you are using EF4 and the association between Category and Product classes has been picked up and defined in your Model, then all products with a specific categoryID can be selected as simple as:
x => x.CategoryID == categoryID
You don't need to join nor an eager loading for that.
Ok this should be really simple, but I am doing my head in here and have read all the articles on this and tried a variety of things, but no luck.
I have 3 tables in a classic many-to-many setup.
ITEMS
ItemID
Description
ITEMFEATURES
ItemID
FeatureID
FEATURES
FeatureID
Description
Now I have a search interface where you can select any number of Features (checkboxes).
I get them all nicely as an int[] called SearchFeatures.
I simply want to find the Items which have the Features that are contained in the SearchFeatures.
E.g. something like:
return db.Items.Where(x => SearchFeatures.Contains(x.ItemFeatures.AllFeatures().FeatureID))
Inside my Items partial class I have added a custom method Features() which simply returns all Features for that Item, but I still can't seem to integrate that in any usable way into the main LINQ query.
Grr, it's gotta be simple, such a 1 second task in SQL. Many thanks.
The following query will return the list of items based on the list of searchFeatures:
from itemFeature in db.ItemFeatures
where searchFeatures.Contains(itemFeature.FeatureID)
select itemFeature.Item;
The trick here is to start with the ItemFeatures table.
It is possible to search items that have ALL features, as you asked in the comments. The trick here is to dynamically build up the query. See here:
var itemFeatures = db.ItemFeatures;
foreach (var temp in searchFeatures)
{
// You will need this extra variable. This is C# magic ;-).
var searchFeature = temp;
// Wrap the collection with a filter
itemFeatures =
from itemFeature in itemFeatures
where itemFeature.FeatureID == searchFeature
select itemFeature;
}
var items =
from itemFeature in itemFeatures
select itemFeature.Item;
Let's say I have an Order table which has a FirstSalesPersonId field and a SecondSalesPersonId field. Both of these are foreign keys that reference the SalesPerson table. For any given order, either one or two salespersons may be credited with the order. In other words, FirstSalesPersonId can never be NULL, but SecondSalesPersonId can be NULL.
When I drop my Order and SalesPerson tables onto the "Linq to SQL Classes" design surface, the class builder spots the two FK relationships from the Order table to the SalesPerson table, and so the generated Order class has a SalesPerson field and a SalesPerson1 field (which I can rename to SalesPerson1 and SalesPerson2 to avoid confusion).
Because I always want to have the salesperson data available whenever I process an order, I am using DataLoadOptions.LoadWith to specify that the two salesperson fields are populated when the order instance is populated, as follows:
dataLoadOptions.LoadWith<Order>(o => o.SalesPerson1);
dataLoadOptions.LoadWith<Order>(o => o.SalesPerson2);
The problem I'm having is that Linq to SQL is using something like the following SQL to load an order:
SELECT ...
FROM Order O
INNER JOIN SalesPerson SP1 ON SP1.salesPersonId = O.firstSalesPersonId
INNER JOIN SalesPerson SP2 ON SP2.salesPersonId = O.secondSalesPersonId
This would make sense if there were always two salesperson records, but because there is sometimes no second salesperson (secondSalesPersonId is NULL), the INNER JOIN causes the query to return no records in that case.
What I effectively want here is to change the second INNER JOIN into a LEFT OUTER JOIN. Is there a way to do that through the UI for the class generator? If not, how else can I achieve this?
(Note that because I'm using the generated classes almost exclusively, I'd rather not have something tacked on the side for this one case if I can avoid it).
Edit: per my comment reply, the SecondSalesPersonId field is nullable (in the DB, and in the generated classes).
The default behaviour actually is a LEFT JOIN, assuming you've set up the model correctly.
Here's a slightly anonymized example that I just tested on one of my own databases:
class Program
{
static void Main(string[] args)
{
using (TestDataContext context = new TestDataContext())
{
DataLoadOptions dlo = new DataLoadOptions();
dlo.LoadWith<Place>(p => p.Address);
context.LoadOptions = dlo;
var places = context.Places.Where(p => p.ID >= 100 && p.ID <= 200);
foreach (var place in places)
{
Console.WriteLine(p.ID, p.AddressID);
}
}
}
}
This is just a simple test that prints out a list of places and their address IDs. Here is the query text that appears in the profiler:
SELECT [t0].[ID], [t0].[Name], [t0].[AddressID], ...
FROM [dbo].[Places] AS [t0]
LEFT OUTER JOIN (
SELECT 1 AS [test], [t1].[AddressID],
[t1].[StreetLine1], [t1].[StreetLine2],
[t1].[City], [t1].[Region], [t1].[Country], [t1].[PostalCode]
FROM [dbo].[Addresses] AS [t1]
) AS [t2] ON [t2].[AddressID] = [t0].[AddressID]
WHERE ([t0].[PlaceID] >= #p0) AND ([t0].[PlaceID] <= #p1)
This isn't exactly a very pretty query (your guess is as good as mine as to what that 1 as [test] is all about), but it's definitively a LEFT JOIN and doesn't exhibit the problem you seem to be having. And this is just using the generated classes, I haven't made any changes.
Note that I also tested this on a dual relationship (i.e. a single Place having two Address references, one nullable, one not), and I get the exact same results. The first (non-nullable) gets turned into an INNER JOIN, and the second gets turned into a LEFT JOIN.
It has to be something in your model, like changing the nullability of the second reference. I know you say it's configured as nullable, but maybe you need to double-check? If it's definitely nullable then I suggest you post your full schema and DBML so somebody can try to reproduce the behaviour that you're seeing.
If you make the secondSalesPersonId field in the database table nullable, LINQ-to-SQL should properly construct the Association object so that the resulting SQL statement will do the LEFT OUTER JOIN.
UPDATE:
Since the field is nullable, your problem may be in explicitly declaring dataLoadOptions.LoadWith<>(). I'm running a similar situation in my current project where I have an Order, but the order goes through multiple stages. Each stage corresponds to a separate table with data related to that stage. I simply retrieve the Order, and the appropriate data follows along, if it exists. I don't use the dataLoadOptions at all, and it does what I need it to do. For example, if the Order has a purchase order record, but no invoice record, Order.PurchaseOrder will contain the purchase order data and Order.Invoice will be null. My query looks something like this:
DC.Orders.Where(a => a.Order_ID == id).SingleOrDefault();
I try not to micromanage LINQ-to-SQL...it does 95% of what I need straight out of the box.
UPDATE 2:
I found this post that discusses the use of DefaultIfEmpty() in order to populated child entities with null if they don't exist. I tried it out with LINQPad on my database and converted that example to lambda syntax (since that's what I use):
ParentTable.GroupJoin
(
ChildTable,
p => p.ParentTable_ID,
c => c.ChildTable_ID,
(p, aggregate) => new { p = p, aggregate = aggregate }
)
.SelectMany (a => a.aggregate.DefaultIfEmpty (),
(a, c) => new
{
ParentTableEntity = a.p,
ChildTableEntity = c
}
)
From what I can figure out from this statement, the GroupJoin expression relates the parent and child tables, while the SelectMany expression aggregates the related child records. The key appears to be the use of the DefaultIfEmpty, which forces the inclusion of the parent entity record even if there are no related child records. (Thanks for compelling me to dig into this further...I think I may have found some useful stuff to help with a pretty huge report I've got on my pipeline...)
UPDATE 3:
If the goal is to keep it simple, then it looks like you're going to have to reference those salesperson fields directly in your Select() expression. The reason you're having to use LoadWith<>() in the first place is because the tables are not being referenced anywhere in your query statement, so the LINQ engine won't automatically pull that information in.
As an example, given this structure:
MailingList ListCompany
=========== ===========
List_ID (PK) ListCompany_ID (PK)
ListCompany_ID (FK) FullName (string)
I want to get the name of the company associated with a particular mailing list:
MailingLists.Where(a => a.List_ID == 2).Select(a => a.ListCompany.FullName)
If that association has NOT been made, meaning that the ListCompany_ID field in the MailingList table for that record is equal to null, this is the resulting SQL generated by the LINQ engine:
SELECT [t1].[FullName]
FROM [MailingLists] AS [t0]
LEFT OUTER JOIN [ListCompanies] AS [t1] ON [t1].[ListCompany_ID] = [t0].[ListCompany_ID]
WHERE [t0].[List_ID] = #p0
I need to insert to two tables in a single query. Is this possible to do in LINQ?
At present I am using insertonsubmit() 2 times.
If your tables have a primary key/foreign key relationship to each other, then you also have two objects which you can link to each other:
InternetStoreDataContext db = new InternetStoreDataContext();
Category c = new Category();
c.name = "Accessories";
Product p = new Product();
p.name = "USB Mouse";
c.Products.Add(p);
//and finally
db.Categories.Add(c);
db.SubmitChanges();
That adds your object and all linked objects when submitting the changes.
Note that for that to work, you must have a primary key in both tables. Otherwise LINQ doesn't offer you the linking possibility.
Here are good examples of using LINQ to SQL: http://weblogs.asp.net/scottgu/archive/2007/05/19/using-linq-to-sql-part-1.aspx
The database submit doesn't happen until you call SubmitChanges. There is no tangible cost associated with multiple calls to InsertOnSubmit - so why not just do that?
This will still result in two TSQL INSERT commands - it simply isn't possible to insert into two tables in a single regular INSERT command.