I've contrived this example because it's an easily digested version of the actual problem I'm trying to solve. Here are the classes and their relationships.
First we have a Country class that contains a Dictionary of State objects indexed by a string (their name or abbreviation for example). The contents of the State class are irrelevant:
class Country
Dictionary<string, State> states;
class State { ... }
We also have a Company class which contains a Dictionary of zero or more BranchOffice objects also indexed by state names or abbreviations.
class Company
Dictionary<string, BranchOffice> branches;
class BranchOffice { ... }
The instances we're working with are one Country object and an array of Company objects:
Country usa;
Company companies[];
What I want is an array of the State objects which contain a branch. The LINQ I wrote is below. First it grabs all the companies which actually contain a branch, then joins to the list of states by comparing the keys of both lists.
The problem is that ToArray returns an anonymous type. I understand why anonymous types can't be cast to strong types. I'm trying to figure out whether I could change something to get back a strongly typed array. (And I'm open to suggestions about better ways to write the LINQ overall.)
I've tried casting to BranchOffice all over the place (up front, at list2, at the final select, and other less-likely candidates).
BranchOffice[] offices =
(from cm in companies
where cm.branches.Count > 0
select new {
list2 =
(from br in cm.branches
join st in usa.states on br.Key equals st.Key
select st.Value

You can do:
select new MyClassOfSomeType {
For selection, you can give it a custom class type. You can also then use ToList. With ArrayList, if you need to keep it loosely typed, you can then make it strongly typed later using Cast<>, though only for any select result that doesn't generate an anonymous class.

If i understand the problem correctly, the you want just the states that have office brances in them, not the branches too. If so, one posible linq is the following:
State[] offices =
(from cm in companies
where cm.branches.Count > 0
from br in cm.branches
join st in usa.states on br.Key equals st.Key
select st.Value
If you want both the states and the branches, then you will have to do a group by, and the result will be an IEnumerable>, which you can process after.
var statesAndBranches =
from cm in companies
where cm.branches.Count > 0
from br in cm.branches
join st in usa.states on br.Key equals st.Key
group br.Value by st.Value into g
select g;
Just one more thing, even though you have countries and branches declared as dictionaries, they are used as IEnumerable (from keyValuePair in dictionary) so you will not get any perf benefit form them.


Realm Xamarin LINQ Select

Is there a way to restrict the "columns" returned from a Realm Xamarin LINQ query?
For example, if I have a Customer RealmObject and I want a list of all customer names, do I have to query All<Customer> and then enumerate the results to build the names list? That seems cumbersome and inefficient. I am not seeing anything in the docs. Am I missing something obvious here? Thanks!
You have to remember that Realm is an object based store. In a RDBMS like Sqlite, restricting the return results to a sub-set of "columns" of an "record" makes sense, but in an object store, you would be removing attributes from the original class and thus creating a new dynamic class to then instantiate these new classes as objects.
Thus is you want just a List of strings representing the customer names you can do this:
List<string> names = theRealm.All<Customer>().ToList().Select(customer => customer.Name).ToList();
Note: That you take the Realm.All<> results to a List first and then using a Linq Select "filter" just the property that you want. Using a .Select directly on a RealmResults is not currently supported (v0.80.0).
If you need to return a complex type that is a subset of attributes from the original RealObject, assuming you have a matching POCO, you can use:
var custNames = theRealm.All<Customer>().ToList().Select((Customer c) => new Name() { firstName = c.firstName, lastName = c.lastName } );
Remember, once you convert a RealmResult to a static list of POCOs you do lose the liveliness of using RealmObjects.
Personally I avoid doing this whenever possible as Realm is so fast that using a RealmResult and thus the RealObjects directly is more efficient on processing time and memory overhead then converting those to POCOs everytime you need to new list...

update a property value during linq to sql select (involves join)

Ok I have seen many questions that based on their text could be something like this but not quite. Say I have something like this
(from r in reports
join u in SECSqlClient.DataContext.GetTable<UserEntity>()
on r.StateUpdateReportUserID equals u.lngUserID
select r).
If reports have a bunch of say reportDTO class and I want to select from a list of that DTO but at the same time set one property to a property in userEntity how would I do that? Basically I want all other fields on the report maintained but set a user name from the user table. (There is a reason this is not done in one big query that gets a list of reports)
What I am looking for is something like Select r).Something(SOME LAMBDA TO SET ONE FIELD TO userEntity property).
There is a dirty way to do this, which is
var repQuery = from r in reports ... select new { r, u };
var reps = repQuery.Select(x => { x.r.Property1 = x.u.Property1; return x.r; };
However, When it comes to functional programming (which Linq is, arguably) I like to adhere to its principles, one of which to prevent side effects in functions. A side effect is a change in state outside the function body, in this case the property value.
On the other hand, this is a valid requirement, so I would either use the ForEach method after converting the query to list (ToList()). Foreach is expected to incur side effects. Or I would write a clearly named extension method on IEnumerable<T> (e.g. DoForAll) that does the same, but in a deferred way. See Why there is no ForEach extension method on IEnumerable?.

Use LINQ to select elements within a generic list and casting to their specific type

I have a base class, called NodeUpgrade, which have several child types. An example of a specific child class is FactoryUpgrade.
I have a list of NodeUpgrades, which can be a mix of different child types. How do I write a linq query to retrieve a type of NodeUpgrade and cast to that specific type?
My working query looks something like this:
var allFactories = (from Node n in assets.Nodes
from FactoryUpgrade u in n.NodeUpgrades
where u.ClassID == NodeUpgradeTypes.Factory
select u)
This, of course, doesn't work. Can I specify the final type of the output?
If you are sure that every type in a sequence is a given type, you can use the Cast<T>() extension method. If there can be multiple types in the list and you only want one of them, you can use OfType<T>() to filter the sequence.
List<Animal> animals = ...
// assumes all animals are cats
var cats = animals.Cast<Cat>();
// var cats = (from animal in animals where ... select animal).Cast<Cat>();
// or maybe animals can contain dogs, but you don't want them
var cats = animals.OfType<Cat>();
The difference is that Cast will throw an exception if an animal isn't a cat, whereas OfType will perform a type check before actually trying the conversion. I would favor Cast over OfType when you are confident of the uniform type. (Also note that these do not perform user-defined conversions. If you have defined an implicit or explicit conversion, those will not be supported by these methods.)
The resulting sequence in each case will be IEnumerable<Cat>, which you can do further query operations on (filters, groupings, projections, ToList(), etc.)
You can use the method OfType<>
var allFactories = (from Node n in assets.Nodes
from FactoryUpgrade in n.NodeUpgrades
where u.ClassID == NodeUpgradeTypes.Factory
select u).OfType<ChildType>();
As others have said, use the .OfType extension method to filter the types (assuming you have set the inheritance model and appropriate discriminators on your data source). This will translate in the database to include the appropriate Where clause on the discriminator (ClassID).
var allFactories = from n in assets.Nodes
from u in n.NodeUpgrades.OfType<FactoryUpgrade>()
select u;
You didn't specify here if you were using EF, LINQ to SQL, or just Linq to Objects in this case. Each has a different way of modeling the inheritance. If you need help with the modeling portion, let us know which OR/M you are using.

Is there a way, using LINQ/EF, to get the top most item in a parent/child hierarchy?

I have a class called Structure:
public class Structure
public int StructureId { get; set; }
public Structure Parent { get; set; }
As you can see, Structure has a parent Structure. There can be an indefinite number of structures within this hierarchy.
Is there any way, using LINQ (with Entity Framework), to get the top-most structure in this hierarchy?
Currently, I'm having to hit the database quite a few times in order to find the top most parent. The top most parent is a Structure with a null Parent property:
Structure structure = structureRepository.Get(id);
while (structure.Parent != null)
structure = structureRepository.Get(structure.Parent.StructureId);
// When we're here; `structure` is now the top most parent.
So, is there any elegant way to do this using LINQ/Lambdas? Ideally, starting with the following code:
var structureQuery = from item in context.Structures
where item.StructureId == structureId
select item;
I just want to be able to write something like the following so that I only fire off one database hit:
structureQuery = Magic(structureQuery);
Structure topMostParent = structureQuery.Single();
This is not a direct answer, but the problem you are having is related to the way you are storing your tree. There are a couple ways of simplifying this query by structuring data differently.
One is to use a Nested Set Hierarchy, which can simplify many kinds of queries across trees.
Another is to store a denomralized table of Ancestor/Descendant/Depth tuples. This query then becomes finding the tuple with the current structure as the descendant with the maximum depth.
I think the best I'm going to get is to load the entire hierarchy in one hit from the structure I want the top parent of:
var structureQuery = from item in context.Structures
.Include(x => x.Parent)
where item.StructureId == structureId
select item;
Then just use the code:
while (structure.Parent != null)
structure = structure.Parent;
I have a similar situation. I didn't manage to solve it directly with LINQ/EF. Instead I solved by creating a database view using recursive common table expressions, as outlined here. I made a user-defined function that cross applies all parents to a child (or vice versa), then a view that makes use of this user-defined function which I imported into my EF object context.
(disclaimer: simplified code, I didn't actually test this)
I have two tables, say MyTable (containing all items) and MyParentChildTable containing the ChildId,ParentId relation
I have then defined the following udf:
CREATE FUNCTION dbo.fn_getsupertree(#childid AS INT)
,ParentId INT NULL
WITH Parent_Tree(ChildId, ParentId)
-- Anchor Member (AM)
SELECT ChildId, ParentId, 0
FROM MyParentChildTable
WHERE ChildId = #childid
-- Recursive Member (RM)
SELECT info.ChildId, info.ParentId, tree.[Level]+1
FROM MyParentChildTable AS info
JOIN Parent_Tree AS tree
ON info.ChildId = tree.ParentId
SELECT * FROM Parent_Tree;
and the following view:
SELECT tree.*
FROM MyTable
CROSS APPLY fn_getsupertree(MyTable.Id) as tree
This gives me for each child, all parents with their 'tree level' (direct parent has level 1, parent of parent has level 2, etc.). From that view, it's easy to query the item with the highest level. I just imported the view in my EF context to be able to query it with LINQ.
I like the question and can't think of a linq-y way of doing this. But could you perhaps implement this on your repository class? After all, there should be only one at the top and if the need for it is there, then maybe it deserves a structureRepository.GetRoot() or something.
you can use the linq take construct, for instance
var first3Customers = (
from c in customers
select new {c.CustomerID, c.CustomerName} )

Can I force the auto-generated Linq-to-SQL classes to use an OUTER JOIN?

Let's say I have an Order table which has a FirstSalesPersonId field and a SecondSalesPersonId field. Both of these are foreign keys that reference the SalesPerson table. For any given order, either one or two salespersons may be credited with the order. In other words, FirstSalesPersonId can never be NULL, but SecondSalesPersonId can be NULL.
When I drop my Order and SalesPerson tables onto the "Linq to SQL Classes" design surface, the class builder spots the two FK relationships from the Order table to the SalesPerson table, and so the generated Order class has a SalesPerson field and a SalesPerson1 field (which I can rename to SalesPerson1 and SalesPerson2 to avoid confusion).
Because I always want to have the salesperson data available whenever I process an order, I am using DataLoadOptions.LoadWith to specify that the two salesperson fields are populated when the order instance is populated, as follows:
dataLoadOptions.LoadWith<Order>(o => o.SalesPerson1);
dataLoadOptions.LoadWith<Order>(o => o.SalesPerson2);
The problem I'm having is that Linq to SQL is using something like the following SQL to load an order:
FROM Order O
INNER JOIN SalesPerson SP1 ON SP1.salesPersonId = O.firstSalesPersonId
INNER JOIN SalesPerson SP2 ON SP2.salesPersonId = O.secondSalesPersonId
This would make sense if there were always two salesperson records, but because there is sometimes no second salesperson (secondSalesPersonId is NULL), the INNER JOIN causes the query to return no records in that case.
What I effectively want here is to change the second INNER JOIN into a LEFT OUTER JOIN. Is there a way to do that through the UI for the class generator? If not, how else can I achieve this?
(Note that because I'm using the generated classes almost exclusively, I'd rather not have something tacked on the side for this one case if I can avoid it).
Edit: per my comment reply, the SecondSalesPersonId field is nullable (in the DB, and in the generated classes).
The default behaviour actually is a LEFT JOIN, assuming you've set up the model correctly.
Here's a slightly anonymized example that I just tested on one of my own databases:
class Program
static void Main(string[] args)
using (TestDataContext context = new TestDataContext())
DataLoadOptions dlo = new DataLoadOptions();
dlo.LoadWith<Place>(p => p.Address);
context.LoadOptions = dlo;
var places = context.Places.Where(p => p.ID >= 100 && p.ID <= 200);
foreach (var place in places)
Console.WriteLine(p.ID, p.AddressID);
This is just a simple test that prints out a list of places and their address IDs. Here is the query text that appears in the profiler:
SELECT [t0].[ID], [t0].[Name], [t0].[AddressID], ...
FROM [dbo].[Places] AS [t0]
SELECT 1 AS [test], [t1].[AddressID],
[t1].[StreetLine1], [t1].[StreetLine2],
[t1].[City], [t1].[Region], [t1].[Country], [t1].[PostalCode]
FROM [dbo].[Addresses] AS [t1]
) AS [t2] ON [t2].[AddressID] = [t0].[AddressID]
WHERE ([t0].[PlaceID] >= #p0) AND ([t0].[PlaceID] <= #p1)
This isn't exactly a very pretty query (your guess is as good as mine as to what that 1 as [test] is all about), but it's definitively a LEFT JOIN and doesn't exhibit the problem you seem to be having. And this is just using the generated classes, I haven't made any changes.
Note that I also tested this on a dual relationship (i.e. a single Place having two Address references, one nullable, one not), and I get the exact same results. The first (non-nullable) gets turned into an INNER JOIN, and the second gets turned into a LEFT JOIN.
It has to be something in your model, like changing the nullability of the second reference. I know you say it's configured as nullable, but maybe you need to double-check? If it's definitely nullable then I suggest you post your full schema and DBML so somebody can try to reproduce the behaviour that you're seeing.
If you make the secondSalesPersonId field in the database table nullable, LINQ-to-SQL should properly construct the Association object so that the resulting SQL statement will do the LEFT OUTER JOIN.
Since the field is nullable, your problem may be in explicitly declaring dataLoadOptions.LoadWith<>(). I'm running a similar situation in my current project where I have an Order, but the order goes through multiple stages. Each stage corresponds to a separate table with data related to that stage. I simply retrieve the Order, and the appropriate data follows along, if it exists. I don't use the dataLoadOptions at all, and it does what I need it to do. For example, if the Order has a purchase order record, but no invoice record, Order.PurchaseOrder will contain the purchase order data and Order.Invoice will be null. My query looks something like this:
DC.Orders.Where(a => a.Order_ID == id).SingleOrDefault();
I try not to micromanage does 95% of what I need straight out of the box.
I found this post that discusses the use of DefaultIfEmpty() in order to populated child entities with null if they don't exist. I tried it out with LINQPad on my database and converted that example to lambda syntax (since that's what I use):
p => p.ParentTable_ID,
c => c.ChildTable_ID,
(p, aggregate) => new { p = p, aggregate = aggregate }
.SelectMany (a => a.aggregate.DefaultIfEmpty (),
(a, c) => new
ParentTableEntity = a.p,
ChildTableEntity = c
From what I can figure out from this statement, the GroupJoin expression relates the parent and child tables, while the SelectMany expression aggregates the related child records. The key appears to be the use of the DefaultIfEmpty, which forces the inclusion of the parent entity record even if there are no related child records. (Thanks for compelling me to dig into this further...I think I may have found some useful stuff to help with a pretty huge report I've got on my pipeline...)
If the goal is to keep it simple, then it looks like you're going to have to reference those salesperson fields directly in your Select() expression. The reason you're having to use LoadWith<>() in the first place is because the tables are not being referenced anywhere in your query statement, so the LINQ engine won't automatically pull that information in.
As an example, given this structure:
MailingList ListCompany
=========== ===========
List_ID (PK) ListCompany_ID (PK)
ListCompany_ID (FK) FullName (string)
I want to get the name of the company associated with a particular mailing list:
MailingLists.Where(a => a.List_ID == 2).Select(a => a.ListCompany.FullName)
If that association has NOT been made, meaning that the ListCompany_ID field in the MailingList table for that record is equal to null, this is the resulting SQL generated by the LINQ engine:
SELECT [t1].[FullName]
FROM [MailingLists] AS [t0]
LEFT OUTER JOIN [ListCompanies] AS [t1] ON [t1].[ListCompany_ID] = [t0].[ListCompany_ID]
WHERE [t0].[List_ID] = #p0
