Can I use linq to join two result sets on an ordinal/ index #? - linq

I'm trying to use linq to objects with html agility pack to join two result sets on their relative ordinal position. One set is a list of headers, the other is a set of tables, with each table corresponding to one of the header values. Each set has a count of five. I've read the post here which looks very similar, but can't get it to translate to my purposes.
Here is what I'm using to get the two html node collections:
HtmlNodeCollection ratingsChgsHdrs = htmlDoc.DocumentNode.SelectNodes("//div[#id='calendar-header']");
HtmlNodeCollection ratingsChgsTbls = htmlDoc.DocumentNode.SelectNodes("//table[#class='calendar-table']");
The collection ratingsChgsHdrs contains the headers for each of the tables in ratingsChgsTbls, within the InnerText property. The end result I'm looking for is one result set consisting of all of the rows from all five tables, with the header value added as a property to each row. I hope that is clear.. any help would be great.

This might work:
ratingsChgsHdrs.Select((x, i) => new { x, ratingsChgsTbls.ElementAt(i) });


Finding items from a list in an array stored in a DB field

I have a legacy database that has data elements stored as a comma delimited list in a single database field. (I didn't design that, I'm just stuck with it.)
I have a list of strings that I would like to match to any of the individual values in the "array" in the DB field and am not sure how to do this in Linq.
My list:
List<string> items= new List<string>();
The DB field "Products" would contain data something like:
My first pass at the Linq query was:
var results = (from o in Order
.Where(p=> items.Contains(p.Products)
But I know that won't work. because it will only return the records that contain only "Item1" or "Item2". So with the example data above it would return 0 records. I need to have it return two records.
Any suggestions?
There is a simple clever trick for searching comma-separated lists. First, add an extra , to the beginning and end of the target value (the product list), and the search value. Then search for that exact string. So for example, you would search ,Item1,Item3,Item4, for ,Item1,. The purpose of this is to prevent false positives, i.e., Item12,Item3 finding a match for Item1, while allowing items at the beginning/end of the list to be properly found.
Then, you can use the LINQ .Any method to check that any item in your list is a match to the product list, like the following:
var results = (from o in Order
.Where(o => items.Any(i => (","+o.Products+",").Contains(","+i+",")))
One way would be to parse the list in the Products field:
var results = (from o in Order
.Where(o => items.Any(i => o.Products.Split(',').Contains(i))
But that would parse the string multiple times for each record. You could try pulling back ALL of the records, parsing each record once, then doing the comparison:
var results = from o in Order
let prods = o.Products.Split(',')
where items.Any(i => prods.Contains(i))
select o;

Querying MVC collection with Array

Im using entity framework 4 and linq/lamda expressions. Im sure its an easy one but im trying to query a collection with an array but get records which contain all the arrays values.
basically what im doing is this
var records = collection.where(x.classifications.Any(y=> Array.Contains(y.ClassificationID))).ToList()
This works in a sense that it returns records that contain any of the arrays values but how do i get only records that contain all of the values in the array.
Hope that makes sense
Im marking the comment below as the answer as I did have to use ALL in my query to get it to work, however I also had to re - write my query slightly. This is what i eventually had...
var records = collection.Where(x=> Array.All(c=> x.Classifications.Select(l=>l.ClassificationID).Contains(c)))
What about using All instead of Any?
var records = collection.Where(x => x.classifications.All(y => Array.Contains(y.ClassificationID)))

OpenXML linq query

I'm using OpenXML to open a spreadsheet and loop through the rows of a spreadsheet. I have a linq query that returns all cells within a row. The linq query was ripped straight from a demo on the MSDN.
IEnumerable<String> textValues =
from cell in row.Descendants<Cell>()
where cell.CellValue != null
select (cell.DataType != null
&& cell.DataType.HasValue
&& cell.DataType == CellValues.SharedString
? sharedString.ChildElements[int.Parse(cell.CellValue.InnerText)].InnerText
: cell.CellValue.InnerText);
The linq query is great at returning all cells that have a value, but it doesn't return cells that don't have a value. This in turn makes it impossible to tell which cell is which. Let me explain a little more. Say for instance we have three columns in our spreadsheet: Name, SSN, and Address. The way this linq query works is it only returns those cells that have a value for a given row. So if there is a row of data that has "John", "", "173 Sycamore" then the linq query only returns "John" and "173 Sycamore" in the enumeration, which in turn makes it impossible for me to know if "173 Sycamore" is the SSN or the Address field.
Let me reiterate here: what I need is for all cells to be returned, and not just cells that contain a value.
I've tried to monkey the linq query in every way that I could think of, but I had no luck whatsoever (ie - removing the where clause isn't the trick). Any help would be appreciated. Thanks!
The OpenXML standard does not define placeholders for cells that don't have data. In other words, it's underlying storage in XML is sparse. You could work round this on one of two ways:
Create a list of all "available" or "possible" cells (probably by using a CROSS JOIN type of operation) then "left" joining to the row.Descendants<Cell>() collection to see if the cell reference has a value
Utilize a 3rd party tool such as ClosedXML or EPPlus as a wrapper around the Excel data and query their interfaces, which are much more developer-friendly.
With ClosedXML:
var wb = new XLWorkbook("YourWorkbook.xlsx");
var ws = wb.Worksheet("YourWorksheetName");
var range = ws.RangeUsed();
foreach(var row in range.Rows())
// Do something with the row...
// ...
foreach(var cell in row.Cells())
// Now do something with every cell in the row
// ...
The one way I recommend is to fill in all the null cells with blank data so they will be returned by your linq statement. See this answer for how to do that.

Imroving/Modifying LINQ query

I already have a variable containing some groups. I generated that using the following LINQ query:
var historyGroups = from payee in list
group payee by payee.Payee.Name into groups
orderby groups.Key
select new {PayeeName = groups.Key, List = groups };
Now my historyGroups variable can contain many groups. Each of those groups has a key which is a string and Results View is sorted according to that. Now inside each of those groups there is a List corresponding to the key. Inside that List there are elements and each one those element is an object of a particular type. One of it's fields is of type System.DateTime. I want to sort this internal List by date.
Can anyone help with this? May be modify the above query or a new query on variable historyGroups.
It is not clear to me what you want to sort on (the payee type definition is missing as well)
var historyGroups = from payee in list
group payee by payee.Payee.Name into groups
orderby groups.Key
select new {
PayeeName = groups.Key,
List = groups.OrderBy(payee2 => payee2.SomeDateTimeField)
Is most straightforward.
If you really want to sort only by date (and not time), use SomeDateTimeField.Date.
Inside that List there are elements and each one those element is an object of a particular type. One of it's fields is of type System.DateTime
This leads me to maybe(?) suspect
List = groups.OrderBy(payee2 => payee2.ParticularTypedElement.DateTimeField)
Or perhaps even
List = groups.OrderBy(payee2 => payee2.ObjectsOfParticularType
I hope next time you can clarfy the question a bit better, so we don't have to guess that much (and come up with a confusing answer)

Querying M:M relationships using Entity Framework

How would I modify the following code:
var result = from p in Cache.Model.Products
from f in p.Flavours
where f.FlavourID == "012541-5-5-5-651"
select p;
So that f.FlavourID is supplied a range of ID's as a supposed to just one value as shown in the above example?
Given the following ERD Model:
Products* => ProdCombinations <= *Flavours
ProdCombinations is a junction/link table and simply has one composite key in there.
Of the top of my head
string [] ids = new[]{"012541-5-5-5-651", "012541-5-5-5-652", "012541-5-5-5-653"};
var result = from p in Cache.Model.Products
from f in p.Flavours
where ids.Contains(f.FlavourID)
select p;
There are some limitations, but an array of ids has worked for me before. I've only actually tried with SQL Server backend, and my IDs were integers.
As I understand it, Linq needs to translate your query into SQL, and it's only possible sometimes. For example it's not possible with IEnumerable<SomeClass>, which produces a runtime error, but possible with a collection of simple types.
