How do I find a collection of nodes in HtmlAgilityPack using linq to xml? - linq

I want to extract information from various websites. I am using HtmlAgilityPack and Linq to XML. So far I have managed to extract the value from a single node in a website by writing:
var q = document.DocumentNode.DescendantNodes()
.Where(n => n.Name == "img" && n.Id == "GraphicalBoard001")
.FirstOrDefault();
But I am really interested in the whole collection of img's that start with "GraphicalBoard". I tried something like:
var q2 = document.DocumentNode.DescendantNodes()
.Where(n => n.Name == "img" && n.Id.Contains("GraphicalBoard"))
.Select...
But it seems that linq doesn't like the Contains-method, since I lose the Select option in intellisense. How can I extract all the img-tags where the Id starts with "GraphicalBoard"?

How can I extract all the img-tags where the Id starts with "GraphicalBoard"?
You had it already, just stop at the call to Where(). The Where() call filters the collection by the items that satisfies the predicate.
Though you should write it so you filter through the img descendants, not all descendants.
var query = doc.DocumentNode.Descendants("img")
.Where(img => img.Id.StartsWith("GraphicalBoard"));

Related

Can this code be converted into a single linq statement?

I'm trying to filter a list within a list from an entity framework entity.
I've managed to get the code working however, i'm not convinced it's the cleanest way of achieving the goal.
Here's the code I have so far:
foreach (var n1 in tier.MatchNodes)
{
n1.LenderMatchNodes = n1.LenderMatchNodes.Where(x => x.Commission == 0).ToList();
}
Effectively MatchNodes contains a collection of LenderMatchNodes, however I want to return only the nodes where the commission == 0.
Thanks in advance.
Try
tier.MatchNodes.ToList().ForEach(n1=>n1.LenderMatchNodes = n1.LenderMatchNodes.Where(x => x.Commission == 0).ToList());
Try using SelectMany():
var result = dataContext.Table<Tier>()
.Where(some condition to get you the tier)
.SelectMany(tier => tier.MatchNodes)
.SelectMany(node => node.LenderMatchNodes)
.Where(x => x.Commission == 0)
.ToList();
This has the additional benefit of being able to execute it a single SQL query.
If you're goal is to actually update the node list in the database, you can still minimize the number of queries using Include() (assuming you're using EF):
var nodes = dataContext.Table<Tier>()
.Where(some condition to get you the tier)
.SelectMany(tier => tier.MatchNodes)
.Include(node => node.LenderMatchNodes) // loads this eagerly
.ToList();
nodes.ForEach(n => n.LenderMatchNodes = n.LenderMatchNodes.Where(condition));

Access a collection via LINQ and set a single member to a new object

I am trying to access a user object in a collection with the id = to users101 and set this to another users.
Controller.MyObject.SingleOrDefault(x => x.Id == "user101") = OtherUser();
Thanks in advance.
You can't do it with one LINQ expression.
Usually LINQ extensions works on enumerables, if MyObject is a collection you first have to find the required item and then overwrite it with the new object (moreover SingleOrDefault() will simply return null if condition is not satisfied).
You should write something like this (exact code depends on what MyObject is):
var item = Controller.MyObject.SingleOrDefault(x => x.Id == "user101");
if (item != null)
Controller.MyObject[Controller.MyObject.IndexOf(item)] = new OtherUser();
Please note that if you do not really need the check performed by SingleOrDefault() you can simplify the code (and avoid the double search performed in SingleOrDefault() and IndexOf()).
If this is "performance critical" maybe it is better to write an ad-hoc implementation that does this task in one single pass.
Try it in two lines:
var objectWithId = Controller.MyObject.SingleOrDefault(x => x.Id == "user101");
(objectWithId as WhateverTypeOfObjectOtherUserIs) = OtherUser();

Match on 2 values

I'm trying to figure out how to modify this to match on 2:
var result = _context.FirstOrDefault(c => c.CarId == carId);
I'm not sure how to tack this on. I just want to base it on c.CarId == carId && c.UserId == userId
where carId and userId are incoming params to my method that this LINQ statement resides in. I want to keep this as a lambda expression syntax.
Just do it exactly as you've written it:
var result = _context.FirstOrDefault(c => c.CarId == carId && c.UserId == userId);
There's nothing wrong with that. The lambda expression isn't restricted to compare a single property.
If you want to learn about LINQ in more detail, I'd start with LINQ to Objects, which is simpler to understand and predict. There are various tutorials around for it, and I have a blog series called Edulinq which examines each operator in detail.

LINQ get parent with most children

I have a list of Parent objects that has a list of Children objects. I need to write a query that would give me the parent that has the most children. The ORM is entity framework, so it should work with that.
Code to start with:
parents.FirstOrDefault(c => c.Children.Max());
Something like that.
I think it should look more like this:
parents.OrderByDescending(p => p.Children.Count()).FirstOrDefault();
Your query is not correct, because c.Children.Max() will try to iterate over children of one parent, and if they support comparison (e.g. children are Ints), will simply return biggest of them. And most probably your Children objects are not bool, so you won't be able to even compile the code, because FirstOrDefault takes
Expression<T, bool>
You don't need sorting for this:
int maxChildCount = parents.Max(x => x.Children.Count());
var maxParent = parents.FirstOrDefault(p => p.Children.Count() == maxChildCount);
Or as query expression:
var maxParent = (from p in parents
let max = parents.Max(x => x.Children.Count())
where p.Children.Count() == max).FirstOrDefault();

conditional include in linq to entities?

I felt like the following should be possible I'm just not sure what approach to take.
What I'd like to do is use the include method to shape my results, ie define how far along the object graph to traverse. but... I'd like that traversal to be conditional.
something like...
dealerships
.include( d => d.parts.where(p => p.price < 100.00))
.include( d => d.parts.suppliers.where(s => s.country == "brazil"));
I understand that this is not valid linq, in fact, that it is horribly wrong, but essentially I'm looking for some way to build an expression tree that will return shaped results, equivalent to...
select *
from dealerships as d
outer join parts as p on d.dealerid = p.dealerid
and p.price < 100.00
outer join suppliers as s on p.partid = s.partid
and s.country = 'brazil'
with an emphasis on the join conditions.
I feel like this would be fairly straight forward with esql but my preference would be to build expression trees on the fly.
as always, grateful for any advice or guidance
This should do the trick:
using (TestEntities db = new TestEntities())
{
var query = from d in db.Dealership
select new
{
Dealer = d,
Parts = d.Part.Where
(
p => p.Price < 100.0
&& p.Supplier.Country == "Brazil"
),
Suppliers = d.Part.Select(p => p.Supplier)
};
var dealers = query.ToArray().Select(o => o.Dealer);
foreach (var dealer in dealers)
{
Console.WriteLine(dealer.Name);
foreach (var part in dealer.Part)
{
Console.WriteLine(" " + part.PartId + ", " + part.Price);
Console.WriteLine
(
" "
+ part.Supplier.Name
+ ", "
+ part.Supplier.Country
);
}
}
}
This code will give you a list of Dealerships each containing a filtered list of parts. Each part references a Supplier. The interesting part is that you have to create the anonymous types in the select in the way shown. Otherwise the Part property of the Dealership objects will be empty.
Also, you have to execute the SQL statement before selecting the dealers from the query. Otherwise the Part property of the dealers will again be empty. That is why I put the ToArray() call in the following line:
var dealers = query.ToArray().Select(o => o.Dealer);
But I agree with Darren that this may not be what the users of your library are expecting.
Are you sure this is what you want? The only reason I ask is, once you add the filter on Parts off of Dealerships, your results are no longer Dealerships. You're dealing in special objects that are, for the most part, very close to Dealerships (with the same properties), but the meaning of the "Parts" property is different. Instead of being a relationship between Dealerships and Parts, it's a filtered relationship.
Or to put it another way, if I pull a dealership out of your results and passed to a method I wrote, and then in my method I call:
var count = dealership.Parts.Count();
I'm expecting to get the parts, not the filtered parts from Brazil where the price is less than $100.
If you don't use the dealership object to pass the filtered data, it becomes very easy. It becomes as simple as:
var query = from d in dealerships
select new { DealershipName = d.Name,
CheapBrazilProducts = dealership.Parts.Where(d => d.parts.Any(p => p.price < 100.00) || d.parts.suppliers.Any(s => s.country == "brazil")) };
If I just had to get the filtered sets like you asked, I'd probably use the technique I mentioned above, and then use a tool like Automapper to copy the filtered results from my anonymous class to the real class. It's not incredibly elegant, but it should work.
I hope that helps! It was an interesting problem.
I know this can work with one single Include. Never test with two includes, but worth the try:
dealerships
.Include( d => d.parts)
.Include( d => d.parts.suppliers)
.Where(d => d.parts.All(p => p.price < 100.00) && d.parts.suppliers.All(s => s.country == "brazil"))
Am I missing something, or aren't you just looking for the Any keyword?
var query = dealerships.Where(d => d.parts.Any(p => p.price < 100.00) ||
d.parts.suppliers.Any(s => s.country == "brazil"));
Yes that's what I wanted to do I think the next realease of Data Services will have the possiblity to do just that LINQ to REST queries that would be great in the mean time I just switched to load the inverse and Include the related entity that will be loaded multiple times but in theory it just have to load once in the first Include like in this code
return this.Context.SearchHistories.Include("Handle")
.Where(sh => sh.SearchTerm.Contains(searchTerm) && sh.Timestamp > minDate && sh.Timestamp < maxDate);
before I tried to load for any Handle the searchHistories that matched the logic but don't know how using the Include logic you posted so in the mean time I think a reverse lookup would be a not so dirty solution

Resources