Linq datatable to get unique rows and their count - linq

i have data table like :
country
China
India
Thailand
India
china
china
Thailand
Hong kong
India
can get my output as shown below using LINQ
Country Count
India 3
China 2
Thailand 2
Hong kong 1

As Ben Allred pointed out, what you're likely looking for is the LINQ GroupBymethod.
Using query syntax, it may look something like this:
var query = from tuple in table
group tuple by tuple.Country into g
select new { Country = g.Key, Count = g.Count() };
query now contains an IEnumerable collection of anonymous objects which have as members the string Country and the integer Count representing the number of occurrences of that country in the table.
You can now of course iterate over these objects as such:
foreach (var item in query)
{
Console.WriteLine("Country : {0} - Count : {1}", item.Country, item.Count);
}
For more examples, I strongly suggest the 101 LINQ Samples
It's also worth pointing out if you haven't used LINQ before that the processing is deferred, meaning that the iteration over the query object doesn't occur until you try to access any of its items, for example, in the foreach statement. If the collection or reading from table is expensive and you intend to use the results of the query more than once, you can call ToList() on query to return a more tangible, concrete collection.

Related

Which way will get high performance while selecting many data IQueryable Vs For loop (Using Entity Frame Work)

I am trying to get a list from the database containing two or more lists inside that list.(using .net core, entity framework).Assume I have two table call header and details table.
Header Table
Detail Table
And I want the result like this:
{
"data":[
{
"Country":"Singapore",
"Hospital_List":[
{
"Hospital_Name":"SG Host A"
},
{
"Hospital_Name":"SG Host A"
}
]
},
{
}
]
}
I only know two ways to get the result like this,First Way, select Country list data with blank Hospital list as List,then for loop that list to select related Hospital list from db again.
And Second Way,select Country list data with blank Hospital list as IQueryable List,and then select related Hospital list via jointing with Hospital Table.So my question is
Which way should i used to get higher performance? And Is any other way?
Please remember there has a lot of field and data in my real table.
For loop give give you the lowest perfomance, because you will create SQL query for each iteration. Instead of this, try following solution:
from hospital in hospitals
group hospital by hospital.CID into gh
join country in countries
on gh.FirstOrDefault().CID equals country.CID
select new
{
Country = country.Country,
Hospital_List = from h in gh select h
}
EDITED:
And if your model created right you can use this code:
from hospital in hospitals
join country in countries
on hospital.Country equals country
group hospital by hospital.CID into gh
select new
{
Country = from h in gh select h.Country.Country,
Hospital_List = from h in gh select h
}

Get first record of each entity order by a column

I have a query in linq that fetch students assessments data something like
new {x.StudentId, x.StudentAssessmentId, x.AssessmentName, x.SubmittedDate}
then I perform some operations on this list to get only last added student assessment per student, I get last studentassessment by finding the max id of studentassessment,
so I finally get last studentassessments data of all the students.
Is there a way to do this directly in the initial list?
I thought about the way to group the results by student Id and select max of studentassessmentid, like
group x.StudentAssessmentId by x.StudentId
select new {x.Key, x.Max()}
in this way I will get student with there last studentassessmentid which is what I want but this will only give me studentassessment ids while I want other data also like AssessmentName, SubmittedDate etc.
Try something like this:
group x.StudentAssessmentId
by new {
x.StudentId,
x.AssessmentName,
x.SubmittedDate }
into g
select new
{
g.Key.StudentId,
g.Key.AssessmentName,
g.Key.SubmittedDate,
g.Max(),
}

Linq query returns duplicate results when .Distinct() isn't used - why?

When I use the following Linq query in LinqPad I get 25 results returned:
var result = (from l in LandlordPreferences
where l.Name == "Wants Student" && l.IsSelected == true
join t in Tenants on l.IsSelected equals t.IsStudent
select new { Tenant = t});
result.Dump();
When I add .Distinct() to the end I only get 5 results returned, so, I'm guessing I'm getting 5 instances of each result when the above is used.
I'm new to Linq, so I'm wondering if this is because of a poorly built query? Or is this the way Linq always behaves? Surely not - if I returned 500 rows with .Distinct(), does that mean without it there's 2,500 returned? Would this compromise performance?
It's a poorly built query.
You are joining LandlordPreferences with Tenants on a boolean value instead of a foreign key.
So, most likely, you have 5 selected land lords and 5 tenants that are students. Each student will be returned for each land lord: 5 x 5 = 25. This is a cartesian product and has nothing to do with LINQ. A similar query in SQL would behave the same.
If you would add the land lord to your result (select new { Tenant = t, Landlord = l }), you would see that no two results are actually the same.
If you can't fix the query somehow, Distinct is your only option.

Linq to Entities: complex query getting "average" restaurant rating

So I'm building a Restaurant Review site for my community. I need to
extract data from the following tables: RESTAURANT, CUISINE, CITY,
PRICE and RATING (customer ratings).
The query should return all restuarants of a selected CUISINE_ID and
return the RESTAURANT_NAME, CUSINE_NAME, CUTY_NAME, PRICE_CODE and it
should average all the reviews RATING_CODE and return a calculated
value. I'm fine with returning all the data except the average
rating.
I've only been working with LINQ to Entities 2 days and LINQ for about
3 weeks, so I'm really a newbie; I'm waiting for my LINQ book to be
delivered from Amazon.com. Your help guidance be appreciated!
It should end up looking something like this:
var avgForMatches =
(from r in context.Restaurants
where r.Cuisines.Any(c => c.CuisineName == cuisineName)
where r.Prices.Any(p => p.PriceCode == priceCode)
//... same pattern for other searches.
select r.RatingCode)
.Average();
Read about aggregate methods (including average) within the 101 linq samples - http://msdn.microsoft.com/en-us/vcsharp/aa336747

achieving a complex sort via Linq to Objects

I've been asked to apply conditional sorting to a data set and I'm trying to figure out how to achieve this via LINQ. In this particular domain, purchase orders can be marked as primary or secondary. The exact mechanism used to determine primary/secondary status is rather complex and not germane to the problem at hand.
Consider the data set below.
Purchase Order Ship Date Shipping Address Total
6 1/16/2006 Tallahassee FL 500.45
19.1 2/25/2006 Milwaukee WI 255.69
5.1 4/11/2006 Chicago IL 199.99
8 5/16/2006 Fresno CA 458.22
19 7/3/2006 Seattle WA 151.55
5 5/1/2006 Avery UT 788.52
5.2 8/22/2006 Rice Lake MO 655.00
Secondary POs are those with a decimal number and primary PO's are those with an integer number. The requirement I'm dealing with stipulates that when a user chooses to sort on a given column, the sort should only be applied to primary POs. Secondary POs are ignored for the purposes of sorting, but should still be listed below their primary PO in ship date descending order.
For example, let's say a user sorts on Shipping Address ascending. The data would be sorted as follows. Notice that if you ignore the secondary POs, the data is sorted by Address ascending (Avery, Fresno, Seattle, Tallahassee)
Purchase Order Ship Date Shipping Address Total
5 5/1/2006 Avery UT 788.52
--5.2 8/22/2006 Rice Lake MO 655.00
--5.1 4/11/2006 Chicago IL 199.99
8 5/16/2006 Fresno CA 458.22
19 7/3/2006 Seattle WA 151.55
--19.1 2/25/2006 Milwaukee WI 255.69
6 1/16/2006 Tallahassee FL 500.45
Is there a way to achieve the desired effect using the OrderBy extension method? Or am I stuck (better off) applying the sort to the two data sets independently and then merging into a single result set?
public IList<PurchaseOrder> ApplySort(bool sortAsc)
{
var primary = purchaseOrders.Where(po => po.IsPrimary)
.OrderBy(po => po.ShippingAddress).ToList();
var secondary = purchaseOrders.Where(po => !po.IsPrimary)
.OrderByDescending(po => po.ShipDate).ToList();
//merge 2 lists somehow so that secondary POs are inserted after their primary
}
Have you seen ThenBy and ThenByDescending methods?
purchaseOrders.Where(po => po.IsPrimary).OrderBy(po => po.ShippingAddress).ThenByDescending(x=>x.ShipDate).ToList();
I'm not sure if this is going to fit your needs because I don't quiet understand well how final list should look like (po.IsPrimary and !po.IsPrimary is confusing me).
The solution for your problem is GroupBy.
First order your object according to selected column:
var ordered = purchaseOrders.OrderBy(po => po.ShippingAddress);
Than you need to group your orders according to the primary order. I assumed the order is a string, so i created a string IEqualityComparer like so:
class OrderComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y)
{
x = x.Contains('.') ? x.Substring(0, x.IndexOf('.')) : x;
y = y.Contains('.') ? y.Substring(0, y.IndexOf('.')) : y;
return x.Equals(y);
}
public int GetHashCode(string obj)
{
return obj.Contains('.') ? obj.Substring(0, obj.IndexOf('.')).GetHashCode() : obj.GetHashCode();
}
}
and use it to group the orders:
var grouped = ordered.GroupBy(po => po.Order, new OrderComparer());
The result is a tree like structure ordered by the ShippingAddress column and grouped by the primary order id.

Resources