Linq - group by a name, but still get the id - linq

I'm trying to find duplicates in linq by a particular column (the name column), but I also wish to return the unique id, as I wish to bind to the ID to display additional information about the row.
I've dug around on stackoverflow, but can only find ways of finding duplicates in the fashion off:
By the whole object
By a particular property
Getting the number of duplicates
The closest thing I could find was by specifying "Key" in my group by, but I'm ensure if that is working.
Ideally I'm hoping to output something that has the ID, Number of Duplicates.
Thanks

Assume you have people collection:
from p in people
group p by p.Name into g
select new {
Name = g.Key,
NumberOfDuplicates = g.Count(),
IDs = g.Select(x => x.ID)
}

Related

SQL Statement to delete only one row out of duplicates

So I am working in Ruby, and say I have 6 rows in a table of two columns that are exactly identical. In my case, my table "campaign_items" has two columns "campaign_name" and "item." I would like to delete only one row out of the 6 duplicates using a single query. I started with this:
db.exec("DELETE FROM products WHERE campaign_name = '#{camp_name}' AND product_type = 'fleecejacket' AND size = '#{size_array[index]}'")
Which of course deleted all items of that condition. So I found in another question an answer along these lines:
db.exec("DELETE FROM products a WHERE a.ctid <> (SELECT min(b.ctid) FROM products b WHERE a.key = b.key)")
However, this would delete all duplicates except for one. I have not found a way that only deletes a SINGLE row that has duplicates. Is there a delete top query that I am looking for? Thanks in advance.
Edit: I also have a column "id" which is a primary key.
So I definitely overthought this, but all that is needed is this:
x = db.exec("SELECT * FROM campaign_items WHERE campaign_name = '#{camp_name}' AND item = 'fleecejacket'")
id = x[0]['id']
db.exec("DELETE FROM campaign_items WHERE campaign_name = '#{camp_name}' AND item = 'fleecejacket' AND id = '#{id}'")
Get the unique id from the first duplicate (since it doesn't matter which one is deleted) and delete the row with that id.

join two tables in linq with special conditions

I hope one can help me, I am new in linq,
I have 2 tables name tblcart and tblorderdetail:
I just show some fields in these two tables to show whats my problem:
tblCart:
ID,
CartID,
Barcode,
and tblOrderDetail:
ID,
CartID,
IsCompleted
Barcode
when someone save an order, before he confirms his request,one row temporarily enter into the tblCart,
then if he or she confirms his request another row will be inserted into the tblOrderDetail ,
Now I wanna not to show the rows that is inserted into tblOrderDetailed(showing just temporarily rows which there is in tblCart),
In another words, if there is rows in tblCart with cartID=1 and at the same time there is the same row with CartID= 1 in tblOrderDetail, then I dont want that Row.
All in all, Just the rows that there isnt in tblOrderDetail, and the field to realize this is CartID,
I should mention that I make Iscompleted=true, and with that either we can exclude the rows we do not want,
I did this:
var cartItems = context.tblCarts
.Join(context.tblSiteOrderDetails,
w => w.CartID,
orderDetail => orderDetail.cartID,
(w,orderDetail) => new{w,orderDetail})
.Where(a=>a.orderDetail.cartID !=a.w.CartID)
.ToList()
however it doesn't work.
one example:
tblCart:
ID=1
CartID=1213
Barcode=4567
ID=2
CartID=1214
Barcode=4567
ID=3
CartID=1215
Barcode=6576
tblOrderDetail:
ID=2
CartID=1213
Barcode=4567
IsCompleted=true
with these data it should just show the last two Row in tblCart, I mean
ID=2
CartID=1214
Barcode=4567
ID=3
CartID=1215
Barcode=6576
This sounds like a case for WHERE NOT EXISTS in sql.
roughly translated this should be something like this in LINQ:
var cartItems = context.tblCarts.Where(crt => !context.tblSiteOrderDetails.Any(od => od.CartID == crt.cartID));
If you have a navigation property on cart to reference details (I'll assume it's called Details), then:
var results=context.tblCarts.Where(c=>!c.Details.Any(d=>d.IsCompleted));

Find Maximum Columns in a grouped row. [using PIG]

I have to find maximum number of posts created by person with some given set of data, where I am provided with user id, display name, age, comments count, view count, date, score and title of each post.
To get the number of maximum post, I think, we can group by user id.Now, after grouping, I need to check the id which has the most no. of columns. I don't understand how would I solve the latter part. Please help.
As What, I understand from your question. I am giving you answer Accordingly.
Let be try this code :
a = load '<path>' using PigStorage(',') as(userId,displayName,age,commentsCount,viewCount,date,score,title)
b = group a by userId;
c = foreach b generate group,COUNT(a.title);
dump c;

Get first record of each entity order by a column

I have a query in linq that fetch students assessments data something like
new {x.StudentId, x.StudentAssessmentId, x.AssessmentName, x.SubmittedDate}
then I perform some operations on this list to get only last added student assessment per student, I get last studentassessment by finding the max id of studentassessment,
so I finally get last studentassessments data of all the students.
Is there a way to do this directly in the initial list?
I thought about the way to group the results by student Id and select max of studentassessmentid, like
group x.StudentAssessmentId by x.StudentId
select new {x.Key, x.Max()}
in this way I will get student with there last studentassessmentid which is what I want but this will only give me studentassessment ids while I want other data also like AssessmentName, SubmittedDate etc.
Try something like this:
group x.StudentAssessmentId
by new {
x.StudentId,
x.AssessmentName,
x.SubmittedDate }
into g
select new
{
g.Key.StudentId,
g.Key.AssessmentName,
g.Key.SubmittedDate,
g.Max(),
}

Using Linq to bring back last 3,4...n orders for every customer

I have a database with customers orders.
I want to use Linq (to EF) to query the db to bring back the last(most recent) 3,4...n orders for every customer.
Note:
Customer 1 may have just made 12 orders in the last hr; but customer 2 may not have made any since last week.
I cant for the life of me work out how to write query in linq (lambda expressions) to get the data set back.
Any good ideas?
Edit:
Customers and orders is a simplification. The table I am querying is actually a record of outbound messages to various web services. It just seemed easer to describe as customers and orders. The relationship is the same.
I am building a task that checks the last n messages for each web service to see if there were any failures. We are wanting a semi real time Health status of the webservices.
#CoreySunwold
My table Looks a bit like this:
MessageID, WebserviceID, SentTime, Status, Message, Error,
Or from a customer/order context if it makes it easer:
OrderID, CustomerID, StatusChangedDate, Status, WidgetName, Comments
Edit 2:
I eventually worked out something
(Hat tip to #StephenChung who basically came up with the exact same, but in classic linq)
var q = myTable.Where(d => d.EndTime > DateTime.Now.AddDays(-1))
.GroupBy(g => g.ConfigID)
.Select(g =>new
{
ConfigID = g.Key,
Data = g.OrderByDescending(d => d.EndTime)
.Take(3).Select(s => new
{
s.Status,
s.SentTime
})
}).ToList();
It does take a while to execute. So I am not sure if this is the most efficient expression.
This should give the last 3 orders of each customer (if having orders at all):
from o in db.Orders
group o by o.CustomerID into g
select new {
CustomerID=g.Key,
LastOrders=g.OrderByDescending(o => o.TimeEntered).Take(3).ToList()
}
However, I suspect this will force the database to return the entire Orders table before picking out the last 3 for each customer. Check the SQL generated.
If you need to optimize, you'll have to manually construct a SQL to only return up to the last 3, then make it into a view.
You can use SelectMany for this purpose:
customers.SelectMany(x=>x.orders.OrderByDescending(y=>y.Date).Take(n)).ToList();
How about this? I know it'll work with regular collections but don't know about EF.
yourCollection.OrderByDescending(item=>item.Date).Take(n);
var ordersByCustomer =
db.Customers.Select(c=>c.Orders.OrderByDescending(o=>o.OrderID).Take(n));
This will return the orders grouped by customer.
var orders = orders.Where(x => x.CustomerID == 1).OrderByDescending(x=>x.Date).Take(4);
This will take last 4 orders. Specific query depends on your table / entity structure.
Btw: You can take x as a order. So you can read it like: Get orders where order.CustomerID is equal to 1, OrderThem by order.Date and take first 4 'rows'.
Somebody might correct me here, but i think doing this is linq with a single query is probably very difficult if not impossible. I would use a store procedure and something like this
select
*
,RANK() OVER (PARTITION BY c.id ORDER BY o.order_time DESC) AS 'RANK'
from
customers c
inner join
order o
on
o.cust_id = c.id
where
RANK < 10 -- this is "n"
I've not used this syntax for a while so it might not be quite right, but if i understand the question then i think this is the best approach.

Resources