Fetching the highest value from CosmosDb table - linq

With the SDK of Azure.Data.Tables I’m trying to write a query that groups the data and fetches the highest value from each group. Is there a way to achieve this?
Currently I’m fetching all the data and executing the following LINQ query:
public class SomeClass
{
public string CompanyName { get; set; }
public long SomeValue { get; set; }
public string ProperyAttribute1 { get; set; }
public string ProperyAttribute2 { get; set; }
public long ProperyAttribute3 { get; set; }
}
List<SomeClass> someList = FetchingDataFromCosmosDbTableStorage(); //fetching all the data
var result = someList.GroupBy(x => x.CompanyName)
.Select(y => y.OrderByDescending(i => i.SomeValue).First())
.ToList();
Instead of filtering all the data in my application I would prefer to write a query to get the same result from CosmosDb Table.

Related

Filter list by grandchild using LINQ

I need to filter a list by the DamageCodeName field in the DamageCode class.
public partial class DamageCategory
{
public string DamageCategoryId { get; set; }
public string CategoryName { get; set; }
}
public partial class DamageGroup
{
public string DamageGroupId { get; set; }
public string DamageCategoryId { get; set; }
public string GroupName { get; set; }
}
public partial class DamageCode
{
public string DamageCodeId { get; set; }
public string DamageGroupId { get; set; }
public string DamageCodeName { get; set; }
}
I pull the records using EF CORE 5 into a list:
private List<DamageCategory> _DamageCodeList { get; set; } = new();
_DamageCodeList = _contextDB.DamageCategories
.Include(i => i.DamageGroups)
.ThenInclude(d => d.DamageCodes).AsSingleQuery().ToListAsync();
Now I need to filter this list by the DamageCode.DamageCodeName property.
private string _SearchText { get; set; } = "Bubble";
private List<DamageCategory> _CategoryList { get; set; } = new();
_CategoryList = _DamageCodeList.Where(g => g.DamageGroups.SelectMany(c => c.DamageCodes
.Where(w => w.DamageCodeName.ToLower().Contains(_SearchText.ToLower()))).Any()).ToList();
The code above only filters for the DamageCategory. It brings back all the records for the DamageGroup and all the records for the DamageCodes.
I need the linq query result to produce a list like the one below (Filtered by "Bubble") and bring back only the DamageCategory, DamageGroup, and DamageCodes filtered by DamageCode.DamageCodeName.Contains("Bubble"):
Here is the SQL that produces the result above that I need:
SELECT
CT.[DamageCategoryID],
CT.[CategoryName],
DG.[DamageGroupID],
DG.[DamageCategoryID],
DG.[GroupName],
DC.[DamageCodeID],
DC.[DamageGroupID],
DC.[DamageCodeName]
FROM
[dbo].[DamageCategory] AS CT
INNER JOIN [dbo].[DamageGroup] AS DG ON CT.[DamageCategoryID] = DG.[DamageCategoryID]
INNER JOIN [dbo].[DamageCode] AS DC ON DG.[DamageGroupID] = DC.[DamageGroupID]
WHERE
DC.[DamageCodeName] LIKE '%Bubble%'
This is where query syntax shines.
from dc in _contextDB.DamageCategories
from dg in dc.DamageGroups
from dc in dg.DamageCodes
where dc.DamageCodeName.Contains("Bubble")
select new
{
dc.DamageCategoryID,
dc.CategoryName,
dg.DamageGroupID,
dg.DamageCategoryID,
dg.GroupName,
dc.DamageCodeID,
dc.DamageGroupID,
dc.DamageCodeName
}
The query shape from ... from is the query syntax equivalent of SelectMany.
You use ToLower in your code. That may not be necessary. The query is translated into SQL and if the database field has a case-insensitive collation you don't need ToLower.

Linq SQL EF Core subselect always requests ALL columns for tables in subquery from database

I have an ASP.NET Core 3.1 with EF Core Web API running in front of a SQL Server. Front end is Angular.
There is a function where someone is able to select the campaigns that are running for multiple selected clients.
The models involved are:
public partial class Campaign
{
public int CampaignId { get; set; }
public string Code { get; set; }
public string Description { get; set; }
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }
public int AdvertiserId { get; set; }
public int SalesHouseId { get; set; }
}
public partial class Customer
{
public int CustomerId { get; set; }
public string CompanyName { get; set; }
public string SearchCode { get; set; }
public string PhoneNumber1 { get; set; }
public string PhoneNumber2 { get; set; }
public string FaxNumber { get; set; }
public string Email { get; set; }
public string CompanyURL { get; set; }
public bool IsOnHold { get; set; }
public DateTime? OnHoldSince { get; set; }
public string OnHoldBy { get; set; }
public .... more fields { get; set; }
}
I use the following LinqSql to pull the campaigns for the selected clients (example):
List<int> selectedClients = new List<int>() { 10414, 19529, 14025 };
var cams = (from cam in _dbContext.Campaigns
join cl in (from cus in _dbContext.Customers.Where(cus =>
selectedClients.Contains(cus.CustomerId))
select new { cus.CustomerId, cus.CompanyName })
on cam.AdvertiserId equals cl.CustomerId
select new { cam.CampaignId, cam.Description, cam.StartDate, cl.CompanyName})
.AsNoTracking()
.ToList();
The result returned is good and all works fine.
BUT I checked the SQL that is send down to the server and noticed that for the sub query ALL columns are requested:
SELECT [c].[CampaignId], [c].[Description], [c].[StartDate], [t].[CompanyName]
FROM [Campaigns] AS [c]
INNER JOIN (
SELECT [c0].[CustomerId], [c0].[ACEndDate], [c0].[ACStartDate], [c0].[ActiveSince], [c0].[AgencyCommission], [c0].[Balance], [c0].[BalanceDate], [c0].[BankAccount], [c0].[CompanyName], [c0].[CompanyURL], [c0].[CreditLimit], [c0].[DebtorNumber], [c0].[Email], [c0].[EmailInvoice], [c0].[EmailOrderConfirmation], [c0].[FaxNumber], [c0].[FinanciallyResponsibleCustId], [c0].[InactiveSince], [c0].[InvDueDays], [c0].[InvFreqMargin], [c0].[InvoiceFrequency], [c0].[InvoiceInfo], [c0].[IsActive], [c0].[IsAdvertiser], [c0].[IsAdvertisingAgency], [c0].[IsAgency], [c0].[IsApprovedAgency], [c0].[IsDirectAdvertiser], [c0].[IsFinanciallyResponsible], [c0].[IsHolding], [c0].[IsOnHold], [c0].[IsOther], [c0].[IsProductionCompany], [c0].[IsTaxable], [c0].[OnHoldBy], [c0].[OnHoldReason], [c0].[OnHoldSince], [c0].[PaymentMethod], [c0].[PhoneNumber1], [c0].[PhoneNumber2], [c0].[PrintOrderConfirmation], [c0].[PrintZeroInvoices], [c0].[SearchCode], [c0].[SetInactiveBy], [c0].[VATId], [c0].[VATNumber]
FROM [Customers] AS [c0]
WHERE [c0].[CustomerId] IN (10414, 19529, 14025)
) AS [t] ON [c].[AdvertiserId] = [t].[CustomerId]
ALL columns from the table in the subquery are send back to the API. I only need the [customerId] and [companyName] fields.
I tried to use a DTO instead of the anonymous select in the subquery but that does not help either.
If the subselect consists of more tables, all columns of all tables in that sub select will be requested from the database.
Does anyone have an idea how to limit the columns from the sub-select here?

RavenDB using filter with group by

I have a Transaction entity.
I can make group by by (CustomerCode, CustomerName) then select CustomerCode and Total(Amount).
It is easy. But When I Want to filter AtCreated. I have An Error.
Unhandled exception. Raven.Client.Exceptions.InvalidQueryException: Raven.Client.Exceptions.InvalidQueryException: Field 'AtCreated' isn't neither an aggregation operation nor part of the group by key
Query: from Transactions group by CustomerCode, CustomerName where AtCreated >= $p0 select CustomerCode, count() as Total
Parameters: {"p0":"2019-01-01T00:00:00.0000000"}
public class Transactions
{
public string Id { get; set; }
public long TransId { get; set; }
public DateTime AtCreated { get; set; }
public string CustomerCode { get; set; }
public string CustomerName { get; set; }
public string City { get; set; }
public double Amount { get; set; }
public string GXF { get; set; }
}
var transactList = session.Query<Transactions>()
.Where(a=>a.AtCreated >= new DateTime(2019,01,01))
.GroupBy(a => new {a.CustomerCode, a.CustomerName})
.Select(a => new {a.Key.CustomerCode, Total = a.Count()})
.ToList();
How can I Grouping filtered data?
Thank You.
Create a Map-Reduce Index and then query on it.
https://ravendb.net/docs/article-page/4.2/csharp/indexes/map-reduce-indexes
For example, in this example, you can query on 'Category' field because it was indexed (meaning it was part of the Map-Reduce index definition)
See short demo examples in:
https://demo.ravendb.net/

Fetch single value with linq projection query without using FirstOrDefualt

I am using Entity Framework and this is my view model:
public class UserDetailsModel:CityModel
{
public int Id { get; set; }
public string Fullname { get; set; }
}
public class VendorInCategoryModel
{
public int CategoryId { get; set; }
public int VendorId { get; set; }
public virtual CategoryMasterModel CategoryMaster { get; set; }
public virtual UserDetailsModel UserDetails { get; set; }
}
public class CategoryMasterModel
{
public int CategoryId { get; set; }
public string CategoryName { get; set; }
}
This is my query to fetch vendor details along with category details of particular vendor say v001:
UserDetailsModel workerDetails = context.UserDetails.
Where(d => d.Id == _vendorId).
Select(d => new UserDetailsModel
{
Id = d.Id,
Fullname = d.Fullname,
CategoryId = d.VendorInCategory.Select(v => v.CategoryId).FirstOrDefault(),
}).SingleOrDefault();
Here I have used FirstOrDefault to fetch categoryId (that is single value)
But I don't want to use FirstOrDefault as I have used in so many queries and it is giving me wrong output in some cases. So that the reason why I don't want to use FirstOrDefault.
When I have written SingleOrDefualt in place of FirstOrDefault it is throwing me error
that use FirstOrDefault.
So how to overcome this? Can anybody please help me?
It looks like maybe your outer select is capable of returning multiple results (e.g. if there are more than one UserDetailsModel with the same Id). If it returns multiple results then your call to .SingleOrDefault() will throw an exception as it expects only a single result or no results. See LINQ: When to use SingleOrDefault vs. FirstOrDefault() with filtering criteria for more details.

RavenDB SelectMany not supported

I am trying to find one or more documents in RavenDB based on the values of a child collection.
I have the following classes
public class GoldenDocument
{
public GoldenDocument()
{
LinkedDocuments = new List<LinkedDocument>();
MergeMatchFields = new List<MergeMatchField>();
}
public string Id { get; set; }
public Guid SourceRowId { get; set; }
public List<MergeMatchField> MergeMatchFields { get; set; }
public List<LinkedDocument> LinkedDocuments { get; set; }
}
And the class that is in the collection MergeMatchFields
public class MergeMatchField
{
public string Id { get; set; }
public Guid OriginId { get; set; }
public string Name { get; set; }
public MatchType MatchType { get; set; }
public double MatchPerc { get; set; }
public string Value { get; set; }
}
In a List<MergeFields> mergeFields collection I have values that is not stored in RavenDB yet. Values are compared to values in a RavenDB document for find if it is a possible match by executing the following query:
using (var session = documentStore.OpenSession())
{
var docs = from gd in session.Query<GoldenDocument>()
from mf in gd.MergeMatchFields
from tf in mergeFields
where mf.Name == tf.Name
&& JaroWinklerCalculator.jaroWinkler(mf.Value, tf.Value) > .90d
&& !string.IsNullOrEmpty(mf.Value)
select gd;
}
I understand that ravenDB does not support SelectMany() so how would I go about getting the results from the Document store?
Create an index for this that would output the values you want to query on.
Note that you can't just execute arbitrary code the way you do here: JaroWinklerCalculator.jaroWinkler(mf.Value, tf.Value) > .90d
But you can use fuzzy queries, and they will do the same.

Resources