var response = client.Search<Timeline>(
x => x.Query(
q => q.Bool(
b => b.Must(queryContainer)))
.Size(0)
.Aggregations(a => a
.DateRange("last_24_hours",
f => f.Field(n=>n.server_time)
.Ranges(z=>z.From(DateMath.Now.Subtract("24h")).To(DateMath.Now))
.Aggregations(
agg => agg.DateHistogram("widget_clicked_by_hour",
p => p.Field(z => z.server_time)
.Interval(DateInterval.Hour)
.Format("yyyy-MM-dd hh:mm")
.OrderDescending("_key"))))
)
);
I'm trying to get items from widget_clicked_by_hour aggregation but in the nest .net library I don't have access to the items list
although while debugging I found the items list
To get the date histogram buckets for each date range bucket would be
var dateRange = response.Aggs.DateRange("last_24_hours");
foreach (var rangeBucket in dateRange.Buckets)
{
var dateHistogram = rangeBucket.DateHistogram("widget_clicked_by_hour");
foreach (var histogramBucket in dateHistogram.Buckets)
{
// do something with bucket
}
}
Since the date histogram aggregation is a sub-aggregation of the date range aggregation, it can be accessed from each bucket in the date histogram aggregation.
I would suggest 2 things that helped me immensely.
1) I would install the sense plugin from chrome
https://chrome.google.com/webstore/detail/sense-beta/lhjgkmllcaadmopgmanpapmpjgmfcfig?hl=en
This gives you a very userfriendly way to build your elasticsearch queries and analysis right in the browser.
2) I would look into using the cardinality aggregation:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html
If you are trying to get a list, this should give you a list of items and the counts of it (which you can use/ignore)
Related
I am going to fetch some records between two dates using Elastic Search query.
First, I check the number of records between two dates to know whether it is greater than 10000 or not. If it is, i try to fetch them 10000 by 10000.
//get count
var result_count = client.Count<TelegramMessageStructure>(s => s
.AllTypes()
.AllIndices()
.Query(q => q
.DateRange(r => r
.Field(f => f.messageDate)
.GreaterThanOrEquals("2018-06-03 00:00:00.000")
.LessThan("2018-06-03 00:59:00.000")
)
)
);
long count = result_count.Count; //count = 27000
it returns 27000. So I want to fetch them 10000 by 10000. I use this query to do that:
int MaxMessageCountPerQuery=10000;
for (int i = 0; i < count; i += MaxMessageCountPerQuery)
{
client = new ElasticClient(connectionSettings);
// No change whether the client is renewed or not
var result = client.Search<TelegramMessageStructure>(s => s
.AllTypes()
.AllIndices()
.MatchAll()
.From(i)
.Size(MaxMessageCountPerQuery)
.Sort(ss => ss.Ascending(p => p.id))
.Query(q => q
.DateRange(r => r
.Field(f => f.messageDate)
.GreaterThanOrEquals("2018-06-03 00:00:00.000")
.LessThan("2018-06-03 00:59:00.000")
)
)
);
//when i=0, result.documents contains 10000 records otherwise it has 0
}
In The first round, when i=0, result.documents contains 10000 records otherwise it contains 0 records.
What is wrong with this?
Based on this link:
scroll in elastic net-api
Your codes should contains below steps:
1- Search with all parameters that you need plus .Scroll("5m") (I assume from(0) and size(10000) is set too and save response in result variable)
2- Now you have first 10000 records (in result.Documents)
3- For receive more records, you should use ScrollId param to get more results. (Each call of bellow code give you next 10000 records)
var result_new = client.Scroll<TelegramMessageStructure>("10m", result.ScrollId);
In fact, your codes should be like this:
int MaxMessageCountPerQuery=10000;
client = new ElasticClient(connectionSettings);
// No change whether the client is renewed or not
var result = client.Search<TelegramMessageStructure>(s => s
.AllTypes()
.AllIndices()
.MatchAll()
.From(i)
.Size(MaxMessageCountPerQuery)
.Sort(ss => ss.Ascending(p => p.id))
.Query(q => q
.DateRange(r => r
.Field(f => f.messageDate)
.GreaterThanOrEquals("2018-06-03 00:00:00.000")
.LessThan("2018-06-03 00:59:00.000")
)
)
.Scroll("5m") // Add this parameter
);
// TODO some code:
// save and use result.Documents
for (int i = 0; i < result.Total; i += MaxMessageCountPerQuery)
{
var result_new = client.Scroll<TelegramMessageStructure>("10m", result.ScrollId); // Add this line to loop , Each loop you can get next 10000 record.
// TODO some code:
// save and use result_new.Documents
}
Elasticsearch has a default index.max_result_window = 10000 and it's well explained at
https://www.elastic.co/guide/en/elasticsearch/guide/current/pagination.html
To understand why deep paging is problematic, let’s imagine that we
are searching within a single index with five primary shards. When we
request the first page of results (results 1 to 10), each shard
produces its own top 10 results and returns them to the coordinating
node, which then sorts all 50 results in order to select the overall
top 10.
Now imagine that we ask for page 1,000—results 10,001 to 10,010.
Everything works in the same way except that each shard has to produce
its top 10,010 results. The coordinating node then sorts through all
50,050 results and discards 50,040 of them!
You can see that, in a distributed system, the cost of sorting results
grows exponentially the deeper we page. There is a good reason that
web search engines don’t return more than 1,000 results for any query.
I am using NEST. The number of buckets returned from ElasticSearch aggregation is always 10 (default value), in spite of the fact that the size is set to 10000
You need to set the size inside the Terms aggregation and not outside of it. Try this:
.Aggregations( a => a
.Terms(category_agg", st => st
.Field(o => o.categories.Select(x => x.id))
.Size(10000)
)
)
var seasonPlayer = (from SeasonPlayer in db.SeasonPlayerSet
orderby SeasonPlayer.StatisticsPlayer.Average(x => x.STP_timeplay.Ticks) descending
select SeasonPlayer).ToList();
SeasonPlayer has an ICollection of StatisticsPlayer so i want to get a average of time spent on the court ordered descending by STP_timeplay which is a typ of TimeSpan. I can't get average by STP_timeplay because it isn't a decimal so i tried get average by Ticks. It throws an exception:
The specified type member 'Ticks' is not supported in LINQ to Entities. Only initializers, entity members, and entity navigation properties are supported.
The problem is that the Linq to Entities query provider isn't able to translate your LINQ into a Sql query which joins to the Statistics Player, averages the timeplay, grouped by season player.
Given that you appear to be iterating all Season Players, if the number of records isn't too large you could bring this all into memory like so:
var seasonPlayer = db.SeasonPlayerSet
.Include(sp => sp.StatisticsPlayer)
.ToList()
.Select(sp => new {SeasonPlayer = sp, Average = sp.StatisticsPlayer.Average(stp => stp.STP_timeplay.Ticks)})
.OrderByDescending(sp => sp.Average)
.Select(sp => SeasonPlayer)
.ToList();
Try this:-
var seasonPlayer = db.SeasonPlayerSet.ToList()
.OrderByDescending(x => x.StatisticsPlayer
.Average(z => z.STP_timeplay.Ticks);
I need to get a calculation on aggregation from linq which I hope someone can help
I have a list of objects that have 3 fields (date, saleprice and productcode) I need to get FOR EACH date (Group by date), the SUM of saleprice
/ COUNT of distinct product code.
I know how I can find the SUM alone but not calculation by another aggregate
It would be easier to answer your question with some sample code and objects. I'll assume, items is your list of objects:
items.GroupBy(obj => obj.Date)
.Select(g => new
{
Date = g.Key.Date,
Aggregate = g.Sum(obj => obj.SalePrice) / g.Select(obj => obj.ProductCode)
.Distinct().Count()
});
I am a newbie to Linq. I am trying to write a linq query to get a min value from a set of records. I need to use groupby, where , select and min function in the same query but i am having issues when using group by clause. here is the query I wrote
var data =newTrips.groupby (x => x.TripPath.TripPathLink.Link.Road.Name)
.Where(x => x.TripPath.PathNumber == pathnum)
.Select(x => x.TripPath.TripPathLink.Link.Speed).Min();
I am not able to use group by and where together it keeps giving error .
My query should
Select all the values.
filter it through the where clause (pathnum).
Groupby the road Name
finally get the min value.
can some one tell me what i am doing wrong and how to achieve the desired result.
Thanks,
Pawan
It's a little tricky not knowing the relationships between the data, but I think (without trying it) that this should give you want you want -- the minimum speed per road by name. Note that it will result in a collection of anonymous objects with Name and Speed properties.
var data = newTrips.Where(x => x.TripPath.PathNumber == pathnum)
.Select(x => x.TripPath.TripPathLink.Link)
.GroupBy(x => x.Road.Name)
.Select(g => new { Name = g.Key, Speed = g.Min(l => l.Speed) } );
Since I think you want the Trip which has the minimum speed, rather than the speed, and I'm assuming a different data structure, I'll add to tvanfosson's answer:
var pathnum = 1;
var trips = from trip in newTrips
where trip.TripPath.PathNumber == pathnum
group trip by trip.TripPath.TripPathLink.Link.Road.Name into g
let minSpeed = g.Min(t => t.TripPath.TripPathLink.Link.Speed)
select new {
Name = g.Key,
Trip = g.Single(t => t.TripPath.TripPathLink.Link.Speed == minSpeed) };
foreach (var t in trips)
{
Console.WriteLine("Name = {0}, TripId = {1}", t.Name, t.Trip.TripId);
}