Is there a way to get a total results count when calling Aggregate function?
Note that I'm not using Aggregate function to aggregate results, but as an advanced search query, because Search function does not allow to sort by multiple fields.
RediSearch returns total documents matched count, but I can't find a way to get this number using NRediSearch library.
With NRediSearch
Using NRediSearch, you would need to build and execute aggregation that will run a GROUPBY 0 and the COUNT reducer, say you have a person-idx index and you want to count all the Person documents in Redis:
var client = new Client("person-idx", muxer.GetDatabase());
var result = await client.AggregateAsync(new AggregationBuilder().GroupBy(new List<string>(), new List<Reducer>{Reducers.Count()}));
Console.WriteLine(result.GetResults().First().Values.First());
Will get the count you are looking for.
With Redis.OM
There's a newer library Redis.OM which you can also use to make these aggregations a bit simpler, the same operation would be done with the following:
var peopleAggregations = provider.AggregationSet<Person>();
Console.WriteLine(peopleAggregations.Count());
Related
I'm working with a time-based index storing syslog events.
All the data is coming from different sources (PCs).
Suppose I have this kind of events:
timestamp = 0
source = PC-1
event = event_type_1
timestamp = 1
source = PC-1
event = event_type_1
timestamp = 1
source = PC-2
event = event_type_1
I want to make a query that will retrieve all the distinct value of "source" field for documents where match event = event_type_1
I am expecting to have all exact values (no approximations).
To achieve it I have written a cardinality query with an aggregation specifying the correct size, because I have no prior knowledge of the number of distinct sources. I think this is a expensive work to do as it consumes a lot of memory.
Is there any other alternative to get this done?
Here is requirement I am working on
There are multiple indexes with name content_ssc, content_teal, content_mmy.
These indexes can have common data (co_code is one of the field in the documents of these indexes)
a. content_ssc can have documents with co_code = teal/ssc/mmy
b. content_mmy can have documents with co_code = ssc/mmy
I need to get the data using below condition (this is one of the approach to get the unique data from these indexes)
a. (Index = content_ssc and site_code = ssc) OR (Index = content_mmy and site_code = mmy)
Basically I am getting a duplicate data from these indexes currently so I need any solution which should fetch unique data from these indexes using the above condition.
I have tried using boolean query with multiple indices from this link but it didn't produce unique result.
Please suggest.
You can use distinct query , and you will get unique result
I am beginning with ElasticSearch and really like it, hovewer I am stuck with quite simple scenario.
I am indexing such structure of a Worker:
NAME SURENAME ID AGE SEX NAME_SURENAME BIRTH_DATE
NAME_SURENAME - not analyzed - this field is indexed for grouping purposes
NAME, SURENAME - analyzed
The task is simple - search 5 unique workers sorted by birth_date (unique means the same name and surename, even if they are in different age and are different people)
I read about aggregation queries and as I understand, I can get only aggregations without documents. Unfortunatelly I aggregate by name and surename so I won't have other fields in results in buckets, like for example document ID field at least. But I also read about TopHit aggregation, that it returns document, and i tried it - the second idea below.
I have two ideas
1) Not use aggregations, just search 5 workers, filter duplicates in java and again search workers and filter duplicates in Java till I reach 5 unique results
2) Use aggregations. I event tried it like below, it even works on test data but since it is my first time, please advice, whether it works accidentially or it is done correctly? So generally I thought I could get 5 buckets with one TopHit document. I have no idea how TopHit document is chosen but it seems to work. Below is the code
String searchString = "test";
BoolQueryBuilder query = boolQuery().minimumNumberShouldMatch(1).should(matchQuery("name", searchString).should(matchQuery("surename", searchString));
TermsBuilder terms = AggregationBuilders.terms("namesAgg").size(5);
terms.field("name_surename");
terms.order(Terms.Order.aggregation("birthAgg", false)).subAggregation(AggregationBuilders.max("birthAgg")
.field("birth_date")
.subAggregation(AggregationBuilders.topHits("topHit").setSize(1).addSort("birth_date", SortOrder.DESC));
SearchRequestBuilder searchRequestBuilder = client.prepareSearch("workers")
.addAggregation(terms).setQuery(query).setSize(1).addSort(SortBuilders.fieldSort("birth_date")
.order(SortOrder.DESC));
Terms aggregations = searchRequestBuilder.execute().actionGet().getAggregations().get("namesAgg");
List<Worker> results = new ArrayList<>();
for (Terms.Bucket bucket : aggregations.getBuckets()) {
Optional<Aggregation> first = bucket.getAggregations().asList().stream().filter(aggregation -> aggregation instanceof TopHits).findFirst();
SearchHit searchHitFields = ((TopHits) first.get()).getHits().getHits()[0];
Transformer<SearchHit, Worker> transformer = transformers.get(Worker.class);
Worker transform = transformer.transform(searchHitFields);
results.add(transform);
}
return results;//
I just need to find the count of total records which i get from the mentioned query.
$orderData = $this->mongo_db->get_where('order', array('order_date'=> array('$gte'=>floatval(1337797800000), '$lte'=>floatval(1337970600000))));
I am using alexbilbie's library for codeigniter.
echo count($orderData); will return the number of records from your result
I am using the geoNear commang with mongoid in order to retrive a document collection ordered by distance. I need the distance for each document in the collection which is why I am having to resort to the geoNear command.
Given the following command:
category_ids = ["list", "of", "ids"]
cmd = Hash.new
cmd[:geoNear] = :poi
cmd[:near] = [params[:location][:x], params[:location][:y]]
cmd[:query] = {
"$or" => [
{primary_category_id: {"$in" => category_ids}},
{category_ids: {"$in" => category_ids}}
]
}
cmd[:spherical] = true
cmd[:num] = num
res = Poi.collection.database.command cmd
My problem is that I require the total number of results in the collection. Sure I could just run another query that just counts the number of items that satisfy the query part of the command, however that would be pretty inefficient and also not very extendible as every change I make in the command would have to be reflected in the count query. Just adding a maxDistance would land me in a whole heap of trouble.
Another option would be to go with find and calculate the distance manually but again I would like to avoid that.
So my question is there a clever way of getting the number of documents returned by the command (minus the num) without having to run a separate query or having to calculate the distance manually and go with find.
You can use facet for the same after geoNear use facet one will project the documents and in other you can use group by _id null and use the count in group to count the total number of documents.