Elasticsearch - Search by a field but distinct another field - elasticsearch

What I mean would be equivalent of this SQL query:
SELECT distinct fieldA
from DB
where fieldB like '%value%'
What is the Term Aggregation of this query in elastic search?

You can use wildcard query in conjunction with terms aggregation to fetch distinct values for the field.
You can use the following query to get the results.
POST test/_search
{
"query": {
"wildcard": {
"fieldB": {
"value": "*ali*"
}
}
},
"aggs": {
"distinct_fieldA": {
"terms": {
"field": "fieldA",
"size": 10
}
}
}
}
Hope this works

Related

How to limit search results from each index in a multi index search query?

I am using Elasticsearch version 6.3 and I want to make queries across multiple indices.Elasticsearch has support for this and I can give multiple indices as comma separated values in the url with one query in request body and also give size parameter to limit the number of search results returned.However this limits the size of the overall search results and might lead to no results from some indexes- so instead I want to fetch first n number of results from each index.
I tried using multi search api (_msearch) but with that it seems I have to give the same query and size for all indexes and that works, but I am not able to get a single aggregation over the entire result , is there any way to address both the issues?
Solution 1:
You're on the right path with the _msearch query. What I would do is to issue one query per index (no aggregations!) with the size you want for that index, as well as another query just for the aggregations, like this:
{ "index": "index1" }
{ "size": 5, "query": { ... }}
{ "index": "index2" }
{ "size": 5, "query": { ... }}
{ "index": "index3" }
{ "size": 5, "query": { ... }}
{ "index": "index1,index2,index3" }
{ "size": 0, "query": { ... }, "aggs": { ... } }
So the first three queries will return document hits from each of the three indexes and the last query will return the aggregation computed on all indexes, but no documents.
Solution 2:
Another way to tackle this if you have a small size, is to have a single query in the query part and then aggregate on the index name and retrieve hits from each index using top_hits, like this:
POST index1,index2,index3/_search
{
"size": 0,
"query": { ... },
"aggs": {
"indexes": {
"terms": {
"field": "_index",
"size": 50
},
"aggs": {
"hits": {
"top_hits": {
"size": 5
}
}
}
}
}
}

How to filter the results from a composite aggregation?

I want to filter the results of the composite aggregation which inside has a top_hits aggregation. So I first group my data with a top_hits, then I use this as a subaggregation inside my composite aggregation that has a single source based on an Id and I don't know how to filter those gruped results.
I've tried using the filters aggregation but I'm not sure since composite aggregation must be the father of all aggregations. Tried different combinations of these aggregation but none of these show me the results as I want.
{
"size": 0,
"aggs": {
"grouped_data": {
"composite": {
"sources": [
{
"artifact": {
"terms": {
"field": "artifactId.keyword"
}
}
}
],
"size": 20
},
"aggs": {
"top_artifacts_hits": {
"top_hits": {
"size": 1,
"sort": [{
"initialDate": {
"order": "desc"
}
}]
}
}
}
}
}
}
I tried using the query API for filtering but that is not a good option for me since the filters I want to apply are meant for the grouped results. Using some query before the main aggregation makes ElasticSearch query first and then group. I need it to be backwards. I'm using ES 6.3 under AWS.
So my documents look something like this:
{
"artifactId": "foo",
"clientId": "bar",
"artifactState": "foozz",
"initialDate": 1559745246
}
What I need to do is to get the last artifactState based on the initialDate for each different artifactId so this is why I'm using top_hits + composite.

Elastic search, query based on another query

I want to perform a query and then use the results to perform another query
to ne clear:
I want to perform a query and select the ids and then use those ids to find some users somethink like
SELECT * FROMpostsWHERE user_id IN (SELECT id FROM users WHERE id IN (1,2,3,4,...)) in sql
You can filter the results using query string query DSL
In the following example, I have filtered the result based on result_type value
And then grouped the values
example
GET .ml-anomalies-.write-high_request_time/_search
{
"size": 0,
"query": {
"query_string": {
"query": "result_type: model_plot OR result_type:bucket"
}
},
"aggs": {
"NAME": {
"terms": {
"field": "result_type",
"size": 10
}
}
}
}
You can try the same command in the following demo box
https://demo.elastic.co/app/kibana#/dev_tools/console?_g=()

Finding the max date in elastic search query

Can you please help me to convert this sql query to elastic search query?
SELECT group,MAX(date) as max_date
FROM table
WHERE checks>0
GROUP BY group
What if you have your query as below, assuming that you're doing an HTTP POST. You could simply use max aggregations of ES in order to get the max value and use terms within aggs in order to get the GROUP BY function done.
Request:
yourhost:9200/your_index/_search
Request Body:
{
"query": {
"query_string": {
"query": "checks > 0" <-- check whether this works, if not use the range query
}
},
"aggs": {
"groupby_group": {
"terms": {
"field": "group"
},
"aggs": {
"maximum": {
"max": {
"script": "doc['date'].value"
}
}
}
}
}
}
For checks > 0, you could go with the range query as well within the query, which could look like:
"range" : {
"checks" : {
"gte" : 0
}
}
This one could help you on executing aggregations. But please do make sure that you've enabled scripting from your elasticsearch.yml before you try querying:
script.inline: on
Hope this helps!

How to query for many facets in single elasticsearch query

I'm looking for a way to query the distribution of the top n values for many object fields in single query
My object in elastic search looks like:
obj: {
os: "Android",
device_model: "Samsung Galaxy S II (GT-I9100)",
device_brand: "Samsung",
os_version: "Android-2.3",
country: "BR",
interests: [1,2,3],
behavioral_segment: ["sport", "lifestyle"]
}
The following query brings the distribution of the values for specific field with number of appearances of this value only for the UK users
curl -XPOST http://<endpoint>/profiles/_search?search_type=count -d '
{
"query": {
"match": {
"country" : "UK"
}
},
"facets": {
"ItemsPerCategoryCount": {
"terms": {
"field": "behavioral_segment"
}
}
}
}'
How can I query for many fields - for example I would like to get a result for behavioral_segment and device_brand and os in single query. Is it possible?
In the facets section of the query, you should use the fields parameter.
"facets": {
"ItemsPerCategoryCount": {
"terms": {
"fields": ["behavioral_segment","device_brand"]
}
}
}
That should solve your problem, but of course it might not garantee the coherence of the data

Resources