Get unique values of a field in elasticsearch and get all records - elasticsearch

I am using elasticsearch 2x version
I want distinct values in a field. I am getting only 10 values in the query.
How do I change this to view all distinct records?
Following is the query I am using:
GET messages-2017.04*/_search
{
"fields": ["_index"],
"query": {
"bool": {
"must":{
"bool":{
"should": [
{
"match": {
"RouteData": {
"query": "Q25B",
"type": "phrase"
}
}
}
]
}
}
}
}
}
I need to get all distinct _index names from the DB.

You need to use a terms aggregation instead, like this:
POST messages-2017.04*/_search
{
"size": 0,
"query": {
"bool": {
"must":{
"bool":{
"should": [
{
"match": {
"RouteData": {
"query": "Q25B",
"type": "phrase"
}
}
}
]
}
}
}
},
"aggs": {
"all_indexes": {
"terms": {
"field": "_index",
"size": 100
}
}
}
}

Related

Elasticsearch how can perform a "TERMS" AND "RANGE" query together

In elasticsearch, I am working well with Terms query to search multiple ID in one query,
my original terms query
{
"query": {
"terms": {
"Id": ["134","156"],
}
}
}
however, I need to add an extra condition like the following:
{
"query": {
"terms": {
"id": ["163","121","569","579"]
},
"range":{
"age":
{"gt":10}
}
}
}
the "id" field can be a long array.
You can combine both the queries using bool query
{
"query": {
"bool": {
"must": [
{
"terms": {
"Id": [
"134",
"156"
]
}
},
{
"range": {
"age": {
"gt": 10
}
}
}
]
}
}
}

how to select distinct children in elasticsearch

got this query
"query": {
"bool": {
"filter": {
"has_parent": {
"parent_type": "profiles",
"query": {
"query_string": {
"query": "age:>0 and user:aqwe"
}
}
}
}
}
},
"sort": ["user", {"createdAt": "asc"}]
as a result got multiple items with same '_id', I think this is something like problem with joining. How to edit this query to select distinct items?
If you want to return only unique values you can use terms aggregation. In your case it would look like this (size in this case is maximum number of unique ids you want to return):
"query": {
"bool": {
"filter": {
"has_parent": {
"parent_type": "profiles",
"query": {
"query_string": {
"query": "age:>0 and user:aqwe"
}
}
}
}
}
},
"aggs": {
"unique": {
"terms": {
"field": "_id",
"size": 100
}
}
}
"sort": ["user", {"createdAt": "asc"}]

Aggregation, Query Context and filter Context not working in Elasticsearch 5.1

I am facing issue in migrating from elastic search 1.5 to 5.1.
Following is my elastic search - 1.5 Query:
{
"_source":["_id","spotlight"],
"query":{
"filtered":{
"filter":{
"and":[
{"term":{"gender":"female"}},
{"range":{"lastlogindate":{"gte":"2016-10-19 12:39:57"}}}
]
}
}
},
"filter":{
"and":[
{"term":{"maritalstatus":"1"}}
]
},
"sort":[{"member2_dummy7":{"order":"desc"}}],
"size":"0",
"aggs": {
"maritalstatus": {
"filter": {},
"aggs" : {
"filtered_maritalstatus": {"terms":{"field":"maritalstatus","size":5000}}
}
}
}
}
This query is giving me correct doc_count in aggregations. This doc_count is calculated over result set returned by query context and it ignores filter context.
I have written same query in elastic search 5.1:
{
"_source":["_id","spotlight"],
"query":{
"bool":{
"must":[
{"term":{"gender":"female"}},
{"range":{"lastlogindate":{"gte":"2016-10-19 12:39:57"}}}
],
"filter":{
"bool":{
"must":[
{"term":{"maritalstatus":"1"}}
]
}
}
}
},
"sort":[{"member2_dummy7":{"order":"DESC"}}],
"size":"0",
"aggs": {
"maritalstatus": {
"filter": {},
"aggs" : {
"filtered_maritalstatus": {"terms":{"field":"maritalstatus","size":5000}}
}
}
}
}
But in elastic search 5.1, it is returning me wrong doc_count in aggregation. I think it is taking filter in query context and hence, it is returning wrong doc_cout. Can someone tell me correct way to separate query and filter in elastic search 5.1?
Your 1.5 query uses post_filter which you have removed in your 5.1 query.
The equivalent query in ES 5.1 is the following (filtered/filter simply gets replaced as bool/filter and the top-level filter renamed to post_filter):
{
"_source": [
"_id",
"spotlight"
],
"query": {
"bool": {
"filter": [
{
"term": {
"gender": "female"
}
},
{
"range": {
"lastlogindate": {
"gte": "2016-10-19 12:39:57"
}
}
}
]
}
},
"post_filter": {
"term": {
"maritalstatus": "1"
}
},
"sort": [
{
"member2_dummy7": {
"order": "desc"
}
}
],
"size": "0",
"aggs": {
"maritalstatus": {
"filter": {},
"aggs": {
"filtered_maritalstatus": {
"terms": {
"field": "maritalstatus",
"size": 5000
}
}
}
}
}
}

Elasticsearch Aggregation Word Count with using Stopwords

I'm using elasticsearch to store my data. I want to count the words in my documents. But I want to see the result without the stopwords. For example; in my current result I see 'and' is my top word. But I want to remove it. Currently I have 3802 stopwords in my stopword.txt. I don't want any of them to be shown in the aggregation result. How can I do that? MY current query;
{
"query": {
"bool": {
"must": [
{
"range": {
"date": {
"gte": "now-0d/d"
}
}
}
]
}
},
"aggs": {
"words": {
"terms": {
"size" : 0,
"field": "text"
}
}
}
}
The way I want query to work is;
{
"aggs": {
"filtered": {
"query": {
"bool": {
"must": [
{
"range": {
"date": {
"gte": "now-0d/d"
}
}
}
]
}
},
"filter": {
"my_stop": {
"type": "stop",
"stopwords_path": "/work/projects/stop_words.txt"
}
},
"aggs": {
"words": {
"terms": {
"size" : 0,
"field": "text"
}
}
}
}
}
}
By the way, I have my stopwords list in my custom analyzer.But it doesn't work the way I want.

Elastic search filtered query, query part being ignored?

I'm building up the following search in code, the idea being that it filters down the set of matches then queries this so I can add score based on certain fields. For some reason the filter part works but whatever I put in the query (i.e. in the below I have no index sdfsdfsdf) it still returns anything matching the filter.
Is the syntax wrong?
{
"query":{
"filtered":{
"query":{
"bool":{
"must":{
"match":{
"sdfsdfsdf":{
"query":"4",
"boost":2.0
}
}
}
},
"filter":{
"bool":{
"must":[
{
"terms":{
"_id":[
"55f93ead5df34f1900abc20b",
"55f8ab0226ec4bb216d7c938",
"55dc4e949dcf833308c63d6b"
]
}
},
{
"range":{
"published_date":{
"lte":"now"
}
}
}
],
"must_not":{
"terms":{
"_id":[
"55f0a799acccc28204a5058c"
]
}
}
}
}
}
}
}
}
Your filter is not at the right level. It should not be inside query but at the same level as query like this:
{
"query": {
"filtered": {
"query": { <--- query and filter at the same level
"bool": {
"must": {
"match": {
"sdfsdfsdf": {
"query": "4",
"boost": 2
}
}
}
}
},
"filter": { <--- query and filter at the same level
"bool": {
"must": [
{
"terms": {
"_id": [
"55f93ead5df34f1900abc20b",
"55f8ab0226ec4bb216d7c938",
"55dc4e949dcf833308c63d6b"
]
}
},
{
"range": {
"published_date": {
"lte": "now"
}
}
}
],
"must_not": {
"terms": {
"_id": [
"55f0a799acccc28204a5058c"
]
}
}
}
}
}
}
}
You need to replace sdfsdfsdf with your existing field name in your type, e.g. title, otherwise I think it will fallback to match_all query.
"match":{
"title":{
"query": "some text here",
"boost":2.0
}
}

Resources