Aggregating over _field_names in elasticsearch 5 - elasticsearch

I'm trying to aggregate over field names in ES 5 as described in Elasticsearch aggregation on distinct keys But the solution described there is not working anymore.
My goal is to get the keys across all the documents. Mapping is the default one.
Data:
PUT products/product/1
{
"param": {
"field1": "data",
"field2": "data2"
}
}
Query:
GET _search
{
"aggs": {
"params": {
"terms": {
"field": "_field_names",
"include" : "param.*",
"size": 0
}
}
}
}
I get following error: Fielddata is not supported on field [_field_names] of type [_field_names]

After looking around it seems the only way in ES > 5.X to get the unique field names is through the mappings endpoint, and since cannot aggregate on the _field_names you may need to slightly change your data format since the mapping endpoint will return every field regardless of nesting.
My personal problem was getting unique keys for various child/parent documents.
I found if you are prefixing your field names in the format prefix.field when hitting the mapping endpoint it will automatically nest the information for you.
PUT products/product/1
{
"param.field1": "data",
"param.field2": "data2",
"other.field3": "data3"
}
GET products/product/_mapping
{
"products": {
"mappings": {
"product": {
"properties": {
"other": {
"properties": {
"field3": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"param": {
"properties": {
"field1": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"field2": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
}
Then you can grab the unique fields based on the prefix.

This is probably because setting size: 0 is not allowed anymore in ES 5. You have to set a specific size now.
POST _search
{
"aggs": {
"params": {
"terms": {
"field": "_field_names",
"include" : "param.*",
"size": 100 <--- change this
}
}
}
}

Related

Update "keyword" to "text" field type of an index for inexact words matching in elasticsearch

{
"myindex": {
"mappings": {
"properties": {
"city": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
I tried to update by using below PUT request on the index, but still getting the above ouput of _mapping
{
"_doc" : {
"properties" : {
"city" : {"type" : "text"}
}
}
}
I am not able to query with inexact words because its type is "keyword", for the below the actual value in record is "Mumbai"
{
"query": {
"bool": {
"must": {
"match": {
"city": {
"query": "Mumbi",
"minimum_should_match": "10%"
}
}
}
}
}
}
Below mapping (What is shared in the question) will store 'city' as text and 'city.keyword' as a keyword.
{
"myindex": {
"mappings": {
"properties": {
"city": {
"type": "text", // ==========> Store city as text
"fields": {
"keyword": {
"type": "keyword", // =========> store city.keyword as a keyword
"ignore_above": 256
}
}
}
}
}
}
}
your's is the use case of Fuzzy search and not minimum_should_match.
ES Docs for Fuzzy Search: https://www.elastic.co/blog/found-fuzzy-search
Try below query
{
"query": {
"match": {
"city": {
"query": "mubai",
"fuzziness": "AUTO"
}
}
}
}
minimum_should_match
Minimum number of clauses that must match for a document to be returned
It signifies the percentage of clauses not the percentage of the string. Go through this documentation to frame the query to get the expected results. Invalid queries return invalid results.

Update field in a document based on the condition in Kibana/Elasticsearch

I am trying to update particular field in document based on some condition. In general sql way, I want to do following.
Update index indexname
set name = "XXXXXX"
where source: file and name : "YYYYYY"
I am using below to update all the documents but I am not able to add any condition.
POST indexname/_update_by_query
{
"query": {
"term": {
"name": "XXXXX"
}
}
}
Here is the template, I am using:
{
"indexname": {
"mappings": {
"idxname123": {
"_all": {
"enabled": false
},
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"date1": {
"type": "date",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"source": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
Could someone guide me how to add condition to it as mentioned above for the source and name.
Thanks,
Babu
You can make use of the below query to what you are looking for. I'm assuming name and source are your fields in your index.
POST <your_index_name>/_update_by_query
{
"script": {
"inline": "ctx._source.name = 'XXXXX'",
"lang": "painless"
},
"query": {
"bool": {
"must": [
{
"term": {
"name": {
"value": "YYYYY"
}
}
},
{
"term": {
"source": {
"value": "file"
}
}
}
]
}
}
}
You can probably make use of any of the Full Text Queries or Term Queries inside the Bool Query for either searching/updating/deletions.
Do spend sometime in going through them.
Note: Make use of Term Queries only if your field's datatype is keyword
Hope this helps!

How to make Elasticsearch aggregation only create 1 bucket?

I have an Elasticsearch index which contains a field called "host". I'm trying to send a query to Elasticsearch to get a list of all the unique values of host in the index. This is currently as close as I can get:
{
"size": 0,
"aggs": {
"hosts": {
"terms": {"field": "host"}
}
}
}
Which returns:
"buckets": [
{
"key": "04",
"doc_count": 201
},
{
"key": "cyn",
"doc_count": 201
},
{
"key": "pc",
"doc_count": 201
}
]
However the actual name of the host is 04-cyn-pc. My understanding is that it is spliting them up into keywords so I try something like this:
{
"properties": {
"host": {
"type": "text",
"fields": {
"raw": {
"type": "text",
"analyzer": "keyword",
"fielddata": true
}
}
}
}
}
But it returns illegal_argument_exception "reason": "Mapper for [host.raw] conflicts with existing mapping in other types:\n[mapper [host.raw] has different [index] values, mapper [host.raw] has different [analyzer]]"
As you can probably tell i'm very new to Elasticsearch and any help or direction would be awesome, thanks!
Try this instead:
{
"properties": {
"host": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
Elastic automatically indexes string fields as text and keyword type if you do not specify a mapping. In your example if you do not want your field to be analyzed for full text search, you should just define that fields' type as keyword. So you can get rid of burden of analyzed text field. With the mapping below you can easily solve your problem without changing your agg query.
"properties": {
"host": {
"type": "keyword"
}
}

How do I do a terms aggregation by concatenating two arrays?

I have an Elasticsearch mapping that looks like this:
"product": {
"properties": {
"attributes": {
"type": "keyword",
"normalizer": "lowercase"
},
"skus": {
"type": "nested",
"properties": {
"attributes": {
"type": "keyword",
"normalizer": "lowercase"
}
}
}
}
}
I'm trying to do a terms aggregation on both the field attributes and the field skus.attributes by concatenating them but I haven't figured out how. Both fields are simple string arrays. This is as far as I've gotten:
{
"query": {
"match_all": {}
},
"aggregations": {
"unique_attrs": {
"terms": {
"field": "attributes"
}
}
}
}
Of course, I could reindex my data in a way that there would be another field that contains a concatenation of the values of both fields but that seem right.
As mentioned on the Elasticsearch Forums: https://discuss.elastic.co/t/combining-nested-and-non-nested-aggregations/82583 it recommends merging them using a copy_to mapping when indexing the data.

elasticsearch query child list containing specific value

I writing a query to return the products that has a specific promotionCode. In my index, product has following property indexed
"offers": [
{
"promotionCode": "MV"
},
{
"promotionCode": "LI"
},
.....
]
My initial thought the following would be the answer to
GET alias-live-dev/_search
{
"query": {
"match": {
"offers.promotionCode":"MV"
}
}
}
However, this always return 0 hit, I am guessing, it failed because offers is a list. Could anyone please advise what would the right query for this scenario. Thanks in advance.
In mapping,
"productId": {
"type": "keyword"
},
"offers": {
"type": "nested",
"properties": {
......
"promotionCode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},

Resources