Elastic search query for name / value pair columns pull - elasticsearch

We have one document in elastic search with multiple sections of name/value pair and we want to fetch value's only based on name column value.
"envelopeData": {
"envelopeName": "Bills",
"details": {
"detail": [
{
"name": "UC_CORP",
"value": "76483"
},
{
"name": "UC_CYCLE",
"value": "V"
}
We are expecting only 76483 as result based on name equals to UC_CORP

If the field envelopeData.details.detail is nested type then you can perform a match query for the desired name on the nested path and can use inner_hits to get just the value.
Map the field envelopeData.details.detail as nested(if not nested):
PUT stackoverflow
{
"mappings": {
"_doc": {
"properties": {
"envelopeData.details.detail": {
"type": "nested"
}
}
}
}
}
then you can perform the following query to get value using inner_hits:
GET stackoverflow/_search
{
"_source": "false",
"query": {
"nested": {
"path": "envelopeData.details.detail",
"query": {
"match": {
"envelopeData.details.detail.name.keyword": "UC_CORP"
}
},
"inner_hits": {
"_source": "envelopeData.details.detail.value"
}
}
}
}
which outputs:
{
"_index": "stackoverflow",
"_type": "_doc",
"_id": "W5GUW2gB3GnGVyg-Sf4T",
"_score": 0.6931472,
"_source": {},
"inner_hits": {
"envelopeData.details.detail": {
"hits": {
"total": 1,
"max_score": 0.6931472,
"hits": [
{
"_index": "stackoverflow",
"_type": "_doc",
"_id": "W5GUW2gB3GnGVyg-Sf4T",
"_nested": {
"field": "envelopeData.details.detail",
"offset": 0
},
"_score": 0.6931472,
"_source": {
"value": "76483" -> Outputs value only
}
}
]
}
}
}
}

Related

elk's elastic search dsl case sensitive

I'm doing an Elasticsearch Query DSL query on ELK such as:
{
"query": {
"wildcard": {
"url.path": {
"value": "*download*",
"boost": 1,
"rewrite": "constant_score"
}
}
}
}
but it seems is case sensitive (so show only info with "download", not "Download" or "DOWNLOAD").
i.e. is case sensitive.
can I disable this? and search case insensitive?
Version used: 7.9.1
The below query will help you perform case-insensitive search as it will fetch results for *download, *Download and *DOWNLOAD. You may replace with your index and with the field you would like to perform this search.
Search Query
GET /<my-index>/_search
{
"query" : {
"bool" : {
"must" : {
"query_string" : {
"query" : "*download",
"fields": ["<field1>"]
}
}
}
}
}
If you wish to perform the same search on multiple fields, you can add the same in list.
Search on multiple fields
GET /<my-index>/_search
{
"query" : {
"bool" : {
"must" : {
"query_string" : {
"query" : "*download",
"fields": ["<field1>","<field2>","field3>"]
}
}
}
}
}
There is a case_insensitive parameter available for wildcard query, but it was introduced in Elasticsearch 7.10.0, so you need to upgrade if you are still on 7.9.1.
If you can upgrade to 7.10.0 or higher:
Ideally, in index mapping field should use wildcard type:
{
"mappings": {
"properties": {
"url.path": {
"type": "wildcard"
}
}
}
}
Then a wildcard query with case insensitivity enabled will find all the variants ("download", "DOWNLOAD", "download", etc)
{
"query": {
"wildcard": {
"url.path": {
"value": "*download*",
"boost": 1,
"rewrite": "constant_score",
"case_insensitive": true
}
}
}
}
If you must remain at 7.9.1:
Define your mapping in such a way that Elasticsearch treats the field contents as lowercase. The following will mimic wildcard type (it's a keyword, so only one token) indexed as lowercase.
{
"mappings": {
"properties": {
"url": {
"type": "text",
"analyzer": "lowercase-keyword"
}
}
},
"settings": {
"analysis": {
"analyzer": {
"lowercase-keyword": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
}
}
The query, without the case_insensitive parameter which is unsupported in this version:
{
"query": {
"wildcard": {
"url": {
"value": "*download*",
"boost": 1,
"rewrite": "constant_score"
}
}
}
}
Example results (note that searching for "*download*" and "*DoWnLoAd*" with both work in the same way):
{
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "my-index",
"_type": "_doc",
"_id": "PtbQe3wByTvslqtrs7Cn",
"_score": 1.0,
"_source": {
"url": "http://example.com/download"
}
},
{
"_index": "my-index",
"_type": "_doc",
"_id": "P9bQe3wByTvslqtrvbDt",
"_score": 1.0,
"_source": {
"url": "http://example.com/Download"
}
},
{
"_index": "my-index",
"_type": "_doc",
"_id": "QNbQe3wByTvslqtrzbDw",
"_score": 1.0,
"_source": {
"url": "http://example.com/DOWNLOAD"
}
}
]
}
}
You can use case_insensitive parameter for wildcard query. This parameter was introduced in 7.10.0 version
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"url": {
"properties": {
"path": {
"type": "wildcard"
}
}
}
}
}
}
Index Data:
{
"url":{
"path":"xx/download"
}
}
Search Query:
{
"query": {
"wildcard": {
"url.path": {
"value": "*Download*",
"boost": 1,
"rewrite": "constant_score",
"case_insensitive": false
}
}
}
}
Search Result:
No results will be there when you are searching for *Download* or *DOWNLOAD*
Update:
You can use the wildcard query with "case_insensitive": true parameter
Adding a sample index data, search query, and search result
Index Data:
{
"url": {
"path": "download"
}
}
{
"url": {
"path": "DOWNLOAD"
}
}
{
"url": {
"path": "Download"
}
}
Search Query:
{
"query": {
"wildcard": {
"url.path": {
"value": "*DOWNLOAD*",
"boost": 1,
"rewrite": "constant_score",
"case_insensitive": true
}
}
}
}
Search Result:
"hits": [
{
"_index": "67210888",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"url": {
"path": "download"
}
}
},
{
"_index": "67210888",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"url": {
"path": "Download"
}
}
},
{
"_index": "67210888",
"_type": "_doc",
"_id": "3",
"_score": 1.0,
"_source": {
"url": {
"path": "DOWNLOAD"
}
}
}
]

How to make flattened sub-field in the nested field in elastic search?

Here, I have a indexed document like:
doc = {
"id": 1,
"content": [
{
"txt": I,
"time": 0,
},
{
"txt": have,
"time": 1,
},
{
"txt": a book,
"time": 2,
},
{
"txt": do not match this block,
"time": 3,
},
]
}
And I want to match "I have a book", and return the matched time: 0,1,2. Is there anyone who knows how to build the index and the query for this situation?
I think the "content.txt" should be flattened but "content.time" should be nested?
want to match "I have a book", and return the matched time: 0,1,2.
Adding a working example with index mapping,search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"content": {
"type": "nested"
}
}
}
}
Search Query:
{
"query": {
"nested": {
"path": "content",
"query": {
"bool": {
"must": [
{
"match": {
"content.txt": "I have a book"
}
}
]
}
},
"inner_hits": {}
}
}
}
Search Result:
"inner_hits": {
"content": {
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 2.5226097,
"hits": [
{
"_index": "64752029",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "content",
"offset": 2
},
"_score": 2.5226097,
"_source": {
"txt": "a book",
"time": 2
}
},
{
"_index": "64752029",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "content",
"offset": 0
},
"_score": 1.5580825,
"_source": {
"txt": "I",
"time": 0
}
},
{
"_index": "64752029",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "content",
"offset": 1
},
"_score": 1.5580825,
"_source": {
"txt": "have",
"time": 1
}
}
]
}
}
}
}

Elasticsearch exact match query (not fuzzy)

I have the following query over a string field:
const query = {
"query": {
"match_phrase": {
name: "ron"
}
},
"from": 0,
"size": 10
};
these are the names I have in the database
1. "ron"
2. "ron martin"
3. "ron ron"
4. "ron howard"
the result of this query is very fuzzy, all the of the rows are returned instead of row number 1 only.
It's like it is performing "contains" instead of "equals".
Thanks
In your case, all the documents are returning, because all the documents have ron in them.
If you want that only the exact field should match, then you need to add keyword subfield to the name field. This uses the keyword analyzer instead of the standard analyzer (notice the ".keyword" after name field). Try out this below query -
Index Mapping:
{
"mappings": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
Index Data:
{
"name":"ron"
}
{
"name":"ron martin"
}
{
"name":"ron ron"
}
{
"name":"ron howard"
}
{
"name": "john howard"
}
Search Query:
{
"query": {
"match_phrase": {
"name.keyword": "ron"
}
},
"from": 0,
"size": 10
}
Search Result:
"hits": [
{
"_index": "64982377",
"_type": "_doc",
"_id": "1",
"_score": 1.2039728,
"_source": {
"name": "ron"
}
}
]
Update 1:
Based on the comments below, if you want to search for both exact match and fuzzy match (according to your requirement), then you can use multi_match query.
Search Query:
{
"query": {
"multi_match": {
"query": "howard",
"fields": [
"name",
"name.keyword"
],
"type": "phrase"
}
}
}
Search Result:
"hits": [
{
"_index": "64982377",
"_type": "_doc",
"_id": "4",
"_score": 0.83740485,
"_source": {
"name": "ron howard"
}
},
{
"_index": "64982377",
"_type": "_doc",
"_id": "5",
"_score": 0.83740485,
"_source": {
"name": "john howard"
}
}
]

Elastic Search - weighting base on an attribute

Is there a way in Elastic Search to weight results base on an attribute other than the one used for the search query. For example, we search the field 'name', but all documents that have 'with_pictures' attributed to true weighted higher.
You can use boost on individual fields, that will be boosted automatically — count more towards the relevance score — at query time, with the boost parameter
Adding working example with index data, mapping and search query
Index mapping:
{
"mappings": {
"properties": {
"with_pictures": {
"type": "boolean",
"boost": 2
},
"name": {
"type": "keyword"
}
}
}
}
Index data:
{
"name": "A",
"with_pictures": false
}
{
"name": "A",
"with_pictures": true
}
{
"name": "B",
"with_pictures": true
}
Search Query:
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"should": [
{
"term": {
"name": "A"
}
},
{
"term": {
"with_pictures": true
}
}
]
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "fd_cb1",
"_type": "_doc",
"_id": "1",
"_score": 1.4100108,
"_source": {
"name": "A",
"with_pictures": true
}
},
{
"_index": "fd_cb1",
"_type": "_doc",
"_id": "3",
"_score": 0.9400072,
"_source": {
"name": "B",
"with_pictures": true
}
},
{
"_index": "fd_cb1",
"_type": "_doc",
"_id": "2",
"_score": 0.4700036,
"_source": {
"name": "A",
"with_pictures": false
}
}
]
Score of documents satisfying both the conditions of name and with_properties is having the highest score. But the document having name: B and with_pictures: true have higher score than name: A and with_pictures: false( because of the boost applied on the with_pictures
You can also refer function score query that allows you to modify the score of documents that are retrieved by a query.

Elasticsearch search for Turkish characters

I have some documents that i am indexing with elasticsearch. But some of the documents are written with upper case and Tukish characters are changed. For example "kürşat" is written as "KURSAT".
I want to find this document by searching "kürşat". How can i do that?
Thanks
Take a look at the asciifolding token filter.
Here is a small example for you to try out in Sense:
Index:
DELETE test
PUT test
{
"settings": {
"analysis": {
"filter": {
"my_ascii_folding": {
"type": "asciifolding",
"preserve_original": true
}
},
"analyzer": {
"turkish_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"my_ascii_folding"
]
}
}
}
},
"mappings": {
"test": {
"properties": {
"name": {
"type": "string",
"analyzer": "turkish_analyzer"
}
}
}
}
}
POST test/test/1
{
"name": "kürşat"
}
POST test/test/2
{
"name": "KURSAT"
}
Query:
GET test/_search
{
"query": {
"match": {
"name": "kursat"
}
}
}
Response:
"hits": {
"total": 2,
"max_score": 0.30685282,
"hits": [
{
"_index": "test",
"_type": "test",
"_id": "2",
"_score": 0.30685282,
"_source": {
"name": "KURSAT"
}
},
{
"_index": "test",
"_type": "test",
"_id": "1",
"_score": 0.30685282,
"_source": {
"name": "kürşat"
}
}
]
}
Query:
GET test/_search
{
"query": {
"match": {
"name": "kürşat"
}
}
}
Response:
"hits": {
"total": 2,
"max_score": 0.4339554,
"hits": [
{
"_index": "test",
"_type": "test",
"_id": "1",
"_score": 0.4339554,
"_source": {
"name": "kürşat"
}
},
{
"_index": "test",
"_type": "test",
"_id": "2",
"_score": 0.09001608,
"_source": {
"name": "KURSAT"
}
}
]
}
Now the 'preserve_original' flag will make sure that if a user types: 'kürşat', documents with that exact match will be ranked higher than documents that have 'kursat' (Notice the difference in scores for both query responses).
If you want the score to be equal, you can put the flag on false.
Hope I got your problem right!

Resources