Query consecutive words using match_phrase elasticsearch works unexpected - elasticsearch

I have the parameter name as a text:
{
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
}
}
}
}
Because of the nature of text type in ElasticSearch, matchs every word on the phrase. That's why in some cases I get the next results:
POST /example-tags/_search
{
"query": {
"match": {
"name": "Jordan Rudess was born in 1956"
}
}
}
// Results
{
"took" : 28,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 2.1596613,
"hits" : [
{
"_index" : "example-tags",
"_type" : "_doc",
"_id" : "6101e538bc8ec610aff699e4",
"_score" : 4.1596613,
"_source" : {
"name" : "Jordan Rudess"
}
},
{
"_index" : "example-tags",
"_type" : "_doc",
"_id" : "610123538bc8ec61034ff699e4",
"_score" : 4.1796613,
"_source" : {
"name" : "Alice in Chains"
}
},
]
}
}
As you can see, in the text Jordan Rudess was born in 1956 I get the result Alice in Chains just for the word in. I want to avoid this behaviour.
If I try:
POST /example-tags/_search
{
"query": {
"match_phrase": {
"name": "Dream Theater keyboardist's Jordan Rudess was born in 1956"
}
}
}
// Results
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
So, in the past example I was expecting to get the Jordan Rudess tag name but I get empty results.
I need to get the maximum ocurrences in tag.name of consecutive words in a phrase. How can I achieve that?

Related

why elasticsearch can not search a document contains one word?

I am using default settings for one index, follow DSL is how to create the documents and searching.
### create index
PUT /mk_test
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
},
"mappings": {
"_doc": {
"properties": {
"nickName": {
"type": "text"
}
}
}
}
}
### get index
GET /mk_test/_mapping
### create document
POST /mk_test/_doc
{
"nickName": "C.BP"
}
### create document
POST /mk_test/_doc
{
"nickName": "BP"
}
### create document
POST /mk_test/_doc
{
"nickName": "C.B"
}
### create document
POST /mk_test/_doc
{
"nickName": "你好,中国"
}
now I have 4 document in mk_test index,
and I have 2 search query, give me different answers.
I want to query docs contains "中国"
GET /mk_test/_search
{
"query": {
"bool": {
"must": [
{"match_phrase": {"nickName": "中国"}}
]
}
}
}
server responses:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.5779729,
"hits" : [
{
"_index" : "mk_test",
"_type" : "_doc",
"_id" : "c2gwwX0BTkUG9klh1b8k",
"_score" : 1.5779729,
"_source" : {
"nickName" : "你好,中国"
}
}
]
}
}
I want to query docs contains "BP", I can't get "C.BP",
GET /mk_test/_search
{
"query": {
"bool": {
"must": [
{"match_phrase": {"nickName": "BP"}}
]
}
}
}
server give me only "BP", but "C.BP" not found
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.4599355,
"hits" : [
{
"_index" : "mk_test",
"_type" : "_doc",
"_id" : "TmguwX0BTkUG9klhAJ_S",
"_score" : 1.4599355,
"_source" : {
"nickName" : "BP"
}
}
]
}
}
How can I find both "BP" and "C.BP" ?

elasticsearch reducing result to one column - return only 1 value for each document

I try to reduce the json result of elasticsearch to only the column or columns i suggested to get. Is there any way?
When I use the following command, I get the result nasted into "_source":
{
"from": "0", "size":"2",
"_source":["id"],
"query": {
"match_all": {}
}
}
'
and there is no need for my use case.
I get this result:
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "indexer_1",
"_type" : "type_indexer_1",
"_id" : "38142",
"_score" : 1.0,
"_source" : {
"id" : 38142
}
},
{
"_index" : "indexer_1",
"_type" : "type_indexer_1",
"_id" : "38147",
"_score" : 1.0,
"_source" : {
"id" : 38147
}
}
]
}
}
What I would like to have:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : 1.0,
"hits" : [
{
"id" : 38142
},
{
"id" : 38147
}
]
}
}
And this json-result I would love:
{
{
"id" : 38142
},
{
"id" : 38147
}
}
Is there any way out of the box in ES to reduce the result set?
you can filter the output JSON look at the documentation : response filtering
GET /index/_search?filter_path=hits.hits._id
{
"from": "0",
"size":"2",
"_source":["id"],
"query": {
"match_all": {}
}
}

Why does elastic search wildcard query return no results?

Query #1 in Kibana returns results, however Query #2 returns no results. I search for only "bob" and get results, but when searching for "bob smith", no results, even though "Bob Smith" exists in the index. Any reason why?
Query #1: returns results
GET people/_search
{
"query": {
"wildcard" : {
"name" : "*bob*"
}
}
}
Results:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 23,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "people",
"_type" : "_doc",
"_id" : "xxxxx",
"_score" : 1.0,
"_source" : {
"name" : "Bob Smith",
...
Query #2: returns nothing.. why(?)
GET people/_search
{
"query": {
"wildcard" : {
"name" : "*bob* *smith*"
}
}
}
results...nothing
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
Look like the reason of the empty result is your index mapping. If you use "text" type field, you actually search in the inverted index, mean you search in the token "bob" and token "smith" (standard analyzer) and not in the "Bob Smith". If you want to search in "Bob Smith" as one token, you need to use "keyword" type (maybe with lowercase normalizer, if you want to use not key sensetive search)
For example:
PUT test
{
"settings": {
"analysis": {
"normalizer": {
"lowercase_normalizer": {
"type": "custom",
"char_filter": [],
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "keyword",
"ignore_above": 256,
"normalizer": "lowercase_normalizer"
}
}
}
}
PUT test/_doc/1
{
"name" : "Bob Smith"
}
GET test/_search
{
"query": {
"wildcard": {
"name": "*bob* *Smith*"
}
}
}

No matches when querying Elastic Search

I'm trying to run a query elastic search. When run this query
GET accounts/_search/
{
"query": {
"term": {
"address_line_1": "1000"
}
}
}
I get back multiple records like
"hits" : [
{
"_index" : "accounts",
"_type" : "_doc",
"_id" : "...",
"_score" : 8.355149,
"_source" : {
"state_id" : 35,
"first_name" : "...",
"last_name" : "...",
"middle_name" : "P",
"dob" : "...",
"status" : "ACTIVE",
"address_line_1" : "1000 BROADROCK CT",
"address_line_2" : "",
"address_city" : "PARMA",
"address_zip" : "",
"address_zip_plus_4" : ""
}
},
But when I try to expand it to include the more like below I don't get any matches
GET accounts/_search/
{
"query": {
"term": {
"address_line_1": "1000 B"
}
}
}
The response is
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
The term query is looking for exact matches. Your address_line_* fields were most probably indexed with the standard analyzer which lowercase-s all the letters which in turn prevents the query from matching.
So either use
GET accounts/_search/
{
"query": {
"match": { <--
"address_line_1": "1000 B"
}
}
}
which does not really 'care' about B being lower/upper case or adjust your field analyzers such that the capitalization is preserved.

elasticsearch does not return expected returns

I'm complete new on elasticsearch. I tried search API but it's not returning what I expected
What I did
POST /test/_doc/1
{
"name": "Hello World"
}
GET /test/_doc/1
Response:
{
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_version" : 5,
"_seq_no" : 28,
"_primary_term" : 1,
"found" : true,
"_source" : {
"name" : "Hello World"
}
}
GET /test/_mapping
Response:
{
"test" : {
"mappings" : {
"properties" : {
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"query" : {
"properties" : {
"term" : {
"properties" : {
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
}
}
}
GET /test/_search
{
"query": {
"term": {
"name": "Hello"
}
}
}:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
GET /test/_search
{
"query": {
"term": {
"name": "Hello World"
}
}
}
Response:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
My elasticsearch version is 7.3.2
The last two search should return me document 1, is that correct? Why does it hit nothing?
Problem is that you have term queries. Term queries are not analysed. Hence Hello didn't match the term hello in your index. Note the case difference.
Unlike full-text queries, term-level queries do not analyze search terms. Instead, term-level queries match the exact terms stored in a field.
Reference
Whereas match queries analyse the search term also.
{
"query": {
"match": {
"name": "Hello"
}
}
}
You can use _analyze to check how your terms are indexed.

Resources