Elasticsearch 6.3 query with space in a keyword field and not returning all documents - elasticsearch

I have the fallowing part of a mapping:
"name": {
"store": "true",
"type": "keyword"
}
and this query:
{
"query":{
"query_string":{
"query":"+(name:John Doe)",
"fields":[
]
}
},
"aggregations":{
"name":{
"terms":{
"field":"name",
"size":10
}
}
}
}
The query should return over 100 results however it only returns a few. If I add quotes to John Doe like this: \"John Doe\" then it returns all the desired results.
I'm wondering why this happens. Isn't enough that the field is mapped as keyword so that John Doe is analyzed as a whole, and no quotes should be added? Also, why would it return less items without quotes?
Note: In ES 1.4 the same query seems to work fine (although is not the same data to be honest, and it uses facets instead of aggregations).

The documentation for query string query clearly states:
If the field is a keyword field the analyzer will create a single term ...
So you don't need to add quotes to your search string. Instead, you need to write your query correctly. Currently your query try to find the term John in field name, and term Doe in all other fields! So you must rewrite your query in one of the following ways:
Add parentheses to your search term so the query parser can "understand" that all words must be found in name field:
{
"query": {
"query_string": {
"query": "+(name:(John Doe))",
"fields": [
]
}
},
"aggregations": {
"name": {
"terms": {
"field": "name",
"size": 10
}
}
}
}
Specify field name in fields array rather than in query string:
{
"query": {
"query_string": {
"query": "+(John Doe)",
"fields": [
"name"
]
}
},
"aggregations": {
"name": {
"terms": {
"field": "name",
"size": 10
}
}
}
}

Related

How does multi field mapping work in Elastic Search

I want to support both text search (match query) as well as exact match (term query) on a single field in my elasticsearch index.
Following is the mapping that I have created:
PUT multi_mapping_test/_mapping
{
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
However, the term query is not behaving as I am expecting it to (may be understanding of it is wrong).
For example, here are couple of sample documents indexed:
POST multi_mapping_test/_doc
{
"name": "abc llc"
}
POST multi_mapping_test/_doc
{
"name": "def llc"
}
Following term query yields no results:
GET multi_mapping_test/_search
{
"query": {
"term": {
"name": {
"value": "abc llc"
}
}
}
}
Am I doing anything wrong or is my understanding of exact matches with term query incorrect?
P.S. The term query works fine when I put mapping for only keyword type.
Term query: Returns documents that contain an exact term in a provided field.
When you're searching for exact match you should use keyword field types. Like the following:
GET multi_mapping_test/_search
{
"query": {
"term": {
"name.keyword": {
"value": "abc llc"
}
}
}
}
In addition, You can use bool query for both text search (match query) and exact match (term query) in your elasticsearch index.
GET multi_mapping_test/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"name": "abc llc"
}
},
{
"term": {
"name.keyword": {
"value": "abc llc"
}
}
}
],
"minimum_should_match": 1
}
}
}
Note: You can also use the match_bool_prefix query if you need to autocomplete the feature.
Details: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-bool-prefix-query.html
"abc llc" _score will be higher than "def llc" because it matches both match and term queries.

ElasticSearch phrase search not working for wildcard search

I have a word index and I can't search phrases on elasticsearch. there is no result. I check tons of solutions but I can't implement them to my query.
My mapping looks like this;
PUT /words/_mapping
{
"properties": {
"text": {
"type": "keyword"
}
}
}
(if the type is text everything worked as expected)
My elastic query looks like this;
GET /words/_search
{
"from": 0,
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"query_string": {
"default_field": "text",
"default_operator": "AND",
"query": "*foo bar baz*"
}
}
]
}
}
]
}
},
"size": 100,
"track_total_hits": true
}
But there is no data
_id
_index
_score
_type
text
I expect the record that has `foo bar baz` value,
_id
_index
_score
_type
text
drDzzYQBu3ncIuw4vn10
words
0.0
_doc
foo bar baz
What is the problem? Could someone help?
"Query_string" is a full text query and will be optimized for text fields only , I suggest you to to use text type instead of keyword for this query.
As you are using AND operator , it will treat as phrase only.

Elastic Search exact match query with wildcard search for multiple fields

I have elastic search data store like below, and I need to write multi search ES query through these data with exact match and exact match + *(wildcard)
[
{
"id": "123",
"name": "test123 abc bct",
"externalObj": {
"id": "abc 123"
}
},
{
"id": "124",
"name": "test124 abc bct",
"externalObj": {
"id": "abc 124"
}
}
]
currently i have written query like below,
{
"query": {
"query_string": {
"fields": [
"name^5",
"id",
"externalObj.*"
],
"query": "(test124 abc)",
"default_operator": "AND"
}
}
}
Above query is working fine with exact match but I need to get the data for partial search and maximum relevant score for the response as well. that thing doesn't work with this query.
e.g: "query": "test124 ab"
Can anyone help me out for above problem ?
There are 2 options to achieve what you want. You can choose one of them to use:
Set default_operator to OR (or just simply remove it since the default value is OR).
{
"query": {
"query_string": {
"fields": [
"name^5",
"id",
"externalObj.*"
],
"query": "test124 a"
}
}
}
Change your query into test124 a*
{
"query": {
"query_string": {
"fields": [
"name^5",
"id",
"externalObj.*"
],
"query": "test124 a*",
"default_operator": "AND"
}
}
}

How to sort elasticsearch results based on number of collapsed items?

I'm using a a query with collapse in order to gather some documents under a certain person, yet I wish to sort the results based on the number of documents in which the search found a match.. this is my query:
GET documents/_search
{
"_source": {
"includes": [
"text"
]
},
"query": {
"query_string": {
"fields": [
"text"
],
"query": "some text"
}
},
"collapse": {
"field": "person_id",
"inner_hits": {
"name": "top_mathing_docs",
"_source": {
"includes": [
"doc_year",
"text"
]
}
}
}
}
Any suggestions?
Thanks
If I understand correctly, what you require here is to sort the documents i.e. parent documents, based on the count of inner_hits i.e. count of inner_hits based on person_id.
So that means, the _score of the parent documents in the result doesn't matter.
The only way I've found this doable is making use of the Top Hits Aggregation for Field Collapse Example and below is what your query would look like.
Aggregation Query Field Collapse Example:
POST <your_index_name>/_search
{
"size":0,
"query": {
"query_string": {
"fields": [
"text"
],
"query": "some text"
}
},
"aggs": {
"top_person_ids": {
"terms": {
"field": "person_id"
},
"aggs": {
"top_tags_hits": {
"top_hits": {
"size": 10
}
}
}
}
}
}
Note that I'm assuming person_id is of type keyword or any numeric.
Also if you look at query closely, I've mentioned "size":"0". Which means I'm only returning the result of aggregation.
Another note is that the above aggregation has nothing to do with Field Collapse in Search Request feature that you have posted in the question. It's just that using this aggregation, your result could be formatted in a similar way.
Let me know if this helps!

Case insensitive query

Im using ElasticSearch, and I have this field:
"name": {
"type": "string",
"index": "not_analyzed"
},
I run this query to get, for example, all employees with the name "Charles":
GET company_employee/employee/_search
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"name": "Charles"
}
}
]
}
}
}
The issue with this is that I need to get an insensitive search. I need to retrieve all "Charles", even if the value I provide to the query es ChaRleS or charles, or CHarles, etc. What I need to do?
If reindexing is not an option, this leaves altering your query
Although the regexp approach doesn't allow for searching case insensitive, you could do so "manually".
If the character is always the first letter, you can get by with this:
GET company_employee/employee/_search
{
"query": {
"regexp": { "name": "[Cc]harles" }
}
}
Otherwise for "true" case-insensitive:
GET company_employee/employee/_search
{
"query": {
"regexp": { "name": "[Cc][Hh][Aa][Rr][Ll][Ee][Ss]" }
}
}
This is in no way efficient, but matches your constraint of not altering the index.
Basically you'd rather need to have your name field to be analyzed, i.e. like this:
"name": {
"type": "text",
"analyzer": "standard"
}
With such mapping all your values will be lower-cased, thus search will be case-insensitive.

Resources