Match_phrase is elastic search not working as expected - elasticsearch

In my elastic search I have documents which contains a "fieldname" with values "abc" and "abc-def". When I am using match_phrase query for searching documents with fieldname "abc" it is returning me documents with values "abc-def" as well. However when I am querying for "abc-def" it is working fine. My query is as follows:
Get my_index/_search
{
"query" : {
"match_phrase" : {"fieldname" : "abc"}
}
}
Can someone please help me in understanding the issue?

match_phrase query analyzes the search term based on the analyzer provided for the field (if no analyzer is added then by default standard analyzer is used).
Match phrase query searches for those documents that have all the terms present in the field (from the search term), and the terms must be present in the correct order.
In your case, "abc-def" gets tokenized to "abc" and "def" (because of standard analyzer). Now when you are using match phrase query for "abc-def", this searches for all documents that have both abc and def in the same order. (therefore you are getting only 1 doc in the result)
When searching for "abc", this will search for those documents that have abc in the fieldname field (since both the document contain abc, so both are returned in the result)
If you want to return only exact matching documents in the result, then you need to change the way the terms are analyzed.
If you have not explicitly defined any mapping then you need to add .keyword to the fieldname field. This uses the keyword analyzer instead of the standard analyzer (notice the ".keyword" after the fieldname field).
Adding a working example with index data, mapping, search query and search result
Index data:
{
"name":"abc-def"
}
{
"name":"abc"
}
Search Query:
{
"query": {
"match_phrase": {
"name.keyword": "abc"
}
}
}
Search Result:
"hits": [
{
"_index": "67394740",
"_type": "_doc",
"_id": "1",
"_score": 0.6931471,
"_source": {
"name": "abc"
}
}
]

Related

How to make _source field dynamic in elasticsearch search template?

While using search query in elastic search we define what fields we required in the response
"_source": ["name", "age"]
And while working with search templates we have to set _source fields value while inserting search template to ES Cluster.
"_source": ["name", "age"]
but the problem with the search template is that it will always return us name and age and to get other fields we have to change our search template accordingly.
Is there any way we can pass search fields from the client so that it will only return fields in response to which the user asked?
I have achieved that just for one field like if you do this
"_source": "{{field}}"
then while search index via template you can do this
POST index_name/_search/template
{
"id": template_id,
"params": {
"field": "name"
}
}
This search query returning the name field in response but I could not find a way to pass it as in array or in another format so I can get multiple fields.
Absolutely!!
Your search template should look like this:
"_source": {{#toJson}}fields{{/toJson}}
And then you can call it like this:
POST index_name/_search/template
{
"id": template_id,
"params": {
"fields": ["name"]
}
}
What it's going to do is to transform the params.fields array into JSON and so the generated query will look like this:
"_source": ["name"]

Elastic Search | How to get original search query with corresponding match value

I'm using ElasticSearch as search engine for a human resource database.
The user submits a competence (f.ex 'disruption'), and ElasticSearch returns all users ordered by best match.
I have configured the field 'competences' to use synonyms, so 'innovation' would match 'disruption'.
I want to show the user (who is performing the search) how a particular search result matched the search query. For this I use the explain api (reference)
The query works as expected and returns an _explanation to each hit.
Details (simplified a bit) for a particular hit could look like the following:
{
description: "weight(Synonym(skills:innovation skills:disruption)),
value: 3.0988
}
Problem: I cannot see what the original search term was in the _explanation. (As illustrated in example above: I can see that some search query matched with 'innovation' or 'disruption', I need to know what the skill the users searched for)
Question: Is there any way to solve this issue (example: parse a custom 'description' with info about the search query tag to the _explanation)?
Expected Result:
{
description: "weight(Synonym(skills:innovation skills:disruption)),
value: 3.0988
customDescription: 'innovation'
}
Maybe you can put the original query in the _name field?
Like explained in https://qbox.io/blog/elasticsearch-named-queries:
GET /_search
{
"query": {
"query_string" : {
"default_field" : "skills",
"query" : "disruption",
"_name": "disruption"
}
}
}
You can then find the proginal query in the matched queries section in the return object:
{
"_index": "testindex",
"_type": "employee",
"_id": "2",
"_score": 0.19178301,
"_source": {
"skills": "disruption"
},
"matched_queries": [
"disruption"
]
}
Add the explain to the solution and i think it would work fine...?

Elasticsearch query to get results irrespective of spaces in search text

I am trying to fetch data from Elasticsearch matching from a field name. I have following two records
{
"_index": "sam_index",
"_type": "doc",
"_id": "key",
"_version": 1,
"_score": 2,
"_source": {
"name": "Sample Name"
}
}
and
{
"_index": "sam_index",
"_type": "doc",
"_id": "key1",
"_version": 1,
"_score": 2,
"_source": {
"name": "Sample Name"
}
}
When I try to search using texts like sam, sample, Sa, etc, I able fetch both records by using match_phrase_prefix query. The query I tried with match_phrase_prefix is
GET sam_index/doc/_search
{
"query": {
"match_phrase_prefix" : {
"name": "sample"
}
}
}
I am not able to fetch the records when I try to search with string samplen. I need search and get results irrespective of spaces between texts. How can I achieve this in Elasticsearch?
First, you need to understand how Elasticsearch works and why it gives the result and doesn't give the result.
ES works on the token match, Documents which you index in ES goes through the analysis process and creates and stores the tokens generated from this process to inverted index which is used for searching.
Now when you make a query then that query also generates the search tokens, these can be as it is in the search query in case of term query or tokens based on the analyzer defined on the search field in case of match query. Hence it's very important to understand the internals of your search query.
Also, it's very important to understand the mapping of your index, ES uses the standard analyzer by default on the text fields.
You can use the Explain API to understand the internals of the query like which search tokens are generated by your search query, how documents matched to it and on what basis score is calculated.
In your case, I created the name field as text, which uses the word joined analyzer explained in Ignore spaces in Elasticsearch and I was able to get the document which consists of sample name when searched for samplen.
Let us know if you also want to achieve the same and if it solves your issue.

Elasticsearch 6.2: terms query require lowercase input when searching on keyword

I've created an example index, with the following mapping:
{
"_doc": {
"_source": {
"enabled": False
},
"properties": {
"status": { "type": "keyword" }
}
}
}
And indexed a document:
{"status": "CMP"}
When searching the documents with this status with a terms query, I find no results:
{
"query" : {
"terms": { "status": ["CMP"]}
}
}
However, if I make the same query by putting the input in lowercase, I will find my document:
{
"query" : {
"terms": { "status": ["cmp"]}
}
}
Why is it? Since I'm searching on a keyword field, the indexed content should not be analyzed and should match an uppercase value...
no more #Oliver Charlesworth Now - in Elastic 6.x - you could continue to use a keyword datatype, lowercasing your text with a normalizer,doc here. However in every cases you should change your index mapping and reindex your docs
The index and mapping creation and the search were part of a test suite. It seems that the setup part of the test suite was not executed, and the mapping was not applied to the index.
The index was then using the default types instead of the mapping types, resulting of the use of string fields instead of keywords.
After changing the setup method of the automated tests, the mappings are well applied to the index, and the uppercase values for the status "CMP" are now matching documents.
The symptoms you're seeing shouldn't occur, unless something else is wrong.
A keyword index is not analysed, so your index should contain only CMP. A terms query is also not analysed, etc. so your index is searched only for CMP. Hence there should be a match.

Elastic search filter value like "123-325-23243" during aggregation

In elastic search query when I try to aggregate, I have value like 1234-3245-34234-2342 it just returns with key: 1234
Is there any possibility in mentionings the property type or regular expression in it
Some more explanation :
"aggregations": { "myagg": { "terms": { "field": "did", "size": 50 } } }
When I try it on the data the values are like ABC-CDEF-DEFG and after running the script it is not able aggregate it. It shows the key only to be ABC and
"key" : "ABC", "doc_count" : 24069
It can't take the entire key like ABC-DEF-GHI-fhho
Check your mapping, I expect you did not do anything for the mapping. That is when you can the standard analyzer for strings. The standard analyser brakes up at the "-", that is why you get the term you mentioned. Make the field not_analyzed and you should get better results.
When i use field.raw that fixes the issue...https://github.com/elasticsearch/kibana/issues/364

Resources