I search for all fields using Elasticsearch, do you know which field matched? - elasticsearch

I search for all fields using Elasticsearch, do you know which field matched?
PUT my_index/user/1
{
"first_name": "John",
"last_name": "Smith",
"date_of_birth": "1970-10-24"
}
GET my_index/_search
{
"query": {
"match": {
"_all": "john 1970"
}
}
}
In the above example, "john 1970" is searched for all fields.
Since the put document matches "first_name" and "date_of_birth", it returns as a result.
How do I know that it matches "first_name" and "date_of_birth"?

The thing is that _all is a field into which all values from all other fields are copied at indexing time. Concretely, when you index your document, what ES conceptually sees is this (though the source is not modified to contain _all and _all itself is not stored, just indexed):
{
"first_name": "John",
"last_name": "Smith",
"date_of_birth": "1970-10-24",
"_all": "john smith 1970 10 24"
}
So if you match against _all then the only field that can match is _all itself, there's no way to "reverse-engineer" which field contained which matching value solely based on _all.
What you can do, however, is to use another feature called highlighting. Since the _all field is not stored it cannot be highlighted but the other fields can, so you can highlight which original fields match which values:
{
"query": {
"match": {
"_all": "john 1970"
}
},
"highlight": {
"fields": {
"*": {
"require_field_match": false
}
}
}
}
In the response, you'll see something like this which shows that first_name matches the query.
"highlight": {
"first_name": [
"<em>John</em>"
]
}

Related

Merging fields in Elastic Search

I am pretty new to Elastic Search. I have a dataset with multiple fields like name, product_info, description etc., So while searching a document, the search term can come from any of these fields (let us call them as "search core fields").
If I start storing the data in elastic search, should I derive a field which is a concatenated term of all the "search core fields" ? and then index this field alone ?
I came across _all mapping concept and little confused. Does it do the same ?
no, you don't need to create any new field with concatenated terms.
You can just use _all with match query to search a text from any field.
About _all, yes, it searches the text from any field
The _all field has been removed in ES 7, so it would only work in ES 6 and previous versions. The main reason for this is that it used too much storage space.
However, you can define your own all field using the copy_to feature. You basically specify in your mapping which fields should be copied to your custom all field and then you can search on that field.
You can define your mapping like this:
PUT my-index
{
"mappings": {
"properties": {
"name": {
"type": "text",
"copy_to": "custom_all"
},
"product_info": {
"type": "text",
"copy_to": "custom_all"
},
"description": {
"type": "text",
"copy_to": "custom_all"
},
"custom_all": {
"type": "text"
}
}
}
}
PUT my-index/_doc/1
{
"name": "XYZ",
"product_info": "ABC product",
"description": "this product does blablabla"
}
And then you can search on your "all" field like this:
POST my-index/_search
{
"query": {
"match": {
"custom_all": {
"query": "ABC",
"operator": "and"
}
}
}
}

Elasticsearch with AND query in DSL

this drives me crazy. I have no clue why this elastic search do not return me value.
I put values with this:
PUT /customer/person-test/1?pretty
{
"name": "John Doe",
"personId": 153,
"houseHoldId": 6191136,
"quarter": "2016_Q1"
}
PUT /customer/person-test/2?pretty
{
"name": "John Doe",
"personId": 153,
"houseHoldId": 6191136,
"quarter": "2016_Q2"
}
and when I query like this, it do not returns me value:
GET /customer/person-test/_search
{
"query": {
"bool": {
"must" : [
{
"term": {
"name": "John Doe"
}
},
{
"term": {
"quarter": "2016_Q1"
}
}
]
}
}
}
this query i copied from A simple AND query with Elasticsearch
I just want to get the person with "John Doe" AND "2016_Q1", why this did not work?
You should use match instead of term :
GET /customer/person-test/_search
{
"query": {
"bool": {
"must" : [
{
"match": {
"name": "John Doe"
}
},
{
"match": {
"quarter": "2016_Q1"
}
}
]
}
}
}
Explanation
Why doesn’t the term query match my document ?
String fields can be of type text (treated as full text, like the body
of an email), or keyword (treated as exact values, like an email
address or a zip code). Exact values (like numbers, dates, and
keywords) have the exact value specified in the field added to the
inverted index in order to make them searchable.
However, text fields are analyzed. This means that their values are
first passed through an analyzer to produce a list of terms, which are
then added to the inverted index.
There are many ways to analyze text: the default standard analyzer
drops most punctuation, breaks up text into individual words, and
lower cases them. For instance, the standard analyzer would turn the
string “Quick Brown Fox!” into the terms [quick, brown, fox].
This analysis process makes it possible to search for individual words
within a big block of full text.
The term query looks for the exact term in the field’s inverted
index — it doesn’t know anything about the field’s analyzer. This
makes it useful for looking up values in keyword fields, or in numeric
or date fields. When querying full text fields, use the match query
instead, which understands how the field has been analyzed.
...
its not working because of u r using default standard analyzer link for 'name' and 'quarter' .
You have two more options :-
1)change mapping :-
"name": {
"type": "string",
"index": "not_analyzed"
},
"quarter": {
"type": "string",
"index": "not_analyzed"
}
2)try this , lowercase your value since by default standard analyzer use Lower Case Token Filter :-
{
"query": {
"bool": {
"must" : [
{
"term": {
"name": "john_doe"
}
},
{
"term": {
"quarter": "2016_q1"
}
}
]
}
}
}

Elastic search highlight not working

I am new to elastic search and i am trying to highlight the matched keywords
GET /{index}/_search
{
"query": {
"match": {
"_all": "first"
}
},
"highlight": {
"fields": {
"*": {}
},
"require_field_match": false
}
}
My output is a nested object.I also tried without "require_field_match" parameter
You can use one of the 2 methods mentioned in below link to search and highlight on all fields
A field can only be used for highlighting if the original string value
is available, either from the _source field or as a stored field.
The _all field is not present in the _source field and it is not
stored or enabled by default, and so cannot be highlighted. There are
two options. Either store the _all field or highlight the original
fields.
Highlight all fields
you can't produce a highlight with a search from the _all field.
You have to search in an actual field for it to work:
GET /{index}/_search
{
"query": {
"match": {
"title": "first"
}
},
"highlight": {
"fields": {
"title": {}
}
}
}

Is there any way not to return arrays when specifying return fields in an Elasticsearch query?

If I have a documents like this :
[
{
"model": "iPhone",
"brand": "Apple"
},
{
"model": "Nexus 5",
"brand": "Google"
}
]
And that I make a query which only returns the model field in a query, like this:
{
"fields": ["model"],
"query": {
"term": {
"brand": "apple"
}
}
}
Then each document field is returned within an array like this:
{ "model": ["iPhone"] }
instead of
{ "model": "iPhone" }
How can I avoid that and get the fields in the same format as when the fields query option is not defined?
At the end the answer was pretty easy: you have to use the _source query option insteand of fields.
Example:
{
"_source": ["model"],
"query": {
"term": {
"brand": "apple"
}
}
}
This way I get documents in the following format, like in the original one (without the _source option):
{ "model": "iPhone" }
I had the same problem, and indeed (as Wax Cage said) I thought that _source would bring some performances problems. I think using both fields and _source solves the problem:
const fields = ['model']
{
fields: fields,
_source: fields
query: {
term: {
brand: 'apple'
}
}
}

OR query with elasticsearch

I have an index with "name" and "description" filed. I am running a Boolean query against my index. Sometimes the term is present in both name and description fields, in this case the documents in which both the name and description contains the search term are scored higher compared to the ones having either the name or the description having the search term.
What I want is to score them equal. So the the documents with either name or description having the term has the same score as the document having the search term present in both name and description.
Is it possible?
Here is the example:
{
"name": "xyz",
"description": "abc xyz"
},
{
"name": "abc",
"description": "xyz pqr"
},
{
"name": "xyz",
"description": "abc pqr"
}
If the user search for term "xyz" I want all three documents above to have the same score.
As all documents contains the term "xyz" either in name or in description or in both fields.
You can use a Filtered Query for this. Filters are not scored. See the query below for searching the term "xyz":
POST <index name>/<type>/_search
{
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"should": [
{
"term": {
"name": "xyz"
}
},
{
"term": {
"description": "xyz"
}
}
]
}
}
}
}
I think you can either :
transform you query to a filter. Filters do not affect score (and are faster than queries)
or wrap your query in a "Constant score query" - see : http://www.elasticsearch.org/guide/reference/query-dsl/constant-score-query/

Resources