I have a document with a nested field and I'm having some trouble getting highlighting to work. Why am I not getting highlighting when my term query contains pointy brackets (<>)?
We have two fields in a nested mapping containing similar data:
"value": {
"type": "keyword",
"normalizer": "lowercase"
},
"valueWithQualifier": {
"type": "keyword",
"normalizer": "lowercase"
}
The lowercase normalizer uses the filters ["asciifolding", "lowercase"]
The value is generally an alphanumeric string but the valueWithQualifier takes the form value<qualifier>. When I execute a term query on the value field, it generally returns highlighting information. When I execute a term query on the valueWithQualifier field, I never get highlighting info.
{
"query": {
"nested": {
"path": "assoc",
"query": {
"term": {
"assoc.value": "123abc"
}
},
"inner_hits": {
"highlight": {
"fields": {
"assoc.value*": {}
}
}
}
}
}
}
This returns an inner hit with a highlight:
"highlight": {
"assoc.value": [
"<em>123abc</em>"
]
}
However, this query returns the inner_hit but no highlighting:
{
"query": {
"nested": {
"path": "assoc",
"query": {
"term": {
"assoc.valueWithQualifier": "123abc<qual>"
}
},
"inner_hits": {
"highlight": {
"fields": {
"assoc.value*": {}
}
}
}
}
}
}
However, this does return the highlighting (but I'd rather use a term query due to efficiency):
{
"query": {
"nested": {
"path": "assoc",
"query": {
"prefix": {
"assoc.valueWithQualifier": "123abc"
}
},
"inner_hits": {
"highlight": {
"fields": {
"assoc.value*": {}
}
}
}
}
}
}
"highlight": {
"assoc.valueWithQualifier": [
"<em>123abc<qual></em>"
]
}
And before someone asks, I have tried adding "encoder": "html" to the highlight.
It turns out this is a bug that was fixed in ES 6.2 (https://github.com/elastic/elasticsearch/pull/27604).
Related
I am trying to implement a search-as-you-type query inside an array.
This is the structure of the documents:
{
"guid": "6f954d53-df57-47e3-ae9e-cb445bd566d3",
"labels":
[
{
"name": "London",
"lang": "en"
},
{
"name": "Llundain",
"lang": "cy"
},
{
"name": "Lunnainn",
"lang": "gd"
}
]
}
and up to now this is what I came with:
{
"query": {
"multi_match": {
"fields": ["labels.name"],
"query": name,
"type": "phrase_prefix"
}
}
which works exactly as requested.
The problem is that I would like to search also by language.
What I tried is:
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["labels.name"],
"query": "london",
"type": "phrase_prefix"
}
},
{
"term": {
"labels.lang": "gd"
}
}
]
}
}
}
but these queries act on separate values of the array.
So, for example, I would like to search only Welsh language (cy). That means that my query that contains the city name should match only values that have "cy" on the "lang" tag.
How do I write this kind of query?
Internally, ElasticSearch flattens nested JSON objects, so it can't correlate the lang and name of a specific element in the labels array. If you want this kind of correlation, you'll need to index your documents differently.
The usual way to do this is to use the nested data type with a matching nested query.
The query would end up looking something like this:
{
"query": {
"nested": {
"path": "labels",
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["labels.name"],
"query": "london",
"type": "phrase_prefix"
}
},
{
"term": {
"labels.lang": "gd"
}
}
]
}
}
}
}
}
But note that you'll need to also specify nested mappings for your labels, e.g.:
"properties": {
"labels": {
"type": "nested",
"properties": {
"name": {
"type": "text"
/* you might want to add other mapping-related configuration here */
},
"lang": {
"type": "keyword"
}
}
}
}
Other ways to do this include:
Indexing each label as a separate document, repeating the guid field
Using parent/child documents
You should use Nested datatype in mapping instead of Object datatype. For detail explanation refer this:
https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
So, you should define mapping of your field something like this:
{
"properties": {
"labels": {
"type": "nested",
"properties": {
"name": {
"type": "text"
},
"lang": {
"type": "keyword"
}
}
}
}
}
After this you could query using Nested Query as:
{
"query": {
"nested": {
"path": "labels",
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["labels.name"],
"query": "london",
"type": "phrase_prefix"
}
},
{
"term": {
"labels.lang": "gd"
}
}
]
}
}
}
}
}
I have the post_filter as below, Where I am trying to filter records where the school name is HILL SCHOOL AND containing a nested child object with name JOY AND section A.
school is present in the parent object, Which is holding children list of nested objects.
All of the above are AND conditions.
But the query doesn't seem to work. Any idea why ? And is there a way to combine the two nested queries?
GET /test_school/_search
{
"query": {
"match_all": {}
},
"post_filter": {
"bool": {
"must_not": [
{
"bool": {
"must": [
{
"term": {
"schoolname": {
"value": "HILL SCHOOL"
}
}
},
{
"nested": {
"path": "children",
"query": {
"bool": {
"must": [
{
"match": {
"name": "JACK"
}
}
]
}
}
}
},
{
"term": {
"children.section": {
"value": "A"
}
}
}
]
}
}
]
}
}
}
The schema is as below:
PUT /test_school
{
"mappings": {
"_doc": {
"properties": {
"schoolname": {
"type": "keyword"
},
"children": {
"type": "nested",
"properties": {
"name": {
"type": "keyword",
"index": true
},
"section": {
"type": "keyword",
"index": true
}
}
}
}
}
}
}
Sample data as below:
POST /test_school/_doc
{
"schoolname":"HILL SCHOOL",
"children":{
"name":"JOY",
"section":"A"
}
}
second record
POST /test_school/_doc
{
"schoolname":"HILL SCHOOL",
"children":{
"name":"JACK",
"section":"B"
}
}
https://stackoverflow.com/a/17543151/183217 suggests special mapping is needed to work with nested objects. You appear to be falling foul of the "cross object matching" problem.
I have a nested type field in my mapping. When I use Term search query on my nested field no result is returned from Elasticsearch whereas when I change Term to Match query, it works fine and Elasticsearch returns expected result
here is my mapping, imagine I have only one nested field in my type mapping
{
"homing.estatefiles": {
"mappings": {
"estatefile": {
"properties": {
"DynamicFields": {
"type": "nested",
"properties": {
"Name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"ValueBool": {
"type": "boolean"
},
"ValueDateTime": {
"type": "date"
},
"ValueInt": {
"type": "long"
}
}
}
}
}
}
}
}
And here is my term query (which returns no result)
{
"from": 50,
"size": 50,
"query": {
"bool": {
"filter": [
{
"nested": {
"query": {
"bool": {
"must": [
{
"term": {
"DynamicFields.Name":{"value":"HasParking"}
}
},
{
"term": {
"DynamicFields.ValueBool": {
"value": true
}
}
}
]
}
},
"path": "DynamicFields"
}
}
]
}
}
}
And here is my query which returns expected result (by changing Term query to Match query)
{
"from": 50,
"size": 50,
"query": {
"bool": {
"filter": [
{
"nested": {
"query": {
"bool": {
"must": [
{
"match": {
"DynamicFields.Name":"HasParking"
}
},
{
"term": {
"DynamicFields.ValueBool": {
"value": true
}
}
}
]
}
},
"path": "DynamicFields"
}
}
]
}
}
}
This is happening because the capital letters with the analyzer of elastic.
When you are using term the elastic is looking for the exact value you gave.
up until now it sounds good, but before it tries to match the term, the value you gave go through an analyzer of elastic which manipulate your value.
For example in your case it also turn the HasParking to hasparking.
And than it will try to match it and of course will fail. They have a great explanation in the documentation in the "Why doesn’t the term query match my document" section. This analyzer not being activated on the value when you query using match and this why you get your result.
Mapping of the field that im trying to make filter for:
"genres": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
Theres an entry with these values:
"genres": [
"Animation",
"History"
],
I am trying to make a filter, where i would input "Animation" - it would return me all entries that have animation as their genre.
Tried using terms:
GET /test/_search
{
"query": {
"bool": {
"filter": {
"terms": {
"genres": [
"Animation",
"History"
]
}
}
}
}
}
}
This resulted with no entries, as i read more i see that i need to remap my database and put "index": "not_analyzed" - then it would return some entries.
However, i can get these results by not using filter, using something like this:
GET /tmdb/_search
{
"query": {
"bool": {
"must" : [
{
"match": {
"genres": "history"
}
},
{
"match": {
"genres": "animation"
}
}
]
}
}
}
This does give me some results, but it only returns values that have both "animation" AND "history" as their genre.
So my question - do i need to remap my database, and add the "index": "not_analyzed" to the columns that i will use the filter for, or do i go with the second option (not using filters).
Edit:
Thought something like this would work, but its not working as i expected (the operator and does not seem to work for me):
GET /test/_search
{
"query": {
"match": {
"genres": {
"query": "animation",
"query": "history",
"operator": "and"
}
}
}
}
Your first query is almost correct. If you query the genres field (i.e. analyzed) you should use a match query instead
POST /test/_search
{
"query": {
"bool": {
"should": [{
"match": {
"genres": "Animation"
}
}
},{
"match": {
"genres": "History"
}
}
}]
}
}
}
If you query the genres.keyword field (i.e. not analyzed) then you can use the terms query
POST /test/_search
{
"query": {
"bool": {
"filter": {
"terms": {
"genres.keyword": [
"Animation",
"History"
]
}
}
}
}
}
}
Note: not_analyzed was used in ES 2.x and earlier, starting with ES 5 the using the keyword type is equivalent.
I recently working with Elasticsearch 2 and would like to request a query over all text fields.
GET myindex/mydata/_search
{
"query": {
"simple_query_string": {
"query": "Raketenfahrrad"
}
},
"highlight": {
"fields": [ { "*": {} } ]
}
}
The query returns the expected results but without any highlighting. I experienced that I get highlighting when I narrow the search fields manually:
{
"query": {
"simple_query_string": {
"query": "Raketenfahrrad",
"fields": ["MainTitle","SubTitle","Author","Content"]
}
},
"highlight": {
"fields": [ { "*": {} } ]
}
}
But that doesn't fit my requirement "Search over all" and will fail when the next new property is added to mydata type.
From ES 2.0, highlighting will be performed only on queried fields, you have to set require_field_match option to false. Here is the link to the change
Try this
{
"query": {
"simple_query_string": {
"query": "Raketenfahrrad"
}
},
"highlight": {
"fields": {
"*": {}
},
"require_field_match" : false
}
}