Elasticsearch - Include fields in highlight excluded in _source - elasticsearch

I know objects marked as excluded in the _source mapping can be included in the search query. But I have a requirement to include matching terms in the highlight section of the response.
e.g.
I have a mapping like:
{
"mappings": {
"doc": {
"_source": {
"excludes": ["some_nested_object.complex_tags_object"]
},
"properties": {
"some_nested_object": {
"type": "nested"
}
}
}
}
}
Search Query:
GET my_index/_search {
"size": 500,
"query": {
"bool": {
"must": [{
"nested": {
"query": {
"bool": {
"must":
[{
"match_phrase_prefix": {
"some_nested_object.complex_tags_object.name": {
"query": "account"
}
}
}
]
}
},
"path": "some_nested_object"
}
}
]
}
},
"highlight": {
"pre_tags": [
""
],
"post_tags": [
""
],
"fields": {
"some_nested_object.complex_tags_object.name": {}
}
}
}
If I don't exclude in the mapping but in the search query at runtime then I am able to return matching terms in the highlight section but the response is very slow due to the large size of the object.
So is it possible to include fields marked as exclude in the mapping/doc/_source as part of highlight?

So is it possible to include fields marked as exclude in the mapping/doc/_source as part of highlight?
The short answer to your question unfortunately is no. From the Elasticsearch highlighting documentation:
Highlighting requires the actual content of a field. If the field is not stored (the mapping does not set store to true), the actual _source is loaded and the relevant field is extracted from _source.
You have a few options, each of which involve compromise:
Include your field back into the source if you absolutely need to support highlighting over it (I appreciate this will conflict with the reasons for excluding it from the source in the first place)
Relax the requirement to support highlighting over this field (compromise on features)
Implement a highlighting feature for this field outside Elasticsearch (probably this will compromise on quality of your solution and perhaps cost)

Related

Find all entries on a list within Kibana via Elasticserach Query DSL

Could you please help me on this? My Kibana Database within "Discover" contains a list of trades. I know want to find all trades within this DB that have been done in specific instruments (ISIN-Number). When I add a filter manually and switch to Elasticserach Query DSL, I find the following:
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"obdetails.isin": "CH0253592783"
}
},
{
"match_phrase": {
"obdetails.isin": "CH0315622966"
}
},
{
"match_phrase": {
"obdetails.isin": "CH0357659488"
}
}
],
"minimum_should_match": 1
}
}
}
Since I want to check the DB for more than 200 ISINS, this seems to be inefficient. Is there a way, in which I could just say "show me the trade if it contains one of the following 200 ISINs?".
I already googled and tried this, which did not work:
{
"query": {
"terms": {
"obdetails.isin": [ "CH0357659488", "CH0315622966"],
"boost": 1.0
}
}
}
The query works, but does not show any results.
To conclude. A field of type text is analyzed which basically converts the given data to a list of terms using given analyzers etc. rather than it being a single term.
Given behavior causes the terms query to not match these values.
Rather than changing the type of the field one may add an additional field of type keyword. That way a terms queries can be performed whilst still having the ability to match on the field.
{
"isin": {
"type" "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
The above example will add an extra field called obdetails.isin.keyword which can be used for terms. While still being able to use match queries on obdetails.isin

Elasticsearch template in Logstash doesn't mapping and not able to sort fields

I want to sort datas via elasticsearch rest client, below is my template in logstash
{
"index_patterns": ["index_name"],
"template": {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"_doc": {
"properties": {
"int_var": {
"type": "keyword"
}
}
}
}
}
}
When I try to reach, with the below code
{
"size": 100,
"query": {
"bool": {
"must": {
"match": {
"match_field": user_request
}
}
}
},
"sort": [
{"int_var": {"order": "asc"}}
]
}
I've got this error
Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true
How can i solve this ? Thanks for answering
Here's the documentation regarding field data and how to enable it as long as you are aware of the performance impacts.
When ingested into Elasticsearch, field values are tokenized based on their data type.
Text fields are broken into tokens delimited by whitespace. I.E. "quick brown fox" creates three tokens: 'quick', 'brown', and 'fox'. If you perform a search for any of these three words, you will generate matches.
Keyword fields, on the other hand, create a single token of the entire value. I.E. "quick brown fox" is a single token, 'quick brown fox'. Searching for anything that is not exactly 'quick brown fox' will generate no matches.
Unless you scrubbed your query before you posted it here, you need to modify the field name under match to be the actual field name, like below.
{
"size": 100,
"query": {
"bool": {
"must": {
"match": {
"int_var": "whatever value you are searching for"
}
}
}
},
"sort": [
{"int_var": {"order": "asc"}}
]
}

Search in every field with a fixed parameter

Perhaps it's a basic question; by the way, I need to search in every indexed field and to have a specific fixed value for another field.
How can I do it?
Currently I have a simple: query( "aValue", array_of_models )
I tried many options without success, for example:
query({
"query": {
"bool": {
"query": "aValue",
"filter": {
"term": {
"published": "true"
}
}
}
}
})
I would prefer to avoid to specify the fields to search in because I use the same search params for different models.
I found a solution, perhaps it's not optimized but works:
{
"query": {
"bool": {
"should": [
{
"match": {
"_all": "aValue"
}
}
],
"filter": {
"term": {
"published": true
}
}
}
}
}
Not sure if I understood correctly your intention.
The _all field is as default enabled. So if you have no special mapping every indexed field value is added as text string to the _all field.
You can use the
Query String Query, https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
Simple Query String Query, https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-simple-query-string-query.html
With a simple query like this, that should work for you.
GET my_index/_search
{
"query": {
"simple_query_string": {
"query": "aValue",
"fields": []
}
}
}
Both query types contains parameters, that should suffice your use case IMHO.

How can I score Elasticsearch matches for particular field names higher when using a full text search on _all?

I've setup an index that has many types representing user data such as a ShoppingList, Playlist, etc. Each type has an "identity_id" field for the user's unique identifier. I use the following query to search across all types and fields for a user (for a search function in a website):
GET _search
{
"query": {
"filtered": {
"query": {
"match_phrase_prefix": {
"_all": "awesome"
}
},
"filter": {
"match": {
"identity_id": 1
}
}
}
}
}
My questions are:
Is there a way to give a higher score to matches on fields that have "name" in the field name? For example, the ShoppingList type will have a shopping_list_name field, and I want a match on that to be higher than its other fields.
Is the above way of doing a full text search for a particular user (query then filter) the most efficient way? What about creating an index per user?
How about this query that boosts certain fields:
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "awesome",
"fields": [
"*_name",
"field*"
]
}
},
"functions": [
{
"weight": 2,
"filter": {
"multi_match": {
"query": "awesome",
"fields": [
"*_name"
]
}
}
},
{
"weight": 1,
"filter": {
"multi_match": {
"query": "awesome",
"fields": [
"field*"
]
}
}
}
]
}
}
}
What the query above does is to boost (weigth: 2) the *_name fields query and not do apply any boosting to fields called field*.
Is the above way of doing a full text search for a particular user (query then filter) the most efficient way? What about creating an index per user?
Regarding this ^ question, that's more complicated and you also need to consider how many users you have, the hardware resources the cluster has, structure of data, queries used etc.

Elastic Search 2.0/2.1 Issue with Highlighter and the Bool Query

I am having an issue with highlighting in Elastic 2.0 and 2.1 - it's returning more information than I think it should.
I am constructing a bool query (the filtered query keyword is deprecated in 2.0+ so I am trying to update my syntax). I am building a must section and a filter section within the query, followed by a request for highlighting information.
The documentation says to use the query either in a query context or a filter context, but the highlighter doesn't seem to denote such a distinction.
Here is my fully formed query:
GET /sample04/_search
{
"query": {
"bool": {
"must": [
{
"query": { "query_string": { "query": "east west" } }
}
],
"filter": [
{
"terms": {"OwnerId": ["1", "2","3"]}
}
]
}
},
"highlight": {
"fields": {
"*": { "require_field_match": "false" }
}
}
}
So this query works as expected - we are querying for terms east or west, and we are filtering documents on an Id field that is part of our security requirements, and then I ask for highlighting information.
The downside, however, is the highlighting information contains a hit every instance of every value I submitted in my filter (in this case 1, 2 or 3) that matched any value in any field in any part of my document, like this:
"highlight": {
"SomeTextField": [
"North <em>West</em>"
],
"OwnerId": [
"<em>3</em>"
],
"SerialNumber": [
"<em>3</em>-<em>3</em>"
],
"AssociatedValue": [
"<em>3</em>",
"<em>2</em>"
],
"RelatedValue": [
"<em>3</em>",
"<em>3</em>",
"<em>3</em>",
"<em>3</em>",
"<em>3</em>"
]
}
How do I get the highlighter to match my query in the must section, but ignore the filter? It is my belief that it should ignore highlighting matches that were part of the filter, notably when it's highlighting fields that contain values were requested to filter a SPECIFIC FIELD, but it's utilizing the value anywhere within my document. This seems wrong somehow, but perhaps it's my understanding.
As an FYI, if I set require_field_match to TRUE, then I ONLY get hits that match the filter, and NONE that match the query.
I cannot specify a field to generate highlighting information for, whereas we consume Elastic as a search once find anywhere model, so I don't know field my result will return from.
Can you see what I'm doing wrong? It would be greatly appreciated to understand this.
You can use highlight query for this purpose. change your highlight part to
"highlight": {
"fields": {
"*": {
"highlight_query": {
"query_string": {
"query": "east west"
}
}
}
}
}

Resources