Elasticsearch template in Logstash doesn't mapping and not able to sort fields - elasticsearch

I want to sort datas via elasticsearch rest client, below is my template in logstash
{
"index_patterns": ["index_name"],
"template": {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"_doc": {
"properties": {
"int_var": {
"type": "keyword"
}
}
}
}
}
}
When I try to reach, with the below code
{
"size": 100,
"query": {
"bool": {
"must": {
"match": {
"match_field": user_request
}
}
}
},
"sort": [
{"int_var": {"order": "asc"}}
]
}
I've got this error
Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true
How can i solve this ? Thanks for answering

Here's the documentation regarding field data and how to enable it as long as you are aware of the performance impacts.
When ingested into Elasticsearch, field values are tokenized based on their data type.
Text fields are broken into tokens delimited by whitespace. I.E. "quick brown fox" creates three tokens: 'quick', 'brown', and 'fox'. If you perform a search for any of these three words, you will generate matches.
Keyword fields, on the other hand, create a single token of the entire value. I.E. "quick brown fox" is a single token, 'quick brown fox'. Searching for anything that is not exactly 'quick brown fox' will generate no matches.
Unless you scrubbed your query before you posted it here, you need to modify the field name under match to be the actual field name, like below.
{
"size": 100,
"query": {
"bool": {
"must": {
"match": {
"int_var": "whatever value you are searching for"
}
}
}
},
"sort": [
{"int_var": {"order": "asc"}}
]
}

Related

Elasticsearch - Include fields in highlight excluded in _source

I know objects marked as excluded in the _source mapping can be included in the search query. But I have a requirement to include matching terms in the highlight section of the response.
e.g.
I have a mapping like:
{
"mappings": {
"doc": {
"_source": {
"excludes": ["some_nested_object.complex_tags_object"]
},
"properties": {
"some_nested_object": {
"type": "nested"
}
}
}
}
}
Search Query:
GET my_index/_search {
"size": 500,
"query": {
"bool": {
"must": [{
"nested": {
"query": {
"bool": {
"must":
[{
"match_phrase_prefix": {
"some_nested_object.complex_tags_object.name": {
"query": "account"
}
}
}
]
}
},
"path": "some_nested_object"
}
}
]
}
},
"highlight": {
"pre_tags": [
""
],
"post_tags": [
""
],
"fields": {
"some_nested_object.complex_tags_object.name": {}
}
}
}
If I don't exclude in the mapping but in the search query at runtime then I am able to return matching terms in the highlight section but the response is very slow due to the large size of the object.
So is it possible to include fields marked as exclude in the mapping/doc/_source as part of highlight?
So is it possible to include fields marked as exclude in the mapping/doc/_source as part of highlight?
The short answer to your question unfortunately is no. From the Elasticsearch highlighting documentation:
Highlighting requires the actual content of a field. If the field is not stored (the mapping does not set store to true), the actual _source is loaded and the relevant field is extracted from _source.
You have a few options, each of which involve compromise:
Include your field back into the source if you absolutely need to support highlighting over it (I appreciate this will conflict with the reasons for excluding it from the source in the first place)
Relax the requirement to support highlighting over this field (compromise on features)
Implement a highlighting feature for this field outside Elasticsearch (probably this will compromise on quality of your solution and perhaps cost)

elasticsearch query string(full text search) which can't be searched

we have a document below. I can't searched with financialmarkets. but it can be searched with industry_icon_financialmarkets.png. Can anyone tell me what is the reason?
content is the text type field.
document:
{
"title":"test",
"content":"industry_icon_financialmarkets.png"
}
Query:
{
"from": 0,
"size": 2,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "\"industry_icon_financialmarkets.png\""
}
}
]
}
}
}
The default analyzer for text field is standard which won't break industry_icon_financialmarkets into tokens using _ as a delimiter. I would suggest you to use simple analyzer instead which will breaks text into terms whenever it encounters a character which is not a letter.
You can also add sub-field of type keyword to retain the original value.
So the mapping of the field should be:
{
"content": {
"type": "text",
"analyzer": "simple",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
At the time of creating index, we should have our own mapping for each fields based on its type to get the expected result.
Mapping
PUT relevance
{"mapping":{"ID":{"type":"long"},"title":
{"type":"keyword","analyzer":"my_analyzer"},
"content":
{"type":"string","analyzer":"my_analyzer","search_analyzer":"my_analyzer"}},
"settings":
{"analysis":
{"analyzer":
{"my_analyzer":
{"tokenizer":"my_tokenizer"}},
"tokenizer":
{"my_tokenizer":
{"type":"ngram","min_gram":3,"max_gram":30,"token_chars":
["letter","digit"]
}
}
},"number_of_shards":5,"number_of_replicas":2
}
}
Then start inserting documents,
POST relevance/_doc/1
{
"name": "1elastic",
"content": "working fine" //replace special characters with space using program before inserting into ES index.
}
Query
GET relevance/_search
{"size":20,"query":{"bool":{"must":[{"match":{"content":
{"query":"fine","fuzziness":1}}}]}}}

Elasticsearch with AND query in DSL

this drives me crazy. I have no clue why this elastic search do not return me value.
I put values with this:
PUT /customer/person-test/1?pretty
{
"name": "John Doe",
"personId": 153,
"houseHoldId": 6191136,
"quarter": "2016_Q1"
}
PUT /customer/person-test/2?pretty
{
"name": "John Doe",
"personId": 153,
"houseHoldId": 6191136,
"quarter": "2016_Q2"
}
and when I query like this, it do not returns me value:
GET /customer/person-test/_search
{
"query": {
"bool": {
"must" : [
{
"term": {
"name": "John Doe"
}
},
{
"term": {
"quarter": "2016_Q1"
}
}
]
}
}
}
this query i copied from A simple AND query with Elasticsearch
I just want to get the person with "John Doe" AND "2016_Q1", why this did not work?
You should use match instead of term :
GET /customer/person-test/_search
{
"query": {
"bool": {
"must" : [
{
"match": {
"name": "John Doe"
}
},
{
"match": {
"quarter": "2016_Q1"
}
}
]
}
}
}
Explanation
Why doesn’t the term query match my document ?
String fields can be of type text (treated as full text, like the body
of an email), or keyword (treated as exact values, like an email
address or a zip code). Exact values (like numbers, dates, and
keywords) have the exact value specified in the field added to the
inverted index in order to make them searchable.
However, text fields are analyzed. This means that their values are
first passed through an analyzer to produce a list of terms, which are
then added to the inverted index.
There are many ways to analyze text: the default standard analyzer
drops most punctuation, breaks up text into individual words, and
lower cases them. For instance, the standard analyzer would turn the
string “Quick Brown Fox!” into the terms [quick, brown, fox].
This analysis process makes it possible to search for individual words
within a big block of full text.
The term query looks for the exact term in the field’s inverted
index — it doesn’t know anything about the field’s analyzer. This
makes it useful for looking up values in keyword fields, or in numeric
or date fields. When querying full text fields, use the match query
instead, which understands how the field has been analyzed.
...
its not working because of u r using default standard analyzer link for 'name' and 'quarter' .
You have two more options :-
1)change mapping :-
"name": {
"type": "string",
"index": "not_analyzed"
},
"quarter": {
"type": "string",
"index": "not_analyzed"
}
2)try this , lowercase your value since by default standard analyzer use Lower Case Token Filter :-
{
"query": {
"bool": {
"must" : [
{
"term": {
"name": "john_doe"
}
},
{
"term": {
"quarter": "2016_q1"
}
}
]
}
}
}

How can I score Elasticsearch matches for particular field names higher when using a full text search on _all?

I've setup an index that has many types representing user data such as a ShoppingList, Playlist, etc. Each type has an "identity_id" field for the user's unique identifier. I use the following query to search across all types and fields for a user (for a search function in a website):
GET _search
{
"query": {
"filtered": {
"query": {
"match_phrase_prefix": {
"_all": "awesome"
}
},
"filter": {
"match": {
"identity_id": 1
}
}
}
}
}
My questions are:
Is there a way to give a higher score to matches on fields that have "name" in the field name? For example, the ShoppingList type will have a shopping_list_name field, and I want a match on that to be higher than its other fields.
Is the above way of doing a full text search for a particular user (query then filter) the most efficient way? What about creating an index per user?
How about this query that boosts certain fields:
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "awesome",
"fields": [
"*_name",
"field*"
]
}
},
"functions": [
{
"weight": 2,
"filter": {
"multi_match": {
"query": "awesome",
"fields": [
"*_name"
]
}
}
},
{
"weight": 1,
"filter": {
"multi_match": {
"query": "awesome",
"fields": [
"field*"
]
}
}
}
]
}
}
}
What the query above does is to boost (weigth: 2) the *_name fields query and not do apply any boosting to fields called field*.
Is the above way of doing a full text search for a particular user (query then filter) the most efficient way? What about creating an index per user?
Regarding this ^ question, that's more complicated and you also need to consider how many users you have, the hardware resources the cluster has, structure of data, queries used etc.

Must match multiple values

I have a query that works fine when I need the property of a document
to match just one value.
However I also need to be able to search with must with two values.
So if a banana has id 1 and a lemon has id 2 and I search for yellow
I will get both if I have 1 and 2 in the must clause.
But if i have just 1 I will only get the banana.
{
"from": 0,
"size": 20,
"query": {
"bool": {
"should": [
{ "match":
{ "fruit.color": "yellow" }}
],
"must" : [
{ "match": { "fruit.id" : "1" } }
]
}
}
}
I haven´t found a way to search with two values with must.
is that possible?
If the document "must" be returned only if the id is 1 or 2, that sounds like another should clause. If I'm understanding your question properly, you want documents with either id 1 OR id 2. Additionally, if the color is yellow, give it a higher score.
Here's one way you might achieve what you're looking for:
{
"query": {
"bool": {
"should": {
"match": {
"fruit.color": "yellow"
}
},
"must": {
"bool": {
"should": [
{
"match": {
"fruit.id": "1"
}
},
{
"match": {
"fruit.id": "2"
}
}
]
}
}
}
}
}
Here I put the two match queries in the should clause of a separate bool query. This achieves the OR behavior you are looking for.
Have another look at the Bool Query documentation and take note of the nuances of should. It behaves differently by default depending on whether or not there is a sibling must clause and whether or not the bool query is being executed in filter context.
Another key option that is adjustable and can help you achieve your expected results is the minimum_should_match parameter. Have a look at this documentation page.
Instead of a match query, you could simply try the terms query for ORing between multiple terms.
Match queries are generally used for analyzed fields. For exact matching, you should use term queries
{
"from": 0,
"size": 20,
"query": {
"bool": {
"should": [
{ "match": { "fruit.color": "yellow" } }
],
"must" : [
{ "terms": { "fruit.id": ["1","2"] } }
]
}
}
}
term or terms query is the perfect way to fetch the exact text or id, using match query result in search inside the id or text
Ex:
id = '4'
id = '44'
Search using match query with id = 4 return both 4 & 44 since it matches 4 in both. This is where terms query come into play.
same search using terms query will return 4 only.
So the accepted is absolutely wrong. Use the #Rahul answer. Just one more thing you need to do, Instead of text you need to analyse the field as a keyword
Example for indexing a field both as a text and keyword (mapping is for flat level for nested change it accordingly).
{
"index_patterns": [ "test" ],
"mappings": {
"kb_mapping_doc": {
"_source": {
"enabled": true
},
"properties": {
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
}
using #Rahul's answer doesn't worked because you might be analysed as a text.
id - access a text field
id.keyword - access a keyword field
it would be
{
"from": 0,
"size": 20,
"query": {
"bool": {
"should": [{
"match": {
"color": "yellow"
}
}],
"must": [{
"terms": {
"id.keyword": ["1", "2"]
}
}]
}
}
}
So I would say accepted answer will return falsy results Please use #Rahul's answer with the corresponding mapping.

Resources