How to make _source field dynamic in elasticsearch search template? - elasticsearch

While using search query in elastic search we define what fields we required in the response
"_source": ["name", "age"]
And while working with search templates we have to set _source fields value while inserting search template to ES Cluster.
"_source": ["name", "age"]
but the problem with the search template is that it will always return us name and age and to get other fields we have to change our search template accordingly.
Is there any way we can pass search fields from the client so that it will only return fields in response to which the user asked?
I have achieved that just for one field like if you do this
"_source": "{{field}}"
then while search index via template you can do this
POST index_name/_search/template
{
"id": template_id,
"params": {
"field": "name"
}
}
This search query returning the name field in response but I could not find a way to pass it as in array or in another format so I can get multiple fields.

Absolutely!!
Your search template should look like this:
"_source": {{#toJson}}fields{{/toJson}}
And then you can call it like this:
POST index_name/_search/template
{
"id": template_id,
"params": {
"fields": ["name"]
}
}
What it's going to do is to transform the params.fields array into JSON and so the generated query will look like this:
"_source": ["name"]

Related

Calculate field data size and store to other field at indexing time ElasticSearch 7.17

I am looking for a way to store the size of a field (bytes) in a new field of a document.
I.e. when a document is created with a field message that contains the value hello, I want another field message_size_bytes written that in this example has the value 5.
I am aware of the possibilities using _update_by_query and _search using scripting fields, but I have so much data that I do not want to calculate the sizes while querying but at index time.
Is there a possibility to do this using Elasticsearch 7.17 only? I do not have access to the data before it's passed to elasticsearch.
You can use Ingest Pipeline with Script processor.
You can create pipeline using below command:
PUT _ingest/pipeline/calculate_bytes
{
"processors": [
{
"script": {
"description": "Calculate bytes of message field",
"lang": "painless",
"source": """
ctx['message_size_bytes '] = ctx['message'].length();
"""
}
}
]
}
After creating pipeline, you cna use pipeline name while indexing data like below (same you can use in logstash, java or anyother client as well):
POST 74906877/_doc/1?pipeline=calculate_bytes
{
"message":"hello"
}
Result:
"hits": [
{
"_index": "74906877",
"_id": "1",
"_score": 1,
"_source": {
"message": "hello",
"message_size_bytes ": 5
}
}
]

Match_phrase is elastic search not working as expected

In my elastic search I have documents which contains a "fieldname" with values "abc" and "abc-def". When I am using match_phrase query for searching documents with fieldname "abc" it is returning me documents with values "abc-def" as well. However when I am querying for "abc-def" it is working fine. My query is as follows:
Get my_index/_search
{
"query" : {
"match_phrase" : {"fieldname" : "abc"}
}
}
Can someone please help me in understanding the issue?
match_phrase query analyzes the search term based on the analyzer provided for the field (if no analyzer is added then by default standard analyzer is used).
Match phrase query searches for those documents that have all the terms present in the field (from the search term), and the terms must be present in the correct order.
In your case, "abc-def" gets tokenized to "abc" and "def" (because of standard analyzer). Now when you are using match phrase query for "abc-def", this searches for all documents that have both abc and def in the same order. (therefore you are getting only 1 doc in the result)
When searching for "abc", this will search for those documents that have abc in the fieldname field (since both the document contain abc, so both are returned in the result)
If you want to return only exact matching documents in the result, then you need to change the way the terms are analyzed.
If you have not explicitly defined any mapping then you need to add .keyword to the fieldname field. This uses the keyword analyzer instead of the standard analyzer (notice the ".keyword" after the fieldname field).
Adding a working example with index data, mapping, search query and search result
Index data:
{
"name":"abc-def"
}
{
"name":"abc"
}
Search Query:
{
"query": {
"match_phrase": {
"name.keyword": "abc"
}
}
}
Search Result:
"hits": [
{
"_index": "67394740",
"_type": "_doc",
"_id": "1",
"_score": 0.6931471,
"_source": {
"name": "abc"
}
}
]

Elastic Search | How to get original search query with corresponding match value

I'm using ElasticSearch as search engine for a human resource database.
The user submits a competence (f.ex 'disruption'), and ElasticSearch returns all users ordered by best match.
I have configured the field 'competences' to use synonyms, so 'innovation' would match 'disruption'.
I want to show the user (who is performing the search) how a particular search result matched the search query. For this I use the explain api (reference)
The query works as expected and returns an _explanation to each hit.
Details (simplified a bit) for a particular hit could look like the following:
{
description: "weight(Synonym(skills:innovation skills:disruption)),
value: 3.0988
}
Problem: I cannot see what the original search term was in the _explanation. (As illustrated in example above: I can see that some search query matched with 'innovation' or 'disruption', I need to know what the skill the users searched for)
Question: Is there any way to solve this issue (example: parse a custom 'description' with info about the search query tag to the _explanation)?
Expected Result:
{
description: "weight(Synonym(skills:innovation skills:disruption)),
value: 3.0988
customDescription: 'innovation'
}
Maybe you can put the original query in the _name field?
Like explained in https://qbox.io/blog/elasticsearch-named-queries:
GET /_search
{
"query": {
"query_string" : {
"default_field" : "skills",
"query" : "disruption",
"_name": "disruption"
}
}
}
You can then find the proginal query in the matched queries section in the return object:
{
"_index": "testindex",
"_type": "employee",
"_id": "2",
"_score": 0.19178301,
"_source": {
"skills": "disruption"
},
"matched_queries": [
"disruption"
]
}
Add the explain to the solution and i think it would work fine...?

Elasticsearch nested objects with query_string as first class attributes

I'm trying to index a nested field as a first-class attribute in my document so that I can search them using query_string without dot syntax.
For example, if I have a document like
"data": { "name": "Bob" }
instead of searching for data.name:Bob I would like to be able to search for name:Bob
The root of my issue is that we index a jsonb column that may have varying attributes. In some instances the data property may contain a data.business attribute, etc. I would like users to be able to search on these attributes without needing to "dig" into the object.
The data field does not have to be indexed as a nested type unless necessary; I was indexing it as an object previously.
I have tried to leverage the _all field as suggested in this post.
I have also tried to use include_in_parent:true and set the datatype as nested for my data field as suggested in this post.
I have also looked into the inner_hits feature to no avail.
Here's an example of my mapping for the data attribute.
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"data": {
"type": "object"
}
}
}
}
}
Example document
PUT my_index/_doc/1
{
"data": {
name: "bob",
business: "None of yours"
}
}
And how my query currently looks:
GET my_index/_search
{
"query": {
"query_string": {
"query": "name:bob",
"fields": ["data.*"]
}
}
}
With the current setup I almost get my desired results. I can search on individual properties like data.name:bob and data.business:"None of yours" and get back the correct documents.
However I want to be able to get the exact same results with business:"None of yours" or name:bob.
Thanks in advance for any help!
I figured it out using dynamic templates. For anyone coming across this in the future, here is how I solved the issue:
I used path_match to match the data object (data.*).
Then using copy_to and {name} I dynamically created top-level fields on my parent object.
{
"dynamic_templates":[
{"template_1":
{"mapping":
{"copy_to":"{name}"},
"path_match":"data.*"
}
}
]
}

Elasticsearch query to get results irrespective of spaces in search text

I am trying to fetch data from Elasticsearch matching from a field name. I have following two records
{
"_index": "sam_index",
"_type": "doc",
"_id": "key",
"_version": 1,
"_score": 2,
"_source": {
"name": "Sample Name"
}
}
and
{
"_index": "sam_index",
"_type": "doc",
"_id": "key1",
"_version": 1,
"_score": 2,
"_source": {
"name": "Sample Name"
}
}
When I try to search using texts like sam, sample, Sa, etc, I able fetch both records by using match_phrase_prefix query. The query I tried with match_phrase_prefix is
GET sam_index/doc/_search
{
"query": {
"match_phrase_prefix" : {
"name": "sample"
}
}
}
I am not able to fetch the records when I try to search with string samplen. I need search and get results irrespective of spaces between texts. How can I achieve this in Elasticsearch?
First, you need to understand how Elasticsearch works and why it gives the result and doesn't give the result.
ES works on the token match, Documents which you index in ES goes through the analysis process and creates and stores the tokens generated from this process to inverted index which is used for searching.
Now when you make a query then that query also generates the search tokens, these can be as it is in the search query in case of term query or tokens based on the analyzer defined on the search field in case of match query. Hence it's very important to understand the internals of your search query.
Also, it's very important to understand the mapping of your index, ES uses the standard analyzer by default on the text fields.
You can use the Explain API to understand the internals of the query like which search tokens are generated by your search query, how documents matched to it and on what basis score is calculated.
In your case, I created the name field as text, which uses the word joined analyzer explained in Ignore spaces in Elasticsearch and I was able to get the document which consists of sample name when searched for samplen.
Let us know if you also want to achieve the same and if it solves your issue.

Resources