Copy field to payload - elasticsearch

When using suggesters in Elastic, it is possible to give a payload when indexing a document. Each time the suggester will be used, its payload will be given with the suggestions.
I would like to add the value of the id field of a document to the payload. While it is easy to do so at index-time, I would want to handle it in the mapping, because I don't want to change the way I convert documents to JSON.
I tried the following:
POST test
{
"mappings" : {
"type1" : {
"properties" : {
"id": {"type": "String", "copy_to": ["field1_suggest.payload"]},
"field1" : { "type" : "string", "copy_to": ["field1_suggest"]},
"field1_suggest":{"type": "completion", "payloads": true}
}
}
}
}
POST test/type1/1
{
"id": "payload",
"field1": "my value"
}
This fails, since "payload" is not a real field of field_suggest:
"error": "MapperParsingException[attempt to copy value to non-existing object [field1_suggest.payload]]", "status": 400
How can I automatically include fields in the payload? If it is not possible, I guess I will have to use mainstream queries to get suggestions for completion...

Related

Elastic Search Date Range Query

I am new to elastic search and I am struggling with date range query. I have to query the records which fall between some particular dates.The JSON records pushed into elastic search database are as follows:
"messageid": "Some message id",
"subject": "subject",
"emaildate": "2020-01-01 21:09:24",
"starttime": "2020-01-02 12:30:00",
"endtime": "2020-01-02 13:00:00",
"meetinglocation": "some location",
"duration": "00:30:00",
"employeename": "Name",
"emailid": "abc#xyz.com",
"employeecode": "141479",
"username": "username",
"organizer": "Some name",
"organizer_email": "cde#xyz.com",
I have to query the records which has start time between "2020-01-02 12:30:00" to "2020-01-10 12:30:00". I have written a query like this :
{
"query":
{
"bool":
{
"filter": [
{
"range" : {
"starttime": {
"gte": "2020-01-02 12:30:00",
"lte": "2020-01-10 12:30:00"
}
}
}
]
}
}
}
This query is not giving results as expected. I assume that the person who has pushed the data into elastic search database at my office has not set the mapping and Elastic Search is dynamically deciding the data type of "starttime" as "text". Hence I am getting inconsistent results.
I can set the mapping like this :
PUT /meetings
{
"mappings": {
"dynamic": false,
"properties": {
.
.
.
.
"starttime": {
"type": "date",
"format":"yyyy-MM-dd HH:mm:ss"
}
.
.
.
}
}
}
And the query will work but I am not allowed to do so (office policies). What alternatives do I have so that I can achieve my task.
Update :
I assumed the data type to be "Text" but by default Elastic Search applies both "Text" and "Keyword" so that we can implement both Full Text and Keyword based searches. If it is also set as "Keyword" . Will this benefit me in any case. I do not have access to lots of stuff in the office that's why I am unable to debug the query.I only have the search API for which I have to build the query.
GET /meetings/_mapping output :
'
'
'
"starttime" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
'
'
'
Date range queries will not work on text field, for that, you have to use the date field
Since you are working on date fields , best practice is to use the date field.
I would suggest you to reindex your index to another index so that you can change the type of your text field to date field
Step1-: Create index2 using index1 mapping and make sure to change the type of your date field which is text to date type
Step 2-: Run the elasticsearch reindex and reindex all your data from index1 to index2. Since you have changed your field type to date field type. Elasticsearch will now recognize this field as date
POST _reindex
{
"source":{ "index": "index1" },
"dest": { "index": "index2" }
}
Now you can run your Normal date queries on index2
As #jzzfs suggested the idea is to add a date sub-field to the starttime field. You first need to modify the mapping like this:
PUT meetings/_mapping
{
"properties": {
"starttime" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
},
"date": {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss",
}
}
}
}
}
When done, you need to reindex your data using the update by query API so that the starttime.date field gets populated and index:
POST meetings/_update_by_query
When the update is done, you'll be able to leverage the starttime.date sub-field in your query:
{
"query": {
"bool": {
"filter": [
{
"range": {
"starttime.date": {
"gte": "2020-01-02 12:30:00",
"lte": "2020-01-10 12:30:00"
}
}
}
]
}
}
}
There are ways of parsing text fields as dates at search time but the overhead is impractical... You could, however, keep the starttime as text by default but make it a multi-field and query it using starttime.as_date, for example.

Elasticsearch: Duplicate properties in a single record

I have to find every document in Elasticsearch that has duplicate properties. My mapping looks something like this:
"type": {
"properties": {
"thisProperty": {
"properties" : {
"id":{
"type": "keyword"
},
"other_id":{
"type": "keyword"
}
}
}
The documents I have to find have a pattern like this:
"thisProperty": [
{
"other_id": "123",
"id": "456"
},
{
"other_id": "123",
"id": "456"
},
{
"other_id": "4545",
"id": "789"
}]
So, I need to find any document by type that has repeat property fields. Also I cannot search by term because I do not what the value of either Id field is. So far the API hasn't soon a clear way to do this via query and the programmatic approach is possible but cumbersome. Is it possible to get this result set in a elasticsearch query? If so, how?
(The version of Elasticsearch is 5.3)

Output field in autocomplete suggestion

when I want to index a document in elasticsearch this problem occurring:
message [MapperParsingException[failed to parse]; nested: IllegalArgumentException[unknown field name [output], must be one of [input, weight, contexts]];]
I know that the output field removed from elasticsearch in version 5 but why? and what I have to do for getting single result for inputs?
output field removed from ElasticSearch in version 5, now _source filed returns with the suggestion. Example shown in below.
Mapping
{
"user": {
"properties": {
"name": {
"type": "string"
},
"suggest": {
"type": "completion",
"analyzer": "simple",
"search_analyzer": "simple"
}
}
}
}
Data
{
"id": "123",
"name": "Abc",
"suggest":
{
"input": "Abc::123"
},
"output": "Abc::123"
}
Query
POST - http://localhost:9200/user*/_suggest?pretty
{
"type-suggest": {
"text": "Abc",
"completion": {
"field": "suggest"
}
}
}
Elastic mentions the following
As suggestions are document-oriented, suggestion metadata (e.g. output) should now be specified as a field in the document. The support for specifying output when indexing suggestion entries has been removed. Now suggestion result entry’s text is always the un-analyzed value of the suggestion’s input (same as not specifying output while indexing suggestions in pre-5.0 indices).
Source
Update
I was able to get a single output from multiple inputs in ES 5.1.1. You can find the answere here

elasticsearch routing on specific field

hi I want to set custom routing on specific field "userId" on my Es v2.0.
But it giving me error.I don't know how to set custom routing on ES v2.0
Please guys help me out.Thanks in advance.Below is error message, while creating custom routing with existing index.
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [_routing] has unsupported parameters: [path : userId]"
}
],
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [_routing] has unsupported parameters: [path : userId]"
},
"status": 400
}
In ES 2.0, the _routing.path meta-field has been removed. So now you need to do it like this instead:
In your mapping, you can only specify that routing is required (but you cannot specify path anymore):
PUT my_index
{
"mappings": {
"my_type": {
"_routing": {
"required": true
},
"properties": {
"name": {
"type": "string"
}
}
}
}
}
And then when you index a document, you can specify the routing value in the query string like this:
PUT my_index/my_type/1?routing=bar
{
"name": "foo"
}
You can still use custom routing based on a field from the data being indexed. You can setup a simple pipeline and then use the pipeline every time you index a document, or you can also change the index settings to use the pipeline whenever the index receives a document indexing request.
Read about the pipeline here
Do read above and below the docs for more clarity. It's not meant for setting custom routing, but can be used for the purpose. Custom routing was disabled for a good reason that the field to be used can turn out to have null values leading to unexpected behavior. Hence take care of that issue by yourself.
For routing, here is a sample pipeline PUT :
PUT localhost:9200/_ingest/pipeline/nameRouting
Content-Type: application/json
{
"description" : "Set's the routing value from name field in document",
"processors" : [
{
"set" : {
"field": "_routing",
"value": "{{_source.name}}"
}
}
]
}
The index settings will be :
{
"settings" : {
"index" : {
"default_pipeline" : "nameRouting"
}
}
}

Replacing (Bulk Update) Nested documents in ElasticSearch

I have an ElasticSearch index with vacation rentals (100K+), each including a property with nested documents for availability dates (1000+ per 'parent' document). Periodically (several times daily), I need to replace the entire set of nested documents for each property (to have fresh data for availability per vacation rental property) - however ElasticSearch default behavior is to merge nested documents.
Here is a snippet of the mapping (availability dates in the "bookingInfo"):
{
"vacation-rental-properties": {
"mappings": {
"property": {
"dynamic": "false",
"properties": {
"bookingInfo": {
"type": "nested",
"properties": {
"avail": {
"type": "integer"
},
"datum": {
"type": "date",
"format": "dateOptionalTime"
},
"in": {
"type": "boolean"
},
"min": {
"type": "integer"
},
"out": {
"type": "boolean"
},
"u": {
"type": "integer"
}
}
},
// this part left out
}
}
}
}
Unfortunately, our current underlying business logic does not allow us to replace or update parts of the "bookingInfo" nested documents, we need to replace the entire array of nested documents. With the default behavior, updating the 'parent' doc, merely adds new nested docs to the "bookingInfo" (unless they exist, then they're updated) - leaving the index with a lot of old dates that should no longer be there (if they're in the past, they're not bookable anyway).
How do I go about making the update call to ES?
Currently using a bulk call such as (two lines for each doc):
{ "update" : {"_id" : "abcd1234", "_type" : "property", "_index" : "vacation-rental-properties"} }
{ "doc" : {"bookingInfo" : ["all of the documents here"]} }
I have found this question that seems related, and wonder if the following will work (first enabling scripts via script.inline: on in the config file for version 1.6+):
curl -XPOST localhost:9200/the-index-and-property-here/_update -d '{
"script" : "ctx._source.bookingInfo = updated_bookingInfo",
"params" : {
"updated_bookingInfo" : {"field": "bookingInfo"}
}
}'
How do I translate that to a bulk call for the above?
Using ElasticSearch 1.7, this is the way I solved it. I hope it can be of help to someone, as a future reference.
{ "update": { "_id": "abcd1234", "_retry_on_conflict" : 3} }\n
{ "script" : { "inline": "ctx._source.bookingInfo = param1", "lang" : "js", "params" : {"param1" : ["All of the nested docs here"]}}\n
...and so on for each entry in the bulk update call.

Resources