Elasticsearch | Mapping exclude field with bulk API - elasticsearch

I am using bulk api to create index and store data fields. Also I want to set mapping to exclude a field "field1" from the source. I know this can be done using "create Index API" reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-source-field.html but I am using bulk API. below is sample API call:
POST _bulk
{ "index" : { "_index" : "test", _type = 'testType', "_id" : "1" } }
{ "field1" : "value1" }
Is there a way to add mapping settings while bulk indexing similar to below code:
{ "index" : { "_index" : "test", _type = 'testType', "_id" : "1" },
"mappings": {
"_source": {
"excludes": [
"field1"
]
}
}
}
{ "field1" : "value1" }
how to do mapping with bulk API?

It is not possible to define the mapping for a new Index while using the bulk API. You have to create your index beforehand and define the mapping then, or you have to define an index template and use a name for your index in your bulk request that triggers that template.
The following example code can be executed via the Dev Tools windows in Kibana:
PUT /_index_template/mytemplate
{
"index_patterns": [
"te*"
],
"priority": 1,
"template": {
"mappings": {
"_source": {
"excludes": [
"testexclude"
]
},
"properties": {
"testfield": {
"type": "keyword"
}
}
}
}
}
POST _bulk
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "testfield" : "value1", "defaultField" : "asdf", "testexclude": "this shouldn't be in source" }
GET /test/_mapping
You can see by the response that in this example the mapping template was used for the new test index because the testfield has only the keyword type and the source excludes is used from the template.
{
"test" : {
"mappings" : {
"_source" : {
"excludes" : [
"testexclude"
]
},
"properties" : {
"defaultField" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"testexclude" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"testfield" : {
"type" : "keyword"
}
}
}
}
}
Also the document is not returned with the excluded field:
GET /test/_doc/1
Response:
{
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 0,
"_primary_term" : 1,
"found" : true,
"_source" : {
"defaultField" : "asdf",
"testfield" : "value1"
}
}
Hope this answers your question and solves your use-case.

Related

Is it possible to extract the stored value of a keyword field when _source is disabled in Elasticsearch 7

I have the following index:
{
"articles_2022" : {
"mappings" : {
"_source" : {
"enabled" : false
},
"properties" : {
"content" : {
"type" : "text",
"norms" : false
},
"date" : {
"type" : "date"
},
"feed_canonical" : {
"type" : "boolean"
},
"feed_id" : {
"type" : "integer"
},
"feed_subscribers" : {
"type" : "integer"
},
"language" : {
"type" : "keyword",
"doc_values" : false
},
"title" : {
"type" : "text",
"norms" : false
},
"url" : {
"type" : "keyword",
"doc_values" : false
}
}
}
}
}
I have a very specific one-time need and I want to extract the stored values from the url field for all documents. Is this possible with Elasticsearch 7? Thanks!
Since in your index mapping, you have defined url field as of keyword type and have "doc_values": false. Therefore you cannot perform terms aggregation on this.
As far as I can understand your question, you only need to get the value of the of the url field in several documents. For that you can use exists query
Adding a working example
Index Mapping:
PUT idx1
{
"mappings": {
"properties": {
"url": {
"type": "keyword",
"doc_values": false
}
}
}
}
Index Data:
POST idx1/_doc/1
{
"url":"www.google.com"
}
POST idx1/_doc/2
{
"url":"www.youtube.com"
}
Search Query:
POST idx1/_search
{
"_source": [
"url"
],
"query": {
"exists": {
"field": "url"
}
}
}
Search Response:
"hits" : [
{
"_index" : "idx1",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"url" : "www.google.com"
}
},
{
"_index" : "idx1",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"url" : "www.youtube.com"
}
}
]
As your
"_source" : { "enabled" : false }
You can add mapping "store:true" for the field that you want to extract value of.
As
PUT indexExample2
{
"mappings": {
"_source": {
"enabled": false
},
"properties": {
"url": {
"type": "keyword",
"doc_values": false,
"store": true
}
}
}
}
Now once you index data, #ESCoder Thanks for example.
POST indexExample2/_doc/1
{
"url":"www.google.com"
}
POST indexExample2/_doc/2
{
"url":"www.youtube.com"
}
You can extract only the stored field in your search queries, even if _source is disabled.
POST indexExample2/_search
{
"query": {
"exists": {
"field": "url"
}
},
"stored_fields": ["url"]
}
This will o/p as:
"hits" : [
{
"_index" : "indexExample2",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"fields" : {
"url" : [
"www.google.com"
]
}
},
{
"_index" : "indexExample2",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"fields" : {
"url" : [
"www.youtube.com"
]
}
}
]

elasticsearch completion suggester returns only one suggestion when more than one matches

I created index definition with mapping:
PUT /test
{
"mappings": {
"properties" : {
"suggest" : {
"type" : "completion"
}
}
}
}
and put some data into
PUT test/_doc/1?refresh
{
"suggest" : {
"input": ["lorem ipsum", "lorem dolor"]
}
}
next I'm getting suggestions
POST test/_search?pretty
{
"_source": false,
"suggest": {
"test-suggest" : {
"prefix" : "lorem",
"completion" : {
"field" : "suggest"
}
}
}
}
The problem is the result contains only one suggestion "lorem dolor"
{
...
"suggest" : {
"test-suggest" : [
{
"text" : "lorem",
"offset" : 0,
"length" : 5,
"options" : [
{
"text" : "lorem dolor",
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0
}
]
}
]
}
}
I would like to get all matching suggestions for this document which are: "lorem ipsum" and "lorem dolor". Is that possible using elasticsearch completion?

How to rename array values in elastic search

I have an array in elastic search document. I want to rename all occurrences of a particular value. It has the following mapping
{
"products" : {
"mappings" : {
"properties" : {
"inStock" : {
"type" : "long"
},
"name" : {
"type" : "keyword"
},
"price" : {
"type" : "double"
},
"tags" : {
"type" : "keyword"
}
}
}
}
}
{
"_index" : "products",
"_type" : "_doc",
"_id" : "100",
"_version" : 6,
"_seq_no" : 5,
"_primary_term" : 1,
"found" : true,
"_source" : {
"name" : "Product A",
"price" : 50,
"inStock" : 5,
"tags" : [
"tagA",
"tagB",
"tagC",
"tagD",
"tagB"
]
}
}
I am looking for a query to rename tagB to something else, for eg: tagF in all documents which has this matching tag.
You can replace the contents of an array with an update_by_query with a simple script as follows.
POST /products/_update_by_query
{
"query": {
"term": {
"tags": {
"value": "tagB"
}
}
},
"script": {
"source": "Collections.replaceAll(ctx._source.tags, params.oldTag, params.newTag);",
"params": {"oldTag": "tagB", "newTag": "tagF"},
"lang": "painless"
}
}

Upsert document such that it would update the particular item in an array field

In Elasticsearch, say I have the document like this:
{
"inputs": [
{
"id": "1234",
"value": "ABCD"
},
{
"id": "5678",
"value": "EFGH"
}
]
}
Say, now, I wanted to update value of all items where id is "1234" to "XYZA". How can I do that using script in elasticsearch? I am not sure if I can do some for loop in script?
Mapping:
{
"inputs" : {
"mappings" : {
"properties" : {
"inputs" : {
"type" : "nested",
"properties" : {
"id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"value" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
}
Query:
You can use _update_by_query api. Query part will filter out documents and script will update the field
<1. When inputs is of nested type
POST inputs/_update_by_query
{
"script": {
"source": "for(a in ctx._source['inputs']){if(a.id=='1234') a.value=params.new_value; }",
"params": {
"new_value": "XYZA"
}
},
"query": {
"nested":{
"path":"inputs",
"query":{
"term":{
"inputs.id":1234
}
}
}
}
}
2. When inputs if of object type
POST inputs/_update_by_query
{
"script": {
"source": "for(a in ctx._source['inputs']){if(a.id=='1234') a.value=params.new_value; }",
"params": {
"new_value": "XYZA"
}
},
"query": {
"term": {
"inputs.id": 1234
}
}
}
Result:
"hits" : [
{
"_index" : "inputs",
"_type" : "_doc",
"_id" : "3uwrwHEBLcdvQ7OTrUmi",
"_score" : 1.0,
"_source" : {
"inputs" : [
{
"id" : "1234",
"value" : "XYZA"
},
{
"id" : "5678",
"value" : "EFGH"
}
]
}
}
]

How to make automatically #timestamp value into elasticsearch7's doc?

I have some problem with setting elasticsearch 7 version.
My purpose is make automatically #timestamp field value after make new doc in ES.
I found some answer about similar question. but it can't be solution because it is different version.
I tried _default_ object in mappings object. But it seems to not provide anymore in ES 7 version.
"_default_":{
"_timestamp" : {
"enabled" : true,
"store" : true
}
}
And I want to make #timestamp value in this case.
PUT /locations
{
"mappings": {
"properties": {
"location": {
"type": "geo_point"
},
"id": {
"type": "text"
}
}
}
}
PUT /locations/_doc/1
{
"location" : "31.387593,121.123446",
"id" : "xxxxxxxxxxxxxxxxxxxxxx"
}
expectd result :
{
#timestamp : "2019-10-23 10:23:50",
"location" : "31.387593,121.123446",
"id" : "xxxxxxxxxxxxxxxxxxxxxx"
}
You can create an ingest pipeline
PUT _ingest/pipeline/timestamp
{
"description": "Adds timestamp to documents",
"processors": [
{
"set": {
"field": "_source.timestamp",
"value": "{{_ingest.timestamp}}"
}
}
]
}
And call it while inserting documents
POST index39/_doc?pipeline=timestamp
{
"id":1
}
Response:
{
"_index" : "index39",
"_type" : "_doc",
"_id" : "KWF6920BpmJq35glEsr3",
"_score" : 1.0,
"_source" : {
"id" : 1,
"timestamp" : "2019-10-23T07:17:15.639200400Z"
}
}
}

Resources