Update information in Elastic Search - elasticsearch

I have a question, I am trying to update an object in ES, so, every time I query it I will get all the updated info. I have an object like this:
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 768,
"successful": 768,
"failed": 0
},
"hits": {
"total": 456,
"max_score": 1,
"hits": [
{
"_index": "sometype_1",
"_type": "sometype",
"_id": "12312321312312",
"_score": 1,
"_source": {
"readModel": {
"id": "asdfqwerzcxv",
"status": "active",
"hidden": false,
"message": "hello world",
},
"model": {
"id": "asdfqwerzcxv",
"content": {
"objectId": "421421312312",
"content": {
"#type": "text",
"text": "hello world"
}
..... //the rest of the object...
And I want to update the message (part of the read model), so I made something like this:
PUT test/readModel.id/ID123
{
"message" : "hello"
}
But everytime I query looking for ID123 I get same info (and even worse, the more PUTs I make the more objects I get back (with same info)
any idea how to?

If there's only a single document that you need to update, you can use the Update API like this:
POST sometype_1/sometype/12312321312312/_update
{
"doc": {
"model.message": { ... your JSON object here... }
}
}
If several documents can have readModel.id: asdfqwerzcxv and you want to update all of them with the same message, then you need to use the Update by query API like this_
POST sometype_1/_update_by_query
{
"script": {
"source": "ctx._source.message = params.message",
"lang": "painless",
"params": {
"message": "hello"
}
},
"query": {
"match": {
"readModel.id": "asdfqwerzcxv"
}
}
}

Related

Source to destination Key Field mapping in Elastic Search

I have a elastic search index with source data coming in the following way:
"_source": {
"email": "smithamber#example.com",
"time": "2022-09-08T13:52:50.347861",
"message": "Pattern thank talk mention. Manage nearly tell beat. Difficult husband feel talk radio however.",
"sIp": "192.168.11.156",
"dIp": "80.254.211.60",
"ts": "2022-09-08T13:52:50"
}
Now I want a way to treat dynamically map #timestamp [destination key] field of ES doc to be time [source key]. For this i am using:
"runtime_mappings": {
"#timestamp": {
"type": "date",
"format": "yyyyMMdd'T'HHmmss.SSSZ",
"script": {
"source": "if (doc[\"time\"].size() == 0) {return} else {return doc[\"time\"].value;}",
"lang": "painless"
}
}
}
However, this does not work. Is there a better way to map source key field to destination key field in elastic search. I am open to static mapping as well if we set once before creating the index for one kind of source data.
I am looking for correct syntax for mapping my field.
Edited:
When I add the query -
{ "query": {
"range": {
"#timestamp": {
"gte": "now-5d",
"lte": "now"
}
}
}
}
I see no hits.
{
"took": 20,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": null,
"hits": []
}
}
However, same query on field time gets me all filtered docs.
{
"took": 27,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 10000,
"relation": "gte"
},
"max_score": 1.0,
"hits": [
{
"_index": "topic-indexer-xxx",
"_id": "c28sIYMB0xJUJru8c47O",
"_score": 1.0,
"_source": {
"email": "albertthompson#example.com",
"time": "2022-09-07T15:25:33.672016",
"message": "Candidate future staff ever former run. Like quality personal specific trouble cell money move. Available majority memory model thing TV wrong. Summer anyone light key.",
"sIp": "192.168.103.75",
"dIp": "191.27.68.163"
}
},
....
}
For mapping I have also tried dynamic templates; but still no results on query for #timestamp field:
{
"dynamic_templates": [
{
"#timestamp": {
"match": "time",
"mapping": {
"type": "date",
"format": "strict_date_optional_time",
"copy_to": "#timestamp"
}
}
}
]
}
With #paulo's response, I just did a little fine tuning to resolve the issue; The below mapping (as set) works and then I can run range queries on the #timestamp field:
{
"runtime": {
"#timestamp": {
"type": "date",
"script": {
"source": "if (doc['time'].size() != 0){ emit(doc['time'].value.toEpochMilli());}",
"lang": "painless"
}
}
},
"properties": {
"#timestamp": {
"type": "date"
}
}
}
Tldr;
I feel you go mixed up in your painless script.
Please find below an example you should be able to reproduce on your side.
Time is already a date on my side. Elasticsearch was able to detect it automatically.
On another note, using runtime fields while very flexible, may lead to performance issue on the long run.
Maybe you should be looking into ingest pipeline.
Solution
POST /73684302/_doc
{
"email": "smithamber#example.com",
"time": "2022-09-08T13:52:50.347861",
"message": "Pattern thank talk mention. Manage nearly tell beat. Difficult husband feel talk radio however.",
"sIp": "192.168.11.156",
"dIp": "80.254.211.60",
"ts": "2022-09-08T13:52:50"
}
POST /73684302/_doc
{
"email": "smithamber#example.com",
"message": "Pattern thank talk mention. Manage nearly tell beat. Difficult husband feel talk radio however.",
"sIp": "192.168.11.156",
"dIp": "80.254.211.60",
"ts": "2022-09-08T13:52:50"
}
GET /73684302/_search
{
"runtime_mappings": {
"#timestamp": {
"type": "date",
"script": {
"source": """
if (doc["time"].size() != 0){
emit(doc["time"].value.toEpochMilli());
}
""",
"lang": "painless"
}
}
},
"_source": false,
"fields": ["#timestamp"]
}

how to make proper query to select by ID and later update using elastic search?

I am very new in ES and I am trying to figure out some things.
I did a basic query this way
GET _search
{
"query": {
"match_all": {}
}
}
and I got this...
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 768,
"successful": 768,
"failed": 0
},
"hits": {
"total": 456,
"max_score": 1,
"hits": [
{
"_index": "sometype_1",
"_type": "sometype",
"_id": "12312321312312",
"_score": 1,
"_source": {
"readModel": {
"id": "asdfqwerzcxv",
"status": "active",
"hidden": false
},
"model": {
"id": "asdfqwerzcxv",
"content": {
"objectId": "421421312312",
"message": "hello world",
..... //the rest of the object...
So right now I want to get the object with id asdfqwerzcxv and I did this:
GET _search
{
"query": {
"match" : {
"id" :"asdfqwerzcxv"
}
}
}
But of course is not working... I also tried to make the whole route like:
GET _search
{
"query": {
"match" : {
"_source" :{
"readModel" : {
"id": "asdfqwerzcxv"
}
}
}
}
}
But no luck...
is there a way to do this? could someone help me?
Thanks
You need to use the full-qualified field name, try this:
GET _search
{
"query": {
"match" : {
"readModel.id" :"asdfqwerzcxv"
^
|
add this
}
}
}

Why does elasticsearch filter does not give any results whereas using kibana dasboard gives the result?

I am query elastic search using sense. When using range filter on field, I get empty hits, but I am able to get results using kibana dashboard. Why is the filter not working? My query:
GET _search
{
"query": {
"bool": {
"must": [
{"match": {"field_name1": "value1"}},
{"match": {"file_name2": "value2"}}
]
}
},
"filter": { <- not working (no data, but gets data from kibana)
"range": {
"#timestamp": {
"gte": "2017-02-18"
}
}
},
"sort": [
{
"#timestamp": {
"order": "desc",
"ignore_unmapped" : true
}
}
]
}
From kibana dashboard when I add the time it add the time:(from:'2017-02-18T10:19:08.680Z',mode:absolute,to:'2017-02-19T10:19:08.680Z')) and I am able to see results. The dashboard also adds some other stuff like metadata and filter with negate but I think they do the same. Only the time part seem to be different. So why the difference and is my query correct? The sample url:
https://elasticsearch/app/kibana#/discover?
_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:'2017-02-18T09:23:41.044Z',mode:absolute,to:'2017-02-19T09:23:41.044Z'))
&_a=(columns:!(description,id),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:index-value,key:field_name1,negate:!f,value:value1),query:(match:(field_name2:(query:value2,type:phrase))))),index:index-value,interval:auto,query:(query_string:(analyze_wildcard:!t,query:'*')),sort:!('#timestamp',desc),uiState:(),vis:(aggs:!((params:(field:field_name2,orderBy:'2',size:20),schema:segment,type:terms),(id:'2',schema:metric,type:count)),type:histogram))
&indexPattern=index-value&type=histogram
Thanks.
Sample json response:
{
"took": some_number,
"timed_out": false,
"_shards": {
"total": some_number,
"successful": some_number,
"failed": 0
},
"hits": {
"total": some_number,
"max_score": null,
"hits": [
{
"_index": "index-name",
"_type": "log-1",
"_id": "alphanum",
"_score": null,
"_source": {
"headers": "header-string",
"query_string": "query-string",
"server_variables": "server-variables",
"cookies": "cookies",
"extra_data": "some extra stuff",
"exception_data_obj": {
"stack_trace": "",
"source": "",
"message": "success",
"additional_data": ""
},
"some_id": "211FA1F1-F312-1234-B539-F7AAE23EAA2F",
"level": "Warn",
"description": "Success",
"#timestamp": "2017-01-20T01:33:27.303Z",
"field1": "value1",
"field2": "value2"
"key": {
"key.field1": "key.value1",
"key.field2": "key.value2"
}
"#by": "app-name",
"environment": "env-name"
},
"sort": [
1484876007303
]
},
{}
]
}
}
it's not the same query, in the sense query you asked must query on field1 and field2 but in kibana you didn't

Get specific fields from index in elasticsearch

I have an index in elastic-search.
Sample structure :
{
"Article": "Article7645674712",
"Genre": "Genre92231455",
"relationDesc": [
"Article",
"Genre"
],
"org": "user",
"dateCreated": {
"date": "08/05/2015",
"time": "16:22 IST"
},
"dateModified": "08/05/2015"
}
From this index i want to retrieve selected fields: org and dateModified.
I want result like this
{
"took": 265,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 28,
"max_score": 1,
"hits": [
{
"_index": "couchrecords",
"_type": "couchbaseDocument",
"_id": "3",
"_score": 1,
"_source": {
"doc": {
"org": "user",
"dateModified": "08/05/2015"
}
}
},
{
"_index": "couchrecords",
"_type": "couchbaseDocument",
"_id": "4",
"_score": 1,
"_source": {
"doc": {
"org": "user",
"dateModified": "10/05/2015"
}
}
}
]
}
}
How to query elastic-search to get only selected specific fields ?
You can retrieve only a specific set of fields in the result hits using the _source parameter like this:
curl -XGET localhost:9200/couchrecords/couchbaseDocument/_search?_source=org,dateModified
Or in this format:
curl -XPOST localhost:9200/couchrecords/couchbaseDocument/_search -d '{
"_source": ["doc.org", "doc.dateModified"], <---- you just need to add this
"query": {
"match_all":{} <----- or whatever query you have
}
}'
That's easy. Considering any query of this format :
{
"query": {
...
},
}
You'll just need to add the fields field into your query which in your case will result in the following :
{
"query": {
...
},
"fields" : ["org","dateModified"]
}
{
"_source" : ["org","dateModified"],
"query": {
...
}
}
Check ElasticSearch source filtering.

How to use _timestamp in a scripted update

I was trying to come up with an elegant answer to this question and ran into an unexpected problem. The basic idea is to update a document based on its current timestamp. Seems straightforward enough, but I seem to be missing something. At the bottom of the Update API page, the ES docs say:
It also allows to update the ttl of a document using ctx._ttl and timestamp using ctx._timestamp. Note that if the timestamp is not updated and not extracted from the _source it will be set to the update date.
The ES documentation is often enigmatic at best, especially when it comes to scripting, but I took this to mean that I could use the _timestamp field in an update script.
So I set up a simple index with a timestamp:
PUT /test_index
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"doc": {
"_timestamp": {
"enabled": true,
"store": true,
"path": "doc_date",
"format" : "YYYY-MM-dd"
},
"properties": {
"doc_date": {
"type": "date",
"format" : "YYYY-MM-dd"
},
"doc_text": {
"type": "string"
}
}
}
}
}
and added some docs:
POST /test_index/_bulk
{"index":{"_index":"test_index","_type":"doc","_id":1}}
{"doc_text":"doc1", "doc_date":"2015-2-5"}
{"index":{"_index":"test_index","_type":"doc","_id":2}}
{"doc_text":"doc2", "doc_date":"2015-2-10"}
{"index":{"_index":"test_index","_type":"doc","_id":3}}
{"doc_text":"doc3", "doc_date":"2015-2-15"}
If I query for the first doc, I get back what I expect:
POST /test_index/_search
{
"query": {
"match": {
"doc_text": "doc1"
}
},
"fields": [
"_timestamp",
"_source"
]
}
...
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.4054651,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 1.4054651,
"_source": {
"doc_text": "doc1",
"doc_date": "2015-2-5"
},
"fields": {
"_timestamp": 1423094400000
}
}
]
}
}
So far so good. Now I want to conditionally update the first doc, based on its timestamp. First I tried this, and got an error:
POST /test_index/doc/1/_update
{
"script": "if(ctx._timestamp < new_ts){ctx._source.doc_date=new_date;ctx._source.doc_text=new_text}",
"params": {
"new_ts": 1423526400000,
"new_date": "2015-2-10",
"new_text": "doc1-updated"
}
}
...
{
"error": "ElasticsearchIllegalArgumentException[failed to execute script]; nested: PropertyAccessException[[Error: could not access: _timestamp; in class: java.util.HashMap]\n[Near : {... if(ctx._timestamp < new_ts){ctx._ ....}]\n ^\n[Line: 1, Column: 4]]; ",
"status": 400
}
Then I tried this:
POST /test_index/doc/1/_update
{
"script": "if(ctx[\"_timestamp\"] < new_ts){ctx._source.doc_date=new_date;ctx._source.doc_text=new_text}",
"params": {
"new_ts": 1423526400000,
"new_date": "2015-2-10",
"new_text": "doc1-updated"
}
}
...
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_version": 2
}
I didn't get an error, but the update didn't happen:
POST /test_index/_search
{
"query": {
"match": {
"doc_text": "doc1"
}
},
"fields": [
"_timestamp",
"_source"
]
}
...
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.287682,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 1.287682,
"_source": {
"doc_text": "doc1",
"doc_date": "2015-2-5"
},
"fields": {
"_timestamp": 1423094400000
}
}
]
}
}
Just out of curiosity, I inverted the conditional:
POST /test_index/doc/1/_update
{
"script": "if(ctx[\"_timestamp\"] > new_ts){ctx._source.doc_date=new_date;ctx._source.doc_text=new_text}",
"params": {
"new_ts": 1423526400000,
"new_date": "2015-2-10",
"new_text": "doc1-updated"
}
}
with the same result: no update.
Okay, so as a sanity check I tried to set the timestamp, and got an error:
POST /test_index/doc/1/_update
{
"script": "ctx._source.doc_date=new_date;ctx._source.doc_text=new_text;ctx._timestamp=new_ts",
"params": {
"new_ts": 1423526400000,
"new_date": "2015-2-10",
"new_text": "doc1-updated"
}
}
...
{
"error": "ClassCastException[java.lang.Long cannot be cast to java.lang.String]",
"status": 500
}
I also tried it with "ctx[\"_timestamp\"]=new_ts;", and got the same error.
So it seems that the _timestamp field is not available to the script, even though the documentation says it is. What am I doing wrong?
I also tried updating without the conditional or updating the timestamp, and it worked as expected.
I used Elasticsearch version 1.3.4 (with dynamic scripting enabled, obviously), running on an Ubuntu 12 VM.
Here is the code I used to set this up:
http://sense.qbox.io/gist/ca2b3c6b84572e5f87d57d22f8c38252fa4ee216

Resources