Update records in ElasticSearch - elasticsearch

I had like to update the logdate column for ALL records in a specific index. From what I have read so far, it seems that it is not possible? I am correct?
Here's a sample of a document:
{
"_index": "logstash-01-2015",
"_type": "ufdb",
"_id": "AU__EvrALg15uxY1Wxf9",
"_score": 1,
"_source": {
"message": "2015-08-14 06:50:05 [31946] PASS level2 10.249.10.70 level2 ads http://ad.360yield.com/unpixel.... GET",
"#version": "1",
"#timestamp": "2015-09-24T11:17:57.389Z",
"type": "ufdb",
"file": "/usr/local/ufdbguard/logs/ufdbguardd.log",
"host": "PROXY-DEV",
"offset": "3983281700",
"logdate": "2015-08-14T04:50:05.000Z",
"status": "PASS",
"group": "level2",
"clientip": "10.249.10.70",
"category": "ads",
"url": "http://ad.360yield.com/unpixel....",
"method": "GET",
"tags": [
"_grokparsefailure"
]
}
}

You are correct, that is not possible.
There's been an open issue asking Update by Query for long time, and I'm not sure it's going to be implemented anytime soon since it is very problematic for the underlying lucene engine. It requires deleting all documents and reindexing them.
An Update by Query Plugin is available on github, but it's experimental and I never tried it.
UPDATE 2018-05-02
The original answer is quite old. Update By Query is now supported.

You can use the partial update API.
To test it, I created a trivial index:
PUT /test_index
Then created a document:
PUT /test_index/doc/1
{
"message": "2015-08-14 06:50:05 [31946] PASS level2 10.249.10.70 level2 ads http://ad.360yield.com/unpixel.... GET",
"#version": "1",
"#timestamp": "2015-09-24T11:17:57.389Z",
"type": "ufdb",
"file": "/usr/local/ufdbguard/logs/ufdbguardd.log",
"host": "PROXY-DEV",
"offset": "3983281700",
"logdate": "2015-08-14T04:50:05.000Z",
"status": "PASS",
"group": "level2",
"clientip": "10.249.10.70",
"category": "ads",
"url": "http://ad.360yield.com/unpixel....",
"method": "GET",
"tags": [
"_grokparsefailure"
]
}
Now I can do a partial update on the document with:
POST /test_index/doc/1/_update
{
"doc": {
"logdate": "2015-09-25T12:20:00.000Z"
}
}
If I retrieve the document:
GET /test_index/doc/1
I will see that the logdate property has been updated:
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_version": 2,
"found": true,
"_source": {
"message": "2015-08-14 06:50:05 [31946] PASS level2 10.249.10.70 level2 ads http://ad.360yield.com/unpixel.... GET",
"#version": "1",
"#timestamp": "2015-09-24T11:17:57.389Z",
"type": "ufdb",
"file": "/usr/local/ufdbguard/logs/ufdbguardd.log",
"host": "PROXY-DEV",
"offset": "3983281700",
"logdate": "2015-09-25T12:20:00.000Z",
"status": "PASS",
"group": "level2",
"clientip": "10.249.10.70",
"category": "ads",
"url": "http://ad.360yield.com/unpixel....",
"method": "GET",
"tags": [
"_grokparsefailure"
]
}
}
Here is the code I used to test it:
http://sense.qbox.io/gist/236bf271df6d867f5f0c87eacab592e41d3095cf

Related

Elasticsearch - Delete query among nested object

I'm new to Elasticsearch, and I cannot find a Delete query.
Here is an example of an document in myIndex :
{
"_index": "myIndex",
"_type": "_doc",
"_id": "IPc5kn8Bq7SuVr5qM9dq",
"_score": 1,
"_source": {
"code": "1234567",
"matches": [
{
"hostname": "hostnameA.com",
"url": "https://www.hostnameA.com/....",
},
{
"hostname": "hostnameB.com",
"url": "https://www.hostnameB.com/....",
},
{
"hostname": "hostnameC.com",
"url": "https://www.hostnameC.com/....",
},
{
"hostname": "hostnameD.com",
"url": "https://www.hostnameD.com/....",
},
]
}
}
Let's say this index contains 10k documents.
I would like a query to remove all the item from my array matches where the hostname is equal to hostnameC.com, and keeping all the others.
Anyone would have an idea to help me?

Kibana Visualization from multiple Elastic Search Indexes

I have a requirement to find the numbers of mobile applications registered by the customer. The Elastic Search index is designed as below (Mobile App in one index, Customers in one index and the association between both in 3rd index). When I created the Kibana Indexpattern for these 3 indices together, it does not provide meaningful/valid set of fields to query them.
mobile_users
{
"_index": "mobile_users",
"_type": "_doc",
"_id": "mobileuser_id1",
"_score": 1,
"_source": {
"userid": "mobileuser_id1",
"name": "jack",
"username": "jtest",
"identifiers": [ ],
"contactEmails": [ ],
"creationDate": "2020-09-29 09:18:36 GMT",
"lastUpdated": 1601371117354,
"isSuspended": false,
"authStrategyIds": [ ],
"subscription": false
}
}
mobile_applications
{
"_index": "mobile_applications",
"_type": "_doc",
"_id": "mobileapp_id1",
"_source": {
"appDefinition": {
"info": {
"version": "1.0",
"title": "TEST.MobileAPP"
},
"AppDisplayName": "TEST.MobileAPP1.0",
"appName": "TEST.MobileAPP",
"appVersion": "1.0",
"maturityState": "Test",
"isActive": false,
"owner": "mobileappowner",
"creationDate": "2020-09-24 11:21:44 GMT",
"lastModified": "2020-10-13 11:58:22 GMT",
"id": "mobileapp_id1"
}
registered_mobile_applications
{
"_index": "registered_mobile_applications",
"_type": "_doc",
"_id": "mobileuser_id1",
"_version": 1,
"_score": 1,
"_source": {
"applicationId": "mobileuser_id1",
"mobileappIds": [
"mobileapp_id1", "mobileapp_id2"
],
"lastUpdated": 1601371117929
}
}
Can you advise if there is any way to get the count of registered applications for the given customer?
it's Elasticsearch, not Elastic Search :)
given each of your document structures are dramatically different, it's not surprising you can't get much meaning from a single index pattern
however there's no way to natively count the values of an array in a document in Kibana. you could create a scripted field that should do it, or add that as a separate field during ingestion

How to remove a field from json field in Elastic Search

I would like to remove member2 from members. I saw script
ctx._source.list_data.removeIf{list_item -> list_item.list_id == remove_id}
for a list but in my case it's not working. Is that possible?
"_index": "test",
"_type": "test",
"_id": "5",
"_score": 1.0,
"_source": {
"id": "1",
"description": "desc",
"name": "ss",
"members": {
"member1": {
"id": "2",
"role": "owner"
},
"member2": {
"role": "owner",
"id": "3"
}
}
}
}
You can use the update API:
POST test/_update/5
{
"script": "ctx._source.members.remove('member2')"
}
removeIf is for list. Your members2 is of type object so you need to use remove
{
"script": "if(ctx._source.members.member2.id=='3')
ctx._source.members.remove('member2')"
}

ElasticSearch NEST Reindex, edit name fields

I have an Index with nested Objects something like
"_index": "originindex",
"_source": {
"message": "",
"environment": "",
"nestedObj": {
"field1": "field1",
"field2": 1 },
"anotherfield": 1}
And I want to reindexit to something like
"_index": "newindex",
"_source": {
"message": "",
"nestedObj-field1":"field1",
"nestedObj-field2": 1 ,
"anotherfield": 1}
I'am new to all of this I'm using Nest on .Net V4.5, it proposes a ReindexAPI But don'tknow how to use it for this purpose
Thank you!
POST _reindex
{
"source": {
"index": "originindex"
},
"dest": {
"index": "newindex"
},
"script":{
"source":"ctx._source.nestedObj-field1 = ctx._source.remove(\"field1\");ctx._source.nestedObj-field2 = ctx._source.remove(\"field2\");"
}
Just make sure your mappings are in place on the dest index before you execute this.

Exclude only 1 field from indexing

I need your help with ElasticSearch field indexes.
I'm sending Jsons to ES and I have only 1 field that I send XML and I never search on this field.
I need your help to exclude ONLY this field from indexing. The field that I would like to exclude is: rawXml
Please let me know if you need more details from me.
Is this possible?
This is the Json i'm sending now :
{
"_index": "logstash-2015.05.05",
"_type": "XXXXX",
"_id": "XXXXXX",
"_score": null,
"_source": {
"#version": 1,
"#timestamp": "2015-05-05T00:00:11.489Z",
"host": "XXXXXXXXX",
"application_name": "XXXXXXX",
"logger_name": "XXXXXXXXX",
"level": "DEBUG",
"thread_name": "XXXXXXX",
"name": "XXXXXXXX",
"rawXml": "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<XXXXX xmlns=\"http://XXXXXXXX\" Version=\"14.000\">\n <POS>\n <Source DKNumber=\"XXXXXX\"/>\n </POS>\n <Unique>\n <ID PCC=\"XXXXX\" GDS=\"XXXXX\" NAME=\"XXXXXX\" TYPE=\"XXXXXX\"/>\n </Unique>\n</OpenHotelHubSessionCreateRQ>\n",
"type": "XXXXXXXX",
"message": "Event: XXXXXXX",
"requestContext": {
"wml-locale": "en_AU",
"accountId": XXXXXXX,
"agentName": "XXXXXXX",
"deviceType": "XXXXXX",
"cwtTraveller": {
"clientTopGuid": "XXXXX",
"travellerGuid": "XXXXXXX",
"clientTopTravelerTypeGuid": "XXXXXX",
"clientSubGuid": "XXXXXXXX"
},
"providerSessionId": null,
"variantId": X
},
"path": "XXXXXXXXXXX",
"hostname": "XXXXXXXXX"
},
"sort": [
1430784011489
]
}

Resources