In Painless, remove value from array - elasticsearch

In Splunk SPL, it's easy to remove a value from an array....
| eval Account_Name = mvindex(Account_Name, 0)
Windows security logs reference the account name as the machine name in array(0)
array(1) contains the actual executing account name.
I need to do the same thing as the mvindex function in Painless.
I find lots of hits searching this but haven't found anything that works. THere must be a simple way to remove an array value.

Did you look for the following thing?
POST sample_index/_doc
{
"Account_Name": [
"machine-name",
"account-name"
]
}
POST sample_index/_update_by_query
{
"query": {
"match_all": {}
},
"script": {
"source": "ctx._source['Account_Name'].remove(0)",
"lang": "painless"
}
}
GET sample_index/_search
The result after search :
{
"took": 892,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "sample_index",
"_id": "t3RIY4YBnUNkT6fHnBrI",
"_score": 1,
"_source": {
"Account_Name": [
"account-name"
]
}
}
]
}
}

Related

ElasticSearch: Exact match on Keyword datatype field with array of values

In ElasticSearch, I have a mapping for an email field and title field as given below:
{
"person": {
"mappings": {
"_doc": {
"email": {
"type": "keyword",
"boost": 80
},
"title": {
"type": "text",
"boost": 70
}
}
}
}
Each person can have more than one email address and title. So, I'm storing the values in arrays.
I use query_string to search for persons with an email address and/or title. Email address needs to match exactly.
I have indexed a document with the following data. Calling GET person/_search in Kibana will yield the following document in the result.
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "person",
"_type": "_doc",
"_id": "101",
"_score": 1,
"_source": {
"title": """["Actor", "Hero", "Model"]""",
"email": """["jdepp#hotmail.com", "johnny#hollywood.com", "jdepp#gmail.com", "johnny.depp#yahoo.com"]""",
"SEARCH_ENTITY": "PERSON"
}
}
]
}
}
Now when I add some email search parameter I don't get the document back in the result. Remember email is of type keyword.
Request:
GET person/_search
{
"query" : {
"query_string" : {
"query" : "SEARCH_ENTITY:PERSON AND (email: (johnny.depp#yahoo.com))"
}
}
}
Response:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
But the same kind of query works for title field which is of type text.
Request:
GET person/_search
{
"query" : {
"query_string" : {
"query" : "SEARCH_ENTITY:PERSON AND (title: ((actor)))"
}
}
}
Response:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 20.137747,
"hits": [
{
"_index": "person",
"_type": "_doc",
"_id": "101",
"_score": 20.137747,
"_source": {
"ID": "101",
"title": """["Actor", "Hero", "Model"]""",
"email": """["jdepp#hotmail.com", "johnny#hollywood.com", "jdepp#gmail.com", "johnny.depp#yahoo.com"]"""
}
}
]
}
}
Can someone tell me what I need to do to make this work for email field which is of keyword type?
Note: If I store only one email address without using an array, it works fine.
Thanks.
Make sure you parse the json array strings in title and email like so before you index your docs:
POST person/_doc/101
{
"title": [
"Actor",
"Hero",
"Model"
],
"email": [
"jdepp#hotmail.com",
"johnny#hollywood.com",
"jdepp#gmail.com",
"johnny.depp#yahoo.com"
],
"SEARCH_ENTITY": "PERSON"
}
Nothing needs to be changed about the mapping -- just the field values.

Elasticsearch range query not working as expected

I am trying to fetch data by applying range on date type field("timeA" in this case).
My query is:
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"name": "A"
}
},
{
"range": {
"timeA": {
"lte": 9999
}
}
}
]
}
}
}
I don't have any data less then 1558891800000 in timeA filed.
SO the expected output has to be:
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
But the actual output I'm getting is:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.287682,
"hits": [
{
"_index": "checktimestamp",
"_type": "doc",
"_id": "AWr4sdJv_fFf5JZrQhXl",
"_score": 1.287682,
"_source": {
"name": "A",
"timeA": 1558899000000,
"timeLocal": "27-1AM"
}
}
]
}
}
The Type of timeA field is date.
My elasticsearch version is 5.6.10 and Kibana version is 5.6.10.
Please suggest what is the problem here and how can I resolve it.
Thanks in advance.
Elastic parses the 4 digits as a year meaning it matches documents with a year less or equal to 9999, which i'm assuming is all your data.
To avoid this your need to define in your mapping a strict format for your date field, this will now allow a "yyyy" format to sneak in.
or alternatively don't use numbers with less than 5 digits in those queries.

Append to array in Elasticsearch

I am currently struggling a bit on how to append a value to an array in elasticsearch.
The Document looks something like this:
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "iethreads",
"_type": "thread",
"_id": "AVRk6WRMU5h_y_zwo4s0",
"_score": 1,
"fields": {
"links": [
"[\"https://somelink123.net/thread-714222&page=1\", \"https://somelink123.net/thread-714222&page=2\", \"https://somelink123.net/thread-714222&page=3\", \"https://somelink123.net/thread-714222&page=4\"]"
]
}
}
]
}
}
then I run the following update query
POST _update
{
"script" : "ctx._source.links+=new_posts",
"params" : {
"new_posts":"blabliblub"
}
}
and I get this:
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "iethreads",
"_type": "thread",
"_id": "AVRk6WRMU5h_y_zwo4s0",
"_score": 1,
"fields": {
"links": [
"[\"https://somelink123.net/thread-714222&page=1\", \"https://somelink123.net/thread-714222&page=2\", \"https://somelink123.net/thread-714222&page=3\", \"https://somelink123.net/thread-714222&page=4\"]blabliblub"
]
}
}
]
}
}
So for me this looks like the array is treated like a string and it just appends the string - this is not what I want.
How would I append the "blabliblub" as a new element to the array ?
It seems your links field actually has one element as string instead of an array. To your update be succesful, your structure must be like that:
"fields": {
"links": [
"https://somelink123.net/thread-714222&page=1",
"https://somelink123.net/thread-714222&page=2",
"https://somelink123.net/thread-714222&page=3",
"https://somelink123.net/thread-714222&page=4"
]
}

How to filter out elements from an array that doesn’t match the query?

stackoverflow won't let me write that much example code so I put it on gist.
So I have this index
with this mapping
here is a sample document I insert into newly created mapping
this is my query
GET products/paramSuggestions/_search
{
"size": 10,
"query": {
"filtered": {
"query": {
"match": {
"paramName": {
"query": "col",
"operator": "and"
}
}
}
}
}
}
this is the unwanted result I get from previous query
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.33217794,
"hits": [
{
"_index": "products",
"_type": "paramSuggestions",
"_id": "1",
"_score": 0.33217794,
"_source": {
"productName": "iphone 6",
"params": [
{
"paramName": "color",
"value": "white"
},
{
"paramName": "capacity",
"value": "32GB"
}
]
}
}
]
}
}
and finally the wanted result, how I want the query result to look like
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.33217794,
"hits": [
{
"_index": "products",
"_type": "paramSuggestions",
"_id": "1",
"_score": 0.33217794,
"_source": {
"productName": "iphone 6",
"params": [
{
"paramName": "color",
"value": "white"
},
]
}
}
]
}
}
How should the query look like to achieve the wanted result with filtered array field which matches the query? In other words, all other non-matching array items should not appear in the final result.
The final result is the _source document that you indexed. There is no feature that lets you mask field elements of your document out of the Elasticsearch response.
That said, depending on your goal, you can look into how Highlighters and Suggesters identify result terms matching the query, or possibly, roll-your-own client-side masking using info returned from setting "explain": true in your query.

ElasticSearch Scripting: check if array contains a value

Let's say I have created a document like this:
PUT idx/type/1
{
"the_field": [1,2,3]
}
I can retrieve my document using GET /idx/type/1:
{
"_index": "idx",
"_type": "type",
"_id": "1",
"_version": 1,
"found": true,
"_source": {
"the_field": [
1,
2,
3
]
}
}
Now, I want to check if the field "the_field" contains the value 2.
I know I can use a term clause, but I need to check this using a filter script, so I tried:
POST /idx/typ/_search
{
"query": {
"match_all": {}
},
"filter": {
"script": {
"script": "doc['the_field'].values.contains(2)"
}
}
}
and get no results:
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
To test if my mevl script syntax is right, I tried doing this:
POST /idx/type/_search
{
"query": {
"match_all": {}
},
"filter": {
"script": {
"script": "[1,2,3].contains(3)"
}
}
}
and get the right results:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "idx",
"_type": "type",
"_id": "1",
"_score": 1,
"_source": {
"the_field": [
1,
2,
3
]
}
}
]
}
}
What am I doing wrong?
I think doc['the_field'].values should return [1,2,3], is not it? If so, my code should work.
Does anybody can help me?
Thank you!
UPDATE
When I replace all the [1, 2, 3] in my code with ["a"," b", "c"], it works. Any idea?
It is working with "a", "b", "c" because the_field is being stored in Elasticsearch by default as a string and not an integer. You can validate by checking the mapping with:
$ curl -XGET 'http://localhost:9200/idx/type/_mapping'
The following should set the appropriate field type:
$ curl -XPUT 'http://localhost:9200/idx/type/_mapping' -d '
{
"type" : {
"properties" : {
"the_field" : {"type" : "integer" }
}
}
}
Update the mapping, re-index your data and see if that works. Please see the PUT Mapping API for additional guidance if needed.

Resources