Elasticsearch : adding temporary field to result with Groovy - elasticsearch

I've a large configuration with queries and filters, which works fine.
Now, I'm adding a new script filter with Groovy, which works fine too:
doc['age'].value >= 18;
But I'm wondering how to do the following with Groovy:
Add a temporary boolean field to current document. See example below.
Example document in my result:
{
"name": "foo",
"age": 20
}
But I want to add the result of the script filter in my result, like so:
{
"name": "foo",
"age": 20,
"age_ok": true
}
age_ok is not indexed, but set by the Groovy filter.

AFAIK, You can't inject a script filter into the search result, but you can use scriptable field in order to inject scriptable data. You'll have to duplicate some part of the script.
From the documentation :
{
"query" : {
...
},
"script_fields" : {
"test1" : {
"script" : "doc['my_field_name'].value * 2"
},
"test2" : {
"script" : {
"inline": "doc['my_field_name'].value * factor",
"params" : {
"factor" : 2.0
}
}
}
}
}
See https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html

Related

Trigger an action for each hit of Elasticsearch query in Kibana Monitor

Is it possible to trigger an action for each hit of a given query in a Kibana Monitor? I would like to use a foreach loop to do this as demonstrated here. However, it's unclear how to implement this on the Kibana Monitor page. On the page there is an input field for Trigger Conditions but I'm unsure how to format the foreach within it or if this is supported.
Consider using Elasticsearch watcher (require at least gold licesnse): https://www.elastic.co/guide/en/elasticsearch/reference/current/how-watcher-works.html
Watcher will run on a certain interval and will perform a query against indices (according to your configuration). You will need to create a condition (e.g. hits number is greater than 5) that when it evaluates to true an action will be performed. Elasticsearch allows you to use multiple actions. For example, you can use webhook and receive the data from the last watcher run (you can also use watcher api to transform the data). If you don't have Gold license you can mimic watcher behavior by a script/program that uses Elasticsearch Search API.
Herbeby is a simple example of a watcher checking index named test every minute and sends a webhook with the entire search context in case there is at least one document.
{
"trigger" : {
"schedule" : { "interval" : "1m" }
},
"input" : {
"search" : {
"request" : {
"indices" : [ "test" ],
"body" : {
"query" : {
"bool": {
"must": {
"range": {
"updatedAt": {
"gte": "now-1m"
}
}
}
}
}
}
}
}
},
"condition" : {
"compare" : { "ctx.payload.hits.total" : { "gt" : 0 }}
},
"actions" : {
"sample_webhook" : {
"webhook" : {
"method" : "POST",
"url": "http://b4022015b928.ngrok.io/UtilsService/api/elasticHandler/watcher",
"body" : "{{#toJson}}ctx.payload{{/toJson}}",
"auth": {
"basic": {
"user": "user",
"password": "pass"
}
}
}
}
}
}
An alternative way would be to use Kibana Alerts and Actions.
https://www.elastic.co/guide/en/kibana/current/alerting-getting-started.html
This feature is slightly different from Watcher but basically allows you to perfrom actions upon a query against Elasticsearch. This featrue is only part of Kibana opposing to watcher which is part of Elasticsearch (though it is accessible from Kibana stack management).

Is there a way to update a document with a Painless script without changing the order of unaffected fields?

I'm using Elasticsearch's Update by Query API to update some documents with a Painless script like this (the actual query is more complicated):
POST ts-scenarios/_update_by_query?routing=test
{
"query": {
"term": { "routing": { "value": "test" } }
},
"script": {
"source": """ctx._source.tagIDs = ["5T8QLHIBB_kDC9Ugho68"]"""
}
}
This works, except that upon reindexing, other fields get reordered, including some classes which are automatically (de)serialized using JSON.NET's type handling. That means a document with the following source before the update:
{
"routing" : "testsuite",
"activities" : [
{
"$type" : "Test.Models.SomeActivity, Test"
},
{
"$type" : "Test.Models.AnotherActivity, Test",
"CustomParameter" : 1,
"CustomSetting" : false
}
]
}
ends up as
{
"routing" : "testsuite",
"activities" : [
{
"$type" : "Test.Models.SomeActivity, Test"
},
{
"CustomParameter" : 1,
"CustomSetting" : false,
"$type" : "Test.Models.AnotherActivity, Test"
}
],
"tagIDs" : [
"5T8QLHIBB_kDC9Ugho68"
]
}
which JSON.NET can't deserialize. Is there a way I can tell the script (or the Update by Query API) not to change the order of those other fields?
In case it matters, I'm using Elasticsearch OSS version 7.6.1 on macOS. I haven't checked whether an Ingest pipeline would work here, as I'm not familiar with them.
(It turns out I can make the deserialization more flexible by setting the MetadataPropertyHandling property to ReadAhead, as mentioned here. That works, but as mentioned it may hurt performance and there might be other situations where field order matters. Technically, it shouldn't; JSON isn't XML, but there are always edge cases where it does matter.)

Problems accessing _source fields with a dot in the name when creating Slack action for Elasticsearch Watcher

I am trying to create a Slack action with a dynamic attachment. My _source looks like this:
{
"user.url": "https://api.github.com/users/...",
"user.gists_url": "https://api.github.com/users/.../gists{/gist_id}",
"user.repos_url": "https://api.github.com/users/.../repos",
"date": "2018-04-27T14:34:10Z",
"user.followers_url": "https://api.github.com/users/.../followers",
"user.following_url": "https://api.github.com/users/.../following{/other_user}",
"user.id": 123456,
"user.avatar_url": "https://avatars0.githubusercontent.com/u/123456?v=4",
"user.events_url": "https://api.github.com/users/.../events{/privacy}",
"user.site_admin": false,
"user.html_url": "https://github.com/...",
"user.starred_url": "https://api.github.com/users/.../starred{/owner}{/repo}",
"user.received_events_url": "https://api.github.com/users/.../received_events",
"metric": "stars",
"user.login": "...",
"user.type": "User",
"user.subscriptions_url": "https://api.github.com/users/.../subscriptions",
"user.organizations_url": "https://api.github.com/users/.../orgs",
"user.gravatar_id": ""
}
and here is my Slack action
"actions": {
"notify-slack": {
"throttle_period_in_millis": 240000,
"slack": {
"account": "monitoring",
"message": {
"from": "Elasticsearch Watcher",
"to": [
"#watcher"
],
"text": "We have {{ctx.payload.new.hits.total}} new stars! And {{ctx.payload.old.hits.total}} in total.",
"dynamic_attachments" : {
"list_path" : "ctx.payload.new.hits.hits",
"attachment_template" : {
"title" : "{{_source.[\"user.login\"]}}",
"text" : "Users Count: {{count}}",
"color" : "{{color}}"
}
}
}
}
}
I can't seem to figure out how to access my _source fields since they have dots in them. I have tried:
"{{_source.[\"user.login\"]}}"
"{{_source.user.login}}"
"{{_source.[user.login]}}"
"{{_source.['user.login']}}"
The answer to my question is that you can't access _source keys with dots in them directly using mustache, you must first transform your data.
Update:
I was able to get this working by using a transform to build a new object. Mustache might not be able to access fields with dots in their names, but painless can! I added this transform to my slack object:
"transform" : {
"script" : {
"source" : "['items': ctx.payload.new.hits.hits.collect(user -> ['userName': user._source['user.login']])]",
"lang" : "painless"
}
}
and now in the slack action dynamic attachments, I can access the items array:
"dynamic_attachments" : {
"list_path" : "ctx.payload.items",
"attachment_template" : {
"title" : "{{userName}}",
"text" : "{{_source}}"
}
}
Old Answer:
So according to this Watcher uses mustache.
and according to this mustache can't access fields with dots in the names.

Invalid json while pushing search template to ElasticSearch

I am developing a webapp, for which I am pushing my search templates to ES during startup and using them to form the elastic search queries at runtime. I have a requirement wherein, I don't know the number of filters to be applied. Created a search template like -
{
"filters" : {
{{#toJson}}
clauses
{{/toJson}}"
}
}
And search will be made like this -
GET _search/template
{
"id": "template-id",
"params": {
"clauses": {
"filters" : {
{ "match": { "user" : "foo" } },
{ "match": { "user" : "bar" } }
}
}
}
which will render result as -
{
"filters":{
"filters":{
"match" : {
"user" : "foo"
}
},
{
"match" : {
"user" : "bar"
}
}
}
}
as suggested by ES documentation-
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-template.html
But, since it's an invalid JSON, it doesn't allow me to push the template to ES.
My template works well when I use it as stored template in elastic-home/config/scripts. But I want to manage my templates with JAVA and push all templates during startup only.
Can I get any help?

Springdata mongodb aggregation match

After asking question to understand a bit more of the aggregation framework in MongoDB I finally found the way to do aggregation for my need (thanks to a StackExchange user)
So basically here is a document from my collection:
{
"_id" : ObjectId("s4dcsd5s4d6c54s6d"),
"items" : [
{
type : "TYPE_1",
text : "blablabla"
},
{
type : "TYPE_2",
text : "blablabla"
},
{
type : "TYPE_3",
text : "blablabla"
},
{
type : "TYPE_1",
text : "blablabla"
},
{
type : "TYPE_2",
text : "blablabla"
},
{
type : "TYPE_1",
text : "blablabla"
}
]
}
The idea was to be able to filter only some elements of my collections (avoiding Type 2 and 3). In fact I have more than 30 types and 6 are not allowed but for simplicity I made this example.
So the aggregation command in command line is this one:
db.history.aggregate([{
$match: {
_id: ObjectId("s4dcsd5s4d6c54s6d")
}
}, {
$unwind: '$items'
}, {
$match: {
'items.type': { '$nin': [ "TYPE_2" , "TYPE_3"] }
}
},
{ $limit: 10 }
]);
With this I am able to retrieve the 10 elements items of this document which do not match TYPE_2 and TYPE_3
However when I am using spring data there is no output. I looked a bit at the example to build mine but its still not working.
So I did:
Aggregation aggregation = newAggregation(
match(Criteria.where("id").is(myID)),
unwind("items"),
match(Criteria.where("items.type").nin(ignoreditemstype)),
limit(3),
skip(offsetLong)
);
AggregationResults<PersonnalHistory> results = mongAccess.getOperation().aggregate(query,
"items", PersonnalHistory.class);
PersonnalHistory is marked with annotation #Document(collection = "history") and id with the #id annotation
ignoreditemstype is a list containing TYPE_2 and TYPE_3
Here is what I have in the toString method of aggregation:
{
"aggregate" : "__collection__" ,
"pipeline" : [
{ "$match": { "id" : "s4dcsd5s4d6c54s6d"} },
{ "$unwind": "$items"},
{ "$match": { "items.type": { "$nin" : [ "TYPE_2" , "TYPE_3" ] } } },
{ "$limit" : 3},
{ "$skip" : 0 }
]
}
I tried a lot of stuff (to have at least an answer :) ) like removing id or the nin:
aggregation = newAggregation(
unwind("items"),
match(Criteria.where("items.type").nin(ignoreditemstype)),
limit(3),
skip(offsetLong)
);
aggregation = newAggregation(
match(Criteria.where("id").is(myid)),
unwind("items")
);
For information when I do a simple query like:
query.addCriteria(Criteria.where("id").is(myID));
My document is returned. However I have thousands of items. So I just want to have the 15 first (in fact the 15 first are the 15 last added)
Do you maybe see what I am doing wrong?
Yeah looks like you are passing simple String while it is expecting ObjectId
Aggregation aggregation = newAggregation(
match(Criteria.where("_id").is(new ObjectId(myID))),
unwind("items"),
match(Criteria.where("items.type").nin(ignoreditemstype)),
limit(3),
skip(offsetLong)
);
Now the question is why it works with simple query, my answer would be because spring-data driver is not that mature at least not with aggregation pipeline.

Resources