elasticsearch bulk delete by custom field values

elasticsearch bulk delete by custom field values - elasticsearch

I'm building app with elasticsearch (5.4) and everything was going well until I try to delete several documents by field values. My x-ndjson looks like this:
{ "delete" : {} }
{ "id" : "109991" }
{ "delete" : {} }
{ "id" : "109992" }
{ "delete" : {} }
{ "id" : "109993" }
<- empty line
and i am POSTing it on http://localhost:9200/someindex/sometype/_bulk, but it responds with "Malformed action/metadata line [2], expected START_OBJECT or END_OBJECT but found [VALUE_NUMBER]".
Note that my "id" is my custom field, not the _id.
Is something missing in my request?
Thank you

I guess you need to use Delete By Query for this.
POST index/_delete_by_query
{
"query": {
"terms": {
"id": [
109991,
109992
]
}
}
}

Related

Problems accessing _source fields with a dot in the name when creating Slack action for Elasticsearch Watcher

I am trying to create a Slack action with a dynamic attachment. My _source looks like this:
{
"user.url": "https://api.github.com/users/...",
"user.gists_url": "https://api.github.com/users/.../gists{/gist_id}",
"user.repos_url": "https://api.github.com/users/.../repos",
"date": "2018-04-27T14:34:10Z",
"user.followers_url": "https://api.github.com/users/.../followers",
"user.following_url": "https://api.github.com/users/.../following{/other_user}",
"user.id": 123456,
"user.avatar_url": "https://avatars0.githubusercontent.com/u/123456?v=4",
"user.events_url": "https://api.github.com/users/.../events{/privacy}",
"user.site_admin": false,
"user.html_url": "https://github.com/...",
"user.starred_url": "https://api.github.com/users/.../starred{/owner}{/repo}",
"user.received_events_url": "https://api.github.com/users/.../received_events",
"metric": "stars",
"user.login": "...",
"user.type": "User",
"user.subscriptions_url": "https://api.github.com/users/.../subscriptions",
"user.organizations_url": "https://api.github.com/users/.../orgs",
"user.gravatar_id": ""
}
and here is my Slack action
"actions": {
"notify-slack": {
"throttle_period_in_millis": 240000,
"slack": {
"account": "monitoring",
"message": {
"from": "Elasticsearch Watcher",
"to": [
"#watcher"
],
"text": "We have {{ctx.payload.new.hits.total}} new stars! And {{ctx.payload.old.hits.total}} in total.",
"dynamic_attachments" : {
"list_path" : "ctx.payload.new.hits.hits",
"attachment_template" : {
"title" : "{{_source.[\"user.login\"]}}",
"text" : "Users Count: {{count}}",
"color" : "{{color}}"
}
}
}
}
}
I can't seem to figure out how to access my _source fields since they have dots in them. I have tried:
"{{_source.[\"user.login\"]}}"
"{{_source.user.login}}"
"{{_source.[user.login]}}"
"{{_source.['user.login']}}"

The answer to my question is that you can't access _source keys with dots in them directly using mustache, you must first transform your data.
Update:
I was able to get this working by using a transform to build a new object. Mustache might not be able to access fields with dots in their names, but painless can! I added this transform to my slack object:
"transform" : {
"script" : {
"source" : "['items': ctx.payload.new.hits.hits.collect(user -> ['userName': user._source['user.login']])]",
"lang" : "painless"
}
}
and now in the slack action dynamic attachments, I can access the items array:
"dynamic_attachments" : {
"list_path" : "ctx.payload.items",
"attachment_template" : {
"title" : "{{userName}}",
"text" : "{{_source}}"
}
}
Old Answer:
So according to this Watcher uses mustache.
and according to this mustache can't access fields with dots in the names.

Invalid json while pushing search template to ElasticSearch

I am developing a webapp, for which I am pushing my search templates to ES during startup and using them to form the elastic search queries at runtime. I have a requirement wherein, I don't know the number of filters to be applied. Created a search template like -
{
"filters" : {
{{#toJson}}
clauses
{{/toJson}}"
}
}
And search will be made like this -
GET _search/template
{
"id": "template-id",
"params": {
"clauses": {
"filters" : {
{ "match": { "user" : "foo" } },
{ "match": { "user" : "bar" } }
}
}
}
which will render result as -
{
"filters":{
"filters":{
"match" : {
"user" : "foo"
}
},
{
"match" : {
"user" : "bar"
}
}
}
}
as suggested by ES documentation-
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-template.html
But, since it's an invalid JSON, it doesn't allow me to push the template to ES.
My template works well when I use it as stored template in elastic-home/config/scripts. But I want to manage my templates with JAVA and push all templates during startup only.
Can I get any help?

Springdata mongodb aggregation match

After asking question to understand a bit more of the aggregation framework in MongoDB I finally found the way to do aggregation for my need (thanks to a StackExchange user)
So basically here is a document from my collection:
{
"_id" : ObjectId("s4dcsd5s4d6c54s6d"),
"items" : [
{
type : "TYPE_1",
text : "blablabla"
},
{
type : "TYPE_2",
text : "blablabla"
},
{
type : "TYPE_3",
text : "blablabla"
},
{
type : "TYPE_1",
text : "blablabla"
},
{
type : "TYPE_2",
text : "blablabla"
},
{
type : "TYPE_1",
text : "blablabla"
}
]
}
The idea was to be able to filter only some elements of my collections (avoiding Type 2 and 3). In fact I have more than 30 types and 6 are not allowed but for simplicity I made this example.
So the aggregation command in command line is this one:
db.history.aggregate([{
$match: {
_id: ObjectId("s4dcsd5s4d6c54s6d")
}
}, {
$unwind: '$items'
}, {
$match: {
'items.type': { '$nin': [ "TYPE_2" , "TYPE_3"] }
}
},
{ $limit: 10 }
]);
With this I am able to retrieve the 10 elements items of this document which do not match TYPE_2 and TYPE_3
However when I am using spring data there is no output. I looked a bit at the example to build mine but its still not working.
So I did:
Aggregation aggregation = newAggregation(
match(Criteria.where("id").is(myID)),
unwind("items"),
match(Criteria.where("items.type").nin(ignoreditemstype)),
limit(3),
skip(offsetLong)
);
AggregationResults<PersonnalHistory> results = mongAccess.getOperation().aggregate(query,
"items", PersonnalHistory.class);
PersonnalHistory is marked with annotation #Document(collection = "history") and id with the #id annotation
ignoreditemstype is a list containing TYPE_2 and TYPE_3
Here is what I have in the toString method of aggregation:
{
"aggregate" : "__collection__" ,
"pipeline" : [
{ "$match": { "id" : "s4dcsd5s4d6c54s6d"} },
{ "$unwind": "$items"},
{ "$match": { "items.type": { "$nin" : [ "TYPE_2" , "TYPE_3" ] } } },
{ "$limit" : 3},
{ "$skip" : 0 }
]
}
I tried a lot of stuff (to have at least an answer :) ) like removing id or the nin:
aggregation = newAggregation(
unwind("items"),
match(Criteria.where("items.type").nin(ignoreditemstype)),
limit(3),
skip(offsetLong)
);
aggregation = newAggregation(
match(Criteria.where("id").is(myid)),
unwind("items")
);
For information when I do a simple query like:
query.addCriteria(Criteria.where("id").is(myID));
My document is returned. However I have thousands of items. So I just want to have the 15 first (in fact the 15 first are the 15 last added)
Do you maybe see what I am doing wrong?

Yeah looks like you are passing simple String while it is expecting ObjectId
Aggregation aggregation = newAggregation(
match(Criteria.where("_id").is(new ObjectId(myID))),
unwind("items"),
match(Criteria.where("items.type").nin(ignoreditemstype)),
limit(3),
skip(offsetLong)
);
Now the question is why it works with simple query, my answer would be because spring-data driver is not that mature at least not with aggregation pipeline.

elasticsearch - how to query relations

I have the currenct structure:
localhost:9200/objects/content
{
id:1,
author:{
name:"john"
},
body:"abc"
}
localhost:9200/objects/reaction
{
content_id:1
message:'I like it'
}
How can I query to get all reactions of contents writed by "john"?
This means a query on reactions, checking the content specified by id, if author is "someone".

This Elasticsearch blog post describes how to manage relationships inside Elasticsearch.
You will need to set a parent mapping between your reactions and your content.
{
"reaction" : {
"_parent" : {
"type" : "content"
}
}
}
You will then index your reaction as a child of content id 1:
curl -XPOST localhost:9200/test/homes?parent=1 -d'
{
message:'I like it'
}
You can then use a Has Parent Query to retrieve all reactions to authors named john:
{
"has_parent" : {
"parent_type" : "content",
"query" : {
"term" : {
"author.name" : "john"
}
}
}
}

I would recommend you to add some redundancy to your model:
localhost:9200/objects/reaction
{
content_id:1
message:'I like it'
author_name:'john'
}
This will increase the index, of course. On the other hand the query to get all reactions of contents writed by "john" will be simple and fast.

Query elasticsearch to find docs that don't have a key

i have logs like:
{
"a":"XXX",
"b":"YYY",
"token":"acquired"
}
Also, i have logs that do not have this token key set. Kibana's terms panel tells that they are around by showing them as Missing fields(3047). How can i query all docs that do not have the token key set?

You can query in ES for missing fields:
From: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_dealing_with_null_values.html
GET /my_index/posts/_search
{
"query" : {
"filtered" : {
"filter": {
"missing" : { "field" : "token" }
}
}
}
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

elasticsearch bulk delete by custom field values - elasticsearch

I guess you need to use Delete By Query for this. POST index/_delete_by_query { "query": { "terms": { "id": [ 109991, 109992 ] } } }

Related

Problems accessing _source fields with a dot in the name when creating Slack action for Elasticsearch Watcher

Invalid json while pushing search template to ElasticSearch

Springdata mongodb aggregation match

elasticsearch - how to query relations

Query elasticsearch to find docs that don't have a key

Categories

Resources