How to generate multiple message from a single message in spring xd? - spring-xd

How to achieve this with spring xd?
Input message :
{"key" : "temp", "key1" : "a b c"}
Output messages (my requirement):
{"key" : "temp", "key1" : "a"}
{"key" : "temp", "key1" : "b"}
{"key" : "temp", "key1" : "c"}
[Note:I tried to use splitter but splitter is taking whole payload as input.]

It's probably easiest to create a custom splitter module.
You could do it with a bunch of transformers followed by a splitter and more transformers but it would be rather convoluted...
jsonToMap->save key in a header->transform to key1.payload->split on space
->transform back to a map->add key entry back in->mapToJson

Related

How could i remove items from another search?

On elastic search we make two searches, one for exact items, and another for non-exact items.
On we search input = dev, and on the exact result we get this item:
{"_id" : "users-USER#1-name",
"_source" : {
"pk" : "USER#1",
"entity" : "users",
"field" : "name",
"input" : "dev",
}}
Then we do a second search for the non-exact results we get this item:
{"_id" : "users-USER#1-description",
"_source" : {
"pk" : "USER#1",
"entity" : "users",
"field" : "name",
"input" : "Dev1",
}}
We want to remove the exact results from the first search from the second non-exact search by pk, we want to remove the items with the pk's from the first search from the second search
I'll heavenly appreciate any idea.
For example, on the fist search we got item:
"_id" : "users-USER#1-name"
"pk" : "USER#1"
Since we got this item on the first search, we want to remove all the items with the pks from the second search.
So the second search would be empty:
empty

Remove a document from Mongo based on the value of a subdocument

I am very new to Mongo but I have SQL experience so I am trying to wrap my head around this concept. I am attempting to remove a whole document based on the result of a subdocument.
The document/row looks close to the following:
{
"_id" : ObjectId("5a7e04e3809303035bf6437a"),
"receivedTime" : ISODate("2018-02-09T20:30:27.118Z"),
"status" : "NORMALIZED",
"originalHeaders" : {
"name" : "My Alert Name",
"description" : null,
"version" : 0,
"severity" : 3
},
"partOfIncident" : false
}
I want to remove all documents that have the name = "My Alert Name". I have been trying something like the following by calling it from a bash script. This is the command after variable substitution has been performed:
++ mongo admin -u admin -p password --eval 'db.getSiblingDB("database_name").collection.deleteMany({originalHeaders: {name: "I ALERT EVERYTHING"} })'
After calling it, nothing is removed. Any pointers on how to accomplish my end goal would be greatly appreciated. I suppose it is possible to run through a find and save all of the node _id to run for deletion but that sounds terribly inefficient.
When accessing a nested field you need to use dot notation.
db.collection_name.deleteMany( { "originalHeaders.name" : "My Alert Name" } )
This will delete all documents where originalHeaders.name = "My Alert Name"

Count of elements on kibana visualization

I have inserted below JSON records on my elastic index. How do I get count of all the elements present in the "devices" array so that count can be visualized on Kibana Dashboard ?
Filter condition - Devices count needs to be displayed as "4" for SAMPLE application and "2" for SAMPLE2 application on Kibana.
Without Filter condition - Device count to be displayed as "6" devices.
{
"status" : "SUCCESS",
"request" : ["ABC"],
"applicationName" : "SAMPLE",
"endTime" : 1478772517736,
"devices" : ["d1","d2","d3","d4"]
}
,
{
"status" : "FAILED",
"request" : ["EDF"],
"applicationName" : "SAMPLE2",
"endTime" : 1478772517736,
"devices" : ["d5","d12"]
}
You should create a scripted field in Kibana in order to get the length of an array element. So your script could look something like this:
doc['devices'].values.size()
OR
doc['devices'].values.length
And then you can have a Data Table visualization, where having the array count in respective to the applicationName by using the terms aggregation. Or you could apply filters saying:
applicationName:"SAMPLE"
applicationName:"SAMPLE2"
which will display the array count for the given filter criteria. This SO could be helpful.

Elasticsearch - Getting multiple documents with multiple custom offset and size 1

Currently, the way I use to get multiple documents with exact query but different positions offset with size 1 is to use Elastic Search Multi Search API. I wonder if there is any better way to do this that would result in a better performance.
The example of current query I am using :
{"index" : "test"}
{"query" : {"term" : { "user" : "Kimchy" }}, "from" : a, "size" : 1}
{}
{"query" : {"term" : { "user" : "Kimchy" }}, "from" : b, "size" : 1}
{}
{"query" : {"term" : { "user" : "Kimchy" }}, "from" : c, "size" : 1}
{}
{"query" : {"term" : { "user" : "Kimchy" }}, "from" : d, "size" : 1}
{}
{"query" : {"term" : { "user" : "Kimchy" }}, "from" : e, "size" : 1}
....
where a,b,c,d,e is a parameter given when query.
If I understand you correctly a,b,c,d,e will all be numbers right?, so you basically want to be able to ask elastic search for say the 3rd, 4th, and 7th documents that show up in a specific query?
I'm not sure if it is the best way to do things, but it would certainly be faster to find the smallest and largest numbers in a through e then do "from : smallest" and "size : largest-smallest". Then take the results that ES returns and go through it yourself to get the specific documents.
Every time you do a from/size query elastic search has to find all the queries before that number anyways so you are currently basically redoing the same search over and over.
This approach does get sketchy if there is a large difference between your smallest and biggest numbers though, and you may end up trying to send back thousands of documents.

MongoDB complex find

I need to grab the top 3 results for each of the 8 users. Currently I am looping through for each user and making 8 calls the the db. Is there a way to structure the query to pull the same 8X3 dataset in a single db pull?
selected_users = users.sample(8)
cur = 0
while cur <= selected_users .count-1
cursor = status_store.find({'user' => selected_users[cur]},{:fields =>params}).sort('score', -1).limit(3)
*do something*
cur+=1
end
The collection I am pulling from looks like the below. Each user can have an unbound number of tweets so I have not embedded them within within a user document.
{
"_id" : ObjectId("51e92cc8e1ce7219e40003eb"),
"id_str" : "57915476419948544",
"score" : 904,
"text" : "Yesterday we had a bald eagle on the show. Oddly enough, he was in the country illegally.",
"timestamp" : "19/07/2013 08:10",
"user" : {
"id_str" : "115485051",
"name" : "Conan O'Brien",
"screen_name" : "ConanOBrien",
"description" : "The voice of the people. Sorry, people.",
}
}
Thanks in advance.
Yes you can do this using the aggregation framework.
Another way would be to keep track of the top 3 scores for in the user documents. If this is faster or not depends on how often you write to scores vs read to top scores by users.

Resources