Kibana/Elastic Query on multiple terms in same array element - elasticsearch

In my Elasticsearch Index I have documents which contain an array of uniform elements, like this:
Document 1:
"listOfElements": {
"entries": [{
"key1": "value1",
"int1": 4,
"key2": "value2"
}, {
"key1": "value1",
"int1": 7,
"key2": "value2"
}
]
}
Document 2:
"listOfElements": {
"entries": [{
"key1": "value1",
"int1": 5,
"key2": "value2"
}, {
"key1": "value1",
"int1": 7,
"key2": "value2"
}
]
}
Now I want to create a query that returns all documents which have, e.g. key1:value1 AND int1:4 in the same entry element.
However, if I only query for "key1:value1 AND int1:4" I obviously get all documents that have key1:value1 and all that have int1:4 so I would get both documents from the above example.
Is there any way to query for multiple fields that have to be in the same array element?

Related

jq: mapping all values of an object

Say I have an object like so:
{
"key1": "value1",
"key2": "value2",
"key3": "value3"
}
I want to use jq to convert this to:
{
"key1": {
"innerkey": "value1"
},
"key2": {
"innerkey": "value2"
},
"key3": {
"innerkey": "value3"
}
}
i.e. I want to apply a mapping to every value in the object, that converts $value to {"innerkey": $value}. How can I achieve this with jq?
It's literally called map_values. Use it like this
map_values({innerkey:.})
Demo
You could also use the fact that iterating over an object iterates its values. So you could update those values on the object.
.[] |= {innerkey:.}
jqplay

How can we paginate a big hash?

I have a pure ruby hash like the following one:
"1875": {
"child1": {
"field1": 1875,
"field2": "Test1"
},
"child2": {
"field1": "value1",
"field2": "value2"
}
},
"1959": {
"child1": {
"field1": 1875,
"field2": "Test1"
},
"child2": {
"field1": "value1",
"field2": "value2"
}
}
I have so many keys that follow the above structure that I want to paginate it.
I have tried the following code:
#records = #records.t_a.paginate(page: params[:page], per_page: 5)
But it is returning me all the elements in an array, like this:
["1875", {
"child1": {
"field1": 1875,
"field2": "Test1"
},
"child2": {
"field1": "value1",
"field2": "value2"
}
}
]
["1959", {
"child1": {
"field1": 1875,
"field2": "Test1"
},
"child2": {
"field1": "value1",
"field2": "value2"
}
}
]
First of all, note that a Hash is a dictionary-like collection and order shouldn't mater. So if you need to use pagination most likely a hash is the wrong data structure to use and you should use something like an array.
#records.t_a.paginate(page: params[:page], per_page: 5) returns an array because you are converting the hash to array with to_a. Depending what you are using the hash/pagination for this may be enough. For example, to display a the returned records assuming there is a function for printing child:
#records.t_a.paginate(page: params[:page], per_page: 5).each |key, value| do
<h1><%= book.title %></h1>
<p>print_child(value)</p>
If you really want a hash, you can convert the Array back to a hash:
Hash[#records.t_a.paginate(page: params[:page], per_page: 5)]

Pipeline aggregations in Elasticsearch

I am working on Elasticsearch Aggregation and have a question regarding how to do pipeline sort of aggregation. I have three high-level fields in my ES document:
documentId, list1, list2
Example:
This is the couple of documents I have:
document 1:
{
"documentId":"1",
"list1":
[
{
"key": "key1",
"value": "value11"
}
],
"list2":
[
{
"key": "key2",
"value": "value21"
}
...
]
}
document 2:
{
"documentId":"2",
"list1":
[
{
"key": "key1",
"value": "value11"
}
],
"list2":
[
{
"key": "key2",
"value": "value21"
}
...
]
}
document 3:
{
"documentId":"3",
"list1":
[
{
"key": "key1",
"value": "value12"
}
],
"list2":
[
{
"key": "key2",
"value": "value21"
}
...
]
}
To summarize -
document1 and document2 has same set of values for key1 and key2 (Except id is different, so they are treated two separate documents).
document3 has same value for key2 as in document1 and document2. Value for key1 is different from document1 and document2.
I want to run terms aggregator on keys of list1 field which should go as input into terms aggregation done on list2.
So, for the above example, the overall output I want is -
value21: 2
(one count corresponding to value11 in key1 and second count corresponding to value12 in key1)
and NOT
value21: 3 (two counts corresponding to value11 in key1 and third count corresponding to value12 in key1).
Is there any simple way of doing this?

Separate multiple events in logstash input into separate documents in elasticsearch index

INPUT in logstash :
{
"Teacher": {
"Name": "Mary",
"age": 20,
},
"Student": [
{
"Name": "Tim",
"age"12
},
{
"Name": "Eric",
"age":13
}
]
}
Need to filter this input using logstash to send three separate documents into ElasticSearch.
doc1: {
"Name": "ABC",
"age": 20,
}
doc2: {
"Name": "Tim",
"age"12
}
doc 3:
{
"Name": "Eric",
"age":13
}
Tried split, mutate, ruby filters function but did not get the desired result. Could someone help me separate these into separate outputs to the elasticsearch index.
Since you want a separate event for 'Mary', use the clone filter to create two events. Delete the 'Students' array from one copy to just be left with 'Mary'.
In the second clone, using the split filter will give you different events for 'Tim' and 'Eric'.

RethinkDB: Equivalent for "select where field not in (items)"

I have a table that looks like this:
[
{ "name": "Alpha", "values": {
"someProperty": 1
}},
{ "name": "Beta", "values": {
"someProperty": 2
}},
{ "name": "Gamma", "values": {
"someProperty": 3
}}
]
I want to select all records where someProperty is not in some array of values (e.g., all records where someProperty not in [1, 2]). I want to get back complete records, not just the values of someProperty.
How should I do this with RethinkDB?
In python it would be:
table.filter(lambda doc: r.not(r.expr([1,2]).contains(doc["someProperty"]))
If the array comes from a subquery and you don't want to do it multiple times:
subquery.do(lambda array:
table.filter(lambda doc: r.not(array.contains(doc["someProperty"]))))

Resources