Is there a better way to parse a mongodb query in go? - go

Hi I got a sort of complex aggregate query that I must write with mgo, but I got really dazed when work it out half way :-(, Is there a better way to do that ?
Here is a console query aggregate command that I have tested and it works.
db.event.aggregate([{$match:{clktime:{$gt:1425289561}}},{$group:{"_id":{$subtract:["$clktime",{$mod:["$clktime", 60*5]}]}, count:{$sum:1}}}])
And here is what I have got so far:
c.Pipe([]bson.M{bson.M{"$match": bson.M{"clktime": bson.M{"$gt": 1425289561}}}, bson.M{"$group": bson.M{"_id": bson.M{"$subtract": []bson.M{bson.M{"$clktime"}, bson.M{"$mod": []bson.M{bson.M{"$clktime"}, bson.M{60 * 5}}}}}}, "count": bson.M{"$sum": 1}}})
It says that there is a missing key in map literal, but I can't find where.
I thought human beings don't deserve that, I am so desperate T_T.
Is there a better or humanity way to do that ?

x := []bson.M{{"$match": bson.M{"clktime": bson.M{"gt": 1425289561}}},{"$group": bson.M{"_id": bson.M{"$subtract": []interface{}{"$clktime", bson.M{"$mod": []interface{}{"$clktime", 60 * 5}}}}, "count": bson.M{"$sum": 1}}}}

Yes, there is a better way. Break your code into multiple lines and use comma after last element in map or array. Then the code will be formatted automatically and you will also get a readable error messages indicating line.
package main
type M map[string]M
var x = M{
"a": M{
"b": M{},
"c": M{},
},
}
By the way. Look at this part bson.M{"$clktime"}, bson.M{60 * 5}}.

Finally, I have write this out, but the suggestions #Grzegorz give(which to split the building or query to multi lines for convenience) and also considered #Morty's opinion(which using []interface{} when come to arrays in query command). And here is what I got which works:
q := []bson.M{
bson.M{
"$match": bson.M{
"clktime": bson.M{
"$gt": 1425289561,
},
},
},
bson.M{
"$group": bson.M{
"_id": bson.M{
"$subtract": []interface{}{
"$clktime",
bson.M{
"$mod": []interface{}{
"$clktime",
60 * 5,
},
},
},
},
"count": bson.M{"$sum": 1},
},
},
}
Hope it will be helpful for other people come to similar issue.

Related

Reorder object hierarchy and group by time in JSONata

Although I'm not a total JSONata noob, I'm having a hard time finding an elegant solution to the following desired transformation. The starting point is a set of time-series data in a format like this:
{
"series1": {
"data": [
{"time": "2022-01-01T00:00:00Z", "value": 22},
{"time": "2022-01-02T00:00:00Z", "value": 23}
]
},
"series2": {
"data": [
{"time": "2022-01-01T00:00:00Z","value": 220},
{"time": "2022-01-02T00:00:00Z","value": 230}
]
}
}
I need to "flip the hierarchy", and group these datapoints by timestamp, into an array of objects, like follows:
[
{
"time": "2022-01-01T00:00:00Z",
"series1": 22,
"series2": 220
},
{
"time": "2022-01-02T00:00:00Z",
"series1": 23,
"series2": 230
}
]
I currently have this working with the expression
$each($, function($v, $s) {
[$v.data.{
'series': $s,
'time':$.time,
'value': $.value
}]
}).*{
`time`: {
`series`: value
}
}
~> $each(function($v, $t) {
$merge([
$v,
{'time': $t}
])
})
(playground link: https://try.jsonata.org/8CaggujJk)
...and...I can't help but feel that there must be a better way!
For reference, my current expression basically does this in three consecutive steps:
The first $each() function, which splits up the original object into an array of datapoints, with a series name, timestamp, and value of each.
A grouping operator which makes time a key, and gathers all values for a given timestamp together.
A second $each() function, which transforms the object into an array of objects where time is a value again, rather than a key - and merges the time key-value alongside the series values.
I've seen some wonderfully elegant solutions to similar problems on here, but am not sure how to approach this in a better way. Any tips appreciated!

Elastic Ingest Pipeline split field and create a nested field

Dear freindly helpers,
I have an index that is fed by a database via Kafka. Now this database holds a field that aggregates a couple of pieces of information like so key/value; key/value; (don't ask for the reason, I have no idea who designed it liked that and why ;-) )
93/4; 34/12;
it can be empty, or it can hold 1..n key/value pairs.
I want to use an ingest pipeline and ideally have a "nested" field which holds all values that are in tha field.
Probably like this:
{"categories":
{ "93": 7,
"82": 4
}
}
The use case is the following: we want to visualize the sum of a filtered number of these categories (they tell me how many minutes a specific process took longer) and relate them in ranges.
Example: I filter categories x, y ,z and then group how many documents for the day had no delay, which had a delay up to 5 minutes and which had a delay between 5 and 15 minutes.
I have tried to get the fields neatly separated with the kv processor and wanted to work from there on but it was a complete wrong approach I guess.
"kv": {
"field": "IncomingField",
"field_split": ";",
"value_split": "/",
"target_field": "delays",
"ignore_missing": true,
"trim_key": "\\s",
"trim_value": "\\s",
"ignore_failure": true
}
When I test the pipeline it seems ok
"delays": {
"62": "3",
"86": "2"
}
but there are two things that don't work.
I can't know upfront how many of these combinations I have and thus converting the values from string t int in the same pipeline is an issue.
When I want to create a kibana index pattern I end up with many fields like delay.82 and delay.82.keyword which does not make sense at all for the usecase as I can't filter (get only the sum of delays where the key is one of x,y,z) and aggregate.
I have looked into other processors (dorexpander) but can't really get my head around how to get this working.
I hope my question is clear (I lack english skills, sorry) and that someone can point me at the right direction.
Thank you very much!
You should rather structure them as an array of objects with shared accessors, for instance:
[ {key: 93, value: 7}, ...]
That way, you'll be able to aggregate on categories.key and categories.value.
So this means iterating the categories' entrySet() using a custom script processor like so:
POST _ingest/pipeline/_simulate
{
"pipeline": {
"description": "extracts k/v pairs",
"processors": [
{
"script": {
"source": """
def categories = ctx.categories;
def kv_pairs = new ArrayList();
for (def pair : categories.entrySet()) {
def k = pair.getKey();
def v = pair.getValue();
kv_pairs.add(["key": k, "value": v]);
}
ctx.categories = kv_pairs;
"""
}
}
]
},
"docs": [
{
"_source": {
"categories": {
"82": 4,
"93": 7
}
}
}
]
}
P.S.: Do make sure your categories field is mapped as nested b/c otherwise you'll lose the connections between the keys & the values (also called flattening).

Elasticsearch compare long sequence strings with fuzzy query

I have two long String sequences that are similar:
C50FD711C2C43287351892A4D82F44B055F048C46D2C54197AC1D1E921F11E6699C4057C4B93907518E6DCA51A672D3D3E419160DAE276CB7716D11B94D8C3BB2E4A591329B7AF973D17A7F9336342FFAAFD4D
and
C50FD711C2C43287351892A4D820B5EAC5F048C1E67CAC197AC1D1E921F11C3623C1DCD6493907518E6DCA18CD71016E7FD1160DAE276CB7716D11B94A6B762E4A591329B7AF973D17A7F9336342FFAAFD4D
Its distance is 41.
I would like to find those strings that are similar to eachother. I started a query like this:
GET my_index/_type/_search
{
"query": {
"fuzzy" : {
"sequence.keyword": {
"value": "C50FD711C2C43287351892A4D820B5EAC5F048C1E67CAC197AC1D1E921F11C3623C1DCD6493907518E6DCA18CD71016E7FD1160DAE276CB7716D11B94A6B762E4A591329B7AF973D17A7F9336342FFAAFD4D",
"boost": 1.0,
"fuzziness": 50,
"prefix_length": 10,
"max_expansions": 200
}
}
}
}
I tried with sequence.keyword and sequence, the field is of type text and type keyword.
However, it did not find the other similar sequence string in my index. Why?
The answer is pretty simple. The maximum edit distance that is allowed is 2 (as can be seen in the source code for the Fuzziness class
You can try with a simpler value, if you index AAAAAA and try to search for AAABBB with fuzziness: 3, you'll get nothing.

Elasticsearch Go nested query

I'm using olivere's elastic Go library to run Elastic queries - https://godoc.org/github.com/olivere/elastic#NestedQuery
The data I'm trying to query on looks like this:
"_source": {
"field1": "randVal1",
"field2": "randVal2",
"nestedfield": {
"ind1": "val1"
}
}
I'm trying to run a query on the nestedfield using the NestedQuery call from the Elastic Go library like so:
aquery := elastic.NewTermQuery("ind1", "val1")
query := elastic.NestedQuery("nestedfield", aquery)
But I get an error stating:
too many arguments to conversion to NestedQuery
I'm trying to retrieve all documents where the ind1 of nestedfield is val1. Would appreciate any help in constructing this query.
EDIT:
I changed it to NewNestedQuery and now it doesn't give that error. However, it is not returning any results, even though that document exists in the index and I am able to query on the non-nested fields.
I tried this:
aquery := elastic.NewTermQuery("ind1", "val1")
query := elastic.NewNestedQuery("nestedfield", aquery)
And this:
query := elastic.NewNestedQuery("nestedfield", elastic.NewMatchQuery("nestedfield.ind1", "val1"))
But they both give 0 results. Any idea what I'm doing wrong?
EDIT #2
The mapping is:
"field1": { "type": "string" },
"field2": { "type": "string" },
"nestedfield": {
"type": "nested"
}
What eventually worked was this:
query := elastic.NewMatchQuery("nestedfield.ind1", "val1")
I was able to add additional fields to 'nestedfield' and do queries like:
query := elastic.NewBoolQuery().Filter(elastic.NewMatchQuery("nestedfield.ind1", "val1"), elastic.NewMatchQuery("nestedfield.ind2", "val2"))
Looks like that should be:
q := elastic.NewTermQuery("nestedfield.ind1", value)
nq := elastic.NewNestedQuery("nestedfield", q)
NestedQuery is a type, not a function.
NewTermQuery needs to take a
value from the json, not a const string
You'll need to parse your source json to get the value from ind1
Edited to fix NewTermQuery too as per comments below. If that still doesn't work provide the full code you're using to parse source and get the error as you don't give enough detail here to guess at the problem.

RethinkDB simple pluck from nested array

I am new to RethinkDB and have looked here and elsewhere for the answer to this. I have found several things close, but still can't seem to figure out what seems like it should be simple. I have a query:
r.db('common').table("counters").filter({org: 'myorg'}).pluck('counters').run()
That gives the following results:
{
"counters": [
{
"aid": 0 ,
"pid": 1000 ,
"rid": 0
}
]
}
What I want is to pluck or somehow get a specific counter (e.g. pid). I tried counter[0].pid, counters.pid and a few others, but can't quite seem to find the magic bullet. From what I did find, I suspect this may involve a function, but am not sure where it should go. Any help is appreciated and if you dup this, please make sure it's an exact dup and not something close. Thanks!
OK, had to change the array to an object:
{
"counters": {
"aid": 0 ,
"pid": 1000 ,
"rid": 0
}
}
... then use get(), this worksr.db('common').table("counters").get('12345-1234-54321-6666-f0dac0b6b68e')('counters')('pid')

Resources