looping through json using sh+ jq [duplicate] - shell

I like to filter json files using jq:
jq . some.json
Given the json containing an array of objects:
{
"theList": [
{
"id": 1,
"name": "Horst"
},
{
"id": 2,
"name": "Fritz"
},
{
"id": 3,
"name": "Walter"
},
{
"id": 4,
"name": "Gerhart"
},
{
"id": 5,
"name": "Harmut"
}
]
}
I want to filter that list to only show the elements with id having the value 2 and 4, so the expected output is:
{
"id": 2,
"name": "Fritz"
},
{
"id": 4,
"name": "Gerhart"
}
How do I filter the json using jq? I have played around with select and map, yet didn't got any of those to work, e.g.:
$ jq '.theList[] | select(.id == 2) or select(.id == 4)' array.json
true

From the docs:
jq '.[] | select(.id == "second")'
Input [{"id": "first", "val": 1}, {"id": "second", "val": 2}]
Output {"id": "second", "val": 2}
I think you can do something like this:
jq '.theList[] | select(.id == 2 or .id == 4)' array.json

You could use select within map.
.theList | map(select(.id == (2, 4)))
Or more compact:
[ .theList[] | select(.id == (2, 4)) ]
Though written that way is a little inefficient since the expression is duplicated for every value being compared. It'll be more efficient and possibly more readable written this way:
[ .theList[] | select(any(2, 4; . == .id)) ]

Using select(.id == (2, 4)) here is generally inefficient (see below).
If your jq has IN/1, then it can be used to achieve a more efficient solution:
.theList[] | select( .id | IN(2,3))
If your jq does not have IN/1, then you can define it as follows:
def IN(s): first(select(s == .)) // false;
Efficiency
One way to see the inefficiency is to use debug. The following expression, for example, results in 10 calls to debug, whereas only 9 checks for equality are actually needed:
.theList[] | select( (.id == (2,3)) | debug )
["DEBUG:",false]
["DEBUG:",false]
["DEBUG:",true]
{
"id": 2,
"name": "Fritz"
}
["DEBUG:",false]
["DEBUG:",false]
["DEBUG:",true]
{
"id": 3,
"name": "Walter"
}
["DEBUG:",false]
["DEBUG:",false]
["DEBUG:",false]
["DEBUG:",false]
index/1
In principle, using index/1 should be efficient, but as of this writing (October 2017), its implementation, though fast (it is written in C), is inefficient.

Here is a solution using indices:
.theList | [ .[map(.id)|indices(2,4)[]] ]

Related

How to retrieve the value of key in array with jq? [duplicate]

I like to filter json files using jq:
jq . some.json
Given the json containing an array of objects:
{
"theList": [
{
"id": 1,
"name": "Horst"
},
{
"id": 2,
"name": "Fritz"
},
{
"id": 3,
"name": "Walter"
},
{
"id": 4,
"name": "Gerhart"
},
{
"id": 5,
"name": "Harmut"
}
]
}
I want to filter that list to only show the elements with id having the value 2 and 4, so the expected output is:
{
"id": 2,
"name": "Fritz"
},
{
"id": 4,
"name": "Gerhart"
}
How do I filter the json using jq? I have played around with select and map, yet didn't got any of those to work, e.g.:
$ jq '.theList[] | select(.id == 2) or select(.id == 4)' array.json
true
From the docs:
jq '.[] | select(.id == "second")'
Input [{"id": "first", "val": 1}, {"id": "second", "val": 2}]
Output {"id": "second", "val": 2}
I think you can do something like this:
jq '.theList[] | select(.id == 2 or .id == 4)' array.json
You could use select within map.
.theList | map(select(.id == (2, 4)))
Or more compact:
[ .theList[] | select(.id == (2, 4)) ]
Though written that way is a little inefficient since the expression is duplicated for every value being compared. It'll be more efficient and possibly more readable written this way:
[ .theList[] | select(any(2, 4; . == .id)) ]
Using select(.id == (2, 4)) here is generally inefficient (see below).
If your jq has IN/1, then it can be used to achieve a more efficient solution:
.theList[] | select( .id | IN(2,3))
If your jq does not have IN/1, then you can define it as follows:
def IN(s): first(select(s == .)) // false;
Efficiency
One way to see the inefficiency is to use debug. The following expression, for example, results in 10 calls to debug, whereas only 9 checks for equality are actually needed:
.theList[] | select( (.id == (2,3)) | debug )
["DEBUG:",false]
["DEBUG:",false]
["DEBUG:",true]
{
"id": 2,
"name": "Fritz"
}
["DEBUG:",false]
["DEBUG:",false]
["DEBUG:",true]
{
"id": 3,
"name": "Walter"
}
["DEBUG:",false]
["DEBUG:",false]
["DEBUG:",false]
["DEBUG:",false]
index/1
In principle, using index/1 should be efficient, but as of this writing (October 2017), its implementation, though fast (it is written in C), is inefficient.
Here is a solution using indices:
.theList | [ .[map(.id)|indices(2,4)[]] ]

OPA Rego issues counting

I am trying to write a rule but am running into an issue. I managed to extract the following from as my input:
myData:= [{"Key": "use", "Value": "1"}, {"Key": "use", "Value": "2"}, {"Key": "att1", "Value": "3"}]
I am trying to count the amount of times a key with the value use appears. However when I do:
p := {keep| keep:= myData[_]; myData.Key == "use"}
I assumed this would create a listing of all I would like to keep but the playground errors with:
1 error occurred: policy.rego:24: rego_type_error: undefined ref: data.play.myData.Key
data.play.myData.Key
I hoped I could list them in p and then do count(p) > 1 to check if more that one is listed.
In your set comprehension for p, you're iterating over the objects in myData, assigning each element to keep. Then, you assert something on myData.Key. I think what you're looking for is
p := {keep| keep := myData[_]; keep.Key == "use"}
Be aware that it's a set comprehension, so p would be the same for these two inputs:
myData:= [{"Key": "use", "Value": "1"}]
myData:= [{"Key": "use", "Value": "1"}, {"Key": "use", "Value": "1"}]
You could use an array comprehension (p := [ keep | keep := ... ]) if that's not what you want.

Sort a list of dictionaries by a specific key within a nested list of dictionaries

I have a list of dictionaries with each entry having the following structure
{
"id": 0,
"type": "notification",
"name": "jane doe",
"loc": {
"lat": 38.8239,
"long": 104.7001
},
"data": [
{
"type": "test",
"time": "Fri Aug 13 09:17:16 2021",
"df": 80000000,
"db": 1000000,
"tp": 92
},
{
"type": "real",
"time": "Sat Aug 14 09:21:30 2021",
"df": 70000000,
"db": 2000000,
"tp:": 97
}
]
}
I need to be able to sort this list by any of these keys: name, type, time, tp and return it in memory.
I understand how to sort by the top level keys sorted(json_list, key=lambda k:k['name']) or even nested keys. For instance by lat sorted(json_list, key=lambda k:k['loc']['lat'])
so currently I have a function that works for the case when sorting by name.
def sort_by(self, param, rev=False):
if param == NAME:
self.json_list = sorted(self.json_list, key=lambda k: k[param], reverse=rev)
else:
# need help here
I'm having trouble sorting by type, time, and tp. Notice the data key is also a list of dictionaries. I would like to leverage existing methods built into the standard lib if possible. I can provide more clarification if necessary
Update:
def sort_by(self, param, rev=False):
if param == NAME:
self.json_list = sorted(self.json_list, key=lambda k: k[param], reverse=rev)
else:
self.json_list = sorted(self.json_list, key=lambda k: k['data'][0][param], reverse=rev)
return self.json_list
This works fine if there is only one item in the data list
If json_list[i]['data'] (for each i) only contains one dict, then the following should work; otherwise modifications are required.
sorted(json_list, key = lambda k: (
k['name'], k['data']['type'], k['data']['time'], k['data']['tp']
))

How to pass arguments inside select and test function?

I've this JSON data, extracted from qbittorrent:
[
{
"hash": "333333333333333333333333333",
"name": "testtosearchcaseinsensitive",
"magnet_uri": "magnet:somedata",
"size": 1243989552,
"progress": 1.0,
"dlspeed": 0,
"upspeed": 0,
"priority": 0,
"num_seeds": 0,
"num_complete": 2,
"num_leechs": 0,
"num_incomplete": 32,
"ratio": 0.0,
"eta": "1.01:11:52",
"state": "stalledUP",
"seq_dl": false,
"f_l_piece_prio": false,
"category": "category",
"tags": "",
"super_seeding": false,
"force_start": false,
"save_path": "/data/path/",
"added_on": 1567358333,
"completion_on": 1567366287,
"tracker": "somedata",
"dl_limit": null,
"up_limit": null,
"downloaded": 1244073666,
"uploaded": 0,
"downloaded_session": 0,
"uploaded_session": 0,
"amount_left": 0,
"completed": 1243989552,
"ratio_limit": 1.0,
"seen_complete": 1567408837,
"last_activity": 1567366979,
"time_active": "1.01:00:41",
"auto_tmm": true,
"total_size": 1243989552,
"max_ratio": 1,
"max_seeding_time": 2880,
"seeding_time_limit": 2880
},
{
"hash": "44444444444444",
"name": "dontmatch",
"magnet_uri": "magnet:somedata",
"size": 2996838603,
"progress": 1.0,
"dlspeed": 0,
"upspeed": 0,
"priority": 0,
"num_seeds": 0,
"num_complete": 12,
"num_leechs": 0,
"num_incomplete": 0,
"ratio": 0.06452786606740063,
"eta": "100.00:00:00",
"state": "stalledUP",
"seq_dl": false,
"f_l_piece_prio": false,
"category": "category",
"tags": "",
"super_seeding": false,
"force_start": false,
"save_path": "/data/path/",
"added_on": 1566420155,
"completion_on": 1566424710,
"tracker": "some data",
"dl_limit": null,
"up_limit": null,
"downloaded": 0,
"uploaded": 193379600,
"downloaded_session": 0,
"uploaded_session": 0,
"amount_left": 0,
"completed": 2996838603,
"ratio_limit": -2.0,
"seen_complete": 4294967295,
"last_activity": 1566811636,
"time_active": "10.23:07:42",
"auto_tmm": true,
"total_size": 2996838603,
"max_ratio": -1,
"max_seeding_time": -1,
"seeding_time_limit": -2
}
]
So I want to match all data where the name have some text, so, in Bash I write this but I can't make it work.
Some declaration to start, actually I pass data via arguments, so I use $1:
TXTIWANT="test"
MYJSONDATA= Here I put my JSON data
Then this jq equation that doesn't work for me is this:
RESULTS=$(echo "$MYJSONDATA" | jq --raw-output --arg TOSEARCH "$TXTIWANT" '.[] | select(.name|test("$TOSEARCH.";"i")) .name')
But I always got an error or all data, I think because $TOSEARCH is not expanded.
Maybe there's a better way to search a string inside a value?
What I do wrong?
The right syntax for variable (or filter) interpolation with jq looks like this:
"foo \(filter_or_var) bar"
In your case:
jq --raw-output --arg TOSEARCH "$TXTIWANT" '.[]select(.name|test("\($TOSEARCH).";"i")) .name')
side-note: By convention, environment variables (PAGER, EDITOR, ...) and internal shell variables (SHELL, BASH_VERSION, ...) are capitalized. All other variable names should be lower case.
If (as suggested by the name TXTIWANT and by the example, as well as by the wording of the question) the value of "$TXTIWANT" is supposed to be literal text, then using test is problematic, as test will search for a regular expression.
Since it is not clear from the question why you are adding a period to TOSEARCH, in the remainder of this first section, I will ignore whatever requirement you have in mind regarding that.
So if you simply want to find the .name values that contain $TXTIWANT literally (ignoring case), then you could convert both .name and the value of $TXTIWANT to the same case, and then check for containment.
In jq, ignoring the mysterious ".", this could be done like so:
jq --raw-output --arg TOSEARCH "$TXTIWANT" '
($TOSEARCH|ascii_upcase) as $up
| .[]
| .name
| select(ascii_upcase|index($up))'
Search for non-terminal occurrence of $TXTIWANT ignoring case
If the "." signifies there must be an additional character after $TXTIWANT, then you could just add another select as follows:
($TOSEARCH|length) as $len
| ($TOSEARCH|ascii_upcase) as $up
| .[]
| .name
| (ascii_upcase|index($up)) as $ix
| select($ix)
| select($ix + $len < length)

Golang - nested map doesn't support indexing on inner level, while outer is fine

Almost a go-newbie, and for the first time I have to made a question about it, about a problem with interfaces, types and maps.
So, my starting point is a database query that retrieves an object like this one:
+-------------+---------------------+----------+------------+
| category_id | category_name | group_id | group_name |
+-------------+---------------------+----------+------------+
| 1 | Category1 | 1 | Group1 |
| 1 | Category1 | 2 | Group2 |
| 1 | Category1 | 3 | Group3 |
| 1 | Category2 | 4 | Group4 |
| 2 | Category2 | 5 | Group5 |
+-------------+---------------------+----------+------------+
and my final goal is having a json object with the groups under the same category under that category, like this one:
{
"id": 1,
"name": "category1",
"groups": [
{
"id": 1,
"name": "Group1"
},
{
"id": 2,
"name": "Group2"
},
{
"id": 3,
"name": "Group3"
}
]
},
{
"id": 2,
"name": "Category2",
"groups": [
{
"id": 4,
"name": "Group4"
},
{
"id": 5,
"name": "Group5"
}
]
}
I don't want to use multiple queries, cause this is just a part of the final query, I used just 2 field to be more clear with this example. In my current situation I just have 5 levels...
So I created a struct that should be used on all levels of my object, that implements an interface:
type NestedMapObjs interface {
getOrderedKeys() []int
}
and the type that implements this interface, that should be a map of int in order to append elements to the correct map:
type BuilderMapObjs map[int]NestedMapObj
when NestedMapObject is:
type NestedMapObj struct {
ID int
Name *string
NestedObj NestedMapObjs
}
so, on my method that builds the map object that I want, I have no problem to add the first level of my object (Category) but, I found some problems on the second level, the group one. In particular, this is my function that adds a new row:
func (m BuilderMapObjs) addNewRow(scanned audienceBuilderScannedObject) error {
if _, ok := m[scanned.CategoryID]; !ok {
var innerObjs BuilderMapObjs
innerObjs = make(BuilderMapObjs, 0)
m[scanned.CategoryID] = NestedMapObj{
ID: scanned.CategoryID,
Name: &scanned.CategoryName,
NestedObj: innerObjs,
}
}
if _, ok := m[scanned.CategoryID].NestedObj[scanned.GroupID]; !ok {
m[scanned.CategoryID].NestedObj[scanned.GroupID] = NestedMapObj{
ID: scanned.GroupID,
Name: &scanned.GroupName,
}
}
return nil
}
(I know, I can refactor and make this code more readable, but this is not the point now...)
The problem is when I try to get the inner object by its key, and when I try to add it. This line:
m[scanned.CategoryID].NestedObj[scanned.GroupID]
produce this error: invalid operation: m[scanned.CategoryID].NestedObj[scanned.GroupID] (type NestedMapObjs does not support indexing)
Actually, I just found that with a better implementation, implementing two more methods in the interface (getIndex and addToIndex) I fixed the problem, but I'd like to understand this problem.
Why I have an error on the inner object and not on the outer one?
Thanks for reading until this point!

Resources