jq Compare two files and output the difference in text format - shell

I have 2 files
file_one.json
{
"releases": [
{
"name": "bpm",
"version": "1.1.5"
},
{
"name": "haproxy",
"version": "9.8.0"
},
{
"name": "test",
"version": "10"
}
]
}
and file_two.json
{
"releases": [
{
"name": "bpm",
"version": "1.1.6"
},
{
"name": "haproxy",
"version": "9.8.1"
},
{
"name": "test",
"version": "10"
}
]
}
In file 2 the versions were changed and i need to echo the new changes.
I have used the following command to see the changes:
diff -C 2 <(jq -S . file_one.json) <(jq -S . file_two.json)
But than i need to format the output to something like this.
I need to output text:
The new versions are:
bpm 1.1.6
haproxy 9.8.1

You may be able to use the following jq command :
jq --slurp -r 'map(.releases) | add
| group_by(.name)
| map(unique | select(length > 1) | max_by(.version))
| map("\(.name) : \(.version)") | join("\n")'
file_one.json file_two.json
It first merges the two releases arrays, groups the elements by name, then unicize the elements of the resulting arrays, remove the arrays with a single element (the versions that were identic between the two files), then map the arrays into their greatest element (by version) and finally format those for display.
You can try it here.
A few particularities that might make this solution incorrect for your use :
it doesn't only report version upgrades, but also version downgrades. However, it always returns the greatest version, disregarding which file contains it.
the version comparison is alphabetic. It's okay with your sample, but it can fail for multi-digits versions (e.g. 1.1.5 is considered greater than 1.1.20 because 5 > 2). This could be fixed but might not be problematic depending on your versionning scheme.
Edit following your updated request in the comments : the following jq command will output the versions changed between the first file and the second. It nicely handles downgrades and somewhat handles products that have appeared or disappeared in the second file (although it always shows the version as version --> null whether it is a product that appeared or disappeared).
jq --slurp -r 'map(.releases) | add
| group_by(.name)
| map(select(.[0].version != .[1].version))
| map ("\(.[0].name) : \(.[0].version) --> \(.[1].version)")
| join("\n")' file_one.json file_two.json
You can try it here.

Related

How to extract a value by searching for two words in different lines and getting the value of second one

How to search for a word, once it's found, in the next line save a specific value in a variable.
The json bellow is only a small part of the file.
Due to this specific file json structure be inconsistent and subject to change overtime, it need to by done via search like grep sed awk.
however the paramenters bellow will be always the same.
search for the word next
get the next line bellow it
extract everything after the word page_token not the boundary "
store in a variable to be used
test.txt:
"link": [
{
"relation": "search",
"url": "aaa/ww/rrrrrrrrr/aaaaaaaaa/ffffffff/ccccccc/dddd/?token=gggggggg3444"
},
{
"relation": "next",
"url": "aaa/ww/rrrrrrrrr/aaaaaaaaa/ffffffff/ccccccc/dddd/?&_page_token=121_%_#212absa23bababa121212121212121"
},
]
so the desired output in this case is:
PAGE_TOKEN="121_%_#212absa23bababa121212121212121"
my attempt:
PAGE_TOKEN=$(cat test.txt| grep "next" | sed 's/^.*: *//;q')
no lucky..
This might work for you (GNU sed):
sed -En '/next/{n;s/.*(page_token=)([^"]*).*/\U\1\E"\2"/p}' file
This is essentially a filtering operation, hence the use of the -n option.
Find a line containing next, fetch the next line, format as required and print the result.
Presuming your input is valid json, one option is to use:
cat test.json
[{
"relation": "search",
"url": "aaa/ww/rrrrrrrrr/aaaaaaaaa/ffffffff/ccccccc/dddd/?token=gggggggg3444"
},
{
"relation": "next",
"url": "aaa/ww/rrrrrrrrr/aaaaaaaaa/ffffffff/ccccccc/dddd/?&_page_token=121_%_#212absa23bababa121212121212121"
}
]
PAGE_TOKEN=$(cat test.json | jq -r '.[] | select(.relation=="next") | .url | gsub(".*=";"")')
echo "$PAGE_TOKEN"
121_%_#212absa23bababa121212121212121

looping through json object using jq

I have a json file which i have obtained using curl command and it looks as below.
{
"Storages": [
{
"Creation": "2020-04-21T14:01:54",
"Modified": "2020-04-21T14:01:54",
"Volume": "/dev/null",
"id": 10000,
"version": "20190925-230722"
},
{
"Creation": "2020-04-22T14:01:54",
"Modified": "2020-04-22T14:01:54",
"Volume": "/opt/home",
"id": 10001,
"version": "22a-20190925-230722"
},
{
"Creation": "2020-04-23T14:01:54",
"Modified": "2020-04-23T14:01:54",
"Volume": "/home/abcd",
"id": 10003,
"version": "21c-20190925-230722"
}
]
}
Now I need to loop thorough array and get id and volume values into 2 variables if version startswith 21a. No need to form another json
For educational purposes, here's a jq command that does both the things you want, but in 2 separate steps:
jq -r 'del(.Storages[] | select(.version | startswith("21a") | not))
.Storages[] | {id, version}'
The first part (del(.Storages[] | select(.version | startswith("21a") | not))) filters out the array elements that don't have a version starting with 21a. The second part (.Storages[] | {id, version}) drills and extracts the specific information you need.
You can use startswith builtin function such as
jq -r '.Storages[] | select(.version | startswith("21a")) | {id, Volume}'
Demo
Edit : Assuming the JSON embedded into a file(Storages.json), then you can assign the results into shell variables such as
$ readarray -t vars < <( jq -r '.Storages[] | select(.version|startswith("21a"))| .id, .Volume' Storages.json )
and display those variables as
$ declare -p vars
declare -a vars='([0]="10003" [1]="/home/abcd")'

jq bash issue with filtering with CONTAINS

I have an issue where I am trying to filter records with a CONTAINS, but it won't accept a variable that has spaces in it. I am including the JSON and the calls. I explain what works and the last one that does not work. I have looked High and Low but I can't make it work. I have seen and tried many (hundreds of ways taking into account the double quotes, escaped, not escaped, with, without, but no luck) can someone take a look and point me to something that might help.
JSON used to test
_metadatadashjson='{ "meta": { "provisionedExternalId": "" }, "dashboard": { "liveNow": false, "panels": [ { "collapsed": false, "title": "Gyrex Thread Count Gauges", "type": "row", "targets": [ { "expr": "jvm_threads_current{instance=\"192.1.50.22:8055\",job=\"prometheus_gyrex\"}", "refId": "B" } ] }, { "datasource": "Prometheus_16_Docker", "targets": [ { "exemplar": true, "expr": "jvm_threads_current{instance=\"10.32.0.4:8055\",job=\"prometheus_gyrex\"}" } ], "title": ".16 : 3279", "type": "gauge" }, { "description": "", "targets": [ { "expr": "jvm_threads_current{instance=\"10.32.0.7:8055\",job=\"prometheus_gyrex\"}", "refId": "B" } ], "title": ".16 : 3288", "type": "graph" }, { "description": "", "targets": [ { "expr": "jvm_threads_current{instance=\"192.168.2.16:3288\",job=\"prometheus_gyrex\"}", "refId": "C" } ], "title": ".16 : 3288", "type": "graph" } ], "version": 55 }}'
Set the string to search for in key "expr"
exprStrSearch="10.32.0.4:8055"
This works returns one record
echo "${_metadatadashjson}" | jq -r --arg EXPRSTRSEARCH "$exprStrSearch" '.dashboard.panels[] | select(.targets[].expr | contains($EXPRSTRSEARCH)) | .targets[].expr'
This works no problem returns two records.
echo "${_metadatadashjson}" | jq -r --arg EXPRSTRSEARCH "$exprStrSearch" '.dashboard.panels[] | select(.targets[].expr | contains("10.32.0.4:8055", "10.32.0.7:8055")) | .targets[].expr'
Change the value to include a space and another string
exprStrSearch="10.32.0.4:8055 10.32.0.7:8055"
Does not work.
echo "${_metadatadashjson}" | jq -r --arg EXPRSTRSEARCH "$exprStrSearch" '.dashboard.panels[] | select(.targets[].expr | contains($EXPRSTRSEARCH)) | .targets[].expr'
None of your data contains "10.32.0.4:8055 10.32.0.7:8055".
You could pass multiple strings to contains(), using a bash array:
strings=("10.32.0.4:8055" "10.32.0.7:8055")
echo "${_metadatadashjson}" |
jq -r --args '.dashboard.panels[] | select(.targets[].expr | contains($ARGS.positional[])) | .targets[].expr' "${strings[#]}"
But contains will evaluate to true for each match. Ie. if one expr contained both strings, it would be selected (and printed) twice.
With test, that won't happen. Here's how you can add the |s between multiple strings, and pass them in a single jq variable (as well as escape all the dots):
strings=("10.32.0.4:8055" "10.32.0.7:8055")
IFS=\|
echo "${_metadatadashjson}" |
jq -r --arg str "${strings[*]//./\\.}" '.dashboard.panels[] | select(.targets[].expr | test($str)) | .targets[].expr'
Both examples print this:
jvm_threads_current{instance="10.32.0.4:8055",job="prometheus_gyrex"}
jvm_threads_current{instance="10.32.0.7:8055",job="prometheus_gyrex"}
Update: I forgot to escape the dots for test. I edited the test example so that all the dots get escaped (with a single backslash). It's regex, so (unescaped) dots will match any character. The contains example matches the strings literally (not regex).
The problem is that the string with the space in it does not in fact occur in the given JSON. It's not too clear what you are trying to do but please note that contains is not symmetric:
"a" | contains("a b")
evaluates to false.
If you intended to write a boolean search criterion, you could use a boolean expression, or use jq's regular expression machinery, e.g.
test("10.32.0.4:8055|10.32.0.7:8055")
or probably even better:
test("\"(10[.]32[.]0[.]4:8055|10[.]32[.]0[.]7:8055)\"")

Select highest version value from JSON array

I have an JSON result and i need to search for a specific value and get from the array.
For example here is my JSON and I need to search for a version 1.15 having the higher patch version inside the validNodeVersions array. So here I wanted to retrieve the value 1.15.12-gke.20 and that is the highest 1.15 versions for the array list. Can somebody please help on this?
Basically I am looking always to pick the highest patch release for any of the version. for 1.15 it is 1.15.12-gke.20.
gcloud container get-server-config --format json
{
"channels": [
{
"channel": "REGULAR",
"defaultVersion": "1.17.9-gke.1504",
"validVersions": [
"1.17.9-gke.6300",
"1.17.9-gke.1504"
]
},
{
"channel": "STABLE",
"defaultVersion": "1.16.13-gke.401",
"validVersions": [
"1.16.13-gke.401",
"1.15.12-gke.20"
]
}
],
"defaultClusterVersion": "1.16.13-gke.401",
"defaultImageType": "COS",
"validImageTypes": [
"UBUNTU",
"UBUNTU_CONTAINERD"
],
"validMasterVersions": [
"1.17.12-gke.500",
"1.14.10-gke.50"
],
"validNodeVersions": [
"1.17.12-gke.500",
"1.16.8-gke.12",
"1.15.12-gke.20",
"1.15.12-gke.17",
"1.15.12-gke.16",
"1.15.12-gke.13",
"1.15.12-gke.9",
"1.15.12-gke.6",
"1.15.12-gke.3",
"1.15.12-gke.2",
"1.15.11-gke.17",
"1.15.11-gke.15",
"1.15.11-gke.13",
"1.15.11-gke.12",
"1.15.11-gke.11",
"1.15.11-gke.9",
"1.15.11-gke.5",
"1.15.11-gke.3",
"1.15.11-gke.1",
"1.15.9-gke.26",
"1.15.8-gke.3",
"1.15.7-gke.23",
"1.15.4-gke.22",
"1.14.10-gke.0",
"1.14.9-gke.0"
]
}
It is more tricky to match the regex, sort or anything inside jq. GNU sort command has a nice parameter, -V that stands for version sorting, so here is a simple way to do this, also without any awk or sort splitting to fields or similar.
jq -r '.validNodeVersions[]' file.json | grep "^1\.15" | sort -V | tail -1
1.15.12-gke.20
jq is doing a simple selection of values here, grep filters these values by version and after sorting by version we get the highest.

Trying to iterate operations on a file with awk and sed

I've got a line that pulls out the number of times the word severity comes out after the word vulnerabilities in a file
please don't laugh too hard:
cat <file> | sed '1,/vulnerabilities/d' | grep -c '"severity": 4'
This will come back with a count of "severity" : 4 matches in the file. I can't seem to iterate this amongst other files.
I have 100 or so files in the form bleeblah-082017. Where bleeblah can be different lengths and words. I'm having issues on how to easily iterate from one file above to get results from each individually.
I would usually have used an awk line to iterate through the list, but I can't seem to find any examples to meld awk and sed.
Would anyone have any ideas on how to perform the task above over many files and return a results per file?
Thanks
Davey
I have a file that has a bunch of entries such as:
{
"count": 6,
"plugin_family": "Misc.",
"plugin_id": 7467253,
"plugin_name": "Blah",
"severity": 4,
"severity_index": 1,
"vuln_index": 13
I'd like to extract the times "severity": 4 appears after the word vulnerabilities in each file. The output would be 10
Some more of the input file.
"notes": null,
"remediations": {
"num_cves": 20,
"num_hosts": 6,
"num_impacted_hosts": 2,
"num_remediated_cves": 6,
"remediations": [
{
"hosts": 2,
"remediation": "Apache HTTP Server httpOnly Cookie Information Disclosure: Upgrade to Apache version 2.0.65 / 2.2.22 or later.",
"value": "f950f3ddf554d7ea2bda868d54e2b639",
"vulns": 4
},
{
"hosts": 2,
"remediation": "Oracle Application Express (Apex) CVE-2012-1708: Upgrade Application Express to at least version 4.1.1.",
"value": "2c07a93fee3b201a9c380e59fa102ccc",
"vulns": 2
}
]
},
"vulnerabilities": [
{
"count": 6,
"plugin_family": "Misc.",
"plugin_id": 71049,
"plugin_name": "SSH Weak MAC Algorithms Enabled",
"severity": 1,
"severity_index": 0,
"vuln_index": 15
},
{
"count": 6,
"plugin_family": "Misc.",
"plugin_id": 70658,
"plugin_name": "SSH Server CBC Mode Ciphers Enabled",
"severity": 1,
"severity_index": 1,
"vuln_index": 13
},
{
"count": 2,
"plugin_family": "Web Servers",
"plugin_id": 64713,
"plugin_name": "Oracle Application Express (Apex) CVE-2012-1708",
"severity": 2,
"severity_index": 2,
"vuln_index": 12
},
Each of these files are from vulnerability scans that have been extracted from my scanner API. Essentially the word severity is all over the place in the different aspects (hosts, vulns, etc). I want to extract from each scan file the number of times the pattern appears after the word vulnerability (which only appears once in each file). Open to using perl python whatever to acheive this. Was just more familiar with shell scripting to manipulate these text type files in the past.
Parsing .json data with sed or awk is fraught with potential pitfalls. I recommend using a format-aware tool like jq to query the data you want. In this case, you can do something like
jq '{(input_filename): [.vulnerabilities[].severity]|add}' *.json
This should produce output something like
{
"bleeblah-201708.json": 4
}
{
"bleeblah-201709.json": 11
}
Use jq for parsing json on the command line. It is the standard tool. Working with text based tools like sed to parse json is very fragile since it relies on the order of elements and formatting of the json documents which is not guaranteed or part of the the json standard.
What you are looking for is the following command:
jq '[.vulnerabilities[]|select(.severity==4)]|length' file.json
If you want to run it for multiple files, use find:
find FOLDER -name 'PATTERN.json' -print \
-exec jq '[.vulnerabilities[]|select(.severity==4)]|length' {} +
I have made the following two example files, assuming that they can represent what you have. Note the occurrence of the search text before "vulnerabilities" and after, with different number of occurrences after.
From your code I assume that the search string will only be at most once on a line, the lines will be counted.
blableh-082017:
"severity" : 4
"severity" : 4
vulnerabilities
"severity" : 4
"severity" : 4
bleeblah-082017:
"severity" : 4
"severity" : 4
vulnerabilities
"severity" : 4
"severity" : 4
"severity" : 4
Here is my proposal, using find in addition to sed and grep, also using sh to achieve the desired piping inside -exec.
find . -iname "*-082017" -print -exec sh -c "sed 1,/vulnerabilities/d {} | grep -c '\"severity\" : 4'" \;
Output (hoping a name line and a count line are OK, otherwise another sed coudl reformat for you):
./blableh-082017
2
./bleeblah-082017
3
Details:
use find to process multiple files and get each file name to the output,
inspite of seds lack of support for that
use basically your code to do the cutting via sed and the counting via grep
give filename to sed as parameter, instead via pipe from cat
use sh within -exec to achieve piping
(answer by devnull to How to use pipe within -exec in find)
Environment:
GNU sed version 4.2.1
GNU bash, version 3.1.23(1)-release (i686-pc-msys)
GNU grep 2.5.4
find (GNU findutils) 4.4.2

Resources