Extracting json payload in shell script

Extracting json payload in shell script - shell

I have a file like below. As you can see there are few lines/contents between curly braces. As there is multiple group of opened and closed curly braces, I want to get the content between the curly brances ({ and } ) for each line separatly.
Sample file:
{
"/tmp/©ƒ-4bf57ed2-velero/velero/templates/crds.yaml": [
],
"/tmp/velero-4bf57ed2-velero/velero/templates/deployment.yaml": [
],
"/tmp/velero-4bf57ed2-velero/velero/templates/restic-daemonset.yaml": [
],
"/tmp/velero-4bf57ed2-velero/velero/templates/secret.yaml": [
]
}
{
"/tmp/autoscaler-fb12fa7a-cluster-autoscaler/cluster-autoscaler/templates/deployment.yaml": [
".spec.replicas: '2' != '0'",
],
"/tmp/autoscaler-fb12fa7a-cluster-autoscaler/cluster-autoscaler/templates/servicemonitor.yaml": [
"error: the server doesn't have a resource type \"ServiceMonitor\"\n"
]
}
{
"/tmp/metrics-server-1960953a-metrics-server-certs/raw/templates/resources.yaml": [
"error: the server doesn't have a resource type \"Issuer\"\n",
"error: the server doesn't have a resource type \"Certificate\"\n"
]
}
Expected result: Need 3 seperated data chunks which is between the curly braces.
Could someone help me here?

If you have a sequence of valid JSON objects, you can use jq to easily and robustly process them:
Given file.jsons:
{
"/tmp/©ƒ-4bf57ed2-velero/velero/templates/crds.yaml": [ ""
],
"/tmp/velero-4bf57ed2-velero/velero/templates/deployment.yaml": [ ""
],
"/tmp/velero-4bf57ed2-velero/velero/templates/restic-daemonset.yaml": [ ""
],
"/tmp/velero-4bf57ed2-velero/velero/templates/secret.yaml": [ ""
]
}
{
"/tmp/autoscaler-fb12fa7a-cluster-autoscaler/cluster-autoscaler/templates/deployment.yaml": [
".spec.replicas: '2' != '0'"
],
"/tmp/autoscaler-fb12fa7a-cluster-autoscaler/cluster-autoscaler/templates/servicemonitor.yaml": [
"error: the server doesn't have a resource type \"ServiceMonitor\"\n"
]
}
{
"/tmp/metrics-server-1960953a-metrics-server-certs/raw/templates/resources.yaml": [
"error: the server doesn't have a resource type \"Issuer\"\n",
"error: the server doesn't have a resource type \"Certificate\"\n"
]
}
You can for example reformat each object as a single line:
$ jq -s -r 'map(#json) | join("\n")' < file.jsons
{"/tmp/©ƒ-4bf57ed2-velero/velero/templates/crds.yaml":[""],"/tmp/velero-4bf57ed2-velero/velero/templates/deployment.yaml":[""],"/tmp/velero-4bf57ed2-velero/velero/templates/restic-daemonset.yaml":[""],"/tmp/velero-4bf57ed2-velero/velero/templates/secret.yaml":[""]}
{"/tmp/autoscaler-fb12fa7a-cluster-autoscaler/cluster-autoscaler/templates/deployment.yaml":[".spec.replicas: '2' != '0'"],"/tmp/autoscaler-fb12fa7a-cluster-autoscaler/cluster-autoscaler/templates/servicemonitor.yaml":["error: the server doesn't have a resource type \"ServiceMonitor\"\n"]}
{"/tmp/metrics-server-1960953a-metrics-server-certs/raw/templates/resources.yaml":["error: the server doesn't have a resource type \"Issuer\"\n","error: the server doesn't have a resource type \"Certificate\"\n"]}
Now you can process it line by line without having to worry about matching up curly braces.

Thank you for your suggestion, the above jq would not work for all the json payload . For example for below json payload it is giving an error
{
"/tmp/ingress-dae7bd30-ingress-internet/nginx-ingress/templates/controller-deployment.yaml": [
".spec.replicas: '2' != '3'",
],
"/tmp/ingress-dae7bd30-ingress-internet/nginx-ingress/templates/controller-metrics-service.yaml": [
".spec.clusterIP: '' != '10.3.24.53'"
],
"/tmp/ingress-dae7bd30-ingress-internet/nginx-ingress/templates/controller-service.yaml": [
".spec.clusterIP: '' != '10.3.115.118'"
],
"/tmp/ingress-dae7bd30-ingress-internet/nginx-ingress/templates/controller-stats-service.yaml": [
".spec.clusterIP: '' != '10.3.115.30'"
],
"/tmp/ingress-dae7bd30-ingress-internet/nginx-ingress/templates/default-backend-deployment.yaml": [
]
}

Related

Generate terraform variables file from shell/bash script

my requirement is to generate the terraform variables file based on the environment, I'm trying to generate them using bash/shell script, but having difficulty converting the output to terraform HCL language (I can't use JSON because I'm manipulating them further in terraform module)
Current API output (which needs to be converted into HCL):
'cluster_name1' 'REGION1' 'Volume_Size1' 'Instance_Size1'
'cluster_name2' 'REGION2' 'Volume_Size2' 'Instance_Size2'
'cluster_name3' 'REGION3' 'Volume_Size3' 'Instance_Size3'
{...}
Output in CSV format:
"cluster_name1","REGION1","Volume_Size1","Instance_Size1"
"cluster_name2","REGION2","Volume_Size2","Instance_Size2"
"cluster_name3","REGION3","Volume_Size3","Instance_Size3"
{...}
Required format:
variable "cluster_configuration" {
default = [
{
"cluster_name" : "cluster_name1"
"cluster_region" : "REGION1"
"instance_db_size" : "Volume_Size1"
"instance_size" : "Instance_Size1"
},
{
"cluster_name" : "cluster_name2"
"cluster_region" : "REGION2"
"instance_db_size" : "Volume_Size2"
"instance_size" : "Instance_Size2"
},
{
"cluster_name" : "cluster_name3"
"cluster_region" : "REGION3"
"instance_db_size" : "Volume_Size3"
"instance_size" : "Instance_Size3"
},
{....}
]
}
My Terraform code, just for reference:
locals {
dbconfig = [
for db in var.cluster_configuration : [{
instance_name = db.cluster_name
db_size = db.instance_db_size
instance_size = db.instance_size
cluster_region = db.cluster_region
}
]
]
}
I have tried with AWK and SED but have had no luck so far.

Terraform can process JSON data, if you define the objects correctly. You might be able to use jq to format the given CSV data into appropriate JSON for Terraform to consume. If I recall correctly, the filter would be something like
# This *very* much assumes that the quotes aren't actually
# quoting fields that contain a real comma. jq is not suitable
# for robustly parsing CSV data.
[
inputs |
split(",") |
map(ltrimstr("\"")) |
map(rtrimstr("\"")) |
{
cluster_name: .[0],
cluster_region: .[1],
instance_db_size: .[2],
instance_size: .[3]
}
] | {variable: {cluster_configuration: {default: .}}}
(There's probably room for improvement.) Assuming you save that to a file like api.jq and your API output is in output.csv, then
$ jq -nRf api.jq output.csv
{
"variable": {
"cluster_configuration": {
"default": [
{
"cluster_name": "cluster_name1",
"cluster_region": "REGION1",
"instance_db_size": "Volume_Size1",
"instance_size": "Instance_Size1"
},
{
"cluster_name": "cluster_name2",
"cluster_region": "REGION2",
"instance_db_size": "Volume_Size2",
"instance_size": "Instance_Size2"
},
{
"cluster_name": "cluster_name3",
"cluster_region": "REGION3",
"instance_db_size": "Volume_Size3",
"instance_size": "Instance_Size3"
}
]
}
}
}
It might be simpler to pick the language of your choice with a proper CSV parser, though, to generate the JSON.

Solution using jq.
The content is as requested (but the specified formatting is disregarded)
INPUT="
'cluster_name1' 'REGION1' 'Volume_Size1' 'Instance_Size1'
'cluster_name2' 'REGION2' 'Volume_Size2' 'Instance_Size2'
'cluster_name3' 'REGION3' 'Volume_Size3' 'Instance_Size3'
"
jq -srR '
split("\n") | # split lines
map(split(" ") | # split fields
select(any) | # remove emty lines
map(.[1:-1]) | # remove enclosing quotes
{
cluster_name: .[0],
cluster_region: .[1],
instance_db_size: .[2],
instance_size: .[3]
}) |
"variable \"cluster_configuration\" {",
" default = ",
.,
"}"
' <<< "$INPUT"
Output
variable "cluster_configuration" {
default =
[
{
"cluster_name": "cluster_name1",
"cluster_region": "REGION1",
"instance_db_size": "Volume_Size1",
"instance_size": "Instance_Size1"
},
{
"cluster_name": "cluster_name2",
"cluster_region": "REGION2",
"instance_db_size": "Volume_Size2",
"instance_size": "Instance_Size2"
},
{
"cluster_name": "cluster_name3",
"cluster_region": "REGION3",
"instance_db_size": "Volume_Size3",
"instance_size": "Instance_Size3"
}
]
}

Argo Events: Use data filter in sensor to identify modified/added/removed path in mono-repo

I'm using Argo Events and Argo Workflow for my CI/CD chain, which works pretty neat. But I'm having some troubles setting up the data filter for the GitHub webhook payloads of my mono repo.
I'm trying to let the sensor only trigger the defined workflow if files were changed in a certain subpath. The payload contains three fields added, removed, modified. There the files are listed which were changed in this commit (webhook-events-and-payloads#push).
The paths I'm searching for is service/jobs/* and service/common*/*.
The filter I defined is:
- path: "[commits.#.modified.#(%\"*service*\")#,commits.#.added.#(%\"*service*\")#,commits.#.removed.#(%\"*service*\")#]"
type: string
value:
- "(\bservice/jobs\b)|(\bservice/common*)"
I valididated my filter in a tiny Go script, as gjson is used by Argo Events to apply the data filter.
package main
import (
"github.com/tidwall/gjson"
"regexp"
)
const json = `{
"commits": [
{
"added": [
],
"removed": [
],
"modified": [
"service/job-manager/README.md"
]
},
{
"added": [
],
"removed": [
"service/joby/something.md"
],
"modified": [
"service/job-manager/something.md"
]
},
{
"added": [
],
"removed": [
"service/joby/something.md"
],
"modified": [
"service/joby/someother.md"
]
}
],
"head_commit": {
"added": [
"service/job-manager/something.md"
],
"removed": [
"service/joby/something.md"
],
"modified": [
"service/job-manager/README.md"
]
}
}`
func main() {
value := gjson.Get(json, "[commits.#.modified.#(%\"*service*\")#,commits.#.added.#(%\"*service*\")#,commits.#.removed.#(%\"*service*\")#]")
println(value.String())
matched, _ := regexp.MatchString(`(\bservice/job-manager\b)|(\bservice/common*)`, value.String())
println(matched) // string is contained?
}
The script gives me the results I expect. But for the same webhook payload the workflow is not triggered, when adding the data filter to the sensor.
Someone any ideas?
UPDATED:
Thanks for the hint incl. body. in the paths.
I ended up setting the filters:
- path: "[body.commits.#.modified.#()#,body.commits.#.added.#()#,body.commits.#.removed.#()#]"
type: string
value:
- ".*service/jobs.*"
- ".*service/common.*"

Path shoud start with body.
Value should add escape special character with \\
So the data filter should be
- path: "[body.commits.#.modified.#(%\"*service*\")#,body.commits.#.added.#(%\"*service*\")#,body.commits.#.removed.#(%\"*service*\")#]"
type: string
value:
- "(\\bservice/jobs\\b)|(\\bservice/common*)"

jq - Cannot index string with string

The content is
{
"properties" : {
"CloudSanityPassed" : [ "true" ],
"GITCOMMIT" : [ "test1" ],
"buildNumber" : [ "54" ],
"jobName" : [ "InveergDB-UI" ]
},
"uri" : "http://ergctory:8081/aergergory/api/storage/test-reergerglease-reergpo/cergom/cloergud/waf/ergregBUI/1ergerggregSHOT/ergregerg-34.zip"
}
I use this command
.[] | ."CloudSanityPassed" | .[]
And I get this message
jq: error (at <stdin>:8): Cannot index string with string "CloudSanityPassed"
"true"
exit status 5
I get, what I want ("true" value), but there is a error in output. Could you explain me, how to avoid it and why does it happen?

According to the jq manual, .[] gets the values of the object when applied to object.
So you get two objects, one for value of "properties" and another for value of "uri":
{
"CloudSanityPassed": [
"true"
],
"GITCOMMIT": [
"test1"
],
"buildNumber": [
"54"
],
"jobName": [
"InveergDB-UI"
]
}
"http://ergctory:8081/aergergory/api/storage/test-reergerglease-reergpo/cergom/cloergud/waf/ergregBUI/1ergerggregSHOT/ergregerg-34.zip"
jq tries to apply ."CloudSanityPassed" operator to each object.
Since former object is dictionary (aka hash), you can apply ."CloudSanityPassed" and get the value ["true"], however, latter is a simple string against which you cannot apply ."CloudSanityPassed", so jq outputs an error at that point.
Maybe the command you want is just .properties.CloudSanityPassed.

In my case jq '[.[] | group_by(.foo)]' gave the error but
jq '[.[]] | group_by(.foo)' worked

Google Input Tools "API" -- can it be used?

I noticed that Google accepts transliteration and IME requests in any language through the url:
https://inputtools.google.com/request?text=$&itc=$&num=$\
&cp=0&cs=1&ie=utf-8&oe=utf-8&app=test
where $ is a variable below, for any language and text.
For example, French (try it):
var text = "ca me plait",
itc = "fr-t-i0-und",
num = 10;
// Result:
[
"SUCCESS",
[
[
"ca me plait",
[
"ça me plaît"
]
]
]
]
Or, Mandarin (try it):
var text = "shide",
itc = "zh-t-i0-pinyin",
num = 5;
// Result:
[
"SUCCESS",
[
[
"shide",
[
"使得",
"似的",
"是的",
"实德",
"似地"
],
[],
{
"annotation": [
"shi de",
"shi de",
"shi de",
"shi de",
"shi de"
]
}
]
]
]
All languages work and return great suggestions. The thing is I can't find documentation for this anywhere on the web, although it clearly looks like an API. Does anyone know if there is an official Google client or if they're okay with raw, unauthenticated requests?
It's used perhaps unofficially by plugins like jQuery.chineseIME.js, but I would appreciate any official usage information.

Whatever. I created my own plugin that uses it for Chinese, and can be extended easily: https://bitbucket.org/purohit/jquery.intlkeyboard.js.

Ruby - FasterCSV after Parsing JSON

I am trying to parse through a JSON response for customer data (names and email) and construct a csv file with column headings of the same.
For some reason, every time I run this code I get a CSV file with a list of all the first names in one cell (with no separation in between the names...just a string of names appended to each other) and the same thing for the last name. The following code does not include adding emails (I'll worry about that later).
Code:
def self.fetch_emails
access_token ||= AssistlyArticle.remote_setup
cust_response = access_token.get("https://blah.desk.com/api/v1/customers.json")
cust_ids = JSON.parse(cust_response.body)["results"].map{|w| w["customer"]["id"].to_i}
FasterCSV.open("/Users/default/file.csv", "wb") do |csv|
# header row
csv << ["First name", "Last Name"]
# data rows
cust_ids.each do |cust_firstname|
json = JSON.parse(cust_response.body)["results"]
csv << [json.map{|x| x["customer"]["first_name"]}, json.map{|x| x["customer"]["last_name"]}]
end
end
end
Output:
First Name | Last Name
JohnJillJamesBill SearsStevensSethBing
and so on...
Desired Output:
First Name | Last Name
John | Sears
Jill | Stevens
James | Seth
Bill | Bing
Sample JSON:
{
"page":1,
"count":20,
"total":541,
"results":
[
{
"customer":
{
"custom_test":null,
"addresses":
[
{
"address":
{
"region":"NY",
"city":"Commack",
"location":"67 Harned Road,
Commack,
NY 11725,
USA",
"created_at":"2009-12-22T16:21:23-05:00",
"street_2":null,
"country":"US",
"updated_at":"2009-12-22T16:32:37-05:00",
"postalcode":"11725",
"street":"67 Harned Road",
"lng":"-73.196225",
"customer_contact_type":"home",
"lat":"40.716894"
}
}
],
"phones":
[
],
"last_name":"Suriel",
"custom_order":"4",
"first_name":"Jeremy",
"custom_t2":"",
"custom_i":"",
"custom_t3":null,
"custom_t":"",
"emails":
[
{
"email":
{
"verified_at":"2009-11-27T21:41:11-05:00",
"created_at":"2009-11-27T21:40:55-05:00",
"updated_at":"2009-11-27T21:41:11-05:00",
"customer_contact_type":"home",
"email":"jeremysuriel+twitter#gmail.com"
}
}
],
"id":8,
"twitters":
[
{
"twitter":
{
"profile_image_url":"http://a3.twimg.com...",
"created_at":"2009-11-25T10:35:56-05:00",
"updated_at":"2010-05-29T22:41:55-04:00",
"twitter_user_id":12267802,
"followers_count":93,
"verified":false,
"login":"jrmey"
}
}
]
}
},
{
"customer":
{
"custom_test":null,
"addresses":
[
],
"phones":
[
],
"last_name":"",
"custom_order":null,
"first_name":"jeremy#example.com",
"custom_t2":null,
"custom_i":null,
"custom_t3":null,
"custom_t":null,
"emails":
[
{
"email":
{
"verified_at":null,
"created_at":"2009-12-05T20:39:00-05:00",
"updated_at":"2009-12-05T20:39:00-05:00",
"customer_contact_type":"home",
"email":"jeremy#example.com"
}
}
],
"id":27,
"twitters":
[
null
]
}
}
]
}
Is there a better use of FasterCSV to allow this? I assumed that << would add to a new row each time...but it doesn't seem to be working. I would appreciate any help!

You've got it all tangled up somehow, you're parsing the json too many times (and inside a loop!) Let's make it simpler:
customers = JSON.parse(data)["results"].map{|x| x['customer']}
customers.each do |c|
csv << [c['first_name'], c['last_name']]
end
also 'wb' is the wrong mode for csv - just 'w'.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Extracting json payload in shell script - shell

Related

Generate terraform variables file from shell/bash script

Argo Events: Use data filter in sensor to identify modified/added/removed path in mono-repo

jq - Cannot index string with string

Google Input Tools "API" -- can it be used?

Ruby - FasterCSV after Parsing JSON

Categories

Resources