{
"id": "a1234567-89ab-cdef-0123-456789abcdef",
"properties": {
...
"my_id": "c1234567-89ab-cdef-0123-456789abcdef",
...
}
Given the above in a file, I want to be able to perform a match (including the 4 leading spaces) on my_id and then append a new line "my_value": "abcd",. The desired output would look like this:
{
"id": "a1234567-89ab-cdef-0123-456789abcdef",
"properties": {
...
"my_id": "c1234567-89ab-cdef-0123-456789abcdef",
"my_value": "abcd",
...
}
Using examples online, I'm unable to get the command to work. Here is an example of something I have tried: sed '/.*"my_id".*/a "my_value": "abcd",' test.json, for which I receive the following error: command a expects \ followed by text.
What is the correct way to structure this command?
Using any awk:
$ awk -v new='"my_value": "abcd",' '{print} sub(/"my_id":.*/,""){print $0 new}' file
{
"id": "a1234567-89ab-cdef-0123-456789abcdef",
"properties": {
...
"my_id": "c1234567-89ab-cdef-0123-456789abcdef",
"my_value": "abcd",
...
}
The above will print the new line using whatever indent the existing "my_id" line has, it doesn't assume/hard-code any indent, e.g. 4 blanks.
I'm using this:
sub(/"my_id":.*/,""){print $0 new}
instead of the briefer:
sub(/"my_id":.*/,new)
so it won't break if the new string contains any backreference chars such as &.
awk procedure with passed argument for insert value
The following awk procedure allows for 'abcd' to be passed as an argument for insertion (allowing it to be set in a bash script if required).
awk -v insertVal="abcd" '/"my_id":/{$0=$0"\n \"my_value\": \""insertVal"\","} {print}' dat.txt
explanation
The required insertion string ('abcd' in this case) is passed as an argument using the -v variable switch followed by a variable name and value: insertVal="abcd".
The first awk action block has a pattern condition to only act on lines containing the target-line string (in this case "my_id":). When a line with that pattern is found, the line is extended with a new line mark \n, the required four spaces to start the next line, the specified key named "my_value", and the value associated with the key, passed by argument as the variable named insertVal ("abcd"), and the final , character. Note the need to escape the " quotes to render them.
The final awk block, prints the current line (whether or not it was modified).
test
The procedure was tested on Mac Terminal using GNU Awk 5.2.0.
The output generated (from the input data saved to a file named dat.txt) is:
{
"id": "a1234567-89ab-cdef-0123-456789abcdef",
"properties": {
...
"my_id": "c1234567-89ab-cdef-0123-456789abcdef",
"my_value": "abcd",
...
}
Using sed
$ sed -e '/my_id/{p;s/id.*"/value": "abcd"/' -e '}' input_file
{
"id": "a1234567-89ab-cdef-0123-456789abcdef",
"properties": {
...
"my_id": "c1234567-89ab-cdef-0123-456789abcdef",
"my_value": "abcd",
...
}
With your shown samples and attempts please try following GNU awk code. Where newVal is an awk variable having new value in it. Using match function in GNU awk where I have used regex (.*)("my_id": "[^"]*",)(.*) which creates 3 capturing groups and saves values into an array named arr. Then printing values as per requirement.
awk -v newVal='"my_value": "abcd",' -v RS= '
match($0,/(.*)("my_id": "[^"]*",)(.*)/,arr){
print arr[1] arr[2] newVal arr[3]
}
' Input_file
This might work for you (GNU sed):
sed '/"my-id".*/p;s//"my-value": "abcd"/' file
Match on "my-id" and print that line, then substitute the additional line.
I have a test.txt file in this format
{
"user": "sthapa",
"ticket": "LIN-5867_3",
"start_date": "2018-03-16",
"end_date": "2018-03-16",
"demo_nos": [692],
"service_names": [
"service1",
"service2",
"service3",
"service4",
"service5",
"service6",
"service7",
"service8",
"service9"
]
}
I need to look for a tag called demo_nos and provide the count of it.
For example in the above file "demo_nos": [692] which means only one demo nos...similarly if it had "demo_nos": [692,300] then the count would be 2
so what shell script can i write to fetch and print the count?
The output should say the demo nos = 1 or 2 depending on the values inside the tag [].
i.e I have a variable in my shell script called market_nos which should give me it's count
The gold standard for manipulating JSON data from the command line is jq:
$ jq '.demo_nos | length' test.txt
1
.demo_nos returns the value associated with the demo_nos key in the object, and that array is piped to the length function which does the obvious.
I'm assuming you have python and the file is JSON :)
$ cat some.json
{
"user": "sthapa",
"ticket": "LIN-5867_3",
"start_date": "2018-03-16",
"end_date": "2018-03-16",
"demo_nos": [692],
"service_names": [
"service1",
"service2",
"service3",
"service4",
"service5",
"service6",
"service7",
"service8",
"service9"
]
}
$ python -c 'import sys,json; print(len(json.load(sys.stdin)["demo_nos"]))' < some.json
1
Not the most elegant solution but this should do it
cat test.txt | grep -o -P 'demo_nos.{0,200}' | cut -d'[' -f2 | cut -d']' -f1 | awk -F',' '{ print NF }'
Please note that this is a quick and dirty solution treating input as raw text, and not taking into account JSON structure. In exceptional cases were "demo_nos" string would also appear elsewhere in the file, the output from the command above might be incorrect.
I have a curl command which generates json output. I want to add a few characters in generated file to be able to process it further.
Command:
curl -sN --negotiate -u foo:bar "http://hostname/db/tbl_name/" >> db.json
This runs under a for loop which runs it for a db and tbl_name combination. Hence it ends up generating a number of json outputs(one for each table) concatenated together without any delimiter.
Output looks like :
{"columns":[{"name":"tbl_id","type":"varchar(50)"},{"name":"cret_timestmp","type":"timestamp"},{"name":"updt_timestmp","type":"timestamp"},{"name":"frst_nm","type":"varchar(50)"},{"name":"last_nm","type":"varchar(50)"},{"name":"acct_num","type":"varchar(15)"},{"name":"r_num","type":"varchar(15)"},{"name":"pid","type":"decimal(15,0)"},{"name":"ami_id","type":"varchar(30)"},{"name":"ssn","type":"varchar(9)"},{"name":"client_id","type":"varchar(30)"},{"name":"client_nm","type":"varchar(100)"},{"name":"info","type":"timestamp"},{"name":"rmx","type":"varchar(10)"},{"name":"id","type":"decimal(12,0)"},{"name":"ingest_timestamp","type":"string"},{"name":"incr_ingest_timestamp","type":"string"}],"database":"db_i","table":"db_tbl"}{"columns":[{"name":"key","type":"varchar(15)"},{"name":"foo_cd","type":"varchar(10)"},{"name":"foo_nm","type":"varchar(56)"},{"name":"tmc_regn_cd","type":"varchar(10)"},{"name":"tmc_mrkt_cd","type":"varchar(20)"},{"name":"mrkt_grp","type":"varchar(30)"},{"name":"ingest_timestamp","type":"string"},{"name":"incr_ingest_timestamp","type":"string"}],"database":"db_i","table":"ss_mv"}{"columns":[{"name":"bar_src_name","type":"string"},{"name":"bar_ent_name","type":"string"},{"name":"from_src","type":"string"},{"name":"reload","type":"string"},{"name":"column_mismatch","type":"string"},{"name":"xx_src_name","type":"string"},{"name":"xx_ent_name","type":"string"}],"database":"db_i","table":"test_table"}
Desired output is to start and end the output with []. Also I want to include "," between the end and beginning where column list starts.
So for ex: if the curl command runs against 3 tables as shown above, then the three generated jsons should be created like :
[{json1},{json2},{json3}]
Number 1,2,3 ...etc corresponds to different tables in curl command running in for loop against a particular db whose json should be created in one file but with desired format.
instead of what I'm currently getting :
{json1}{json2}{json3}
In the output pasted above, JSON 1 is :
{"columns":[{"name":"tbl_id","type":"varchar(50)"},{"name":"cret_timestmp","type":"timestamp"},{"name":"updt_timestmp","type":"timestamp"},{"name":"frst_nm","type":"varchar(50)"},{"name":"last_nm","type":"varchar(50)"},{"name":"acct_num","type":"varchar(15)"},{"name":"r_num","type":"varchar(15)"},{"name":"pid","type":"decimal(15,0)"},{"name":"ami_id","type":"varchar(30)"},{"name":"ssn","type":"varchar(9)"},{"name":"client_id","type":"varchar(30)"},{"name":"client_nm","type":"varchar(100)"},{"name":"info","type":"timestamp"},{"name":"rmx","type":"varchar(10)"},{"name":"id","type":"decimal(12,0)"},{"name":"ingest_timestamp","type":"string"},
{"name":"incr_ingest_timestamp","type":"string"}],"database":"db_i","table":"db_tbl"}
JSON 2 is :
{"columns":[{"name":"key","type":"varchar(15)"},{"name":"foo_cd","type":"varchar(10)"},{"name":"foo_nm","type":"varchar(56)"},{"name":"tmc_regn_cd","type":"varchar(10)"},{"name":"tmc_mrkt_cd","type":"varchar(20)"},{"name":"mrkt_grp","type":"varchar(30)"},{"name":"ingest_timestamp","type":"string"},{"name":"incr_ingest_timestamp","type":"string"}],"database":"db_i","table":"ss_mv"}
JSON 3 is :
{"columns":[{"name":"bar_src_name","type":"string"},{"name":"bar_ent_name","type":"string"},{"name":"from_src","type":"string"},{"name":"reload","type":"string"},{"name":"column_mismatch","type":"string"},{"name":"xx_src_name","type":"string"},{"name":"xx_ent_name","type":"string"}],"database":"db_i","table":"test_table"}
I hope the requirement is clear, thanks in advance, looking to achieve this via bash.
Use jq -s.
--slurp/-s: Instead of running the filter for each JSON object in the input, read the entire input stream into a large array
and run the filter just once.
Here's an example:
$ cat file.json
{ "key": "value1" }
{ "key": "value2" }
{ "key":
"value3"}{"key": "value4"}
$ jq -s < file.json
[
{
"key": "value1"
},
{
"key": "value2"
},
{
"key": "value3"
},
{
"key": "value4"
}
]
I'm not sure if I got it correctly, but I think you are looking for something like
echo "[$(cat *.json | paste -sd ',')]" > result.json
This works by creating a string that starts with [ and ends with ], and in the middle, there are the contents of the json files concatenated (cat) and separated by commas (with the help of paste). That string is echoed and written to a new file.
Presuming input in valid JSONL format (one JSON document per line of input), you can embed a Python script inside your bash script:
slurpjson_py='
import json, sys
json.dump([json.loads(line.strip()) for line in sys.stdin], sys.stdout, indent=4)
sys.stdout.write("\n")
'
slurpjson() { python -c "$slurpjson_py" "$#"; }
If called as:
slurpjson <<EOF
{ "first": "document", "starting": "here" }
{ "second": "document", "ending": "here" }
EOF
...output is correctly:
[
{
"starting": "here",
"first": "document"
},
{
"second": "document",
"ending": "here"
}
]
I managed to achieve this by running curl command and adding a "," with every line break using
sed 's/$/,/'
And then remove the last "," and added first and end [] using :
for i in *; do cat $i | sed '$ s/.$//' | awk '{print "["$0"]"}' > $json_dir/$i; done
I have this text file:
{
"name": "",
"auth": true,
"username": "rtorrent",
"password": "d5275b68305438499f9660b38980d6cef7ea97001efe873328de1d76838bc5bd15c99df8b432ba6fdcacbff82e3f3c4829d34589cf43236468d0d0b0a3500c1e"
}
Now, I want to be able to replace the d5275b68305438499f9660b38980d6cef7ea97001efe873328de1d76838bc5bd15c99df8b432ba6fdcacbff82e3f3c4829d34589cf43236468d0d0b0a3500c1e using sed for example. (The string has always the exact same length, but the values can be different)
I've tried this using sed:
sed -i 5s/./new-string/18 file.json
That basically replaces text, on the 5th line, starting with position 18. I want to be able to replace the text, exactly starting with position 18 and up to position 154, strictly what's inside the "". The command above will cut the ", at the end of the file and if it's run multiple times, the string becomes every time longer and longer.
Any help is really appreciated.
You can use for example awk for it:
$ awk -v var="new_string" 'NR==5{print substr($0,1,17) var substr($0,146);next}1' file
{
"name": "",
"auth": true,
"username": "rtorrent",
"password": "new_string"
}
but there are better tools for changing a value in a JSON, jq for example:
$ jq '.password="new_string"' file
{
"name": "",
"auth": true,
"username": "rtorrent",
"password": "new_string"
}
Edit: When passing a shell variable $var to awk and jq:
$ var="new_string"
$ awk -v var="$var" 'NR==5{print substr($0,1,17) var substr($0,146);next}1' file
and
$ jq --arg var "$var" '.password=$var'
Edit2: There is always sed:
$ sed -i "5s/\"[^\"]*\"/\"$var\"/2" file
I have a file with the following content:
{
"user_id1": "171295",
"timeStamp": "2017-03-06 19:16:58.000"
},,
{
"user_id1": "149821",
"timeStamp": "2017-03-08 12:50:47.000"
},,
{
"user_id1": "184767",
"timeStamp": "2017-03-08 19:55:25.000"
},,
{
"user_id1": "146364",
"timeStamp": "2017-03-12 23:48:48.000"
},
]
I want to replace all instances of },, with }, in bash using sed how do I do this?
This is one of the many ways you can do:
sed 's/},,$/},/g' yourfile.txt
$ is an assurance that it's matching end of line's commas. -i option allows you to edit file in place.
sed -i 's/},,$/},/g' yourfile.txt
sed 's/},,/},/g' < in > out