I'm trying to parse a big json file which I receive using curl.
By following this answer I could parse the next file:
$ cat test.json
{"items": [{"id": 110, "date1": 1590590723, "date2": 1590110000, "name": "somename"}]}
using the next command:
TZ=Europe/Kyiv jq -r '.[] | .[] | .name + "; " + (.date1|strftime("%B %d %Y %I:%M%p")) + "; " + (.date2|strftime("%B %d %Y %I:%M%p"))' test.json
Output is:
somename; May 27 2020 02:45PM; May 22 2020 01:13AM
But when I try to parse the next file using the same command:
$ cat test2.json
{"items": [{"id": 110, "date1": 1590590723, "date2": null, "name": "somename"}]}
Output is:
jq: error (at test2.json:1): strftime/1 requires parsed datetime inputs
I could replace those null values using sed by some valid values before parsing. But maybe there is a better way to skip (ignore) those values, leaving nulls in output:
somename; May 27 2020 02:45PM; null
You could tweak your jq program so that it reads:
def tod: if type=="number" then strftime("%B %d %Y %I:%M%p") else tostring end;
.[] | .[] | .name + "; " + (.date1|tod) + "; " + (.date2|tod)
An alternative would be:
def tod: (tonumber? | strftime("%B %d %Y %I:%M%p")) // null;
.[] | .[] | "\(.name); \(.date1|tod); \(.date2|tod)"
Related
I'm processing this log file:
2021-03-21 20:06:45; ABC; 531.54
2021-03-21 20:06:47; DEF; 136. 81
2021-03-21 20:06:51; GHI; 222.34
I was wondering whether it's possible to use awk to create a filter for the file so that the only lines printed out after applying it are the ones which dates are later than the date given to the script as an argument.
I run the script as:
./script -a 2021-03-21 20:06:46
And expect the output to be:
2021-03-21 20:06:47; DEF; 136. 81
2021-03-21 20:06:51; GHI; 222.34
How can this be achieved?
If GNU Awk which supports the mktime() function is available, would please try the following:
#!/bin/bash
dy=$1 # e.g. "2021-03-21"
tm=$2 # e.g. "20:06:46"
awk -F ";" -v dy="$dy" -v tm="$tm" ' # pass bash arguments to awk
BEGIN { gsub("-", " ", dy); gsub(":", " ", tm); given = mktime(dy " " tm) }
# convert the passed day&time to the seconds since the epoch
{
str = $1; gsub("[-:]", " ", str) # extract the timestamp out of the log line
sec = mktime(str) # convert it to the seconds since the epoch
if (sec > given) print # compare with the given day&time
}
' file.log
Save the script above as a file, say script, add the executable permission with chmod a+x script, then invoke with something like ./script 2021-03-21 20:06:46.
The output will be:
2021-03-21 20:06:47; DEF; 136. 81
2021-03-21 20:06:51; GHI; 222.34
[Alteranative]
Even without the mktime() function, you can just say:
awk -F ";" -v dy="$1" -v tm="$2" '
$1 > dy " " tm
' file.log
which will output the same result. This works because the given date and time string can be compared in a dictionary order.
I have a string and a variable, i want to use this varaible inside this string, so it will take that value before converting it to string:
current_process_id = 222
ruby_command = %q(ps -x | awk '{if($5~"ruby" && $1!= %d ){printf("Killing ruby process: %s \n",$1);}};')
puts ruby_command
I tried :
current_process_id = 222
ruby_command = %q(ps -x | awk '{if($5~"ruby" && $1!= %d ){printf("Killing ruby process: %s \n",$1);}};') % [current_process_id]
puts ruby_command
But this is giving error :
main.rb:2:in `%': too few arguments (ArgumentError)
I tried :
awk_check = %q(ps -x | awk '{if) + "(" + %q($5~"ruby" && $1!=)
print_and_kill = %q({printf("Killing ruby process: %s \n",$1);{system("kill -9 "$1)};}};')
ruby_process_command = awk_check + current_process_id.to_s + ")" + print_and_kill
puts ruby_process_command
This works fine for me. But the way i did is not clean.
I'm looking for more cleaner way to do it.
In your ruby_command variable, you have declared two positional arguments( %d and %s), whereas you only pass one value [current_process_id]. You need to pass the second value as well for %s.
Change your code to:
current_process_id = 222
ruby_command = %q(ps -x | awk '{if($5~"ruby" && $1!= %d ){printf("Killing ruby process: %s \n",$1);}};') % [current_process_id,current_process_id.to_s]
puts ruby_command
Output:
ruby_command
=> "ps -x | awk '{if($5~\"ruby\" && $1!= 222 ){printf(\"Killing ruby process: 222 \\n\",$1);}};'"
If you don't want to display the value, and you just want to display "%s", you can just escape it with %%:
ruby_command = %Q(ps -x | awk '{if($5~"ruby" && $1!= %d ){printf("Killing ruby process: %%s \n",$1);}};') % [current_process_id]
Output:
ruby_command
=> "ps -x | awk '{if($5~\"ruby\" && $1!= 222 ){printf(\"Killing ruby process: %s \n\",$1);}};'"
In my directory, I have a multiple nifti files (e.g., WIP944_mp2rage-0.75iso_TR5.nii) from my MRI scanner accompanied by text files (e.g., WIP944_mp2rage-0.75iso_TR5_info.txt) containing information on the acquisition parameters (e.g., "Series description: WIP944_mp2rage-0.75iso_TR5_INV1_PHS_ND"). Based on these parameters (e.g., INV1_PHS_ND), I need to change the nifti file name, which are echoed in $niftibase. I used grep to do this. When echoing all variables individually, it gives me what I want, but when I try to concatenate them into one filename, the variables are mixed together, instead of delimited by a dot.
I tried multiple forms of sed to cut away potentially invisible characters and identified the source of the problems: the "INV1_PHS_ND" part of 'series description' gives me troubles, which is the $struct component, potentially due to the fact that this part varies in how many fields are extracted. Sometimes this is 3 (in the case of INV1_PHS_ND), but it can be 2 as well (INV1_ND). When I introduce this variable into the filename, everything goes haywire.
for infofile in ${PWD}/*.txt; do
# General characteristics of subjects (i.e., date of session, group number, and subject number)
reco=$(grep -A0 "Series description:" ${infofile} | cut -d ' ' -f 3 | cut -d '_' -f 1)
date=$(grep -A0 "Series date:" ${infofile} | cut -c 16-21)
group=$(grep -A0 "Subject:" ${infofile} | cut -d '^' -f 2 | cut -d '_' -f 1 )
number=$(grep -A0 "Subject:" ${infofile} | cut -d '^' -f 2 | cut -d '_' -f 2)
ScanNr=$(grep -A0 "Series number:" ${infofile} | cut -d ' ' -f 3)
# Change name if reco has structural prefix
if [[ $reco = *WIP944* ]]; then
struct=$(grep -A0 "Series description: WIP944" ${infofile} | cut -d '_' -f 4,5,6)
niftibase=$(basename $infofile _info.txt).nii
#echo ${subStudy}.struct.${date}.${group}.${protocol}.${paradigm}.nii
echo ${subStudy}.struct.${struct}.${date}.${group}.${protocol}${number}.${paradigm}.n${ScanNr}.nii
#mv ${niftibase} ${subStudy}.struct.${struct}.${date}.${group}.${protocol}${number}.${paradigm}.n${ScanNr}.nii
fi
done
This gives me output like this:
.niit47.n4lot.Noc002
.niit47.n5lot.Noc002D
.niit47.n6lot.Noc002
.niit47.n8lot.Noc002
.niit47.n9lot.Noc002
.niit47.n10ot.Noc002
.niit47.n11ot.Noc002D
for all 7 WIP944 files. However, it needs to be in the direction of this:
H1.struct.INV2_PHS_ND.190523.Pilot.Noc001.Heat47.n11.nii, where H1, Noc, and Heat47 are loaded in from a setup file.
EDIT: I tried to use awk in the following way:
reco=$(awk 'FNR==8 {print;exit}' $infofile | cut -d ' ' -f 3 | cut -d '_' -f 1)
date=$(awk 'FNR==2 {print;exit}' $infofile | cut -c 15-21)
group=$(awk 'FNR==6 {print;exit}' $infofile | cut -d '^' -f 2 | cut -d '_' -f 1 )
number=$(awk 'FNR==6 {print;exit}' $infofile | cut -d '^' -f 2 | cut -d '_' -f 2)
ScanNr=$(awk 'FNR==14 {print;exit}' $infofile | cut -d ' ' -f 3)
which again gave me the correct output when echoing the variables individually, but not when I tried to combine them: .niit47.n11022_PHS_ND.
I used echo "$struct" | tr -dc '[:print:]' | od -c to see if there were hidden characters due to line endings, which resulted in:
0000000 I N V 2 _ P H S _ N D
0000013
EDIT: This is how the text file looks like:
Series UID: 1.3.12.2.1107.5.2.34.18923.2019052316005066316714852.0.0.0
Study date: 20190523
Study time: 153529.718000
Series date: 20190523
Series time: 160111.750000
Subject: MDC-0153,pilot_003^pilot_003
Subject birth date: 19970226
Series description: WIP944_mp2rage-0.75iso_TR5_INV1_PHS_ND
Image type: ORIGINAL\PRIMARY\P\ND
Manufacturer: SIEMENS
Model name: Investigational_Device_7T
Software version: syngo MR B17
Study id: 1
Series number: 5
Repetition time (ms): 5000
Echo time[1] (ms): 2.51
Inversion time (ms): 900
Flip angle: 7
Number of averages: 1
Slice thickness (mm): 0.75
Slice spacing (mm):
Image columns: 320
Image rows: 320
Phase encoding direction: ROW
Voxel size x (mm): 0.75
Voxel size y (mm): 0.75
Number of volumes: 1
Number of slices: 240
Number of files: 240
Number of frames: 0
Slice duration (ms) : 0
Orientation: sag
PixelBandwidth: 248
I have one of these for each nifti file. subStudy is hardcoded in a setup file, which is loaded in prior to running the for loop. When I echo this, it shows the correct value. I need to change the names of multiple files with a specific prefix, which are stored in $reco.
As confirmed in comments, the input files have DOS carriage returns, which are basically invalid in Unix files. Also, you should pay attention to proper quoting.
As a general overhaul, I would recommend replacing the entire Bash script with a simple Awk script, which is both simpler and more idiomatic.
for infofile in ./*.txt; do # no need to use $(PWD)
# Pre-filter with a simple grep
grep -q '^Series description: [^ _]*WIP944' "$infofile" && continue
# Still here? Means we want to rename
suffix="$(awk -F : '
BEGIN { split("Series description:Series date:Subject:Series number", f, /:/) }
{ sub(/\r/, ""); } # get rid of pesky DOS carriage return
NR == 1 { nifbase = FILENAME; sub(/_info\.txt$/, ".nii", nifbase) }
$1 in f { x[$1] = substring($0, length($1)+2) }
END {
split(x["Series description"], t, /_/); struct=t[4] "_" t[5] "_" t[6]
split(x["Series description"], t, /_/); reco = t[1]
date=substr(x["Series date"], 16, 5)
split(x["Subject"], t, /\^/); split(t[2], tt, /_/); group=tt[1]
number=tt[2]
ScanNr=x["Series number"]
### FIXME: protocol and paradigm are still undefined
print struct "." date "." group "." protocol number "." paradigm ".n" ScanNr
}' "$infofile")"
echo mv "$infofile" "$subStudy.struct.$suffix"
done
This probably still requires some tweaking (at least "protocol" and "paradigm" are still undefined). Once it seems to print the correct values, you can remove the echo before mv and have it actually rename files for you.
(Probably still better test on a copy of your real data files first!)
I am having a JSON object x and a variable requiredValue
let requiredValue = 5;
let x = [
{"score":1},
{"score":2},
{"score":3}
}
Here using jq first i want to extract all score values and then check if any score value in the object is greater than or equal to requiredValue.
Here is what I tried
jq -r '.[].score | join(",") | contains([requiredValue])'
Suppose if requiredValue is 5 then jq query should return false and if requiredValue is 2 it should return true.
If you split your inputs into two valid JSON documents, rather than having a JavaScript input which is not valid JSON, you could do the following:
requiredValue=5
x='
[{"score":1},
{"score":2},
{"score":3}]
'
jq -n \
--argjson requiredValue "$requiredValue" \
--argjson x "$x" '
[$x[].score | select(. == $requiredValue)] | any
'
The following has been tested with bash:
requiredValue=5
x='[
{"score":1},
{"score":2},
{"score":3}
]'
jq --argjson requiredValue $requiredValue '
any(.[].score; . >= $requiredValue)' <<< "$x"
The result:
false
I have the following sample of dmesg:
throttled log output.
57458] bar 3: test 2 on bar 8 is available
[ 19.696163] bar 1403: test on bar 1405 is available
[ 19.696167] foo: [ 19.696168] bar 3: test 5 on bar 1405 is available
[ 19.696178] foo: [ 19.696179] bar 1403: test 5 on bar 1405 is available
[ 20.928730] foo: [ 20.928733] bar 1403: test on bar 1408 is available
[ 20.928742] foo: [ 20.928745] bar 3: test on bar 1408 is available
[ 24.878861] foo: [ 25.878861] foo: [ 25.878863] bar 1403: bar 802 is present
I would like to convert all timestamps in the line to human format ("%d/%m/%Y %H:%M:%S")
Notes:
This system does not have dmesg -T nor has perl installed.
I would prefer a solution w/ sed or awk, but python is also an option.
I've found a few solutions to this problem, but none quite answers what I need. Nor do I know how to modify it to my needs.
awk -F"]" '{"cat /proc/uptime | cut -d \" \" -f 1" | getline st;a=substr( $1,2, length($1) - 1);print strftime("%d/%m/%Y %H:%M:%S",systime()-st+a)" "$0}'
Or
sed -n 's/\]//;s/\[//;s/\([^.]\)\.\([^ ]*\)\(.*\)/\1\n\3/p' | while read first; do read second; first=`date +"%d/%m/%Y %H:%M:%S" --date="#$(($seconds - $base + $first))"`; printf "[%s] %s\n" "$first" "$second"; done
There's also a python script in here. But outputs some errors which I have zero understanding in.
Thanks!
The following code simulates dmesg -T outcomes. It's inline awk within shell and can be stored as a standalone script or shell function:
awk -v UPTIME="$( cut -d' ' -f1 /proc/uptime )" '
BEGIN {
STARTTIME = systime() - UPTIME
}
match($0, /^\[[^\[\]]*\]/) {
s = substr($0, 2, RLENGTH - 2) + STARTTIME;
s = strftime("%a %b %d %H:%M:%S %Y", s);
sub(/^\[[^\[\]]*\]/, "[" s "]", $0);
print
}
'
It doesn't guarantee precision as dmesg -T provides but makes results a bit closer.
This is a bit touch-and-go, but it should at least give you something to work with:
awk '
{
# tail will be the part of the line that still requires processing
tail = $0;
# Read uptime from /proc/uptime and use it to calculate the system
# start time
"cat /proc/uptime | cut -d \" \" -f 1" | getline st;
starttime = systime() - st;
# while we find matches
while((start = match(tail, /\[[^[]*\]/)) != 0) {
# pick the timestamp from the match
s = substr(tail, start + 1, RLENGTH - 2);
# shorten the tail accordingly
tail = substr(tail, start + RLENGTH);
# format the time to our preference
t = strftime("%d/%m/%Y %H:%M:%S", starttime + s);
# substitute it into the original line. [] are replaced with || so
# the match is not re-replaced in the next iteration.
sub(/\[[^[]*\]/, "|" t "|", $0);
}
# When all matches have been replaced, print the line.
print $0
}' foo.txt