BASH - Parse strings with special characters - bash
Goal: I'm attempting to create an interactive version of docker ps. Basically, have each line be a "menu" such that a user can: start, stop, ssh, etc.
Example:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1. bf4a9c7de6bf app_1 "docker-php-entryp..." 7 days ago Up About an hour 443/tcp, 0.0.0.0:80->80/tcp, 9000/tcp app_1
2. 26195f0764ce app_2 "sh /var/www/html/..." 10 days ago Up About an hour 443/tcp, 127.0.0.1:8000->80/tcp app_2
Upon choosing (1/2, etc) there will be an options menu to perform various actions on the selected container.
Problem: I can't seem to figure out how to parse out each line of the docker ps command such that i'll have the Container ID and other values as array elements.
The code so far:
list=`docker ps`
IFS=$'\n' array=($list)
for index in ${!array[#]}
do
declare -a 'a=('"${array[index]}"')'
printf "%s\n" "${a[#]}"
done
The result:
CONTAINER
ID
IMAGE
COMMAND
CREATED
STATUS
PORTS
NAMES
/usr/bin/dockersh: array assign: line 9: syntax error near unexpected token `>'
/usr/bin/dockersh: array assign: line 9: `bf4a9c7de6bf app_1 "docker-php-entryp..." 7 days ago Up About an hour 443/tcp, 0.0.0.0:80->80/tcp, 9000/tcp app_1'
It looks like you've got a few issues with the quoting, maybe try:
list=$(docker ps)
IFS=$'\n' array=($list)
for index in "${!array[#]}"
do
declare -a a=("${array[index]}")
printf "%s\n" "${a[#]}"
done
Without proper quoting your string will be likely by re-split; consider checking your shell scripts # shell-check.net, as it usually will give you some good hints regarding bad syntax.
If you want to have an associative array that features a matrix with all your docker ps field accessible in row/column, you can use awk to insert separator | between fields. Then export the result in a single associative array and build the matrix according to the number of column you expect (eg 7) :
#!/bin/bash
IFS=$'|'
data=$(docker ps -a | awk '
function rtrim(s) { sub(/[ \t\r\n]+$/, "", s); return s }
{
if (NR == 1) {
head[1] = index($0,"CONTAINER ID")
head[2] = image=index($0,"IMAGE")
head[3] = command=index($0,"COMMAND")
head[4] = created=index($0,"CREATED")
head[5] = status=index($0,"STATUS")
head[6] = ports=index($0,"PORTS")
head[7] = names=index($0,"NAMES")
}
else{
for (i = 1;i < 8;i++) {
if (i!=7){
printf "%s",rtrim(substr($0, head[i], head[i+1] - 1 - head[i])) "|"
}
else{
printf "%s",rtrim(substr($0, head[i], 100)) "|"
}
}
print ""
}
}')
arr=($data)
max_column=7
row=0
column=0
declare -A matrix
for index in "${!arr[#]}"
do
matrix[$row,$column]=$(echo "${arr[index]}" | tr -d '\n')
column=$((column+1))
if [ $((column%max_column)) == 0 ]; then
row=$((row+1))
column=0
fi
done
echo "first container ID is : ${matrix[0,0]}"
echo "second container ID is : ${matrix[1,0]}"
echo "third container NAME is : ${matrix[2,6]}"
In the awk part, the aim is to insert a | character between each field for the data to be injected into an associative array with the | delimiter
As field content is aligned with field title, we store the index of each field names in head array and extract each field trimming according to the next field position
Then the matrix is build according to the max column count (7). Then each row/column can be accessed easily with ${matrix[row,column]}
Usual story ... don't read data with a for loop unless you know exactly the format and how to control it:
while IFS="\n" read -r line
do
array+=("$line")
done< <(docker ps)
Personally I would try and remove the numbers from the start of the lines (1., 2., etc) because then you can throw it into a select and it will give you numbers which can then be used to reference the relevant items.
Related
shell script subtract fields from pairs of lines
Suppose I have the following file: stub-foo-start: 10 stub-foo-stop: 15 stub-bar-start: 3 stub-bar-stop: 7 stub-car-start: 21 stub-car-stop: 51 # ... # EOF at the end with the goal of writing a script which would append to it like so: stub-foo-start: 10 stub-foo-stop: 15 stub-bar-start: 3 stub-bar-stop: 7 stub-car-start: 21 stub-car-stop: 51 # ... # appended: stub-foo: 5 # 5 = stop(15) - start(10) stub-bar: 4 # and so on... stub-car: 30 # ... # new EOF The format is exactly this sequential pairing of start and stop tags (stop being the closing one) and no nesting in between. What is the recommended approach to writing such a script using awk and/or sed? Mostly, what I've tried is greping lines, storing to a variable, but that seemed to overcomplicate things and trail off. Any advice or helpful links welcome. (Most tutorials I found on shell scripting were illustrative at best)
A naive implementation in plain bash #!/bin/bash while read -r start && read -r stop; do printf '%s: %d\n' "${start%-*}" $(( ${stop##*:} - ${start##*:} )) done < file This assumes pairs are contiguous and there are no interlaced or nested pairs.
Using GNU awk: awk -F '[ -]' '{ map[$2][$3]=$4;print } END { for (i in map) { print i": "(map[i]["stop:"]-map[i]["start:"])" // ("map[i]["stop:"]"-"-map[i]["start:"]")" } }' file Explanation: awk -F '[ -]' '{ # Set the field delimiter to space or "-" map[$2][$3]=$4; # Create a two dimensional array with the second and third field as indexes and the fourth field as the value print # Print the line } END { for (i in map) { print i": "(map[i]["stop:"]-map[i]["start:"])" // ("map[i]["stop:"]"-"-map[i]["start:"]")" # Loop through the array and print the data in the required format } }' file
UNIX: cut inside if
I have a simple search script, where based on user's options it will search in certain column of a file. The file looks similar to passwd openvpn:x:990:986:OpenVPN:/etc/openvpn:/sbin/nologin chrony:x:989:984::/var/lib/chrony:/sbin/nologin rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin radvd:x:75:75:radvd user:/:/sbin/nologin now the function based on user's option will search in different columns of the file. For example -1 "searchedword" -2 "secondword" will search in the first column for "searchedword" and in the second column for "secondword" The function looks like this: while [ $# -gt 0 ]; do case "$1" in -1|--one) c=1 ;; -2|--two) c=2 ;; -3|--three) c=3 ;; ... esac In the c variable is the number of the column where I want to search. cat data | if [ "$( cut -f $c -d ':' )" == "$2" ]; then cut -d: -f 1-7 >> result; fi Now I have something like this, where I try to select the right column and compare it to the second option, which is in this case "searchedword" and then copy the whole column into the result file. But it doesn't work. It doesn't copy anything into the result file. Does anyone know where is the problem? Thanks for answers (At the end of the script I use: shift shift to get the next two options)
I suggest using awk for this task as awk is better tool for processing delimited columns and rows. Consider this awk command where we pass search column numbers their corresponding search values in 2 different strings cols and vals to awk command: awk -v cols='1:3' -v vals='rpcuser:29' 'BEGIN { FS=OFS=":" # set input/output field separator as : nc = split(cols, c, /:/) # split column # by : split(vals, v, /:/) # split values by : } { p=1 # initialize p as 1 for(i=1; i<=nc; i++) # iterate the search cols/vals and set p=0 if ($c[i] !~ v[i]) { # if any match fails p=0 break } # finally value of p decides if a row is printing or not } p' file Output: rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin
Merging rows in .csv in order
After analysis of brain scans I ended up with around 1000 .csv files, one for each scan. I've merged them into one in order (by subject ID and date). My problem is, that some subjects had two or more consecutive scans and some had only one. Database now looks like that: ID, CC_area, CC_perimeter, CC_circularity 024_S_0985, 407.00, 192.15, 0.138530 //first scan of A 024_S_0985, 437.50, 204.80, 0.131074 //second scan of A 024_S_0985, 400.75, 198.80, 0.127420 //third scan of A 024_S_1063, 544.50, 214.34, 0.148939 //first and only scan of B 024_S_1171, 654.75, 240.33, 0.142453 //first scan of C 024_S_1171, 659.50, 242.21, 0.141269 //second scan of C ... But I want it to look like that: ID, CC_area, CC_perimeter, CC_circularity, CC_area2, CC_perimeter2, CC_circularity2, CC_area3, CC_perimeter3, CC_circularity3, ..., CC_circularity6 024_S_0985, 407.00, 192.15, 0.138530, 437.50, 204.80, 0.131074, 400.75, 198.80, 0.127420, ... , 024_S_1063, 544.50, 214.34, 0.148939,,,,,, ..., 024_S_1171, 654.75, 240.33, 0.142453, 659.50, 242.21, 0.141269,,, ... , ... What is important, that order of data must not be changed and number of rows for one ID is not known (it varies from 1 to 6). (So first columns of scan 1, then scan 2 etc.). Could you help me, or provide, with solution for that using bash? I am not experienced in programming and I have lost hope, that I could do it myself.
You can combine the line with the same filename (or initial index) using a normal while read loop and then acting on 3 conditions. (1) whether it is the first line following the header; (2) where the current index is equal to the last; and (3) where the current index differs from the last. There are a number of ways to approach this, but a short bash script could look like the following: #!/bin/bash fn="${1:-/dev/stdin}" ## accept filename or stdin [ -r "$fn" ] || { ## validate file is readable printf "error: file not found: '%s'\n" "$fn" exit 1 } declare -i cnt=0 ## flag for 1st iteration while read -r line; do ## for each line in file ## read header, print & continue [ ${line//,*/} = ID ] && printf "%s\n" "$line" && continue line="${line// */}" ## strip //first scan of A.... idx=${line//,*/} ## parse file index from line line="${line#*, }" ## strip index if [ $cnt -eq 0 ]; then ## if first line - print printf "%s, %s" "$idx" "$line" ((cnt++)) elif [ $idx = $lidx ]; then ## if indexes equal, append printf ", %s" "$line" else ## else, newline & print printf "\n%s, %s" "$idx" "$line" fi last="$line" ## save last line lidx=$idx ## save last index done <"$fn" printf "\n" Input $ cat dat/cmbcsv.dat ID, CC_area, CC_perimeter, CC_circularity 024_S_0985, 407.00, 192.15, 0.138530 //first scan of A 024_S_0985, 437.50, 204.80, 0.131074 //second scan of A 024_S_0985, 400.75, 198.80, 0.127420 //third scan of A 024_S_1063, 544.50, 214.34, 0.148939 //first and only scan of B 024_S_1171, 654.75, 240.33, 0.142453 //first scan of C 024_S_1171, 659.50, 242.21, 0.141269 //second scan of C Output $ bash cmbcsv.sh dat/cmbcsv.dat ID, CC_area, CC_perimeter, CC_circularity 024_S_0985, 407.00, 192.15, 0.138530, 437.50, 204.80, 0.131074, 400.75, 198.80, 0.127420 024_S_1063, 544.50, 214.34, 0.148939 024_S_1171, 654.75, 240.33, 0.142453, 659.50, 242.21, 0.141269 Note: I didn't know whether you needed all the additional commas or ellipses or if they were just there to show there could be more of the same index (e.g. ,,...,). You can easily add them if need be.
well if you know which scan belongs to which person you can add an extra column like patient name or id, but I guess that's if you have that original info of how much scans per person
Adding two decimal variables and assigning values in bash
we have been asked to parse a csv file and perform some operations based upon the data in the csv I am trying to find the maximum of addition of two numbers which i get from the csv file that is the last and second last numbers, which are decimals Following is my code #!/bin/bash #this file was created on 09/03/2014 #Author = Shashank Pangam OLDIFS=$IFS IFS="," maxTransport=0 while read year month hydro geo solar wind fuel1 biomassL biomassC totalRenew fuel2 biodieselT biomassT do while [ $year -eq 2012 ] do currentTransport=$(echo "$biodieselT+$biomassT" | bc) echo $currentTransport if (( $(echo "$currentTransport > $maxTransport" | bc -l))); then $maxTransport = $currentTransport echo $maxTransport fi done echo -e "Maximum amount of energy consumed by the Transportation sector for year 2012 : $maxTransport" done < $1 and the following is my csv file 2012,January,2.614,0.356,0.006,0.021,114.362,14.128,1.308,66.74,196.539,199.536,81.791, 2012,February,2.286,0.333,0.007,0.017,107.388,13.952,1.304,61.277,183.921,186.564,81.545, 2012,March,0.356,0.009,0.02,108.268,15.588,1.404,63.444,188.705,191.318,87.827,11.187, 2012,April,,0.344,0.012,0.019,103.627,14.229,1.381,60.683,179.919,181.993,86.339,11.518, 2012,May,,0.356,0.012,0.01,109.644,13.789,1.473,63.611,188.517,190.913,92.087,12.09, 2012,June,,0.344,0.013,0.013,108.116,13.012,1.434,61.056,183.618,185.65,89.673,12.461, 2012,July,,0.356,0.017,0.008,112.426,14.035,1.403,58.057,185.921,187.61,87.707,10.464, 2012,August,0.356,0.016,0.008,113.64,14.01,1.513,60.011,189.174,190.999,94.592,11.14, 2012,September,1.513,0.344,0.015,0.01,110.84,13.435,1.324,56.047,181.647,183.528,82.814, 2012,October,1.83,0.356,0.012,0.02,111.544,15.597,1.462,57.365,185.969,188.186,91.42, 2012,November,2.022,0.344,0.01,0.014,111.808,15.594,1.326,56.793,185.521,187.911,82.919, 2012,December,1.77,0.356,0.007,0.022,116.416,15.873,1.368,58.741,192.398,194.552,85.526, 2013,January,3.021,0.357,0.007,0.018,114.601,15.309,1.334,57.31,188.553,191.956,83.415, 2013,February,3.285,0.322,0.012,0.023,102.499,13.658,1.246,52.05,169.452,173.094,77.914, 2013,March,0.357,0.016,0.025,111.594,14.538,1.419,59.096,186.646,189.884,88.713,11.938, 2013,April,,0.345,0.018,0.03,103.602,14.446,1.437,59.057,178.542,181.342,89.867,12.184, 2013,May,,0.357,0.02,0.032,108.113,14.452,1.497,62.606,186.668,190.117,93.634,13.166, 2013,June,,0.345,0.021,0.028,109.162,14.597,1.47,61.563,186.792,189.994,91.894,14.501, 2013,July,,0.357,0.018,0.024,119.154,15.018,1.45,62.037,197.659,201.027,90.689,14.523, 2013,August,0.357,0.022,0.02,113.177,15.014,1.44,60.682,190.313,192.949,90.065,13.28, 2013,September,2.185,0.345,0.021,0.026,106.912,14.367,1.411,58.901,181.591,184.168,88.254, 2013,October,2.171,0.357,0.02,0.029,109.123,15.158,1.483,64.509,190.273,192.849,92.748 The following is the error i get ./calculator.sh: line 16: 0: command not found 0 268.109 I don't understand why echo $currentTransport returns 0 while in the comparison it works and assigns value to maxTransport but throws the error for the same. Thanks in advance.
Instead of this: $maxTransport = $currentTransport Try this: maxTransport=$currentTransport The $ in front of a variable gives its contents. By removing the $, the actual variable location of maxTransport is used instead as the destination for the contents of currentTransport.
Bash script that analyzes report files
I have the following bash script which I will use to analyze all report files in the current directory: #!/bin/bash # methods analyzeStructuralErrors() { # do something with $1 } # main reportFiles=`find $PWD -name "*_report*.txt"`; for f in $reportFiles do echo "Processing $f" analyzeStructuralErrors $f done My report files are formatted as such: Error Code for Issue X - Description Text - Number of errors. col1_name,col2_name,col3_name,col4_name,col5_name,col6_name 1143-1-1411-247-1-72953-1 1143-2-1411-247-436-72953-1 2211-1-1888-204-442-22222-1 Error Code for Issue Y - Description Text - Number of errors. col1_name,col2_name,col3_name,col4_name,col5_name,col6_name Other data . . . I'm looking for a way to go through each file and aggregate the report data. In the above example, we have two unique issues of type X, which I would like to handle in analyzeStructural. Other types of issues can be ignored in this routine. Can anyone offer advice on how to do this? I want to read each line until I hit the next error basically, and put that data into some kind of data structure.
Below is a working awk implementation that uses it's pseudo multidimensional arrays. I've included sample output to show you how it looks. I took the liberty to add a 'Count' column to denote how many times a certain "Issue" was hit for a given Error Code #!/bin/bash awk ' /Error Code for Issue/ { errCode[currCode=$5]=$5 } /^ +[0-9-]+$/ { split($0, tmpArr, "-") error[errCode[currCode],tmpArr[1]]++ } END { for (code in errCode) { printf("Error Code: %s\n", code) for (item in error) { split(item, subscr, SUBSEP) if (subscr[1] == code) { printf("\tIssue: %s\tCount: %s\n", subscr[2], error[item]) } } } } ' *_report*.txt Output $ ./report.awk Error Code: B Issue: 1212 Count: 3 Error Code: X Issue: 2211 Count: 1 Issue: 1143 Count: 2 Error Code: Y Issue: 2961 Count: 1 Issue: 6666 Count: 1 Issue: 5555 Count: 2 Issue: 5911 Count: 1 Issue: 4949 Count: 1 Error Code: Z Issue: 2222 Count: 1 Issue: 1111 Count: 1 Issue: 2323 Count: 2 Issue: 3333 Count: 1 Issue: 1212 Count: 1
As suggested by Dave Jarvis, awk will: handle this better than bash is fairly easy to learn likely available wherever bash is available I've never had to look farther than The AWK Manual. It would make things easier if you used a consistent field separator for both the list of column names and the data. Perhaps you could do some pre-processing in a bash script using sed before feeding to awk. Anyway, take a look at multi-dimensional arrays and reading multiple lines in the manual.
Bash has one-dimensional arrays that are indexed by integers. Bash 4 adds associative arrays. That's it for data structures. AWK has one dimensional associative arrays and fakes its way through two dimensional arrays. If you need some kind of data structure more advanced than that, you'll need to use Python, for example, or some other language. That said, here's a rough outline of how you might parse the data you've shown. #!/bin/bash # methods analyzeStructuralErrors() { local f=$1 local Xpat="Error Code for Issue X" local notXpat="Error Code for Issue [^X]" while read -r line do if [[ $line =~ $Xpat ]] then flag=true elif [[ $line =~ $notXpat ]] then flag=false elif $flag && [[ $line =~ , ]] then # columns could be overwritten if there are more than one X section IFS=, read -ra columns <<< "$line" elif $flag && [[ $line =~ - ]] then issues+=(line) else echo "unrecognized data line" echo "$line" fi done for issue in ${issues[#]} do IFS=- read -ra array <<< "$line" # do something with ${array[0]}, ${array[1]}, etc. # or iterate for field in ${array[#]} do # do something with $field done done } # main find . -name "*_report*.txt" | while read -r f do echo "Processing $f" analyzeStructuralErrors "$f" done