UNIX: cut inside if - bash

I have a simple search script, where based on user's options it will search in certain column of a file.
The file looks similar to passwd
openvpn:x:990:986:OpenVPN:/etc/openvpn:/sbin/nologin
chrony:x:989:984::/var/lib/chrony:/sbin/nologin
rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin
nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin
radvd:x:75:75:radvd user:/:/sbin/nologin
now the function based on user's option will search in different columns of the file. For example
-1 "searchedword" -2 "secondword"
will search in the first column for "searchedword" and in the second column for "secondword"
The function looks like this:
while [ $# -gt 0 ]; do
case "$1" in
-1|--one)
c=1
;;
-2|--two)
c=2
;;
-3|--three)
c=3
;;
...
esac
In the c variable is the number of the column where I want to search.
cat data | if [ "$( cut -f $c -d ':' )" == "$2" ]; then cut -d: -f 1-7 >> result; fi
Now I have something like this, where I try to select the right column and compare it to the second option, which is in this case "searchedword" and then copy the whole column into the result file. But it doesn't work. It doesn't copy anything into the result file.
Does anyone know where is the problem?
Thanks for answers
(At the end of the script I use:
shift
shift
to get the next two options)

I suggest using awk for this task as awk is better tool for processing delimited columns and rows.
Consider this awk command where we pass search column numbers their corresponding search values in 2 different strings cols and vals to awk command:
awk -v cols='1:3' -v vals='rpcuser:29' 'BEGIN {
FS=OFS=":" # set input/output field separator as :
nc = split(cols, c, /:/) # split column # by :
split(vals, v, /:/) # split values by :
}
{
p=1 # initialize p as 1
for(i=1; i<=nc; i++) # iterate the search cols/vals and set p=0
if ($c[i] !~ v[i]) { # if any match fails
p=0
break
} # finally value of p decides if a row is printing or not
} p' file
Output:
rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin

Related

How to make that variable $f defined how much "Freq" will be printed from column number 3?

I need a help with my bash script. I've problem with code:
for v in $(seq 1 $f)); do echo $(grep "Freq" freq.log) | awk '{print$3}')
because this comands printed $f times column number 3 instead should be printed $f values of "Freq" from column number 3.
It's look like
enter image description here
Should be like
enter image description here
I don't know how make that variable $f defined how much "Freq" will be printed from column number 3. In this file I've plenty expressions of "Freq" but I need just $f.
For sure I pasted all content of script:
#!/bin/bash
e=$(grep "atomic number" freq.log | tail -1 | awk '{print$2}')
echo "Liczba atomow znajdujacyh sie w podanej czasteczce wynosi: $e"
f=$(bc <<< "($e*3-6)/3")
echo "Liczba wartosci Freq, ktore wczyta skrypt to $f"
for v in $(seq 1 $f); do
echo "$(grep "Freq" freq.log | awk '{print$3}')"
done
Sample input data file; geometry optimization calculations in GAUSSIAN
A A A
Frequencies -- 182.1477 202.8948 227.7144
Red. masses -- 6.6528 8.2622 6.3837
Frc consts -- 0.1300 0.2004 0.1950
IR Inten -- 0.8602 0.4870 1.2090
NAtoms= 35 NActive= 35 NUniq= 35 SFac= 1.00D+00 NAtFMM= 60 NAOKFM=F Big=F
Here is your bash script converted to a single awk script:
awk script script.awk
/atomic number/{ # for each line matching regEx pattern "atomic number"
e = $2; # store current 2nd field in variable e
}
/Freq/{ # for each line matching regEx pattern "Freq"
freqArr[fr++]=$3; # add 3rd field to array freqArr, increment array counter fr
}
END { # on complete scanning input file
print "Liczba atomow znajdujacyh sie w podanej czasteczce wynosi: " e;
f = ( ((e * 3) - 6) / 3 ); # claculate vairable f
print "Liczba wartosci Freq, ktore wczyta skrypt to " f;
for (currFreq in freqArr) { # scan all element freqArr
if (currFreq == f) # if currFreq equals f
freqCount++; # increment freqCount coutner
}
print freqCount;
}
run command
awk -f script.awk freq.log

How to find the max value in array without using sort cmd in shell script

I have a array=(4,2,8,9,1,0) and I don't want to sort the array to find the highest number in the array because I need to get the index value of the highest number as it is, so I can use it for further reference.
Expected output:
9 index value => 3
Can somebody help me to achieve this?
Slight variation with a loop using the ternary conditional operator and no assumptions about range of values:
arr=(4 2 8 9 1 0)
max=${arr[0]}
maxIdx=0
for ((i = 1; i < ${#arr[#]}; ++i)); do
maxIdx=$((arr[i] > max ? i : maxIdx))
max=$((arr[i] > max ? arr[i] : max))
done
printf '%s index => values %s\n' "$maxIdx" "$max"
The only assumption is that array indices are contiguous. If they aren't, it becomes a little more complex:
arr=([1]=4 [3]=2 [5]=8 [7]=9 [9]=1 [11]=0)
indices=("${!arr[#]}")
maxIdx=${indices[0]}
max=${arr[maxIdx]}
for i in "${indices[#]:1}"; do
((arr[i] <= max)) && continue
maxIdx=$i
max=${arr[i]}
done
printf '%s index => values %s\n' "$maxIdx" "$max"
This first gets the indices into a separate array and sets the initial maximum to the value corresponding to the first index; then, it iterates over the indices, skipping the first one (the :1 notation), checks if the current element is a new maximum, and if it is, stores the index and the maximum.
Without using sort, you can use a simple loop in shell. Here is a sample bash code:
#!/usr/bin/env bash
array=(4 2 8 9 1 0)
for i in "${!array[#]}"; do
[[ -z $max ]] || (( ${array[i]} > $max )) && { max="${array[i]}"; maxind=$i; }
done
echo "max=$max, maxind=$maxind"
max=9, maxind=3
arr=(4 2 8 9 1 0)
paste <(printf "%s\n" "${arr[#]}") <(seq 0 $((${#arr[#]} - 1)) ) |
sort -k1,1 |
tail -n1 |
sed 's/\t/ index value => /'
Print each array element on a newline with printf
Print array indexes with seq
Join both streams using paste
Numerically sort the lines using the first fields (ie. array value) sort
Print the last line tail -n1
The array value and result is separated by a tab. Substitute tab with the output string you want using sed. One could use ex. cut -d, -f2 to get only the index or use read a b <( ... ) to read the numbers into variables, etc.
Using Perl
$ export data=4,2,8,9,1,0
$ echo $data | perl -ne ' map{$i++; if($_>$x) {$x=$_;$id=$i} } split(","); print "max=$x", " index=",--${id},"\n" '
max=9 index=3
$

Sorting strings from array takes a long time

Reading a text file into an array, extracting elements and sorting them is taking a very long time.
The text file is ffmpeg console output for R128 audio analysis. I need to get the highest M and S values. Example:
[Parsed_ebur128_0 # 0x7fd32a60caa0] t: 4.49998 M: -22.2 S: -29.9 I: -27.0 LUFS LRA: 9.8 LU FTPK: -12.4 dBFS TPK: -9.7 dBFS
[Parsed_ebur128_0 # 0x7fd32a60caa0] t: 4.69998 M: -22.5 S: -28.6 I: -25.9 LUFS LRA: 11.3 LU FTPK: -12.7 dBFS TPK: -9.7 dBFS
The text file can be hundreds or thousands of lines long depending on the duration of the audio file being analysed
I want to find the highest M (-22.2) and S Values (-28.6) and assign them to variables M and S
This is what I am using currently:
ARRAY=()
while read LINE
do
ARRAY+=("$LINE")
done < $tempDir/text.txt
for LINE in "${ARRAY[#]}"
do
echo "$LINE" | sed -n ‘/B:/p' | sed 's/S:.*//' | sed -n -e 's/^.*M://p' | sed -n -e 's/-//p' >>/$tempDir/R128M.txt
done
for LINE in "${ARRAY[#]}"
do
echo "$LINE" | sed -n '/M:/p' | sed 's/I:.*//' | sed -n -e 's/^.*S://p' | sed -n -e 's/-//p' >>$tempDir/R128S.txt
done
cat $tempDir/R128M.txt
M=( $(sort $tempDir/R128M.txt) )
cat $tempDir/R128S.txt
S=( $(sort $tempDir/R128S.txt) )
Is there a faster way of doing this?
Rather than reading in the whole file in memory, writing bits of it out to separate file, and reading those in again, just parse it and pick out the largest values:
$ awk '$7 > m || m == "" { m = $7 } $9 > s || s == "" { s = $9 } END { print m, s }' data
-22.2 -28.6
In your data, field 7 and 9 contains the values of M and S. The awk script will update its m and s variables if it finds larger values in these fields and then print the largest found at the end. The m == "" and s == "" are needed to trigger initialization of the values if no values has been read yet.
Another way with awk, which may look cleaner:
$ awk 'FNR == 1 { m = $7; s = $9; next } $7 > m { m = $7 } $9 > s { s = $9 } END { print m, s }' data
To assign them to M and S in the shell:
$ declare $( awk 'FNR == 1 { m = $7; s = $9; next } $7 > m { m = $7 } $9 > s { s = $9 } END { printf("M=%f S=%f\n", m, s) }' data )
$ echo $M $S
-22.200000 -28.600000
Adjust the printf() format to use %s instead of %f if you want the original strings instead of float values, or set the number of decimals you might want with, e.g., %.2f in place of %f.
First of all, three-process pipe is a bit redundant for a single value extraction, especially taking into account you reinstantiate it anew for every line.
Next, you save all the values into a file and then sort that file, while all you need is the maximum value. You can easily find it during the very first (value extraction) loop, for additional O(N) running time, instead of I/O and sorting with all the I/O overhead and O(NlogN) sorting expenses. See ARITHMETIC EXPANSION and conditional expressions in bash manual.

Sorting and printing a file in bash UNIX

I have a file with a bunch of paths that look like so:
7 /usr/file1564
7 /usr/file2212
6 /usr/file3542
I am trying to use sort to pull out and print the path(s) with the most occurrences. Here it what I have so far:
cat temp| sort | uniq -c | sort -rk1 > temp
I am unsure how to only print the highest occurrences. I also want my output to be printed like this:
7 1564
7 2212
7 being the total number of occurrences and the other numbers being the file numbers at the end of the name. I am rather new to bash scripting so any help would be greatly appreciated!
To emit only the first line of output (with the highest number, since you're doing a reverse numeric sort immediately prior), pipe through head -n1.
To remove all content which is not either a number or whitespace, pipe through tr -cd '0-9[:space:]'.
To filter for only the values with the highest number, allowing there to be more than one:
{
read firstnum name && printf '%s\t%s\n' "$firstnum" "$name"
while read -r num name; do
[[ $num = $firstnum ]] || break
printf '%s\t%s\n' "$num" "$name"
done
} < temp
If you want to avoid sort and you are allowed to use awk, then you can do this:
awk '{
if($1>maxcnt) {s=$1" "substr($2,10,4); maxcnt=$1} else
if($1==maxcnt) {s=s "\n"$1" "substr($2,10,4)}} END{print s}' \
temp

Find nth row using AWK and assign them to a variable

Okay, I have two files: one is baseline and the other is a generated report. I have to validate a specific string in both the files match, it is not just a single word see example below:
.
.
name os ksd
56633223223
some text..................
some text..................
My search criteria here is to find unique number such as "56633223223" and retrieve above 1 line and below 3 lines, i can do that on both the basefile and the report, and then compare if they match. In whole i need shell script for this.
Since the strings above and below are unique but the line count varies, I had put it in a file called "actlist":
56633223223 1 5
56633223224 1 6
56633223225 1 3
.
.
Now from below "Rcount" I get how many iterations to be performed, and in each iteration i have to get ith row and see if the word count is 3, if it is then take those values into variable form and use something like this
I'm stuck at the below, which command to be used. I'm thinking of using AWK but if there is anything better please advise. Here's some pseudo-code showing what I'm trying to do:
xxxxx=/root/xxx/xxxxxxx
Rcount=`wc -l $xxxxx | awk -F " " '{print $1}'`
i=1
while ((i <= Rcount))
do
record=_________________'(Awk command to retrieve ith(1st) record (of $xxxx),
wcount=_________________'(Awk command to count the number of words in $record)
(( i=i+1 ))
done
Note: record, wcount values are later printed to a log file.
Sounds like you're looking for something like this:
#!/bin/bash
while read -r word1 word2 word3 junk; do
if [[ -n "$word1" && -n "$word2" && -n "$word3" && -z "$junk" ]]; then
echo "all good"
else
echo "error"
fi
done < /root/shravan/actlist
This will go through each line of your input file, assigning the three columns to word1, word2 and word3. The -n tests that read hasn't assigned an empty value to each variable. The -z checks that there are only three columns, so $junk is empty.
I PROMISE you you are going about this all wrong. To find words in file1 and search for those words in file2 and file3 is just:
awk '
NR==FNR{ for (i=1;i<=NF;i++) words[$i]; next }
{ for (word in words) if ($0 ~ word) print FILENAME, word }
' file1 file2 file3
or similar (assuming a simple grep -f file1 file2 file3 isn't adequate). It DOES NOT involve shell loops to call awk to pull out strings to save in shell variables to pass to other shell commands, etc, etc.
So far all you're doing is asking us to help you implement part of what you think is the solution to your problem, but we're struggling to do that because what you're asking for doesn't make sense as part of any kind of reasonable solution to what it sounds like your problem is so it's hard to suggest anything sensible.
If you tells us what you are trying to do AS A WHOLE with sample input and expected output for your whole process then we can help you.
We don't seem to be getting anywhere so let's try a stab at the kind of solution I think you might want and then take it from there.
Look at these 2 files "old" and "new" side by side (line numbers added by the cat -n):
$ paste old new | cat -n
1 a b
2 b 56633223223
3 56633223223 c
4 c d
5 d h
6 e 56633223225
7 f i
8 g Z
9 h k
10 56633223225 l
11 i
12 j
13 k
14 l
Now lets take this "actlist":
$ cat actlist
56633223223 1 2
56633223225 1 3
and run this awk command on all 3 of the above files (yes, I know it could be briefer, more efficient, etc. but favoring simplicity and clarity for now):
$ cat tst.awk
ARGIND==1 {
numPre[$1] = $2
numSuc[$1] = $3
}
ARGIND==2 {
oldLine[FNR] = $0
if ($0 in numPre) {
oldHitFnr[$0] = FNR
}
}
ARGIND==3 {
newLine[FNR] = $0
if ($0 in numPre) {
newHitFnr[$0] = FNR
}
}
END {
for (str in numPre) {
if ( str in oldHitFnr ) {
if ( str in newHitFnr ) {
for (i=-numPre[str]; i<=numSuc[str]; i++) {
oldFnr = oldHitFnr[str] + i
newFnr = newHitFnr[str] + i
if (oldLine[oldFnr] != newLine[newFnr]) {
print str, "mismatch at old line", oldFnr, "new line", newFnr
print "\t" oldLine[oldFnr], "vs", newLine[newFnr]
}
}
}
else {
print str, "is present in old file but not new file"
}
}
else if (str in newHitFnr) {
print str, "is present in new file but not old file"
}
}
}
.
$ awk -f tst.awk actlist old new
56633223225 mismatch at old line 12 new line 8
j vs Z
It's outputing that result because the 2nd line after 56633223225 is j in file "old" but Z in file "new" and the file "actlist" said the 2 files had to be common from one line before until 3 lines after that pattern.
Is that what you're trying to do? The above uses GNU awk for ARGIND but the workaround is trivial for other awks.
Use the below code:
awk '{if (NF == 3) { word1=$1; word2=$2; word3=$3; print "Words are:" word1, word2, word3} else {print "Line", NR, "is having", NF, "Words" }}' filename.txt
I have given the solution as per the requirement.
awk '{ # awk starts from here and read a file line by line
if (NF == 3) # It will check if current line is having 3 fields. NF represents number of fields in current line
{ word1=$1; # If current line is having exact 3 fields then 1st field will be assigned to word1 variable
word2=$2; # 2nd field will be assigned to word2 variable
word3=$3; # 3rd field will be assigned to word3 variable
print word1, word2, word3} # It will print all 3 fields
}' filename.txt >> output.txt # THese 3 fields will be redirected to a file which can be used for further processing.
This is as per the requirement, but there are many other ways of doing this but it was asked using awk.

Resources