how to awk pattern as variable and loop the result? - shell

I assign a keyword as variable, and need to awk from a file using this variable and loop. The file has millions of lines.
i have tried the code below.
DEVICE="DEV2"
while read -r line
do
echo $line
X_keyword=`echo $line | cut -d ',' -f 2 | grep -w "X" | cut -d '=' -f2`
echo $X_keyword
done <<< "$(grep -w $DEVICE $config)"
log="Dev2_PRT.log"
while read -r file
do
VALUE=`echo $file | cut -d '|' -f 1`
HEADER=`echo $VALUE | cut -c 1-4`
echo $file
if [[ $HEADER = 'PTR:' ]]; then
VALUE=`echo $file | cut -d '|' -f 4`
echo $VALUE
XCOORD+=($VALUE)
((X++))
fi
done <<< "awk /$X_keyword/ $log"
expected result:
the log files content lots of below:
PTR:1|2|3|4|X_keyword
PTR:1|2|3|4|Y_rest .....
Filter the X_keyword and get the field no 4.

Unfortunately your shell script is simply the wrong approach to this problem (see https://unix.stackexchange.com/q/169716/133219 for some of the reasons why) so you should set it aside and start over.
To demonstrate the solution, lets create a sample input file:
$ seq 10 | tee file
1
2
3
4
5
6
7
8
9
10
and a shell variable to hold a regexp that's a character list of the chars 5, 6, or 7:
$ var='[567]'
Now, given the above input, here is the solution for how to g/re/p pattern as variable and count how many results:
$ awk -v re="$var" '$0~re{print; c++} END{print "---" ORS c+0}' file
5
6
7
---
3
If that's not all you need then please edit your question to clarify your requirements and provide concise, testable sample input and expected output.

Related

command not found on shell script run [duplicate]

This question already has answers here:
bash command not found when setting a variable
(2 answers)
Closed 6 months ago.
i have a script that compare heapmemory
#!/bin/bash
$a=$(jcmd `jps -l | grep com.adobe.coldfusion.bootstrap.Bootstrap | cut -f1 -d ' '` GC.heap_info | awk 'NR==2 {print $5}')
$b=1000000K
if [[ $a -ge $b ]]
then
echo "The heapmemory used is greater."
else
echo "The heapmemory used is small."
fi
my question is when I am executing this script using ./testscriptforheap.sh, although the output is correct but why I am getting this command not found in line number 2 and 3 I am not able to figure it out.
> ./testscriptforheap.sh: line 2: =1603644K: command not found
> ./testscriptforheap.sh: line 3: =1000000K: command not found
The heapmemory used is greater.
Assignments in bash (and also in sh) don't use the $ in front of them:
a=$(jcmd `jps -l | grep com.adobe.coldfusion.bootstrap.Bootstrap | cut -f1 -d ' '` GC.heap_info | awk 'NR==2 {print $5}')
b=1000000K
Your variable assignment is incorrect, try this:
#!/bin/bash
a=$(jcmd "$(jps -l | grep com.adobe.coldfusion.bootstrap.Bootstrap | cut -f1 -d ' ')" GC.heap_info | awk 'NR==2 {print $5}')
b=1000000K
if [[ $a -ge $b ]]
then
echo "The heapmemory used is greater."
else
echo "The heapmemory used is small."
fi
...also, the bash -gt operator is trying to compare two integers.
From what I can gather:
When you issue the jcmd nnnn GC.heap_info command, it's the sixth field of the second record in the output holds the used heap memory.
It will be of this form nnnnnK, so to be able to compare just the numbers, you could do with piping that through a sed:
Then you could remove K from $b=1000000K.
So, no harm in trying...
#!/bin/bash
a=$( \
jcmd `jps -l | grep com.adobe.coldfusion.bootstrap.Bootstrap | cut -f1 -d ' '` GC.heap_info | awk 'NR==2 {print $6}' | sed -r 's/[^0-9]//g' \
)
b=1000000
if [[ $a -ge $b ]]
then
echo "The heapmemory used is greater."
else
echo "The heapmemory used is small."
fi

Extract number in every line of TSV file

I have a file with tab-separated-values and also with blank spaces like this:
! (desambiguación) http://es.dbpedia.org/resource/!_(desambiguación) 5
! (álbum) http://es.dbpedia.org/resource/!_(álbum_de_Trippie_Redd) 2
!! http://es.dbpedia.org/resource/!! 4
$9.99 http://es.dbpedia.org/resource/$9.99 6
Tomlinson http://es.dbpedia.org/resource/(10108)_Tomlinson 20
102 Miriam http://es.dbpedia.org/resource/(102)_Miriam 2
2003 QQ47 http://es.dbpedia.org/resource/(143649)_2003_QQ47 2
I want to extract the last number of every line:
5
2
4
6
20
2
2
For that, I have done this:
while read line;
do
NUMBER=$(echo $line | cut -f 3 -d ' ')
echo $NUMBER
done < $PAIRCOUNTS_FILE
The main problem is that some lines have more spaces than others and cut doesn't work for me with default delimiter (tab). I dont' know why, maybe because I am using WSL.
I have tried cut with several options but it doesn't work in anyway:
NUMBER=$(echo $line | cut -f 3 -d ' ')
NUMBER=$(echo $line | cut -f 4 -d ' ')
NUMBER=$(echo $line | cut -f 2)
NUMBER=$(echo $line | cut -f 3)
Hope you can help me with this. Thanks in advance.
I want to extract the last number of every line:
You could use grep
grep -Eo '[[:digit:]]+$' file
Or mapfile aka readarray which is a bash4+ feature.
mapfile -t array < file
printf '%s\n' "${array[#]##* }"
You can use awk:
awk '{print $NF}' file
With cut (if it is truly TAB separated and 3 fields per line):
cat file | cut -f3
If you have some variable number of fields per line, use rev|cut|rev to get the last field:
cat file | rev | cut -f1 | rev
Or with pure Bash and parameter expansion:
while IFS= read -r line; do
last=${line##* } # that is a literal TAB in the parameter expansion
printf "%s\n" "$last";
done <file
Or, read into a bash array and echo the last field:
while IFS=$'\t' read -r -a arr; do
echo "${arr[${#arr[#]}-1]}"
done <file
If you have a mixture of tabs and spaces you can do what usually is a mistake and break a Bash variable on white spaces in general (tabs and spaces) into an array:
while IFS= read -r line; do
arr=($line) # break on either tab or space without quotes
echo "${arr[${#arr[#]}-1]}"
done <file

bash scripting to add users

I created a bash script to read information such as username, group etc., from a text file and create users based on it in linux. The code seems to function properly and creates the users as desired. But the user information in the last line of the text file always gets misinterpreted. Even if i delete it then the next last line gets misinterpreted i.e., the text is read wrongly.
`
#!/bin/bash
userfile="users.txt"
IFS=$'\n'
if [ ! -f "$userfile" ]
then
echo "File does not exist. Specify a valid file and try again. "
exit
fi
groups=(`cut -f 4 "$userfile" | sed 's/ //'`)
fullnames=(`cut -f 1 "$userfile" | sed 's/,//' | sed 's/"//g'`)
username1=(`cut -f 1 "$userfile" |sed 's/,//' | sed 's/"//' | tr [A-Z] [a-z] | awk '{print substr($2,1,1) substr($3,1,1) substr($1,1,1)}'`)
username2=(`cut -f 4 "$userfile" | tr [A-Z] [a-z] | awk '{print substr($1,1,1)}'`)
i=0
n=${#username1[#]}
for (( q=0; q<n; q++ ))
do
usernames[$q]=${username1[$q]}"${username2[$q]}"
done
declare -a usernames
x=0
created=0
for user in ${usernames[*]}
do
adduser -c ${fullnames[$x]} -p 123456789 -f 15 -m -d /home/${groups[$x]}/$user -K LOGIN_RETRIES=3 -K PASS_MAX_DAYS=30 -K PASS_WARN_AGE=3 -N -s /bin/bash $user 2> /dev/null
usermod -g ${groups[$x]} $user
chage -d 0 $user
let created=$created+1
x=$x+1
echo -e "User $user created "
done
echo "$created Users created"
enter image description here`
#!/bin/bash
userfile="./users.txt"; # <-- Config
while read line; do
# FULL NAME
# Capture all between quotes as full name
fullname=$(printf '%s' "${line}" | sed 's/^"\(.*\)".*/\1/')
# Remove spaces and punctuations???:
fullname=$(printf '%s' "${fullname}" | tr -d '[:punct:][:blank:]')
# Right-side names:
partb=$(printf '%s' "${line}" | sed "s/^\".*\"//g")
# CODE 1, capture second row
code1=$(printf '%s' "${partb}" | cut -f 2 )
# CODE 2, capture third row
code2=$(printf '%s' "${partb}" | cut -f 3 )
# GROUP, capture fourth row
group=$(printf '%s' "${partb}" | cut -f 4 )
# Print only for report
echo "fullname: ${fullname}\n code 1: ${code1}\n code 2: ${code2}\n group: ${group}\n"
done <${userfile}
Maybe these are the fields that you want, now you have it in variables for manipulate them: $fullname, $code1, $code2 and $group.
Although maybe the fail that you observed was due to some misplaced quotation mark in the text file or the line breaks, on the attached screenshot I can see one missed quote.

bash calculations with numbers from files

I am trying to do a simple thing:
To get the second number in the the line with the second occurence of the word TER and lower it by one and further process it. The tr -s ' ' is there because the file is not delimited by tabs, but by different amounts of whitespaces.
My script:
first_res_atombumb= grep 'TER' tata_sbox_cuda.pdb | head -n 2 | tail -1 |tr -s ' '| cut -f 2 -d ' '
echo $((first_res_atombumb-1))
but this only returnes:
255
-1
Of course I want to have 254.
adding | tr -d '\n' does not help either, what on earth is going on? I have already asked several people at work noone seems to know.
the lines in question look linke this
TER 128 DA3 4
TER 255 DA3 8
and if I apply grep 'TER' tata_sbox_cuda.pdb | head -n 2 | tail -1 | tr -s ' '| cut -f 2 -d ' ' in the command line i get what i expect, just 255
With bash, I'd write
n_ter=0
while read -a words; do
if [[ ${words[0]} == TER ]] && (( ++n_ter == 2 )); then
echo $(( ${words[1]} - 1 ))
fi
done < file
but I'd use awk
awk '$1 == "TER" && ++n == 2 {print $2 - 1}' file
The problem with your code: you forgot to use the $() command substitution syntax
first_res_atombumb= grep 'TER' tata_sbox_cuda.pdb | head -n 2 | tail -1 |tr -s ' '| cut -f 2 -d ' '
# .................^...............................................................................^
echo $((first_res_atombumb-1))
You're setting the variable to an empty string in the environment of the grep command. Then, since you're not capturing the output of that pipeline, "255" is printed to the terminal. Because the variable is unset in your current shell, you get echo $((-1))
All you need is:
first_res_atombumb=$(grep 'TER' tata_sbox_cuda.pdb | head -n 2 | tail -1 |tr -s ' '| cut -f 2 -d ' ')
# .................^^...............................................................................^
But I'd still use awk.
If I understand your problem correctly you can solve it using AWK:
awk 'BEGIN{v=0} $1 == "TER" {v++;if (v==2) {print $2-1 ;exit}}' tata_sbox_cuda.pdb
Explanation:
BEGIN{v=0} declaring and nulling the variable.
$1 == "TER" execute the command in {} only if it's the second occurence of TER.
{v++;if (v==2) {print $2-1 ;exit}}' increase the value of v and check if it's 2, in this case subtract 1 from the second field and display, exit afterwards (will make the processing faster and will skip unnecessary lines).

Reading a file in a shell script and selecting a section of the line

This is probably pretty basic, I want to read in a occurrence file.
Then the program should find all occurrences of "CallTilEdb" in the file Hendelse.logg:
CallTilEdb 8
CallCustomer 9
CallTilEdb 4
CustomerChk 10
CustomerChk 15
CallTilEdb 16
and sum up then right column. For this case it would be 8 + 4 + 16, so the output I would want would be 28.
I'm not sure how to do this, and this is as far as I have gotten with vistid.sh:
#!/bin/bash
declare -t filename=hendelse.logg
declare -t occurance="$1"
declare -i sumTime=0
while read -r line
do
if [ "$occurance" = $(cut -f1 line) ] #line 10
then
sumTime+=$(cut -f2 line)
fi
done < "$filename"
so the execution in terminal would be
vistid.sh CallTilEdb
but the error I get now is:
/home/user/bin/vistid.sh: line 10: [: unary operator expected
You have a nice approach, but maybe you could use awk to do the same thing... quite faster!
$ awk -v par="CallTilEdb" '$1==par {sum+=$2} END {print sum+0}' hendelse.logg
28
It may look a bit weird if you haven't used awk so far, but here is what it does:
-v par="CallTilEdb" provide an argument to awk, so that we can use par as a variable in the script. You could also do -v par="$1" if you want to use a variable provided to the script as parameter.
$1==par {sum+=$2} this means: if the first field is the same as the content of the variable par, then add the second column's value into the counter sum.
END {print sum+0} this means: once you are done from processing the file, print the content of sum. The +0 makes awk print 0 in case sum was not set... that is, if nothing was found.
In case you really want to make it with bash, you can use read with two parameters, so that you don't have to make use of cut to handle the values, together with some arithmetic operations to sum the values:
#!/bin/bash
declare -t filename=hendelse.logg
declare -t occurance="$1"
declare -i sumTime=0
while read -r name value # read both values with -r for safety
do
if [ "$occurance" == "$name" ]; then # string comparison
((sumTime+=$value)) # sum
fi
done < "$filename"
echo "sum: $sumTime"
So that it works like this:
$ ./vistid.sh CallTilEdb
sum: 28
$ ./vistid.sh CustomerChk
sum: 25
first of all you need to change the way you call cut:
$( echo $line | cut -f1 )
in line 10 you miss the evaluation:
if [ "$occurance" = $( echo $line | cut -f1 ) ]
you can then sum by doing:
sumTime=$[ $sumTime + $( echo $line | cut -f2 ) ]
But you can also use a different approach and put the line values in an array, the final script will look like:
#!/bin/bash
declare -t filename=prova
declare -t occurance="$1"
declare -i sumTime=0
while read -a line
do
if [ "$occurance" = ${line[0]} ]
then
sumTime=$[ $sumtime + ${line[1]} ]
fi
done < "$filename"
echo $sumTime
For the reference,
id="CallTilEdb"
file="Hendelse.logg"
sum=$(echo "0 $(sed -n "s/^$id[^0-9]*\([0-9]*\)/\1 +/p" < "$file") p" | dc)
echo SUM: $sum
prints
SUM: 28
the sed extract numbers from a lines containing the given id, such CallTilEdb
and prints them in the format number +
the echo prepares a string such 0 8 + 16 + 4 + p what is calculation in RPN format
the dc do the calculation
another variant:
sum=$(sed -n "s/^$id[^0-9]*\([0-9]*\)/\1/p" < "$file" | paste -sd+ - | bc)
#or
sum=$(grep -oP "^$id\D*\K\d+" < "$file" | paste -sd+ - | bc)
the sed (or the grep) extracts and prints only the numbers
the paste make a string like number + number + number (-d+ is a delimiter)
the bc do the calculation
or perl
sum=$(perl -slanE '$s+=$F[1] if /^$id/}{say $s' -- -id="$id" "$file")
sum=$(ID="CallTilEdb" perl -lanE '$s+=$F[1] if /^$ENV{ID}/}{say $s' "$file")
Awk translation to script:
#!/bin/bash
declare -t filename=hendelse.logg
declare -t occurance="$1"
declare -i sumTime=0
sumtime=$(awk -v entry=$occurance '
$1==entry{time+=$NF+0}
END{print time+0}' $filename)

Resources