Unix ksh - To print the PID number and repeat count - shell

Log file abc.log
PID:6543 ……
…………………
PID:4325 ……
……………………
PID:6543 ……
Log file xyz.log
PID:8888 ……
…………………
PID:9992 ……
……………………
PID:6543 ……
Note: The PID numbers can repeat in a file. And also one PID number can appear in multiple log files.
This question is asked in an interview today that I have to return output with each PID number and the count of each PID number logged today.
Here is the script that I have written. Can you confirm if that would work or not. The interviewer didn't said if my answer is correct or not. Can someone review this for me. They want me to print each unique PID:number and its count with tab space like PID:5674 10
— If today’s and previous day’s log files are in same folder
#!/bin/ksh
cd /A/B/
for a in `ls -lrt | grep "Mar 24" | awk '{print $9}'`; — list of files generated today
do
grep "^PID:" $a | cut -d " " f1 >> /tmp/abc.log — saving first column which look like PID:23456
done
for b in `cat /tmp/abc.log | sort -u`;
do
x=grep $b /tmp/abc.log | grep -v grep | wc;
echo $b" "$x — will print like PID:23456 56(count)
done
#!/bin/ksh
— If today’s log files are in different folder
cd /A/B/
for a in `ls /A/B/*.log`
do
grep "^PID:" $a | cut -d " " f1 >> /tmp/abc.log
done
for b in `cat /tmp/abc.log | sort -u`;
do
x=grep $b /tmp/abc.log | grep -v grep | wc;
echo $b" "$x
done

You could use awk
grep -h 'PID:' *.log > all_pids.log
#put only PID lines to a file.This can also be done
#with pattern in awk and multiple files. I'm just separating it for clarity
awk '
{ a[$1]++ } #increments by 1 for corresponding PID
END {
for (i in a) {
printf "%s %s\n", i, a[i];
}
}
' all_pids.log

Related

How to get the second word of a string in shell?

I want to get the size of the directory's content. If I use the command line I can get like this:
ls -l | head -n 1 | awk '{ print $2 }'
So the output is the size of the directory's content:
16816
But I want to do this in a bash script:
x="ls -l DBw | head -n 1 | awk '{ print $2 }'"
while sleep 1; do
y=$(eval "$x");
echo $y
done
But the output of this script is the full line:
Total 16816
Why is this happening and how can I get just the second word?
x="ls -l DBw | head -n 1 | awk '{ print $2 }'"
It's happening because $2 is inside double quotes and so it's interpreted immediately. You could escape it with \$2, but better yet:
Don't store code in variables. Use a function.
x() {
ls -l DBw | head -n 1 | awk '{ print $2 }'
}
Then you can call x many times.
while sleep 1; do
x
done
That said, it's better not to parse the output of ls in the first place. Use du to compute the size of a directory:
$ du -sh /dir
1.3M /dir
$ du -sh /dir | cut -f1
1.3M

Get full path name of file and its size using awk

I want to get the file names followed by their size for all files having size in MB or GB. I have done this much so far :
LIST=$(ls -lh -d -1 $PWD/{*,} | awk '{ print $9":"$5 }')
for i in $LIST
do
if [[ $( echo "$i" | cut -f2 -d: | egrep "M|G" | wc -l) -ne 0 ]]
# egrep not working, only finds M
then
echo "$i" >> bigfiles
fi
done
What I am getting is :
amit#C0deDaedalus:~$ test/findbig
/home/amit/Batch:3.8M
/home/amit/Black:3.6M
What I want is :
amit#C0deDaedalus:~$ test/findbig
/home/amit/Batch File Programming.pdf:3.8M
/home/amit/Black Panther - Legend Has It ( Instrumental ).opus:3.6M
Basically, everything is working fine except filenames that I get are not complete. Only first word is shown. I can't figure out whether there is something wrong with logic or syntax but I think it has something to do with awk.
So, How do I get the full path names of files (having spaces in between) in the output ?
I have tried the loop trick in awk, but don't know how to get both of the columns to fit in.
You can use read and the convenient occurrence of the filename at the right-side of the ls -l listing. read puts all the "extra" fields into the final variable:
function f_getfields
{
local perm lnk uname grp size d1 d2 d3 filename
while read perm lnk uname grp size d1 d2 d3 filename
do
echo "$filename $size"
done < <(ls -l)
}
f_getfields
The problem is due to the spaces in your file names. The for loop uses spaces as delimeter. Therefore the first item in your list will be "/home/amit/Batch", second item "File" and so on.
You can use while loop instead of for, something like :
ls -lh -d -1 $PWD/{*,} | awk '{ print $9":"$5 }' | while read LINE
do
echo ${LINE}
# do your stuff here
done
As an aside, if your only intention is to find out large files, you may want to check out disk usage command :
$ du -a | sort -rn | head

Bash script read specifc value from files of an entire folder

I have a problem creating a script that reads specific value from all the files of an entire folder
I have a number of email files in a directory and I need to extract from each file, 2 specific values.
After that I have to put them into a new file that looks like that:
--------------
To: value1
value2
--------------
This is what I want to do, but I don't know how to create the script:
# I am putting the name of the files into a temp file
`ls -l | awk '{print $9 }' >tmpfile`
# use for the name of a file
`date=`date +"%T"
# The first specific value from file (phone number)
var1=`cat tmpfile | grep "To: 0" | awk '{print $2 }' | cut -b -10 `
# The second specific value from file(subject)
var2=cat file | grep Subject | awk '{print $2$3$4$5$6$7$8$9$10 }'
# Put the first value in a new file on the first row
echo "To: 4"$var1"" > sms-$date
# Put the second value in the same file on the second row
echo ""$var2"" >>sms-$date
.......
and do the same for every file in the directory
I tried using while and for functions but I couldn't finalize the script
Thank You
I've made a few changes to your script, hopefully they will be useful to you:
#!/bin/bash
for file in *; do
var1=$(awk '/To: 0/ {print substr($2,0,10)}' "$file")
var2=$(awk '/Subject/ {for (i=2; i<=10; ++i) s=s$i; print s}' "$file")
outfile="sms-"$(date +"%T")
i=0
while [ -f "$outfile" ]; do outfile="sms-$date-"$((i++)); done
echo "To: 4$var1" > "$outfile"
echo "$var2" >> "$outfile"
done
The for loop just goes through every file in the folder that you run the script from.
I have added added an additional suffix $i to the end of the file name. If no file with the same date already exists, then the file will be created without the suffix. Otherwise the value of $i will keep increasing until there is no file with the same name.
I'm using $( ) rather than backticks, this is just a personal preference but it can be clearer in my opinion, especially when there are other quotes about.
There's not usually any need to pipe the output of grep to awk. You can do the search in awk using the / / syntax.
I have removed the cut -b -10 and replaced it with substr($2, 0, 10), which prints the first 10 characters from column 2.
It's not much shorter but I used a loop rather than the $2$3..., I think it looks a bit neater.
There's no need for all the extra " in the two output lines.
I sugest to try the following:
#!/bin/sh
RESULT_FILE=sms-`date +"%T"`
DIR=.
fgrep -l 'To: 0' "$DIR" | while read FILE; do
var1=`fgrep 'To: 0' "$FILE" | awk '{print $2 }' | cut -b -10`
var2=`fgrep 'Subject' "$FILE" | awk '{print $2$3$4$5$6$7$8$9$10 }'`
echo "To: 4$var1" >>"$RESULT_FIL"
echo "$var2" >>"$RESULT_FIL"
done

BASH script - print sorted contents from all files in directory with no rep's

In the current directory there are files with names of the form "gradesXXX" (where XXX is a course number) which look like this:
ID GRADE (this line is not contained in the files)
123456789 56
213495873 84
098342362 77
. .
. .
. .
I want to write a BASH script that prints all the IDs that have a grade above a certain number, which is given as the first parameter to said script.
The requirements are that an ID must be printed once at most, and that no intermediate files are used.
I was guided to use two scripts - the first with length of one line, and the second with length of up to six lines (not including the "#!" line).
I'm quite lost with this one so any suggestions will be appreciated.
Cheers.
The answer I was looking for was
// internal script
#!/bin/bash
while read line; do
line_split=( $line )
if (( ${line_split[1]} > $1 )); then
echo ${line_split[0]}
fi
done
// external script
#!/bin/bash
cat grades* | sort -r -n -k 1 | internalScript $1 | cut -f1 -d" " | uniq
OK, a simple solution.
cat grades[0-9][0-9][0-9] | sort -nurk 2 | while read ID GRADE ; do if [ $GRADE -lt 60 ] ; then break ; fi ; echo $ID ; done | sort -u
I'm not sure why two scripts should be necessary. All in a script:
#!/bin/bash
threshold=$1
cat grades[0-9][0-9][0-9] | sort -nurk 2 | while read ID GRADE ; do if [ $GRADE -lt $threshold ] ; then break ; fi ; echo $ID ; done | sort -u
We first cat all the grade files, the sort them by grade in reverse order. The while loop breaks if grade is below threshold, so that only lines with higher grades get their ID printed. sort -u makes sure that every ID is sent only once.
You can use awk:
awk '{ if ($2 > 70) print $1 }' grades777
It prints the first column of every line which seconds column is greater than 70. If you need to change the threshold:
N=71
awk '{ if ($2 > '$N') print $1 }' grades777
That ' are required to pass shell variables in AWK. To work with all grade??? files in the current directory and remove duplicated lines:
awk '{ if ($2 > '$N') print $1 }' grades??? | sort -u
A simple one-line solution.
Yet another solution:
cat grades[0-9][0-9][0-9] | awk -v MAX=70 '{ if ($2 > MAX) foo[$1]=1 }END{for (id in foo) print id }'
Append | sort -n after that if you want the IDs in sorted order.
In pure bash :
N=60
for file in /path/*; do
while read id grade; do ((grade > N)) && echo "$id"; done < "$file"
done
OUTPUT
213495873
098342362

Dynamic Patch Counter for Shell Script

I am developing a script on a Solaris 10 SPARC machine to calculate how many patches got installed successfully during a patch delivery. I would like to display to the user:
(X) of 33 patches were successfully installed
I would like my script to output dynamically replacing the "X" so the user knows there is activity occurring; sort of like a counter. I am able to show counts, but only on a new line. How can I make the brackets update dynamically as the script performs its checks? Don't worry about the "pass/fail" ... I am mainly concerned with making my output update in the bracket.
for x in `cat ${PATCHLIST}`
do
if ( showrev -p $x | grep $x > /dev/null 2>&1 ); then
touch /tmp/patchcheck/* | echo "pass" >> /tmp/patchcheck/$x
wc /tmp/patchcheck/* | tail -1 | awk '{print $1}'
else
touch /tmp/patchcheck/* | echo "fail" >> /tmp/patchcheck/$x
wc /tmp/patchcheck/* | tail -1 | awk '{print $1}'
fi
done
The usual way to do that is to emit a \r carriage return (CR) at some point and to omit the \n newline or line feed (LF) at the end of the line. Since you're using awk, you can try:
awk '{printf "\r%s", $1} END {print ""}'
For most lines, it outputs a carriage return and the data in field 1 (without a newline at the end). At the end of the input, it prints an empty string followed by a newline.
One other possibility is that you should place the awk script outside your for loop:
for x in `cat ${PATCHLIST}`
do
if ( showrev -p $x | grep $x > /dev/null 2>&1 ); then
touch /tmp/patchcheck/* | echo "pass" >> /tmp/patchcheck/$x
wc /tmp/patchcheck/* | tail -1
else
touch /tmp/patchcheck/* | echo "fail" >> /tmp/patchcheck/$x
wc /tmp/patchcheck/* | tail -1
fi
done | awk '{ printf "\r%s", $1} END { print "" }'
I'm not sure but I think you can apply similar streamlining to the rest of the repetitious code in the script:
for x in `cat ${PATCHLIST}`
do
if showrev -p $x | grep -s $x
then echo "pass"
else echo "fail"
fi >> /tmp/patchcheck/$x
wc /tmp/patchcheck/* | tail -1
done | awk '{ printf "\r%s", $1} END { print "" }'
This eliminates the touch (which doesn't seem to do much), and especially not when the empty output of touch is piped to echo which ignores its standard input. It eliminates the sub-shell in the if line; it uses the -s option of grep to keep it quiet.
I'm still a bit dubious about the wc line. I think you're looking to count the number of files, in effect, since each file should contain one line (pass or fail), unless you listed some patch twice in the file identified by ${PATCHLIST}. In which case, I'd probably use:
for x in `cat ${PATCHLIST}`
do
if showrev -p $x | grep -s $x
then echo "pass"
else echo "fail"
fi >> /tmp/patchcheck/$x
ls /tmp/patchcheck | wc -l
done | awk '{ printf "\r%s", $1} END { print "" }'
This lists the files in /tmp/patchcheck and counts the number of lines output. It means you could simply print $0 in the awk script since $0 and $1 are the same. To the extent efficiency matters (not a lot), this is more efficient because ls only scans a directory, rather than having wc open each file. But it is more particularly a more accurate description of what you are trying to do. If you later want to count the passes, you can use:
for x in `cat ${PATCHLIST}`
do
if showrev -p $x | grep -s $x
then echo "pass"
else echo "fail"
fi >> /tmp/patchcheck/$x
grep '^pass$' /tmp/patchcheck/* | wc -l
done | awk '{ printf "\r%s", $1} END { print "" }'
Of course, this goes back to reading each file, but you're getting more refined information out of it now (and that's the penalty for the more refined information).
Here is how I got my patch installation script working the way I wanted:
while read pkgline
do
patchadd -d ${pkgline} >> /var/log/patch_install.log 2>&1
# Create audit file for progress indicator
for x in ${pkgline}
do
if ( showrev -p ${x} | grep -i ${x} > /dev/null 2>&1 ); then
echo "${x}" >> /tmp/pass
else
echo "${x}" >> /tmp/fail
fi
done
# Progress indicator
for y in `wc -l /tmp/pass | awk '{print $1}'`
do
printf "\r${y} out of `wc -l /patchdir/master | awk '{print $1}'` packages installed for `hostname`. Last patch installed: (${pkgline})"
done
done < /patchdir/master

Resources