Inifinite loop in bash - bash

I have written the following command to loop over a set of strings in the second column of my file and then do sorting for each string on column 11, then take the second and eleventh column and count the number of unique occurrences. Very simple but it seems that it enters an infinite loop and I can't see why. I would appreciate your help very much.
for item in $(cat file.txt | cut -f2 -d " "| uniq)
do
sort -k11,11 file.txt | cut -f2,11 -d " " | uniq -c | sort -k2,2 > output
done

There's no infinite loop here, but it is a very silly loop (that takes a long time to run, while not accomplishing the script's stated purpose). Let's look at how one might accomplish that purpose more sanely:
Using a temporary file for counts.txt to avoid needing to rerun the sort, cut and uniq steps on each iteration:
sort -k11,11 file.txt | cut -f2,11 -d " " | uniq -c >counts.txt
while read -r item; do
fgrep -e " ${item}" counts.txt
done < <(cut -f2 -d' ' <file.txt | uniq)
Even better, using bash 4 associative arrays and no temporary file:
# reads counts into an array
declare -A counts=( )
while read -r count item; do
counts[$item]=count
done < <(sort -k11,11 file.txt | cut -f2,11 -d " " | sort | uniq -c)
# reads counts back out
while read -r item; do
echo "$item ${counts[$item]}"
done < <(cat file.txt | cut -f2 -d " "| sort | uniq)
...that said, that's only if you want to use sort for ordering on pulling data back out. If you don't need to do that, the latter part could be replaced as such:
# read counts back out
for item in "${!counts[#]}"; do
echo "$item ${counts[$item]}"
done

Related

Print Unique Values while using Do-While loop

I have a file named textfile.txt like below:
a 1 xxx
b 1 yyy
c 2 zzz
d 2 aaa
e 3 bbb
f 3 ccc
I am trying to filter the second column with a unique values in that. I had below code:
while read LINE
do
compname=`echo ${LINE} | cut -d' ' -f2 | uniq`
echo -e "${compname}"
done < textfile.txt
It is displaying:
1
1
2
2
3
3
But I am looking for an output like:
1
2
3
I tried another command also like : echo ${LINE} | cut -d' ' -f2 | sort -u | uniq
still not expected output.
Can anyone help me?
There's no need to loop, sort -u already processes the whole input.
cut -d' ' -f2 textfile.txt | sort -u
Maybe you wanted to get the output in the original order, showing the first occurrence only? You can use an associative array to remember which values have been already seen:
#! /bin/bash
declare -A seen
while read x ; do
[[ ${seen[$x]} ]] || printf '%s\n' "$x"
seen[$x]=1
done < <(cut -d' ' -f2 textfile.txt)
For the last occurrence only, change the last line to
done < <(cut -d' ' -f2 textfile.txt | tac) | tac
(i.e. the last occurrence is the first occurrence in the reversed order)
Just pipe the output of the loop to sort -u. There's no need for cut; the read command can handle this type of splitting.
while read -r _ compname _; do
echo "$compname"
done < textfile.txt | sort -u
Try moving the sort -u or sort | uniq after the done statement like this:
while read LINE;
do
compname=$(echo ${LINE} | cut -d' ' -f2)
echo "${compname}"
done < textfile.txt | sort -u

Finding unique ocurrences in a csv based on a certain field in shell

I have a file emails.csv:
>cat emails.csv
1,joe,joe#gmail.com,32
2,jim,jim#hotmail.fr,23
3,steve,steve_smith#temporary.com.br,45
4,joseph,joseph#protonmail.com,23
5,jim,jim29#bluewin.ch,29
6,hilary,hilary#bluewin.ch,32
I want to keep only the first entry when I find another entry with the same last field (age) - unique entries based on the last field. The output that I want is:
1,joe,joe#gmail.com,32
2,jim,jim#hotmail.fr,23
3,steve,steve_smith#temporary.com.br,45
5,jim,jim29#bluewin.ch,29
The following script is able to do the filtering:
> cut -d, -f4 emails.csv |
> while read age1;
> do line=1;continue_loop=1 cut -d, -f4 emails.csv | while read age;
> do if [[ $age1 == $((age)) ]] && [[ $continue_loop == $1 ]];
> then cat emails.csv | head -n $line | tail -n 1;
> continue_loop=0; fi;
> let line++;
> done;
> done | sort
However, I am looking for a solution that doesn't need require two loops as this seems a bit overcomplicated.
sort -t, -k4 emails.csv | sed -e 's/,/ /g' | uniq -f3 | sed -e 's/ /,/g'
But seems some other languages like Perl or Pyhon will help you to write more stable and not such ugly solution

How to process values from for loop in shell script

I have below for loop in shell script
#!/bin/bash
#Get the year
curr_year=$(date +"%Y")
FILE_NAME=/test/codebase/wt.properties
key=wt.cache.master.slaveHosts=
prop_value=""
getproperty(){
prop_key=$1
prop_value=`cat ${FILE_NAME} | grep ${prop_key} | cut -d'=' -f2`
}
#echo ${prop_value}
getproperty ${key}
#echo "Key = ${key}; Value="${prop_value}
arr=( $prop_value )
for i in "${arr[#]}"; do
echo $i | head -n1 | cut -d "." -f1
done
The output I am getting is as below.
test1
test2
test3
I want to process the test2 from above results to below script in place of 'ABCD'
grep test12345 /home/ptc/storage/**'ABCD'**/apache/$curr_year/logs/access.log* | grep GET > /tmp/test.access.txt
I tried all the options but could not able to succeed as I am new to shell scripting.
Ignoring the many bugs elsewhere and focusing on the one piece of code you say you want to change:
for i in "${arr[#]}"; do
val=$(echo "$i" | head -n1 | cut -d "." -f1)
grep test12345 /dev/null "/home/ptc/storage/$val/apache/$curr_year/logs/access.log"* \
| grep GET
done > /tmp/test.access.txt
Notes:
Always quote your expansions. "$i", "/path/with/$val/"*, etc. (The * should not be quoted on the assumption that you want it to be expanded).
for i in $prop_value would have the exact same (buggy) behavior; using arr buys you nothing. If you want using arr to increase correctness, populate it correctly: read -r -a arr <<<"$prop_value"
The redirection is moved outside the loop -- that way the second iteration through the loop doesn't overwrite the file written by the first one.
The extra /dev/null passed to grep ensures that its behavior is consistent regardless of the number of matches; otherwise, it would display filenames only if more than one matching log file existed, and not otherwise.

BASH - echo issues, wont print anything but while read argument

We are having a wierd issue.
We have this lines :
while read line2; do
echo $line2
done < $1 | `echo grep '.*|.*|.*|.*|.*|.*|.*|.*'` | sort -nbsk1 | cut -d "|" -f1 | uniq -d
Which prints what they should print. but, when changing the echo to ->
while read line2; do
echo "Hello World"
done < $1 | `echo grep '.*|.*|.*|.*|.*|.*|.*|.*'` | sort -nbsk1 | cut -d "|" -f1 | uniq -d
It wont print anything, same result for anything different then $line2.
Whats even more wierd is :
echo " $line2 Hello"
Will print the line2 variable
echo "Hello $line2"
Print nothing
I have tried the same with printf, same results.
Any suggestions ?
What you've written is equivalent to the following shell code:
cat $1 |
while read line2; do
echo $line2
done |
`echo grep '.*|.*|.*|.*|.*|.*|.*|.*'` |
sort -nbsk1 |
cut -d "|" -f1 |
uniq -d
The while read loop takes the contents of file $1 and echoes them, which does nothing other than remove leading and trailing spaces and replace internal spaces with a single space. If you replace the echo $line2 line with echo "Hello World", that string is clearly not going to match the grep command that the output of the loop is being passed through, so producing no output is unsurprising.
When you change the echo line to echo " $line2 Hello", you tack "Hello" onto the end of the input line, which then matches the grep command and gets sliced off the end of the string with the cut command, so it makes sense that it would have essentially no ultimate effect.
If you change the echo line to echo "Hello $line2", any number at the beginning of the line becomes invisible to the sort -ns, which makes your sort call essentially a no-op. This is probably why you're not seeing anything in this situation, although you probably would see something if two identical lines appeared in the input one after the other. (In my testing on my machine, I see one such line because I happen to have two identical lines in succession in my test case.)
It's not exactly clear what you're trying to do since the while loop is almost a no-op. It's possible what you want to do is something more like this:
grep '.*|.*|.*|.*|.*|.*|.*|.*' < $1 |
sort -nbsk1 |
cut -d "|" -f1 |
uniq -d |
while read line2; do
echo $line2
done
... but I'm only speculating at this point.

Speeding up echo in ksh

I've got the below code working, in KSH but it takes the jobs a while to run generating .tmp1 it's slow in the echo $LINE | cut -f 2,4 -d " " >> [file] command, but I don't know why.
I'm guessing it's because it's due to the echo but I don't know; and I don't know how to re-write it to speed it up.
echo "Generating on zTempDay$count.tmp"
while read LINE
do
#Use Cut to trim down to right colums
#cut -b 11-26 $LINE
#mac= cut -b 39-52 $LINE
#vlan= cut -b 62 $LINE
#This line pegs out the CPU - want to know why
echo $LINE | cut -f 2,4 -d " " >> zTempDay$count.tmp1
update_spinner
done < zTempDay$count.tmp
#Remove 'Incomplete' Enteries
#numOfIncomplete=grep "Incomplete" zTempDay$count.tmp1 | wc -l
sed -e "/Incomplete/d" zTempDay$count.tmp1 > zTempDay$count.tmp2
#Use sort to sort by MAC
#Use uniq to remove duplicates
sort +1 -2 zTempDay$count.tmp2 | uniq -f 1 > zTempDay$count.tmp3
#Format Nicely
tr ' ' '\t' < zTempDay$count.tmp3 > zTempDay$count.tmp4
##Want to put a poper progress bar in if program remains slow
#dialog --gauge "Formatting Data: Please wait" 10 70 0
#bc 100*$count/$maxDaysInMonth
Example Data
Internet 10.174.199.193 - 8843.e1a3.1b40 ARPA Vlan####
Internet 10.1.103.206 110 f4ce.46bd.e2e8 ARPA Vlan####
Intended Product (using a tab between IP and MAC)
10.174.199.193 8843.e1a3.1b40
10.1.103.206 f4ce.46bd.e2e8
*awk '/Incomplete/ {next} ;
{print $2 "\t" $4}' zTempDay01.tmp | sort +1 -2 | uniq -f 1 > outfile*
works like a charm thanks to Shellter's help. Thank you! :)

Resources