Unix bash scripting sort - bash

i need help to calculate and display the largest and average of a group of input numbers.
The program should accept a group of numbers, each can be up to 3 digits.
For example, input of 246, 321, 16, 10, 12345, 4, 274 and 0 should result in 321 as the largest and the average of 145, with an error message indicating that 12345 is invalid.
Any ideas how to sort in bash ? Sorry I am not developer in this low level, any help is great :)

I see that you ask for a Bash solution but since you tagged it also Unix I suggest a pure awk solution (awk is just ideal for such problems):
awk '
{
if(length($1) <= 3 && $1 ~ /^[0-9]+$/) {
if($1 > MAX) {MAX = $1}
SUM+=$1
N++
print $1, N, SUM
} else {
print "Illegal Input " $1
}
}
END {
print "Average: " SUM / N
print "Max: " MAX
}
' < <(echo -e "246\n321\n16\n10\n12345\n4\n274\n0")
prints
246 1 246
321 2 567
16 3 583
10 4 593
Illegal Input 12345
4 5 597
274 6 871
0 7 871
Average: 124.429
Max: 321
However, I cannot comprehend why the above input yields 145 as average?

tmpfile=`mktemp`
while read line ; do
if [[ $line =~ ^[0-9]{1,3}$ ]] ; then
# valid input
if [ $line == "0" ] ; then
break
fi
echo $line >> $tmpfile
else
echo "ERROR: invalid input '$line'"
fi
done
awk ' { tot += $1; if (max < $1) max = $1; } \
END { print tot / NR; print max; } ' $tmpfile
rm $tmpfile

A piped coreutils option with bc:
echo 246 321 16 10 12345 4 274 0 \
| grep -o '\b[0-9]{1,3}\b' \
| tee >(sort -n | tail -n1 > /tmp/max) \
| tr '\n' ' ' \
| tee >(sed 's/ $//; s/ \+/+/g' > /tmp/add) \
>(wc -w > /tmp/len) > /dev/null
printf "Max: %d, Min: %.2f\n" \
$(< /tmp/max) \
$((echo -n '('; cat /tmp/add; echo -n ')/'; cat /tmp/len) | bc -l)
Output:
Max: 321, Min: 124.43
grep ensures that the number format constraint.
sort finds max, as suggested by chepner
sed and wc generate the sum and divisor.
Note that this generates 3 temporary files: /tmp/{max,add,len}, so you might want to use mktemp and/or deletion:
rm /tmp/{max,add,len}
Edit
Stick this into the front of the pipe if you want to know about invalid input:
tee >(tr ' ' '\n' \
| grep -v '\b.{1,3}\b' \
| sed 's/^/Invalid input: /' > /tmp/err)
And do cat /tmp/err after the printf.

Related

How can I use 'echo' output as an operand for the 'seq' command within a terminal?

I have an excercise where I need to sum together every digit up until a given number like this:
Suppose I have the number 12, I need to do 1+2+3+4+5+6+7+8+9+1+0+1+1+1+2.
(numbers past 9 are split up into their separate digits eg. 11 = 1+1, 234 = 2+3+4, etc.)
I know I can just use:
seq -s '' 12
which outputs 123456789101112 and then add them all together with '+' in between and then pipe to 'bc' BUT I have to specifically do :
echo 12 | ...
as the first step (because the online IDE fills it in as the unchangeable first step for every testcase) and when I do this I start to have problems with seq
I tried
echo 12 | seq -s '' $1
### or just ###
echo 12 | seq -s ''
but can't get it to work as this just gives back a missing operand error for seq (because I'm in the terminal, not a script and the '12' isn't just assigned to $1 I assume), any recommendations on how to avoid it or how to get seq to interpret the 12 from echo as operand or alternative ways to go?
seq -s '' $(cat)
full solution:
echo "12" | seq -s '' $(cat) | sed 's/./&+/g; s/$/0/' | bc
Or
echo 12 | { echo $(( $({ seq -s '' $(< /dev/stdin); echo; } | sed -E 's/([[:digit:]])/\1+/g; s/$/0/') )); }
without sed:
d=$(echo 12 | { seq -s '' $(< /dev/stdin); echo; }); echo $(( "${d//?/&+}0" ))
echo 12 | awk '{
cnt=0
for(i=1;i<=$1;i++) {
cnt+=i
printf("%s%s",i,i<$1?"+":"=")
}
print cnt
}'
Prints:
1+2+3+4+5+6+7+8+9+10+11+12=78
If it is supposed to be just the digits added up:
echo 12 | awk '{s=""
for(i=1;i<=$1;i++) s=s i
split(s,ch,"")
for(i=1;i<=length(ch); i++) cnt+=ch[i]
print cnt
}'
51
Or a POSIX pipeline:
$ echo 12 | seq -s '' "$(cat)" | sed -E 's/([0-9])/\1+/g; s/$/0/' | bc
51

how can i echo a line once , then the rest keep them the way they are in unix bash?

I have the following comment:
(for i in 'cut -d "," -f1 file.csv | uniq`; do var =`grep -c $i file.csv';if (($var > 1 )); then echo " you have the following repeated numbers" $i ; fi ; done)
The output that i get is : You have the following repeated numbers 455
You have the following repeated numbers 879
You have the following repeated numbers 741
what I want is the following output:
you have the following repeated numbers:
455
879
741
Try moving the echo of the header line before the for-loop :
(echo " you have the following repeated numbers"; for i in 'cut -d "," -f1 file.csv | uniq`; do var =`grep -c $i file.csv';if (($var > 1 )); then echo $i ; fi ; done)
Or only print the header once :
(header=" you have the following repeated numbers\n"; for i in 'cut -d "," -f1 file.csv | uniq`; do var =`grep -c $i file.csv';if (($var > 1 )); then echo -e $header$i ; header=""; fi ; done)
Well, here's what I came to:
1) generated input for testing
for x in {1..35},aa,bb ; do echo $x ; done > file.csv
for x in {21..48},aa,bb ; do echo $x ; done >> file.csv
for x in {32..63},aa,bb ; do echo $x ; done >> file.csv
unsort file.csv > new.txt ; mv new.txt file.csv
2) your line ( corrected syntax errors)
dtpwmbp:~ pwadas$ for i in $(cut -d "," -f1 file.csv | uniq);
do var=`grep -c $i file.csv`; if [ "$var" -ge 1 ] ;
then echo " you have the following repeated numbers" $i ; fi ; done | head -n 10
you have the following repeated numbers 8
you have the following repeated numbers 41
you have the following repeated numbers 18
you have the following repeated numbers 34
you have the following repeated numbers 3
you have the following repeated numbers 53
you have the following repeated numbers 32
you have the following repeated numbers 33
you have the following repeated numbers 19
you have the following repeated numbers 7
dtpwmbp:~ pwadas$
3) my line:
dtpwmbp:~ pwadas$ echo "you have the following repeated numbers:";
for i in $(cut -d "," -f1 file.csv | uniq); do var=`grep -c $i file.csv`;
if [ "$var" -ge 1 ] ; then echo $i ; fi ; done | head -n 10
you have the following repeated numbers:
8
41
18
34
3
53
32
33
19
7
dtpwmbp:~ pwadas$
I added quotes, changed if() to [..] expression, and finally moved description sentence out of loop. Number of occurences tested is digit near "-ge" condition. If it is "1", then numbers which appear once or more are printed. Note, that in this expression, if file contains e.g. numbers
8
12
48
then "8" is listed in output as appearing twice. with "-ge 2", if no digits appear more than once, no output (except heading) is printed.

How to combine columns that have the same headers within 1 file using Awk or Bash

I would like to know how to combine columns with duplicate headers in a file using bash/sed/awk.
x y x y
s1 3 4 6 10
s2 3 9 10 7
s3 7 1 3 2
to :
x y
s1 9 14
s2 13 16
s3 10 3
$ cat file
x y x y
s1 3 4 6 10
s2 3 9 10 7
s3 7 1 3 2
$ cat tst.awk
NR==1 {
for (i=1;i<=NF;i++) {
flds[$i] = flds[$i] " " i+1
}
printf "%-3s",""
for (hdr in flds) {
printf "%3s",hdr
}
print ""
next
}
{
printf "%-3s",$1
for (hdr in flds) {
n = split(flds[hdr],fldNrs)
sum = 0
for (i=1; i<=n; i++) {
sum += $(fldNrs[i])
}
printf "%3d",sum
}
print ""
}
$ awk -f tst.awk file
x y
s1 9 14
s2 13 16
s3 10 3
$ time awk -f ./tst.awk file
x y
s1 9 14
s2 13 16
s3 10 3
real 0m0.265s
user 0m0.030s
sys 0m0.108s
Adjust the printf lines in the obvious ways for different output formatting if you like.
Here's the bash equivalent in response to the comments elsethread. Do NOT use this, the awk solution is the right one, this is just to show how you should write it in bash IF you wanted to do that for some inexplicable reason:
$ cat tst.sh
declare -A flds
while IFS= read -r rec
do
lineNr=$(( lineNr + 1 ))
set -- $rec
if (( lineNr == 1 ))
then
fldNr=1
for fld
do
fldNr=$(( fldNr + 1 ))
flds[$fld]+=" $fldNr"
done
printf "%-3s" ""
for hdr in "${!flds[#]}"
do
printf "%3s" "$hdr"
done
printf "\n"
else
printf "%-3s" "$1"
for hdr in "${!flds[#]}"
do
fldNrs=( ${flds[$hdr]} )
sum=0
for fldNr in "${fldNrs[#]}"
do
eval val="\$$fldNr"
sum=$(( sum + val ))
done
printf "%3d" "$sum"
done
printf "\n"
fi
done < "$1"
$
$ time ./tst.sh file
x y
s1 9 14
s2 13 16
s3 10 3
real 0m0.062s
user 0m0.031s
sys 0m0.046s
Note that it runs in roughly the same order of magnitude duration as the awk script (see comments elsethread). Caveat - I never write bash scripts for processing text files so I'm not claiming the above bash script is perfect, just an example of how to approach it in bash for comparison with the other script in this thread that I claimed should be rewritten!
This not a one line. You can do it using Bash v4, Bash's dictonaries, and some shell tools.
Execute the script below with the name of the file to process a parameter
bash script_below.sh your_file
Here is the script:
declare -A coltofield
headerdone=0
# Take the first line of the input file and extract all fields
# and their position. Start with position value 2 because of the
# format of the following lines
while read line; do
colnum=$(echo $line | cut -d "=" -f 1)
field=$(echo $line | cut -d "=" -f 2)
coltofield[$colnum]=$field
done < <(head -n 1 $1 | sed -e 's/^[[:space:]]*//;' -e 's/[[:space:]]*$//;' -e 's/[[:space:]]\+/\n/g;' | nl -v 2 -n ln | sed -e 's/[[:space:]]\+/=/g;')
# Read the rest of the file starting with the second line
while read line; do
declare -A computation
declare varname
# Turn the line in key value pair. The key is the position of
# the value in the line
while read value; do
vcolnum=$(echo $value | cut -d "=" -f 1)
vvalue=$(echo $value | cut -d "=" -f 2)
# The first value is the line variable name
# (s1, s2)
if [[ $vcolnum == "1" ]]; then
varname=$vvalue
continue
fi
# Get the name of the field by the column
# position
field=${coltofield[$vcolnum]}
# Add the value to the current sum for this field
computation[$field]=$((computation[$field]+${vvalue}))
done < <(echo $line | sed -e 's/^[[:space:]]*//;' -e 's/[[:space:]]*$//;' -e 's/[[:space:]]\+/\n/g;' | nl -n ln | sed -e 's/[[:space:]]\+/=/g;')
if [[ $headerdone == "0" ]]; then
echo -e -n "\t"
for key in ${!computation[#]}; do echo -n -e "$key\t" ; done; echo
headerdone=1
fi
echo -n -e "$varname\t"
for value in ${computation[#]}; do echo -n -e "$value\t"; done; echo
computation=()
done < <(tail -n +2 $1)
Yet another AWK alternative:
$ cat f
x y x y
s1 3 4 6 10
s2 3 9 10 7
s3 7 1 3 2
$ cat f.awk
BEGIN {
OFS="\t";
}
NR==1 {
#need header for 1st column
for(f=NF; f>=1; --f)
$(f+1) = $f;
$1="";
for(f=1; f<=NF; ++f)
fld2hdr[f]=$f;
}
{
for(f=1; f<=NF; ++f)
if($f ~ /^[0-9]/)
colValues[fld2hdr[f]]+=$f;
else
colValues[fld2hdr[f]]=$f;
for (i in colValues)
row = row colValues[i] OFS;
print row;
split("", colValues);
row=""
}
$ awk -f f.awk f
x y
s1 9 14
s2 13 16
s3 10 3
$ awk 'BEGIN{print " x y"} a=$2+$4, b=$3+$5 {print $1, a, b}' file
x y
s1 9 14
s2 13 16
s3 10 3
No doubt there is a better way to display the heading but my awk is a little sketchy.
Here's a Perl solution, just for fun:
cat table.txt | perl -e'#h=grep{$_}split/\s+/,<>;while(#l=grep{$_}split/\s+/,<>){for$i(1..$#l){$t{$l[0]}{$h[$i-1]}+=$l[$i]}};printf " %s\n",(join" ",sort keys%{$t{(keys%t)[0]}});for$h(sort keys%t){printf"$h %s\n",(join " ",map{sprintf"%2d",$_}#{$t{$h}}{sort keys%{$t{$h}}})};'

How can I align the columns of tables in Bash?

I want to format text as a table. I tried echoing with a '\t' separator, but it was misaligned.
Desired output:
a very long string.......... 112232432 anotherfield
a smaller string 123124343 anotherfield
Use the column command:
column -t -s' ' filename
printf is great, but people forget about it.
$ for num in 1 10 100 1000 10000 100000 1000000; do printf "%10s %s\n" $num "foobar"; done
1 foobar
10 foobar
100 foobar
1000 foobar
10000 foobar
100000 foobar
1000000 foobar
$ for((i=0;i<array_size;i++));
do
printf "%10s %10d %10s" stringarray[$i] numberarray[$i] anotherfieldarray[%i]
done
Notice I used %10s for strings. %s is the important part. It tells it to use a string. The 10 in the middle says how many columns it is to be. %d is for numerics (digits).
See man 1 printf for more info.
function printTable()
{
local -r delimiter="${1}"
local -r data="$(removeEmptyLines "${2}")"
if [[ "${delimiter}" != '' && "$(isEmptyString "${data}")" = 'false' ]]
then
local -r numberOfLines="$(wc -l <<< "${data}")"
if [[ "${numberOfLines}" -gt '0' ]]
then
local table=''
local i=1
for ((i = 1; i <= "${numberOfLines}"; i = i + 1))
do
local line=''
line="$(sed "${i}q;d" <<< "${data}")"
local numberOfColumns='0'
numberOfColumns="$(awk -F "${delimiter}" '{print NF}' <<< "${line}")"
# Add Line Delimiter
if [[ "${i}" -eq '1' ]]
then
table="${table}$(printf '%s#+' "$(repeatString '#+' "${numberOfColumns}")")"
fi
# Add Header Or Body
table="${table}\n"
local j=1
for ((j = 1; j <= "${numberOfColumns}"; j = j + 1))
do
table="${table}$(printf '#| %s' "$(cut -d "${delimiter}" -f "${j}" <<< "${line}")")"
done
table="${table}#|\n"
# Add Line Delimiter
if [[ "${i}" -eq '1' ]] || [[ "${numberOfLines}" -gt '1' && "${i}" -eq "${numberOfLines}" ]]
then
table="${table}$(printf '%s#+' "$(repeatString '#+' "${numberOfColumns}")")"
fi
done
if [[ "$(isEmptyString "${table}")" = 'false' ]]
then
echo -e "${table}" | column -s '#' -t | awk '/^\+/{gsub(" ", "-", $0)}1'
fi
fi
fi
}
function removeEmptyLines()
{
local -r content="${1}"
echo -e "${content}" | sed '/^\s*$/d'
}
function repeatString()
{
local -r string="${1}"
local -r numberToRepeat="${2}"
if [[ "${string}" != '' && "${numberToRepeat}" =~ ^[1-9][0-9]*$ ]]
then
local -r result="$(printf "%${numberToRepeat}s")"
echo -e "${result// /${string}}"
fi
}
function isEmptyString()
{
local -r string="${1}"
if [[ "$(trimString "${string}")" = '' ]]
then
echo 'true' && return 0
fi
echo 'false' && return 1
}
function trimString()
{
local -r string="${1}"
sed 's,^[[:blank:]]*,,' <<< "${string}" | sed 's,[[:blank:]]*$,,'
}
SAMPLE RUNS
$ cat data-1.txt
HEADER 1,HEADER 2,HEADER 3
$ printTable ',' "$(cat data-1.txt)"
+-----------+-----------+-----------+
| HEADER 1 | HEADER 2 | HEADER 3 |
+-----------+-----------+-----------+
$ cat data-2.txt
HEADER 1,HEADER 2,HEADER 3
data 1,data 2,data 3
$ printTable ',' "$(cat data-2.txt)"
+-----------+-----------+-----------+
| HEADER 1 | HEADER 2 | HEADER 3 |
+-----------+-----------+-----------+
| data 1 | data 2 | data 3 |
+-----------+-----------+-----------+
$ cat data-3.txt
HEADER 1,HEADER 2,HEADER 3
data 1,data 2,data 3
data 4,data 5,data 6
$ printTable ',' "$(cat data-3.txt)"
+-----------+-----------+-----------+
| HEADER 1 | HEADER 2 | HEADER 3 |
+-----------+-----------+-----------+
| data 1 | data 2 | data 3 |
| data 4 | data 5 | data 6 |
+-----------+-----------+-----------+
$ cat data-4.txt
HEADER
data
$ printTable ',' "$(cat data-4.txt)"
+---------+
| HEADER |
+---------+
| data |
+---------+
$ cat data-5.txt
HEADER
data 1
data 2
$ printTable ',' "$(cat data-5.txt)"
+---------+
| HEADER |
+---------+
| data 1 |
| data 2 |
+---------+
REF LIB at: https://github.com/gdbtek/linux-cookbooks/blob/master/libraries/util.bash
To have the exact same output as you need, you need to format the file like this:
a very long string..........\t 112232432\t anotherfield\n
a smaller string\t 123124343\t anotherfield\n
And then using:
$ column -t -s $'\t' FILE
a very long string.......... 112232432 anotherfield
a smaller string 123124343 anotherfield
It's easier than you wonder.
If you are working with a separated-by-semicolon file and header too:
$ (head -n1 file.csv && sort file.csv | grep -v <header>) | column -s";" -t
If you are working with an array (using tab as separator):
for((i=0;i<array_size;i++));
do
echo stringarray[$i] $'\t' numberarray[$i] $'\t' anotherfieldarray[$i] >> tmp_file.csv
done;
cat file.csv | column -t
awk solution that deals with stdin
Since column is not POSIX, maybe this is:
mycolumn() (
file="${1:--}"
if [ "$file" = - ]; then
file="$(mktemp)"
cat > "${file}"
fi
awk '
FNR == 1 { if (NR == FNR) next }
NR == FNR {
for (i = 1; i <= NF; i++) {
l = length($i)
if (w[i] < l)
w[i] = l
}
next
}
{
for (i = 1; i <= NF; i++)
printf "%*s", w[i] + (i > 1 ? 1 : 0), $i
print ""
}
' "$file" "$file"
if [ "$1" = - ]; then
rm "$file"
fi
)
Test:
printf '12 1234 1
12345678 1 123
1234 123456 123456
' > file
Test commands:
mycolumn file
mycolumn <file
mycolumn - <file
Output for all:
12 1234 1
12345678 1 123
1234 123456 123456
See also:
Using awk to align columns in text file?
AWK: go through the file twice, doing different tasks
I am not sure where you were running this, but the code you posted would not produce the output you gave, at least not in the Bash version that I'm familiar with.
Try this instead:
stringarray=('test' 'some thing' 'very long long long string' 'blah')
numberarray=(1 22 7777 8888888888)
anotherfieldarray=('other' 'mixed' 456 'data')
array_size=4
for((i=0;i<array_size;i++))
do
echo ${stringarray[$i]} $'\x1d' ${numberarray[$i]} $'\x1d' ${anotherfieldarray[$i]}
done | column -t -s$'\x1d'
Note that I'm using the group separator character (0x1D) instead of tab, because if you are getting these arrays from a file, they might contain tabs.
Just in case someone wants to do that in PHP, I posted a gist on GitHub:
https://gist.github.com/redestructa/2a7691e7f3ae69ec5161220c99e2d1b3
Simply call:
$output = $tablePrinter->printLinesIntoArray($items, ['title', 'chilProp2']);
You may need to adapt the code if you are using a PHP version older than 7.2.
After that, call echo or writeLine depending on your environment.
The below code has been tested and does exactly what is requested in the original question.
Parameters:
%30s Column of 30 char and text right align.
%10d integer notation, %10s will also work. \
stringarray[0]="a very long string.........."
# 28Char (max length for this column)
numberarray[0]=1122324333
# 10digits (max length for this column)
anotherfield[0]="anotherfield"
# 12Char (max length for this column)
stringarray[1]="a smaller string....."
numberarray[1]=123124343
anotherfield[1]="anotherfield"
printf "%30s %10d %13s" "${stringarray[0]}" ${numberarray[0]} "${anotherfield[0]}"
printf "\n"
printf "%30s %10d %13s" "${stringarray[1]}" ${numberarray[1]} "${anotherfield[1]}"
# a var string with spaces has to be quoted
printf "\n Next line will fail \n"
printf "%30s %10d %13s" ${stringarray[0]} ${numberarray[0]} "${anotherfield[0]}"
a very long string.......... 1122324333 anotherfield
a smaller string..... 123124343 anotherfield
column -t skips empty fields when a line starts with a delimiter character or when there are two or more consecutive delimiter characters:
$ printf %s\\n a,b,c a,,c ,b,c|column -s, -t
a b c
a c
b c
Therefore I use this awk function instead (it requires gawk because it uses arrays of arrays):
$ tab(){ awk '{if(NF>m)m=NF;for(i=1;i<=NF;i++){a[NR][i]=$i;l=length($i);if(l>b[i])b[i]=l}}END{for(h in a){for(i=1;i<=m;i++)printf("%-"(b[i]+n)"s",a[h][i]);print""}}' n="${2-1}" "${1+FS=$1}"|sed 's/ *$//';}
$ printf %s\\n a,b,c a,,c ,b,c|tab ,
a b c
a c
b c
if you data doesn't contain the equal sign ("=") anywhere in it, you can use that as a shell-friendly delimiter for column without having to escape anything -
by modifying FS to be either a tab ("\t") plus any amount of spaces (" ") or tabs ("\t") on either side of it, or a contiguous chunk of 2 or more spaces, it also allows the input data to have any amount of single space within each field
echo "${inputdata2}" |
mawk NF=NF OFS== FS=' + |[ \t]*\t[ \t]*' |
column -s= -t
a very long string.......... 112232432 anotherfield
a smaller string 123124343 anotherfield
if the data does contain the equal sign, use a combo sep that's close to impossible to exist in typical data :
gawk -e NF=NF OFS='\301\372\5' FS=' + |[ \t]*\t[ \t]*' |
LC_ALL=C column -s$'\301\372\5' -t
a very long string.......... 112232432 anotherfield
a smaller string 123124343 anotherfield
and if ur data only has 2 columns, and you have ballpark sense of how wide the first field is, you can use this \r trick for nice on-screen formatting (but those don't become runs of spaces if u need to send it down the pipe) :
# each \t is 8-spaces at console terminal
mawk NF=2 FS=' + |[ \t]*\t[ \t]*' OFS='\r\t\t\t\t'
a very long string.......... 112232432
a smaller string 123124343

How to properly parse this scenario in a simple bash script?

I have a file where each key-value pair takes a new line. There is a possibility of having multiple values for each key. I want to return a list of all pairs that have a "special key", where "special" is is defined as some function.
For Example, if "special" is defined as a key that somewhere has a value of 100
A 100
B 400
A hello
B world
C 100
I would return
A 100
A hello
C 100
How to do this in bash?
#!/bin/bash
special=100
awk -v s=$special '
{
a[$1,$2]
if($2 ~ s)
k[$1]
}
END
{
for(key in k)
for(pair in a)
{
split(pair,b,SUBSEP)
if(b[1] == key)
print b[1],b[2]
}
}' ./infile
Proof of Concept
$ special=100; echo -e "A 100\nB 400\nA hello\nB world\nC 100" | awk -v s=$special '{a[$1,$2];if($2 ~ s)k[$1]}END{for(key in k)for(pair in a){split(pair,b,SUBSEP); if(b[1] == key)print b[1],b[2]}}'
A hello
A 100
C 100
This would also work:
id=`grep "\<$special\>$" yourfile | sed -e "s/$special//"`
[ -z "$id" ] || grep "^$id" yourfile
Returns:
If special=100
A 100
A hello
C 100
If special="hello"
A 100
A hello
If special="A"
(nothing)
If special="ello"
(nothing)
Notes
drop the \<\> if you want partial match
add | uniq at the end if there is a possibility of multiple entrances of the same pair (A 100, A 100, ...) but you don't want that in your output.
***** script *****
#!/bin/bash
grep " $1" data.txt | cut -d ' ' -f1 | grep -f /dev/fd/0 data.txt
result:
./test.sh 100
A 100
A hello
C 100
***** inline *****
the first grep must contain the 'special' preceded by a space ' ':
grep " 100" data.txt | cut -d ' ' -f1 | grep -f /dev/fd/0 data.txt
A 100
A hello
C 100
awk -v special="100" '$2==special{a[$1]}($1 in a)' file
Whew! My bash was incredibly rusty! Hope this helps:
FILE=$1
IFS=$'\n' # Internal File Sep, so as to avoid splitting in whitespaces
FIND="100"
KEEP=""
for line in `cat $FILE`; do
key=`echo $line | cut -d \ -f1`;
value=`echo $line | cut -d \ -f2`;
echo "$key = $value"
if [ "$value" == "$FIND" ]; then
KEEP="$key $KEEP"
fi
done
echo "Keys to keep: $KEEP"
# You can now do whatever you want with those keys.

Resources