How do I convert the a table of data into list entires - shell

Let's say below is a table data is stored in a variable called "data"
1 apple 50 Mary
2 banana 40 Lily
3 orange 34 Jack
#
for i in ${data}
do
echo $i
done
# Expected Output
1 apple 50 Mary
2 banana 40 Lily
3 orange 34 Jack
How do I convert the above table of data into three entires so that i can iterate over it using a for-loop and print that single entry in exact same format.
Any suggestion.

Thanks, #Darkman for this "https://stackoverflow.com/questions/11393817/read-lines-from-a-file-into-a-bash-array" link, it helps in my cause.
# This below line help
# IFS=$'\r\n' GLOBIGNORE='*'
data="1 apple 50 Mary
2 banana 40 Lily
3 orange 34 Jack"
IFS=$'\r\n' GLOBIGNORE='*'
for i in ${data}
do
echo $i
echo "test"
done
# Resulted Output
1 apple 50 Mary
test
2 banana 40 Lily
test
3 orange 34 Jack
test

Related

Adding data from an array to a new column in a file using bash [duplicate]

This question already has answers here:
Iterate over two arrays simultaneously in bash
(6 answers)
How to print two arrays side by side with bash script?
(2 answers)
Closed 2 years ago.
So I have Name age and city data:
name=(Alex Barbara Connor Daniel Matt Peter Stan)
age=(22 23 55 32 21 8 89)
city=(London Manchester Rome Alberta Naples Detroit Amsterdam)
and I want to set up the following as 3 column data with the headings Name Age and city, I can easily get the first column using
touch info.txt
echo "Name Age City" > info.txt
for n in ${name[#]}; do
echo $n >> info.txt
done
but I can't figure how to get the rest of the data, and I can't seem to find anywhere on how to add data that's different as a new column.
Any help would be greatly appreciated, thank you.
Try something like this:
name=(Alex Barbara Connor Daniel Matt Peter Stan)
age=(22 23 55 32 21 8 89)
city=(London Manchester Rome Alberta Naples Detroit Amsterdam)
touch info.txt
echo "Name Age City" > info.txt
for n in $(seq 0 6); do
echo ${name[$n]} ${age[$n]} ${city[$n]} >> info.txt
done
Output in info.txt:
Name Age City
Alex 22 London
Barbara 23 Manchester
Connor 55 Rome
Daniel 32 Alberta
Matt 21 Naples
Peter 8 Detroit
Stan 89 Amsterdam
JoseLinares solved your problem. For your information, here is a solution with the paste command whose purpose is exactly that: putting data from different sources in separate columns.
$ printf 'Name\tAge\tCity\n'
$ paste <(printf '%s\n' "${name[#]}") \
<(printf '%3d\n' "${age[#]}") \
<(printf '%s\n' "${city[#]}")
Name Age City
Alex 22 London
Barbara 23 Manchester
Connor 55 Rome
Daniel 32 Alberta
Matt 21 Naples
Peter 8 Detroit
Stan 89 Amsterdam
You can fix a specific width for each column (here 20 is used)
name=(Alex Barbara Connor Daniel Matt Peter Stan)
age=(22 23 55 32 21 8 89)
city=(London Manchester Rome Alberta Naples Detroit Amsterdam)
for i in "${!name[#]}"; do
printf "%-20s %-20s %-20s\n" "${name[i]}" "${age[i]}" "${city[i]}"
done
Output:
Alex 22 London
Barbara 23 Manchester
Connor 55 Rome
Daniel 32 Alberta
Matt 21 Naples
Peter 8 Detroit
Stan 89 Amsterdam

Shell script to sum columns associated with a name

I have a file with thousands of numbers on column 1 and each sequence of these numbers are associated with a single person. Would someone have any idea on how can I create a shell script to sum column 1 for that specific person, eg:
John is 10+20+30+50 = 110
Output of the script would be: John 110 and so on and so forth..
I have tried with while, for, etc but I can't associate the sum to the person :(
Example of the file:
10 John
20 John
30 John
50 John
10 Paul
10 Paul
20 Paul
20 Paul
20 Robert
30 Robert
30 Robert
60 Robert
80 Robert
40 Robert
40 Robert
40 Robert
15 Mike
30 Mike
One awk solution that prints averages to 2 decimal places and orders output by name:
awk '
{ total[$2]+=$1
count[$2]++
}
END { PROCINFO["sorted_in"]="#ind_str_asc"
for ( i in total )
printf "%-10s %5d / %-5d = %5.2f\n", i, total[i], count[i], total[i]/count[i]
}
' numbers.dat
This generates:
John 110 / 4 = 27.50
Mike 45 / 2 = 22.50
Paul 60 / 4 = 15.00
Robert 340 / 8 = 42.50
awk '{ map[$2]+=$1 } END { for (i in map) { print i" "map[i] } }' file
Using awk, create an array with the name as the first index and a running total of the values for each name. At the end, print the names and totals.
Thanks a lot Raman, it worked... do you happen to know if would possible to perform a calculation on the same awk to get the average of each one? For example, John is 10+20+30+50 = 110, 110 / 4 = 27
Assumptions:
data resides in a file named numbers.dat
we'll store totals and counts in arrays but calculate averages simply for display (OP can decide if averages should also be stored in an array)
One bash solution using a couple associative arrays to keep track of our numbers:
unset total count
declare -A total count
while read -r number name
do
(( total[${name}] += $number))
(( count[${name}] ++ ))
done < numbers.dat
typeset -p total count
This generates:
declare -A total=([Mike]="45" [Robert]="340" [John]="110" [Paul]="60" )
declare -A count=([Mike]="2" [Robert]="8" [John]="4" [Paul]="4" )
If we want integer based averages (ie, no decimal places):
for i in ${!total[#]}
do
printf "%-10s %5d / %-5d = %5d\n" "${i}" "${total[${i}]}" "${count[${i}]}" $(( ${total[${i}]} / ${count[${i}]} ))
done
This generates:
Mike 45 / 2 = 22
Robert 340 / 8 = 42
John 110 / 4 = 27
Paul 60 / 4 = 15
If we want the averages to include, say, 2 decimal places:
for i in ${!total[#]}
do
printf "%-10s %5d / %-5d = %5.2f\n" "${i}" "${total[${i}]}" "${count[${i}]}" $( bc <<< "scale=2;${total[${i}]} / ${count[${i}]}" )
done
This generates:
Mike 45 / 2 = 22.50
Robert 340 / 8 = 42.50
John 110 / 4 = 27.50
Paul 60 / 4 = 15.00
Output sorted by name:
for i in ${!total[#]}
do
printf "%-10s %5d / %-5d = %5.2f\n" "${i}" "${total[${i}]}" "${count[${i}]}" $( bc <<< "scale=2;${total[${i}]} / ${count[${i}]}" )
done | sort
This generates:
John 110 / 4 = 27.50
Mike 45 / 2 = 22.50
Paul 60 / 4 = 15.00
Robert 340 / 8 = 42.50

Combining 2 or more `grep -A` output

This is my sample data
Apple 13
Apple 37
Apple 341
Apple 27B
Apple 99
Banana 00
Banana 988
Banana 507
Banana 11
Banana 11A
I would like to get the output like this
Apple 13
Apple 37
Apple 341
Banana 00
Banana 988
The problem is I can only do grep with switch -A 2 one time only
root#Ubuntu:/tmp# grep -A 2 'e 1' data.txt
Apple 13
Apple 37
Apple 341
root#Ubuntu:/tmp#
Another grep -A 1
root#Ubuntu:/tmp# grep -A 1 'a 0' data.txt
Banana 00
Banana 988
root#Ubuntu:/tmp#
I've been trying to use egrep but I did not get the output that I wanted.
root#Ubuntu:/tmp# egrep 'e 1|a 0' data.txt
Apple 13
Banana 00
root#Ubuntu:/tmp#
I would like to get 2 more line after Apple 13 and 1 more line after Banana 00
Please advise
With GNU sed:
sed -n '/e 1/{N;N;p}; /a 0/{N;p}' file
Output:
Apple 13
Apple 37
Apple 341
Banana 00
Banana 988
See: man sed
I'll recommend awk or sed to solve these kind of problems
using awk
$ awk ' /e 1/{i=1; t=3} /a 0/{i=1; t=2} i++<=t' file
Apple 13
Apple 37
Apple 341
Banana 00
Banana 988
i : iterator
t : threshold
/e 1/{i=1; t=3} : If string contains e 1 then set i=1 and t=3 .
t=3 because in total 3 lines needs to be printed (including matched line)
/a 0/{i=1; t=2} : If string contains a 0 then set i=1 and t=2
i++<=t : if it's true , then print line. i++ to increment i after each check
I don't have any idea to solve this problem by a single command. If you don't mind using couple of commands, please use the following,
grep -A 2 "e 1" test.txt && grep -A 1 "a 0" test.txt

Filtering Input files

So I'm trying to filter 'duplicate' results from a file.
Ive a file that looks like:
7 14 35 35 4 23
23 53 85 27 49 1
35 4 23 27 49 1
....
that I mentally can divide up into item 1 and item 2. Item 1 is the first 3 numbers on each line and item 2 is the last 3 numbers on each line.
I've also got a list of 'items':
7 14 35
23 53 85
35 4 23
27 49 1
...
At a certain point in the file, lets say line number 3 (this number is arbitrary and for example), the 'items' can be separated. Lets say lines 1 and 2 are red and lines 3 and 4 are blue.
I want to make sure on my original file that there are no red red or blue blues - only red blue or blue red, while retaining the original numbers.
So ideally the file would go from:
7 14 35 35 4 23 (red blue)
23 53 85 27 49 1 (red blue)
35 4 23 27 49 1 (blue blue)
....
to
7 14 35 35 4 23 (red blue)
23 53 85 27 49 1 (red blue)
....
I'm having trouble thinking of a good (or any) way to do it.
Any help is appreciated.
EDIT:
An filtering script I have that grabs lines if they have blue or red on the lines:
#!/bin/bash
while read name; do
grep "$name" Twoitems
done < Itemblue > filtered
while read name2; do
grep "$name2" filtered
done < Itemred > double filtered
EDIT2:
Example input an item files:
This is pretty easy using grep with option -f.
First of all, generate four 'pattern' files out of your items file.
I am using AWK here, but you might as well use Perl or what not.
Following your example, I put the 'split' between line 2 and 3; please adjust when necessary.
awk 'NR <= 2 {print "^" $0 " "}' items.txt > starts_red.txt
awk 'NR <= 2 {print " " $0 "$"}' items.txt > ends_red.txt
awk 'NR >= 3 {print "^" $0 " "}' items.txt > starts_blue.txt
awk 'NR >= 3 {print " " $0 "$"}' items.txt > ends_blue.txt
Next, use a grep pipeline using the pattern files (option -f) to filter the appropriate lines from the input file.
grep -f starts_red.txt input.txt | grep -f ends_blue.txt > red_blue.txt
grep -f starts_blue.txt input.txt | grep -f ends_red.txt > blue_red.txt
Finally, concatenate the two output files.
Of course, you might as well use >> to let the second grep pipeline append its output to the output of the first.
Let's say file1 contents
7 14 35 35 4 23
23 53 85 27 49 1
35 4 23 27 49 1
and file2 contents are
7 14 35
23 53 85
35 4 23
27 49 1
Then, you can use a hash to map line-nos to colors based on your cutoff and using that hash, compare lines in first file for the existence of different colors after splitting on third space of each line.
I suppose you want something like below script.Feel free to modify it according to your requirements.
#!/usr/bin/perl
use strict;
use warnings;
#declare a global hash to keep track of line and colors
my %color;
#open both the files
open my $fh1, '<', 'file1' or die "unable to open file1: $! \n";
open my $fh2, '<', 'file2' or die "unable to open file2: $! \n";
#iterate over the second file and store the lines as
#red or blue in hash based on line nos
while(<$fh2>){
chomp;
if($. <= 2){
$color{$_}="red";
}
else{
$color{$_}="blue";
}
}
#close second file
close($fh2);
#iterate over first file
while(<$fh1>){
chomp;
#split the line on 3rd space
my ($part1,$part2)=split /(?:\d+\s){3}\K/;
#remove trailing spaces present
$part1=~s/\s+$//;
#print if $part1 and $part does not belong to same color
print "$_\n" if($color{$part1} ne $color{$part2});
}
#close first file
close($fh1);

how to summarize data based on a field in a row

In bash, how can I read in a large .csv file and summarize the data? I need to get totals for each person.
example input:
joey 4
joey 3
joey 4
joey 6
paul 7
paul 3
paul 1
paul 4
trevor 5
trevor 6
henry 7
mark 8
mark 9
tom 0
It should end up like this in the end:
joey 17
paul 15
trevor 11
henry 7
mark 17
tom 2
list=`your example input | awk '{print $1}' | uniqe`
it gives You something like this:
joey
paul
trevor
henry
mark
tom
Now let's make a two for loops:
for i in $list
do
for j in `$list | grep $i | awk '{print $2}'`
do
counter=$counter+$j
done
echo "$i $j"
done
First loop is going by the names and second one is just counting results for each name. Guess it should work, and it's quite easy way.

Resources