Let's say below is a table data is stored in a variable called "data"
1 apple 50 Mary
2 banana 40 Lily
3 orange 34 Jack
#
for i in ${data}
do
echo $i
done
# Expected Output
1 apple 50 Mary
2 banana 40 Lily
3 orange 34 Jack
How do I convert the above table of data into three entires so that i can iterate over it using a for-loop and print that single entry in exact same format.
Any suggestion.
Thanks, #Darkman for this "https://stackoverflow.com/questions/11393817/read-lines-from-a-file-into-a-bash-array" link, it helps in my cause.
# This below line help
# IFS=$'\r\n' GLOBIGNORE='*'
data="1 apple 50 Mary
2 banana 40 Lily
3 orange 34 Jack"
IFS=$'\r\n' GLOBIGNORE='*'
for i in ${data}
do
echo $i
echo "test"
done
# Resulted Output
1 apple 50 Mary
test
2 banana 40 Lily
test
3 orange 34 Jack
test
I have a set of text files in a folder, like so:
a.txt
1
2
3
4
5
b.txt
1000
1001
1002
1003
1004
.. and so on (assume fixed number of rows, but unknown number of text files). What I am looking a results file which is a summation across all rows:
result.txt
1001
1003
1005
1007
1009
How do I go about achieving this in bash? without using Python etc.
Using awk
Try:
$ awk '{a[FNR]+=$0} END{for(i=1;i<=FNR;i++)print a[i]}' *.txt
1001
1003
1005
1007
1009
How it works:
a[FNR]+=$0
For every line read, we add the value of that line, $0, to partial sum, a[FNR], where a is an array and FNR is the line number in the current file.
END{for(i=1;i<=FNR;i++)print a[i]}
After all the files have been read in, this prints out the sum for each line number.
Using paste and bc
$ paste -d+ *.txt | bc
1001
1003
1005
1007
1009
using the $ regex I can get last position of each line. but if I have the following:
12345
23456
34567
I need to add a space so it becomes
1234 5
2345 6
3456 7
Thanks!
$ sed 's/.$/ &/' file
1234 5
2345 6
3456 7
gawk -v FIELDWIDTHS='4 1' '{$1=$1}1' file
1234 5
2345 6
3456 7
In bash, how can I read in a large .csv file and summarize the data? I need to get totals for each person.
example input:
joey 4
joey 3
joey 4
joey 6
paul 7
paul 3
paul 1
paul 4
trevor 5
trevor 6
henry 7
mark 8
mark 9
tom 0
It should end up like this in the end:
joey 17
paul 15
trevor 11
henry 7
mark 17
tom 2
list=`your example input | awk '{print $1}' | uniqe`
it gives You something like this:
joey
paul
trevor
henry
mark
tom
Now let's make a two for loops:
for i in $list
do
for j in `$list | grep $i | awk '{print $2}'`
do
counter=$counter+$j
done
echo "$i $j"
done
First loop is going by the names and second one is just counting results for each name. Guess it should work, and it's quite easy way.
cat Error00
4 0 375
4 2001 21
4 2002 20
cat Error01
4 0 465
4 2001 12
4 2002 40
4 2016 1
I want output as below
4 0 375 465
4 2001 21 12
4 2002 20 20
4 2016 - 1
i am using the below query. here problem is i m not able to handle grep for two field because space is coming.
please suggest how can to get rid of this.
keylist=$(awk '{print $1,$2'} Error0[0-1] | sort | uniq)
for key in ${keylist} ; do
echo ${key}
val_a=$(grep "^${key}" Error00 | awk '{print $3}') ;val_a=${val_a:---}
val_b=$(grep "^${key}" Error01 | awk '{print $1,$2}') ; val_b=${val_b:--- --}
echo $key ${val_a} >>testreport
done
i m geting the oputput as below
4 375 465
0
4 21 12
2001
4 20 20
2002
4 - 1
2016
A single awk one liner can handle this easily:
awk 'FNR==NR{a[$1,$2]=$3;next}{print $1,$2,(a[$1,$2]?a[$1,$2]:"-"),$3}' err0 err1
4 0 375 465
4 2001 21 12
4 2002 20 40
4 2016 - 1
For formatted output you can use printf instead of print. Like Jonathan Leffler suggest:
printf "%s %-6s %-6s %s\n",$1,$2,(a[$1,$2]?a[$1,$2]:"-"),$3
4 0 375 465
4 2001 21 12
4 2002 20 40
4 2016 - 1
However a general solution is to use column -t for a nice table output:
awk '{....}' err0 err1 | column -t
4 0 375 465
4 2001 21 12
4 2002 20 40
4 2016 - 1
grep is not really the right tool for this job. You can either play with awk or Perl (or Python, or …), or you can use join. However, join only joins on a single column at a time, and you appear to need to join on two columns. So, we're going to have to massage the data so that it will work with join. I'm about to assume you're using bash and so have process substitution available. You can do the job without, but it is fiddlier and involves temporary files (and traps to clean them up, etc).
The key to the join will be to replace the blank between the first two columns with a colon (or any other convenient character — control-A would work fine too), then join the files on column 1 with a replacement character. The inputs must be sorted; the output must have the colon replaced with a blank.
$ join -o 0,1.2,2.2 -a 1 -a 2 -e '-' \
> <(sed 's/ */:/' Error00 | sort) \
> <(sed 's/ */:/' Error01 | sort) |
> sed 's/:/ /'
4 0 375 465
4 2001 21 12
4 2002 20 40
4 2016 - 1
$
The 's/ */:/' operation replaces the first sequence of one or more blanks with a colon; the input data has two blanks between the 4 and the 0 in the first line of Error00. The input to join must be in sorted order of the joining field, here the first field. The output is the join field, the second column of Error00 and the second column of Error01 (remembering that means the second column after the first two have been fused by the colon). If there's an unmatched line in the first file, generate an output line (-a 1); ditto for the second file; and for the missing fields, insert a dash (-e '-'). The final sed removes the colon that was added.
If you want the data formatted, pipe it through awk.
$ join -o 0,1.2,2.2 -a 1 -a 2 -e '-' \
> <(sed 's/ */:/' Error00 | sort) \
> <(sed 's/ */:/' Error01 | sort) |
> sed 's/:/ /' |
> awk '{printf("%s %-6s %-6s %s\n", $1, $2, $3, $4)}'
4 0 375 465
4 2001 21 12
4 2002 20 40
4 2016 - 1
$