Adding data from an array to a new column in a file using bash [duplicate] - bash

This question already has answers here:
Iterate over two arrays simultaneously in bash
(6 answers)
How to print two arrays side by side with bash script?
(2 answers)
Closed 2 years ago.
So I have Name age and city data:
name=(Alex Barbara Connor Daniel Matt Peter Stan)
age=(22 23 55 32 21 8 89)
city=(London Manchester Rome Alberta Naples Detroit Amsterdam)
and I want to set up the following as 3 column data with the headings Name Age and city, I can easily get the first column using
touch info.txt
echo "Name Age City" > info.txt
for n in ${name[#]}; do
echo $n >> info.txt
done
but I can't figure how to get the rest of the data, and I can't seem to find anywhere on how to add data that's different as a new column.
Any help would be greatly appreciated, thank you.

Try something like this:
name=(Alex Barbara Connor Daniel Matt Peter Stan)
age=(22 23 55 32 21 8 89)
city=(London Manchester Rome Alberta Naples Detroit Amsterdam)
touch info.txt
echo "Name Age City" > info.txt
for n in $(seq 0 6); do
echo ${name[$n]} ${age[$n]} ${city[$n]} >> info.txt
done
Output in info.txt:
Name Age City
Alex 22 London
Barbara 23 Manchester
Connor 55 Rome
Daniel 32 Alberta
Matt 21 Naples
Peter 8 Detroit
Stan 89 Amsterdam

JoseLinares solved your problem. For your information, here is a solution with the paste command whose purpose is exactly that: putting data from different sources in separate columns.
$ printf 'Name\tAge\tCity\n'
$ paste <(printf '%s\n' "${name[#]}") \
<(printf '%3d\n' "${age[#]}") \
<(printf '%s\n' "${city[#]}")
Name Age City
Alex 22 London
Barbara 23 Manchester
Connor 55 Rome
Daniel 32 Alberta
Matt 21 Naples
Peter 8 Detroit
Stan 89 Amsterdam

You can fix a specific width for each column (here 20 is used)
name=(Alex Barbara Connor Daniel Matt Peter Stan)
age=(22 23 55 32 21 8 89)
city=(London Manchester Rome Alberta Naples Detroit Amsterdam)
for i in "${!name[#]}"; do
printf "%-20s %-20s %-20s\n" "${name[i]}" "${age[i]}" "${city[i]}"
done
Output:
Alex 22 London
Barbara 23 Manchester
Connor 55 Rome
Daniel 32 Alberta
Matt 21 Naples
Peter 8 Detroit
Stan 89 Amsterdam

Related

How do I convert the a table of data into list entires

Let's say below is a table data is stored in a variable called "data"
1 apple 50 Mary
2 banana 40 Lily
3 orange 34 Jack
#
for i in ${data}
do
echo $i
done
# Expected Output
1 apple 50 Mary
2 banana 40 Lily
3 orange 34 Jack
How do I convert the above table of data into three entires so that i can iterate over it using a for-loop and print that single entry in exact same format.
Any suggestion.
Thanks, #Darkman for this "https://stackoverflow.com/questions/11393817/read-lines-from-a-file-into-a-bash-array" link, it helps in my cause.
# This below line help
# IFS=$'\r\n' GLOBIGNORE='*'
data="1 apple 50 Mary
2 banana 40 Lily
3 orange 34 Jack"
IFS=$'\r\n' GLOBIGNORE='*'
for i in ${data}
do
echo $i
echo "test"
done
# Resulted Output
1 apple 50 Mary
test
2 banana 40 Lily
test
3 orange 34 Jack
test

AWK or SED Replace space between alphabets in a particular column [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I have an infile as below:
infile:
INM00042170 28.2500 74.9167 290.0 CHURU 2015 2019 2273
INM00042182 28.5833 77.2000 211.0 NEW DELHI/SAFDARJUNG 1930 2019 67874
INXUAE05462 28.6300 77.2000 216.0 NEW DELHI 1938 1942 2068
INXUAE05822 25.7700 87.5200 40.0 PURNEA 1933 1933 179
INXUAE05832 31.0800 77.1800 2130.0 SHIMLA 1926 1928 728
PKM00041640 31.5500 74.3333 214.0 LAHORE CITY 1960 2019 22915
I want to replace the space between two words by an underscore in column 5 (example: NEW DELHI becomes NEW_DELHI). I want output as below.
outfile:
INM00042170 28.2500 74.9167 290.0 CHURU 2015 2019 2273
INM00042182 28.5833 77.2000 211.0 NEW_DELHI/SAFDARJUNG 1930 2019 67874
INXUAE05462 28.6300 77.2000 216.0 NEW_DELHI 1938 1942 2068
INXUAE05822 25.7700 87.5200 40.0 PURNEA 1933 1933 179
INXUAE05832 31.0800 77.1800 2130.0 SHIMLA 1926 1928 728
PKM00041640 31.5500 74.3333 214.0 LAHORE_CITY 1960 2019 22915
Thank you
#!/bin/bash
# connect field 5 and 6 and remove those with numbers.
# this returns a list of new names (with underscore) for
# all cities that need to be replaced
declare -a NEW_NAMES=$(cat infile | awk '{print $5 "_" $6}' | grep -vE "_[0-9]")
# iterating all new names
for NEW_NAME in ${NEW_NAMES[#]}; do
OLD_NAME=$(echo $NEW_NAME | tr '_' ' ')
# replace in file
sed -i "s/${OLD_NAME}/${NEW_NAME}/g" infile
done

reading a file into an array in bash

Here is my code
#!bin/bash
IFS=$'\r\n'
GLOBIGNORE='*'
command eval
'array=($(<'$1'))'
sorted=($(sort <<<"${array[*]}"))
for ((i = -1; i <= ${array[-25]}; i--)); do
echo "${array[i]}" | awk -F "/| " '{print $2}'
done
I keep getting an error that says "line 5: array=($(<)): command not found"
This is my problem.
As a whole my code should read in a file as a command line argument, sort the elements, then print out column 2 of the last 25 lines. I haven't been able to test this far so if there's a problem there too any help would be appreciated.
This is some of what the file contains:
290729 123456
79076 12345
76789 123456789
59462 password
49952 iloveyou
33291 princess
21725 1234567
20901 rockyou
20553 12345678
16648 abc123
16227 nicole
15308 daniel
15163 babygirl
14726 monkey
14331 lovely
14103 jessica
13984 654321
13981 michael
13488 ashley
13456 qwerty
13272 111111
13134 iloveu
13028 000000
12714 michelle
11761 tigger
11489 sunshine
11289 chocolate
11112 password1
10836 soccer
10755 anthony
10731 friends
10560 butterfly
10547 purple
10508 angel
10167 jordan
9764 liverpool
9708 justin
9704 loveme
9610 fuckyou
9516 123123
9462 football
9310 secret
9153 andrea
9053 carlos
8976 jennifer
8960 joshua
8756 bubbles
8676 1234567890
8667 superman
8631 hannah
8537 amanda
8499 loveyou
8462 pretty
8404 basketball
8360 andrew
8310 angels
8285 tweety
8269 flower
8025 playboy
7901 hello
7866 elizabeth
7792 hottie
7766 tinkerbell
7735 charlie
7717 samantha
7654 barbie
7645 chelsea
7564 lovers
7536 teamo
7518 jasmine
7500 brandon
7419 666666
7333 shadow
7301 melissa
7241 eminem
7222 matthew
In Linux you can simply do a
sort -nbr file_to_sort | head -n 25 | awk '{print $2}'
read in a file as a command line argument, sort the elements, then
print out column 2 of the last 25 lines.
From that discription of the problem, I suggest:
#! /bin/sh
sort -bn $1 | tail -25 | awk '{print $2}'
As a rule, use the shell to operate on filenames, and never use the
shell to operate on data. Utilities like sort and awk are far
faster and more powerful than the shell when it comes to processing a
file.

getting the sum of the out put in unix [duplicate]

This question already has answers here:
Summing values of a column using awk command
(2 answers)
Closed 5 years ago.
I am trying to get the sum of my output in bash shell using only awk. One of the problems I am getting is that I only need to use awk in this.
This is the code I am using for getting the output:
awk '{print substr($7, 9, 4)}' emp.txt
This is the output I am getting: (output omitted)
7606
6498
7947
4044
1657
3872
4834
8463
9280
2789
9104
this is how I am trying to do the sum of the numbers: awk '(s = s + substr($7, 9, 4)) {print s}' emp.txt
The problem is that it is not giving me the right output (which should be 9942686) but instead giving me the series sum (as shown below).
(output omitted)
9890696
9898643
9902687
9904344
9908216
9913050
9921513
9930793
9933582
9942686
Am I using the code the wrong way? Or is there any other method of doing it with awk and I am doing it the wrong way?
Here is the sample file I am working on:
Brynlee Watkins F 55 Married 2016 778-555-6498 62861
Malcolm Curry M 24 Married 2016 604-555-7947 54647
Aylin Blake F 45 Married 2015 236-555-4044 80817
Mckinley Hodges F 50 Married 2015 604-555-1657 46316
Rylan Dorsey F 51 Married 2017 778-555-3872 77160
Taylor Clarke M 23 Married 2015 604-555-4834 46624
Vivaan Hooper M 26 Married 2016 778-555-8463 80010
Gibson Rowland M 42 Married 2017 236-555-9280 59874
Alyson Mahoney F 51 Single 2017 778-555-2789 71394
Catalina Frazier F 53 Married 2016 604-555-9104 79364
EDIT: I want to get the sum of the numbers that are repeating in the output. Let's say the repeating numbers are 4826 and 0028 in the output and both of them repeated 2 times. I only want the sum of these numbers (each repetition must be counted as the individual. hence these are counted as 4). So the desired output for these 4 numbers shall be 9708
Will Duffy M 33 Single 2017 236-555-4826 47394
Nolan Reed M 27 Single 2015 604-555-0028 46622
Anya Horn F 54 Married 2017 236-555-4826 73270
Cynthia Davenport F 29 Married 2015 778-555-0028 59687
Oscar Medina M 43 Married 2016 778-555-7864 73688
Angelina Herrera F 37 Married 2017 604-555-7910 82061
Peyton Reyes F 35 Married 2017 236-555-8046 51920
END { print s }
Since you only need the total sum printed once, do it under the END pattern.
awk '{s = s + substr($7, 9, 4)} END {print s}' emp.txt
Could you please try following awk and let me know if this helps you. It will look always look for last digits after -:
awk -F' |-' '{sum+=$(NF-1)} END{print sum}' Input_file
EDIT:
awk -F' |-' '
{
++a[$(NF-1)];
b[$(NF-1)]=b[$(NF-1)]?b[$(NF-1)]+$(NF-1):$(NF-1)
}
END{
for(i in a){
if(a[i]>1){
print i,b[i]}
}}
' Input_file
Output will be as follows:
4826 9652
0028 56

how to summarize data based on a field in a row

In bash, how can I read in a large .csv file and summarize the data? I need to get totals for each person.
example input:
joey 4
joey 3
joey 4
joey 6
paul 7
paul 3
paul 1
paul 4
trevor 5
trevor 6
henry 7
mark 8
mark 9
tom 0
It should end up like this in the end:
joey 17
paul 15
trevor 11
henry 7
mark 17
tom 2
list=`your example input | awk '{print $1}' | uniqe`
it gives You something like this:
joey
paul
trevor
henry
mark
tom
Now let's make a two for loops:
for i in $list
do
for j in `$list | grep $i | awk '{print $2}'`
do
counter=$counter+$j
done
echo "$i $j"
done
First loop is going by the names and second one is just counting results for each name. Guess it should work, and it's quite easy way.

Resources