Using bash/sed, I am trying to search for matching string and when a match is found it appends that variable to the end of the applicable line.
Two lists:
[linuxbox tmp]$ cat lista
a 23
c 4
e 55
b 2
f 44
d 74
[linuxbox tmp]$ cat listb
a 3
e 34
c 84
b 1
f 500
d 666666
#!/bin/bash
rm -rf listc
cat listb |while read rec
do
var1="$(echo $rec | awk '{ print $1 }')"
var2="$(echo $rec | awk '{ print $2 }')"
if egrep "^$var1" lista; then
sed "/^$var1/ s/$/ $var2/1" lista >> listc
fi
done
when I run it I get:
[linuxbox tmp]$ ./blah.sh
a 23
e 55
c 4
b 2
f 44
d 74
[linuxbox tmp]$ cat listc
a 23 3
c 4
e 55
b 2
f 44
d 74
a 23
c 4
e 55 34
b 2
f 44
d 74
a 23
c 4 84
e 55
b 2
f 44
d 74
a 23
c 4
e 55
b 2 1
f 44
d 74
a 23
c 4
e 55
b 2
f 44 500
d 74
a 23
c 4
e 55
b 2
f 44
d 74 666666
The output i'm trying to get to is:
a 23 3
e 55 34
c 4 84
b 2 1
f 44 500
d 74 666666
What am I doing wrong here? Is there a better way to accomplish this?
Thank you in advance.
If you don't mind getting a sorted output:
join <(sort lista) <(sort listb)
One way using awk:
awk 'FNR==NR { array[$1]=$2; next } { if ($1 in array) print $1, array[$1], $2 }' lista listb
Results:
a 23 3
e 55 34
c 4 84
b 2 1
f 44 500
d 74 666666
Based on your input files (no duplicate keys in a single file), the following will do the trick:
>> for key in $(awk '{print $1}' lista) ; do
+> echo $key $(awk -vK=$key '$1==K{$1="";print}' lista listb)
+> done
a 23 3
c 4 84
e 55 34
b 2 1
f 44 500
d 74 666666
Related
I have two text files (tsv format), which each have 240 columns and 100 lines. I would like to sort the columns alternately and make one file (480 columns and 100 lines). How could I achieve this goal with standard command line tools in Linux?
Example (in case of a single line) :
FileA:
1 2 3 4 5 ・・・
FileB:
001 002 003 004 005 ・・・
Expected Result:
1 001 2 002 3 003 ・・・
just awk with "getline"
==> file1 <==
a b c d e f g h i j k l m
n o p q r s t u v w x y z
==> file2 <==
1 2 3 4 5 6 7 8 9 10 11 12 13
14 15 16 17 18 19 20 21 22 23 24 25 26
$ awk '{split($0,f1);
getline < "file2";
for(i=1;i<=NF;i++) printf "%s%s%s%s", f1[i], OFS, $i, (i==NF?ORS:OFS)}' file1
a 1 b 2 c 3 d 4 e 5 f 6 g 7 h 8 i 9 j 10 k 11 l 12 m 13
n 14 o 15 p 16 q 17 r 18 s 19 t 20 u 21 v 22 w 23 x 24 y 25 z 26
if space is not the required output delimiter set OFS accordingly...
ps. getline use is normally discouraged for any non-trivial script, and usually should be avoided by beginners. See here for example for more explanation.
paste + awk solution:
Sample file1:
a b c d e f g h i j k l m n o p q r s t u v w x y z
a b c d e f g h i j k l m n o p q r s t u v w x y z
Sample file2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
paste file1 file2 \
| awk '{ len=NF/2;
for (i=1; i<=len; i++)
printf "%s %s%s", $i, $(i+len),(i==len? ORS:OFS)
}'
The output:
a 1 b 2 c 3 d 4 e 5 f 6 g 7 h 8 i 9 j 10 k 11 l 12 m 13 n 14 o 15 p 16 q 17 r 18 s 19 t 20 u 21 v 22 w 23 x 24 y 25 z 26
a 1 b 2 c 3 d 4 e 5 f 6 g 7 h 8 i 9 j 10 k 11 l 12 m 13 n 14 o 15 p 16 q 17 r 18 s 19 t 20 u 21 v 22 w 23 x 24 y 25 z 26
Use bash to make some dummy files that match the spec, along with some letter-line suffixes to tell them apart:
for f in {A..z} {A..j} ; do echo $( seq -f '%g'"$f" 240 ) ; done > FileA
for f in {z..A} {j..A} ; do echo $( seq -f '%03.3g'"$f" 240 ) ; done > FileB
Use bash, paste and xargs:
paste -d' ' <(tr ' ' '\n' < FileA) <(tr ' ' '\n' < FileB) | xargs -L 240 echo
Since the output of that is a bit unweildy, show first ten lines, with both the first and last six columns:
paste -d' ' <(tr ' ' '\n' < FileA) <(tr ' ' '\n' < FileB) | xargs -L 240 echo |
head | cut -d' ' -f1-6,476-480
1A 001z 2A 002z 3A 003z 238z 239A 239z 240A 240z
1B 001y 2B 002y 3B 003y 238y 239B 239y 240B 240y
1C 001x 2C 002x 3C 003x 238x 239C 239x 240C 240x
1D 001w 2D 002w 3D 003w 238w 239D 239w 240D 240w
1E 001v 2E 002v 3E 003v 238v 239E 239v 240E 240v
1F 001u 2F 002u 3F 003u 238u 239F 239u 240F 240u
1G 001t 2G 002t 3G 003t 238t 239G 239t 240G 240t
1H 001s 2H 002s 3H 003s 238s 239H 239s 240H 240s
1I 001r 2I 002r 3I 003r 238r 239I 239r 240I 240r
1J 001q 2J 002q 3J 003q 238q 239J 239q 240J 240q
I'm currently working on my karaoke files and i see a lot of Non capitalized words.
The .txt files are structured as a key-value pair and i was wondering how to capitalize the first letter of every value word.
Example txt:
#TITLE:fire and Water
#ARTIST:Some band
#CREATOR:yunho
#LANGUAGE:Korean
#EDITION:UAS
#MP3:2NE1 - Fire.mp3
#COVER:2NE1 - Fire.jpg
#VIDEO:2NE1 - Fire.avi
#VIDEOGAP:11.6
#BPM:595
#GAP:3860
F -4 4 16 I
F 2 4 16 go
F 8 6 16 by
F 16 4 16 the
F 22 6 16 name
F 30 4 16 of
F 36 10 16 C
F 46 10 16 L
F 58 6 16 of
F 66 5 16 2
F 71 3 16 N
F 74 4 16 E
F 78 18 16 1
I'd like to capitalize the words after the keys TITLE, ARTISTS, LANGUAGE and EDITION
so for the example txt:
#TITLE:**F**ire **A**nd **W**ater
#ARTIST:**S**ome **B**and
#CREATOR:yunho
#LANGUAGE:**K**orean
#EDITION:**U**AS
#MP3:2NE1 - Fire.mp3
#COVER:2NE1 - Fire.jpg
#VIDEO:2NE1 - Fire.avi
#VIDEOGAP:11.6
#BPM:595
#GAP:3860
F -4 4 16 I
F 2 4 16 go
F 8 6 16 by
F 16 4 16 the
F 22 6 16 name
F 30 4 16 of
F 36 10 16 C
F 46 10 16 L
F 58 6 16 of
F 66 5 16 2
F 71 3 16 N
F 74 4 16 E
F 78 18 16 1
Another thing is that i have loads of these txt's files all in designated directories. I want to run the program from the parent recursive for all *.txt files
Example directories:
Library/Some Band/Some Band - Some Song/some txt file.txt
Library/Some Band2/Some Band2 - Some Song/sometxtfile.txt
Library/Some Band3/Some Band3 - Some Song/some3333 txt file.txt
I've tried to do so with find . -name '*.txt' -exec sed -i command {} +
but i got stuck on the search and replace with sed... anyone care to help me out?
You can use this gnu-sed command to uppercase starting letter for matching lines:
sed -E '/^#(TITLE|ARTIST|LANGUAGE|EDITION):/s/\b([a-z])/\u\1/g' file
#TITLE:Fire And Water
#ARTIST:Some Band
#CREATOR:yunho
#LANGUAGE:Korean
#EDITION:UAS
#MP3:2NE1 - Fire.mp3
#COVER:2NE1 - Fire.jpg
#VIDEO:2NE1 - Fire.avi
#VIDEOGAP:11.6
#BPM:595
#GAP:3860
F -4 4 16 I
F 2 4 16 go
F 8 6 16 by
F 16 4 16 the
F 22 6 16 name
F 30 4 16 of
F 36 10 16 C
F 46 10 16 L
F 58 6 16 of
F 66 5 16 2
F 71 3 16 N
F 74 4 16 E
F 78 18 16
For find + sed command use:
find . -name '*.txt' -exec \
sed -E -i '/^#(TITLE|ARTIST|LANGUAGE|EDITION):/s/\b([a-z])/\u\1/g' {} +
I am trying to extract data from two files with a common column but I am unable to fetch the required data.
File1
A B C D E F G
Dec 3 abc 10 2B 21 OK
Dec 1 %xyZ 09 3F 09 NOK
Dec 5 mnp 89 R5 11 OK
File2
H I
abc 10
xyz 00
pqr 45
I am able to get output A B C D E F G but unable to add I in between C & E column.
Trail 1:
awk 'FNR==1{next}
NR==FNR{a[$1]=$2; next}
{k=$3; sub(/^\%/,"",k)} k in a{print $1,$2,$3,$4,a[$2],$5,$6,$7; delete a[k]}
END{for(k in a) print k,a[k] > "unmatched"}' File2 File1 > matched
Required output:
matched:
A B C D I E F G
Dec 3 abc 10 10 2B 21 OK
Dec 1 %xyZ 09 00 3F 09 NOK
unmatched :
H I
pqr 45
Could you please help me for getting this output please ? Thank you.
Be careful that you have an upper case Z in file1. I put it to lower case in my test --- if it's not a typo it's another small detail to deal with.
$ awk 'FNR==1 {next}
NR==FNR {a[$1]=$2; next}
{k=$3; sub(/^\%/,"",k)}
k in a {print $1,$2,$3,$4,a[k],$5,$6,$7; delete a[k]}
END {for(k in a) print k,a[k] > "unmatched"}' File2 File1 > matched
$ cat matched
Dec 3 abc 10 10 2B 21 OK
Dec 1 %xyz 09 00 3F 09 NOK
$ cat unmatched
pqr 45
File 2 with four columns:
$ cat a
A B C D E F G
Dec 3 abc 10 2B 21 OK
Dec 1 %xyz 09 3F 09 NOK
Dec 5 mnp 89 R5 11 OK
$ cat b
H I J K
abc 10 j1 k1
xyz 00 j2 k2
pqr 45 j3 k3
$ cat x.awk
FNR==1 {next}
NR==FNR {a[$1]=$0; next}
{k=$3; sub(/^\%/,"",k)}
k in a {
split(a[k], b)
print $1,$2,b[2],$3,b[3],b[4],$4,$5,$6,$7; delete a[k]
}
END {for(k in a) print a[k] > "unmatched"}
$ awk -f x.awk b a
Dec 3 10 abc j1 k1 10 2B 21 OK
Dec 1 00 %xyz j2 k2 09 3F 09 NOK
$ cat unmatched
pqr 45 j3 k3
I'm trying to utilize an AWK one liner that uses a bash variable. The problem is its being printed more then once or giving out an error. The bash variable looks like this (without the echo line):
echo "$time"
49.80
63.4
61
60.4
61
The AWK line I'm trying to use is this:
awk -v time="$time" -F, '{print $1,$2,$3,$4,$5=time}' file
The output is this:
SantaClara 6/7/2015 D 4 49.80
63.4
61
60.4
61
SantaClara 5/29/2015 D 5 49.80
63.4
61
60.4
61
SantaClara 5/21/2015 D 5 49.80
63.4
61
60.4
61
SantaClara 4/29/2015 D 5 49.80
63.4
61
60.4
61
SantaClara 4/22/2015 D 5 49.80
63.4
61
60.4
61
And I'm looking for this:
SantaClara 6/7/2015 D 4 49.80
SantaClara 5/29/2015 D 5 63.40
SantaClara 5/21/2015 D 5 61
SantaClara 4/29/2015 D 5 60.4
SantaClara 4/22/2015 D 5 61
I've several variations on the AWK line and have only gotten errors. What am I doing wrong?
Use paste instead:
$ paste file <(echo "$time")
Use the -d switch if you want a specific delimiter (tab is the default).
In case you want an awk only solution:
awk -F= 'FNR==NR{a[++i]=$0;next} {print $0, a[FNR]}' <(echo "$time") file
SantaClara 6/7/2015 D 4 49.80
SantaClara 5/29/2015 D 5 63.4
SantaClara 5/21/2015 D 5 61
SantaClara 4/29/2015 D 5 60.4
SantaClara 4/22/2015 D 5 61
I want to get the single file from two different files based on the list of values.
For example:
File 1
ID G1 G2 G3 G4 G5
E1 1 5 0 Inf 1
E2 2 6 4 0 9
E3 4 5 7 8 10
E4 2 8 3 1 1
E5 6 7 8 0 9
E6 12 34 5 6 11
E7 15 7 18 29 34
E8 0 5 23 16 7
E3 3 32 4 18 12
..........
File 2
ID C1 C2
E1 A B
E2 C A
E3 B D
E4 A D
E3 C D
E5 B C
E6 D B
E7 C A
E8 B A
..........
If I have the list of values E1, E5, E7, E3; then
Output should be printed to a file:
ID G1 G2 G3 G4 G5 C1 C2
E1 1 5 0 Inf 1 A B
E5 2 6 4 0 9 B C
E7 4 5 7 8 10 C A
E3 2 8 3 1 1 B D
E3 3 32 4 18 12 C D
How can I extract with bash command:
if the filenames are file1, file2 and file3, then this should work
len=`wc -l file1|awk {'print $1'}`;for i in `seq 1 $len`; do as=`sed -n "$i"p file1`; sd=`sed -n "$i"p file2`; echo "$as $sd" >> file3; done