Add a specific string at the end of each line - bash

I have a mainfile with 4 columns, such as:
a b c d
e f g h
i j k l
in another file, i have one line of text corresponding to the respective line in the mainfile, which i want to add as a new column to the mainfile, such as:
a b c d x
e f g h y
i j k l z
Is this possible in bash? I can only add the same string to the end of each line.

Two ways you can do
1) paste file1 file2
2) Iterate over both files and combine line by line and write to new file

You could use GNU parallel for that:
fe-laptop-m:test fe$ cat first
a b c d
e f g h
i j k l
fe-laptop-m:test fe$ cat second
x
y
z
fe-laptop-m:test fe$ parallel echo ::::+ first second
a b c d x
e f g h y
i j k l z
Do I get you right what you try to achieve?

This might work for you (GNU sed):
sed -E 's#(^.*) .*#/^\1/s/$/ &/#' file2 | sed -f - file1
Create a sed script from file2 that uses a regexp to match a line in file1 and if it does appends the contents of that line in file2 to the matched line.
N.B.This is independent of the order and length of file1.

You can try using pr
pr -mts' ' file1 file2

Related

Combining 2 lines together but "interlaced"

I have 2 lines from an output as follow:
a b c
x y z
I would like to pipe both lines from the last command into a script that would combine them "interlaced", like this:
a x b y c z
The solution should work for a random number of columns from the output, such as:
a b c d e
x y z x y
Should result in:
a x b y c z d x e y
So far, I have tried using awk, perl, sed, etc... but without success. All I can do, is to put the output into one line, but it won't be "interlaced":
$ echo -e 'a b c\nx y z' | tr '\n' ' ' | sed 's/$/\n/'
a b c x y z
Keep fields of odd numbered records in an array, and update the fields of even numbered records using it. This will interlace each pair of successive lines in input.
prog | awk 'NR%2{split($0,a);next} {for(i in a)$i=(a[i] OFS $i)} 1'
Here's a 3 step solution:
$ # get one argument per line
$ printf 'a b c\nx y z' | xargs -n1
a
b
c
x
y
z
$ # split numbers of lines by 2 and combine them side by side
$ printf 'a b c\nx y z' | xargs -n1 | pr -2ts' '
a x
b y
c z
$ # combine all input lines into single line
$ printf 'a b c\nx y z' | xargs -n1 | pr -2ts' ' | paste -sd' '
a x b y c z
$ printf 'a b c d e\nx y z 1 2' | xargs -n1 | pr -2ts' ' | paste -sd' '
a x b y c z d 1 e 2
Could you please try following, it will join every 2 lines in "interlaced" fashion as follows.
awk '
FNR%2!=0 && FNR>1{
for(j=1;j<=NF;j++){
printf("%s%s",a[j],j==NF?ORS:OFS)
delete a
}
}
{
for(i=1;i<=NF;i++){
a[i]=(a[i]?a[i] OFS:"")$i}
}
END{
for(j=1;j<=NF;j++){
printf("%s%s",a[j],j==NF?ORS:OFS)
}
}' Input_file
Here is a simple awk script
script.awk
NR == 1 {split($0,inArr1)} # read fields frrom 1st line into arry1
NR == 2 {split($0,inArr2); # read fields frrom 2nd line into arry2
for (i = 1; i <= NF; i++) printf("%s%s%s%s", inArr1[i], OFS, inArr2[i], OFS); # ouput interlace fields from arr1 and arr2
print; # terminate output line.
}
input.txt
a b c d e
x y z x y
running:
awk -f script.awk input.txt
output:
a x b y c z d x e y x y z x y
Multiline awk solution:
interlaced.awk
{
a[NR] = $0
}
END {
split(a[1], b)
split(a[2], c)
for (i in b) {
printf "%s%s %s", i==1?"":OFS, b[i], c[i]
}
print ORS
}
Run it like this:
foo_program | awk -f interlaced.awk
Perl will do the job. It was invented for this type of task.
echo -e 'a b c\nx y z' | \
perl -MList::MoreUtils=mesh -e \
'#f=mesh #{[split " ", <>]}, #{[split " ", <>]}; print "#f"'
 
a x b y c z
You can of course print out the meshed output any way you want.
Check out http://metacpan.org/pod/List::MoreUtils#mesh
You could even make it into a shell function for easy use:
function meshy {
perl -MList::MoreUtils=mesh -e \
'#f=mesh #{[split " ", <>]}, #{[split " ", <>]}; print "#f"'
}
$ echo -e 'X Y Z W\nx y z w' |meshy
X x Y y Z z W w
$
Ain't Perl grand?
This might work for you (GNU sed):
sed -E 'N;H;x;:a;s/\n(\S+\s+)(.*\n)(\S+\s+)/\1\3\n\2/;ta;s/\n//;s// /;h;z;x' file
Process two lines at time. Append two lines in the pattern space to the hold space which will introduce a newline at the front of the two lines. Using pattern matching and back references, nibble away at the front of each of the two lines and place the pairs at the front. Eventually, the pattern matching fails, then remove the first newline and replace the second by a space. Copy the amended line to hold space, clean up the pattern space ready for the next couple of line (if any) and print.

Matching contents of one file with another and returning second column

So I have two txt files
file1.txt
s
j
z
z
e
and file2.txt
s h
f a
j e
k m
z l
d p
e o
and what I want to do is match the first letter of file1 with the first letter of file 2 and return the second column of file 2. so for example excepted output would be
h
e
l
l
o
I'm trying to use join file1.txt file2.txt but that just prints out the entire second file. not sure how to fix this. Thank you.
This is an awk classic:
$ awk 'NR==FNR{a[$1]=$2;next}{print a[$1]}' file2 file1
h
e
l
l
o
Explained:
$ awk '
NR==FNR { # processing file2
a[$1]=$2 # hash records, first field as key, second is the value
next
} { # second file
print a[$1] # output, change the record with related, stored one
}' file2 file1

converting four columns to two using linux commands

I am wondering how one could merge four columns into two in the following manner (using the awk command, or other possible commands).
For example,
Old:
A B C D
E F G H
I J K L
M N O P
.
.
.
New:
A B
C D
E F
G H
I J
K L
M N
O P
.
.
Thanks so much!
That's actually quite easy with awk, as per the following transcript:
pax> cat inputFile
A B C D
E F G H
pax> awk '{printf "%s %s\n%s %s\n", $1, $2, $3, $4}' <inputFile
A B
C D
E F
G H
Hww about using xargs here? Could you please try following once.
xargs -n 2 < Input_file
Output will be as follows.
A B
C D
E F
G H
I J
K L
M N
O P
with GNU sed
$ sed 's/ /\n/2' file
replace 2nd space with new line.

sed to insert a text line after first match only & remove n lines after second match using sed only

For first question, example,
A
B
C
B
D
Need to insert E after FIRST MATCH of B.
A
B
E
C
B
D
For second question example,
A
B
C
B
D
E
F
Need to remove only D and E, 2 lines after second pattern match.
A
B
C
B
F
This might work for you (GNU sed):
sed -e '/B/!b;x;s/^/x/;/^x\{1\}$/{x;aE' -e 'b};x' file
sed -e '/B/!b;x;s/^/x/;/^x\{2\}$/{x;n;N;d};x' file
Both these solutions can be split into three parts:
Focus on a particuar regexp
Counting
Conditional on the above
If the regexp is not true, continue as normal
If the regexp is true, count it by appending a character (x) to the hold space for each occurrence.
Condtional on the count (in the first solution, 1 and the second solution, 2) carry out an action.
In the first solution:
append a line containing E
In the second solution:
print the current line
append the next two lines
delete the current pattern space
If the conditional is not true, continue as normal.
N.B. the first solution can be shortened using ranges:
sed '0,/B/!b;//aE' file
or for variations of sed that do not allow GNU extentions (0,address)
sed -e '/B/{aE' -e ':a;n;ba}' file
Implementation done with
sed --version
sed (GNU sed) 4.2.2
Q1:
$ more input
A
B
C
B
D
B
D
sed:
$ sed -n -e '/^B$/!{p};/^B$/{p;x;/^$/{a E' -e '}}' input
A
B
E
C
B
D
B
D
Q2:
$ more input2
A
B
C
B
D
E
F
sed:
$ sed -n '/^B$/{p;H};/^B$/!{x;/B\nB1\?$/{s/.*/&1/;x;b;};x;p}' input2
A
B
C
B
F

Merging two outputs in shell script

I have output of 2 commands like:
op of first cmd:
A B
C D
E F
G H
op of second cmd:
I J
K L
M B
i want to merge both the outputs , and if a value in second column is same for both outputs, I'll take entry set from 1st output..
So , my output should be
A B
C D
E F
G H
I J
K L
//not taking (M B) sice B is already there in first entry(A B) , so giving preference to first output
can i do this using shell script , is there any command?
You can use awk:
awk 'FNR==NR{a[$2];print;next} !($2 in a)' file1 file2
A B
C D
E F
G H
I J
K L
If the order of entries is not important, you can sort on the 2nd column and uniquefy:
sort -u -k2 file1 file2
Both -u and -k are specified in the POSIX standard
This wouldn't work if there are repeated entries in the 2nd column of file1.

Resources