Combining 2 lines together but "interlaced" - bash

I have 2 lines from an output as follow:
a b c
x y z
I would like to pipe both lines from the last command into a script that would combine them "interlaced", like this:
a x b y c z
The solution should work for a random number of columns from the output, such as:
a b c d e
x y z x y
Should result in:
a x b y c z d x e y
So far, I have tried using awk, perl, sed, etc... but without success. All I can do, is to put the output into one line, but it won't be "interlaced":
$ echo -e 'a b c\nx y z' | tr '\n' ' ' | sed 's/$/\n/'
a b c x y z

Keep fields of odd numbered records in an array, and update the fields of even numbered records using it. This will interlace each pair of successive lines in input.
prog | awk 'NR%2{split($0,a);next} {for(i in a)$i=(a[i] OFS $i)} 1'

Here's a 3 step solution:
$ # get one argument per line
$ printf 'a b c\nx y z' | xargs -n1
a
b
c
x
y
z
$ # split numbers of lines by 2 and combine them side by side
$ printf 'a b c\nx y z' | xargs -n1 | pr -2ts' '
a x
b y
c z
$ # combine all input lines into single line
$ printf 'a b c\nx y z' | xargs -n1 | pr -2ts' ' | paste -sd' '
a x b y c z
$ printf 'a b c d e\nx y z 1 2' | xargs -n1 | pr -2ts' ' | paste -sd' '
a x b y c z d 1 e 2

Could you please try following, it will join every 2 lines in "interlaced" fashion as follows.
awk '
FNR%2!=0 && FNR>1{
for(j=1;j<=NF;j++){
printf("%s%s",a[j],j==NF?ORS:OFS)
delete a
}
}
{
for(i=1;i<=NF;i++){
a[i]=(a[i]?a[i] OFS:"")$i}
}
END{
for(j=1;j<=NF;j++){
printf("%s%s",a[j],j==NF?ORS:OFS)
}
}' Input_file

Here is a simple awk script
script.awk
NR == 1 {split($0,inArr1)} # read fields frrom 1st line into arry1
NR == 2 {split($0,inArr2); # read fields frrom 2nd line into arry2
for (i = 1; i <= NF; i++) printf("%s%s%s%s", inArr1[i], OFS, inArr2[i], OFS); # ouput interlace fields from arr1 and arr2
print; # terminate output line.
}
input.txt
a b c d e
x y z x y
running:
awk -f script.awk input.txt
output:
a x b y c z d x e y x y z x y

Multiline awk solution:
interlaced.awk
{
a[NR] = $0
}
END {
split(a[1], b)
split(a[2], c)
for (i in b) {
printf "%s%s %s", i==1?"":OFS, b[i], c[i]
}
print ORS
}
Run it like this:
foo_program | awk -f interlaced.awk

Perl will do the job. It was invented for this type of task.
echo -e 'a b c\nx y z' | \
perl -MList::MoreUtils=mesh -e \
'#f=mesh #{[split " ", <>]}, #{[split " ", <>]}; print "#f"'
 
a x b y c z
You can of course print out the meshed output any way you want.
Check out http://metacpan.org/pod/List::MoreUtils#mesh
You could even make it into a shell function for easy use:
function meshy {
perl -MList::MoreUtils=mesh -e \
'#f=mesh #{[split " ", <>]}, #{[split " ", <>]}; print "#f"'
}
$ echo -e 'X Y Z W\nx y z w' |meshy
X x Y y Z z W w
$
Ain't Perl grand?

This might work for you (GNU sed):
sed -E 'N;H;x;:a;s/\n(\S+\s+)(.*\n)(\S+\s+)/\1\3\n\2/;ta;s/\n//;s// /;h;z;x' file
Process two lines at time. Append two lines in the pattern space to the hold space which will introduce a newline at the front of the two lines. Using pattern matching and back references, nibble away at the front of each of the two lines and place the pairs at the front. Eventually, the pattern matching fails, then remove the first newline and replace the second by a space. Copy the amended line to hold space, clean up the pattern space ready for the next couple of line (if any) and print.

Related

Bash - Sort a list of strings

Would you please show me know how I can sort the following list (ascending oder A to Z) (or a list in general) with Bash?
I have been trying but still could not get the expected results:
my_list='a z t b e c'
And the result should be a list as well, as I will use it for Select Loop.
my_list='a b c e t z'
Thanks for your help!
You can use xargs twice along with a built in sort command to accomplish this.
$ my_list='a z t b e c'
$ my_list=$(echo $my_list | xargs -n1 | sort | xargs)
$ echo $my_list
a b c e t z
If you permit using the sort program (rather than program a sorting algorithm in bash) the answer could be like this:
my_list='a z t b e c'
echo "$my_list" | tr ' ' '\n' | sort | tr '\n' ' '
The result: a b c e t z'
Arrays are more suitable to store a list of things:
list=(a z t b "item with spaces" c)
sorted=()
while IFS= read -rd '' item; do
sorted+=("$item")
done < <(printf '%s\0' "${list[#]}" | sort -z)
With bash 4.4 you can utilize readarray -d:
list=(a z t b "item with spaces" c)
readarray -td '' sorted < <(printf '%s\0' "${list[#]}" | sort -z)
To use the array to create a simple menu with select:
select item in "${sorted[#]}"; do
# do something
done
Using GNU awk and controling array traversal order with PROCINFO["sorted_in"]:
$ echo -n $my_list |
awk 'BEGIN {
RS=ORS=" " # space as record seaparator
PROCINFO["sorted_in"]="#val_str_asc" # array value used as order
}
{
a[NR]=$0 # hash on NR to a
}
END {
for(i in a) # in given order
print a[i] # output values in a
print "\n" # happy ending
}'
a b c e t z
You can do this
my_list=($(sort < <(echo 'a z t b e c'|tr ' ' '\n') | tr '\n' ' ' | sed 's/ $//\
'))
This will create my_list which is an array.

Unix awk command to return all matching lines

I have a file which looks like the below -
A
B
C
D
E
-----
A
B
C
D
C
---
X
Y
A
B
XEC
---
When the fifth row of each block is/contains E, I want the previous 4 lines to be returned. I wrote the below command but it is buggy
awk '{a[NR]=$0} $0~s {f=NR} END {print a[f-4]; print a[f-6]; print a[f-8];}' s="E" file.txt
But it is returning only the last match. I want all the matched lines to be returned.
For the above entries, the output needs to be
A
B
C
D
---
X
Y
A
B
Is there any other way to achieve this?
Using gawk : multi-character RS is only supported in gnu-awk
awk -v RS='\n\n[-]+\n\n*' -v FS="\n" '$5 ~ /E/{printf "%s\n%s\n%s\n%s\n---\n",$1,$2,$3,$4}' inputfile
A
B
C
D
---
X
Y
A
B
---
Not sure really how you want, you really need --- and then newline char ???
Using tac and awk you can try below one
Print the N records after some regexp:
awk -v n=4 'c&&c--;/regexp/{c=n}' <input_file>
Print the N records before some regexp:
tac <input_file> | awk -v n=4 'c&&c--;/regexp/{c=n}' | tac
^ ^ ^ ^
| | | |
reverse file no of lines to print when regexp found again reverse
Input
$ cat infile
A
B
C
D
E
-----
A
B
C
D
C
---
X
Y
A
B
XEC
---
When n=4
$ tac infile | awk -v n=4 'c&&c--;/E/{c=n}' | tac
A
B
C
D
X
Y
A
B
When n=2
$ tac infile | awk -v n=2 'c&&c--;/E/{c=n}' | tac
C
D
A
B

Extract lines having same second column but different third column

I have a file having strings in 3 columns as below.
a b x
a b y
a b z
a c x
a d y
I want to extract all the lines having same second column but different third column. The output I am expecting for the above example is
a b x
a b y
a b z
I tried uniq -f2 and sort -u -k2, But it isn't working as I expect. Any suggestions please.
awk '
seen[$2]++ {
if (!seen[$2,$3]++) {
printf "%s%s\n", first[$2], $0
}
delete first[$2]
next
}
{ first[$2] = $0 ORS }
' file
a b x
a b y
a b z
Note that the above will work in any awk, for any values in your input file, does not retain the whole of the input file in memory, doesn't rely on any external tools for pre/post processing, and will produce the output lines in exactly the same order they appeared in the input.
awk to the rescue!
Need to make sure all records are unique first
$ sort file | uniq |
awk '{c[$2]++; a[$2]=a[$2]?a[$2]RS$0:$0}
END{for(k in a) if(c[k]>1) print a[k]}'
a b x
a b y
a b z
Explanation: keep the counter of second field occurrences and aggregate the records. At the end print the records for which the counter is greater than one.

How to repeat lines in bash and paste with different columns?

is there a short way in bash to repeat the first line of a file as often as needed to paste it with another file in a kronecker product type (for the mathematicians of you)?
What I mean is, I have a file A:
a
b
c
and a file B:
x
y
z
and I want to merge them as follows:
a x
a y
a z
b x
b y
b z
c x
c y
c z
I could probably write a script, read the files line by line and loop over them, but I am wondering if there a short one-line command that could do the same job. I can't think of one and as you can see, I am also lacking some keywords to search for. :-D
Thanks in advance.
You can use this one-liner awk command:
awk 'FNR==NR{a[++n]=$0; next} {for(i=1; i<=n; i++) print $0, a[i]}' file2 file1
a x
a y
a z
b x
b y
b z
c x
c y
c z
Breakup:
NR == FNR { # While processing the first file in the list
a[++n]=$0 # store the row in array 'a' by the an incrementing index
next # move to next record
}
{ # while processing the second file
for(i=1; i<=n; i++) # iterate over the array a
print $0, a[i] # print current row and array element
}
alternative to awk
join <(sed 's/^/_\t/' file1) <(sed 's/^/_\t/' file2) | cut -d' ' -f2-
add a fake key for join to have all records of file1 to match all records of file2, trim afterwards

complex line copying&modifying on-the-fly with grep or sed

Is there a way to do the followings with either grep, or sed: read each line of a file, and copy it twice and modify each copy:
Original line:
X Y Z
A B C
New lines:
Y M X
Y M Z
B M A
B M C
where X, Y, Z, M are all integers, and M is a fixed integer (i.e. 2) we inject while copying! I suppose a solution (if any) will be so complex that people (including me) will start bleeding after seeing it!
$ awk -v M=2 '{print $2,M,$1; print $2,M,$3;}' file
Y 2 X
Y 2 Z
B 2 A
B 2 C
How it works
-v M=2
This defines the variable M to have value 2.
print $2,M,$1
This prints the second column, followed by M, followed by the first column.
print $2,M,$3
This prints the second column, followed by M, followed by the third column.
Extended Version
Suppose that we want to handle an arbitrary number of columns in which we print all columns between first and last, followed by M, followed by the first, and then print all columns between first and last, followed by M, followed by the last. In this case, use:
awk -v M=2 '{for (i=2;i<NF;i++)printf "%s ",$i; print M,$1; for (i=2;i<NF;i++)printf "%s ",$i; print M,$NF;}' file
As an example, consider this input file:
$ cat file2
X Y1 Y2 Z
A B1 B2 C
The above produces:
$ awk -v M=2 '{for (i=2;i<NF;i++)printf "%s ",$i; print M,$1; for (i=2;i<NF;i++)printf "%s ",$i; print M,$NF;}' file2
Y1 Y2 2 X
Y1 Y2 2 Z
B1 B2 2 A
B1 B2 2 C
The key change to the code is the addition of the following command:
for (i=2;i<NF;i++)printf "%s "
This command prints all columns from the i=2, which is the column after the first to i=NF-1 which is the column before the last. The code is otherwise similar.
Sure; you can write:
sed 's/\(.*\) \(.*\) \(.*\)/\2 M \1\n\2 M \3/'
With bash builtin commands:
m=2; while read a b c; do echo "$b $m $a"; echo "$b $m $c"; done < file
Output:
Y 2 X
Y 2 Z
B 2 A
B 2 C

Resources