Print each two column together from a matrix

Print each two column together from a matrix - shell

I have a matrix:
$cat ifile.txt
2 3 4 5 10 0 2 2 0 1 0 0 0 1
0 3 4 6 2 0 2 0 0 0 0 1 2 3
0 0 0 2 3 0 3 0 3 1 2 3 1 0
Here it has total 14 columns e.g. A1 B1 A2 B2 A3 B3 A4 B4 A5 B5 A6 B6 A7 B7. Each odd number columns correspond to A and even number columns correspond to B.
I would like to print all A in one column and all B in one column. So my desire file looks like:
$cat ofile.txt
2 3
0 3
0 0
4 5
4 6
0 2
10 0
2 0
3 0
2 0
0 0
0 3
....
It is possible for me to do manually in the following way, but I am looking for some more easy way to do it.
for c in 1 3 5 7 9 11 13;do
awk'{printf"%5s %5s",$c,$(c+1)} > A$c.txt
cat A1 A3 A5 A7 A9 A11 A13 > ofile.txt

$ cat tst.awk
{
for ( i=1; i<=NF; i++ ) {
a[NR,i] = $i
}
}
END {
for ( i=1; i<=NF; i+=2 ) {
for (j=1; j<=NR; j++ ) {
print a[j,i], a[j,i+1]
}
}
}
.
$ awk -f tst.awk file
2 3
0 3
0 0
4 5
4 6
0 2
10 0
2 0
3 0
2 2
2 0
3 0
0 1
0 0
3 1
0 0
0 1
2 3
0 1
2 3
1 0
If you want to generalize for more than 2 output columns:
$ cat tst.awk
BEGIN { n=(n ? n : 2) }
{
for (i=1; i<=NF; i++) {
a[NR,i] = $i
}
}
END {
for ( i=1; i<=NF; i+=n ) {
for (j=1; j<=NR; j++) {
for ( k=1; k<=n; k++ ) {
printf "%s%s", a[j,i+k-1], (k<n ? OFS : ORS)
}
}
}
}
.
$ awk -v n=2 -f tst.awk file
2 3
0 3
0 0
4 5
4 6
0 2
10 0
2 0
3 0
2 2
2 0
3 0
0 1
0 0
3 1
0 0
0 1
2 3
0 1
2 3
1 0
.
$ awk -v n=7 -f tst.awk file
2 3 4 5 10 0 2
0 3 4 6 2 0 2
0 0 0 2 3 0 3
2 0 1 0 0 0 1
0 0 0 0 1 2 3
0 3 1 2 3 1 0

Related

numeric vs alphanumeric sort on ubuntu 18.04.2

I am getting some strange behavior on sort utility on Ubuntu 18.04.2. Here's some sequence of commands issued. How can I ensure numeric sort for all the columns? column 1, 2, 3, 4 should be in order.
$ cat zz
0 0 0 0
0 1 0 0
1 0 0 0
1 1 0 0
1 1 1 0
1 1 1 1
2 2 2 2
10 10 10 10
1 1 10 1
1 1 100 1
$ cat zz | sort
0 0 0 0
0 1 0 0
1 0 0 0
10 10 10 10
1 1 0 0
1 1 1 0
1 1 100 1
1 1 10 1
1 1 1 1
2 2 2 2
$ cat zz | sort -n
0 0 0 0
0 1 0 0
1 0 0 0
1 1 0 0
1 1 1 0
1 1 100 1
1 1 10 1
1 1 1 1
2 2 2 2
10 10 10 10
$ cat zz | sort -n -k1,3
0 0 0 0
0 1 0 0
1 0 0 0
1 1 0 0
1 1 1 0
1 1 100 1
1 1 10 1
1 1 1 1
2 2 2 2
10 10 10 10
Desired output (with numeric sorting):
0 0 0 0
0 1 0 0
1 0 0 0
1 1 0 0
1 1 1 0
1 1 1 1
1 1 10 1
1 1 100 1
2 2 2 2
10 10 10 10
What options should I use in sort to get my desired output i.e. sorted in numeric order

Renaming file based on a value in a tsv file

My input is a tsv file with 5 columns. It has the column names 'Position' 'A', 'B' and so on, that repeat every now and then in the tsv. How can I split this tsv file so that each one has one set of the column headers and the data underneth, but not the next set of column headers.
Input:
Position A B C D Seg2
1 9 0 0 0 0
2 0 0 16 0 0
3 0 19 0 0 0
4 0 0 18 0 0
Position A B C D Seg1
1 9 0 0 0 1
2 0 0 22 0 0
3 0 19 0 0 0
4 0 0 19 0 0
5 39 0 0 0 0
6 43 0 0 0 0
The ideal output would be the above in split into two tsv files, one named Seg1.tsv and the other Seg2.tsv.
What I have:
awk '/Position/{x="F"++i;}{print > x;}' file.tsv
How can I modify the above to rename the files?

You should just derive the filename from the last column :
awk '/Position/{x=$6".tsv"}{print > x;}' file.tsv

how to record properties of other variables in stata

I have to generate variables entry_1, entry_2 and entry_3 which will adopt the value 1 if id_i for that particular month had entry=1.
Example.
id month entry entry_1 entry_2 entry_3
1 1 1 1 0 0
1 2 0 0 0 0
1 3 0 0 1 1
1 4 0 0 0 0
2 1 0 1 0 0
2 2 0 0 0 0
2 3 1 0 1 1
2 4 0 0 0 0
3 1 0 1 0 0
3 2 0 0 0 0
3 3 1 0 1 1
3 4 0 0 0 0
Would anyone be so kind to propose an idea of how to implement a loop in order to do this?
I am thinking of something like this:
forvalues i=1(1)3 {
gen entry`i'=0
replace entry`i'=1 if on that particular month id=`i' had entry=1
}

You could do something like this (although your data don't quite look right for the question you're asking):
forvalues i = 1/3 {
gen entry_`i' = id == `i' & entry == 1
}
This generates a dummy variable entry_i for each i in the forvalues loop where entry_i = 1 if id is i and entry is 1, and 0 otherwise.

The code can be simplified down to at most one loop.
clear
input id month entry entry_1 entry_2 entry_3
1 1 1 1 0 0
1 2 0 0 0 0
1 3 0 0 1 1
1 4 0 0 0 0
2 1 0 1 0 0
2 2 0 0 0 0
2 3 1 0 1 1
2 4 0 0 0 0
3 1 0 1 0 0
3 2 0 0 0 0
3 3 1 0 1 1
3 4 0 0 0 0
end
forval j = 1/4 {
egen entry`j' = total(entry & id == `j'), by(month)
}
list id month entry entry? , sepby(id)
+--------------------------------------------------------+
| id month entry entry1 entry2 entry3 entry4 |
|--------------------------------------------------------|
1. | 1 1 1 1 0 0 0 |
2. | 1 2 0 0 0 0 0 |
3. | 1 3 0 0 1 1 0 |
4. | 1 4 0 0 0 0 0 |
|--------------------------------------------------------|
5. | 2 1 0 1 0 0 0 |
6. | 2 2 0 0 0 0 0 |
7. | 2 3 1 0 1 1 0 |
8. | 2 4 0 0 0 0 0 |
|--------------------------------------------------------|
9. | 3 1 0 1 0 0 0 |
10. | 3 2 0 0 0 0 0 |
11. | 3 3 1 0 1 1 0 |
12. | 3 4 0 0 0 0 0 |
+--------------------------------------------------------+

Bash: Pipe output into a table

I have a program that prints out the following:
bash-3.2$ ./drawgrid
0
1 1 0
1 1 0
0 0 0
1
0 1 1
0 1 1
0 0 0
2
0 0 0
1 1 0
1 1 0
3
0 0 0
0 1 1
0 1 1
Is it possible to pipe the output of this command such that I get all the 3x3 matrices (together with their number) displayed on a table, for example a 2x2 like this?
0 1
1 1 0 0 1 1
1 1 0 0 1 1
0 0 0 0 0 0
2 3
0 0 0 0 0 0
1 1 0 0 1 1
1 1 0 0 1 1
I tried searching, and came across the column command, but I did not figure it out.
Thank you

You can use pr -2T to get the following output, which is close to what you expected:
0 2
1 1 0 0 0 0
1 1 0 1 1 0
0 0 0 1 1 0
1 3
0 1 1 0 0 0
0 1 1 0 1 1
0 0 0 0 1 1

You could use an awk script:
NF == 1 {
if ($NF % 2 == 0) {
delete line
line[1]=$1
f=1
} else {
print line[1]"\t"$1
f=0
}
n=1
}
NF > 1 {
n++
if (f)
line[n]=$0
else
print line[n]"\t"$0
}
And pipe to it like so:
$ ./drawgrid | awk -f 2x2.awk
0 1
1 1 0 0 1 1
1 1 0 0 1 1
0 0 0 0 0 0
2 3
0 0 0 0 0 0
1 1 0 0 1 1
1 1 0 0 1 1

You can get exactly what you expect with a short bash script and a little array index thought:
#!/bin/bash
declare -a idx
declare -a acont
declare -i cnt=0
declare -i offset=0
while IFS=$'\n'; read -r line ; do
[ ${#line} -eq 1 ] && { idx+=( $line ); ((cnt++)); }
[ ${#line} -gt 1 ] && { acont+=( $line );((cnt++)); }
done
for ((i = 0; i < ${#idx[#]}; i+=2)); do
printf "%4s%8s\n" ${idx[i]} ${idx[i+1]}
for ((j = offset; j < offset + 3; j++)); do
printf " %8s%8s\n" ${acont[j]} ${acont[j+3]}
done
offset=$((j + 3))
done
exit 0
Output
$ bash array_cols.sh <dat/cols.txt
0 1
1 1 0 0 1 1
1 1 0 0 1 1
0 0 0 0 0 0
2 3
0 0 0 0 0 0
1 1 0 0 1 1
1 1 0 0 1 1

Sort each column independently

cat sanger.* | tr '\-ACGT' '01234' | sed -e 's/\([[:digit:]]\)/\1 /g'
1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 1 1 1 1 0
0 2 2 0 0 0 0 2 2 2 2 0 2 0 0 0 0 0 2 2 2 0 2 0 0 0 0 0 0 0 2
0 0 0 0 0 0 3 0 0 0 0 3 0 0 3 0 0 3 0 0 0 0 0 0 3 0 0 0 0 0 0
0 0 0 4 4 0 0 0 0 0 0 0 0 4 0 4 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0
This is my current output, now I want to sort each column independently, so all the numbers are on the same line.
How can I do that?

I am not sorting, here, but extracting the non-0 digits.
Here is an awk filter that "updates" each fields with the only (actually, the latest) non-"0" content it sees :
# short version
awk '/./ { if ( NF > maxNF ) { maxNF=NF }
for(i=1;i<=NF;i++) { if ( $i!="0" ) { result[i]=$i } }
}
END { for(i=1;i<=maxNF;i++) { printf "%s ",result[i] } }
'
# expanded version (ie, the same as above, with different indentation to mhelp reading)
awk '/./ { if ( NF > maxNF )
{ maxNF=NF }
for(i=1;i<=NF;i++)
{ if ( $i!="0" )
{ result[i]=$i }
}
}
END { for(i=1;i<=maxNF;i++)
{ printf "%s ",result[i]
}
}
'
so if I paste your posted result into that filter:
echo "
1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 1 1 1 1 0
0 2 2 0 0 0 0 2 2 2 2 0 2 0 0 0 0 0 2 2 2 0 2 0 0 0 0 0 0 0 2
0 0 0 0 0 0 3 0 0 0 0 3 0 0 3 0 0 3 0 0 0 0 0 0 3 0 0 0 0 0 0
0 0 0 4 4 0 0 0 0 0 0 0 0 4 0 4 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0
" | awk '/./ { if ( NF > maxNF ) { maxNF=NF }
for(i=1;i<=NF;i++) { if ( $i!="0" ) { result[i]=$i } }
}
END { for(i=1;i<=maxNF;i++) { printf "%s ",result[i] } }
'
it outputs:
1 2 2 4 4 1 3 2 2 2 2 3 2 4 3 4 1 3 2 2 2 1 2 1 3 4 1 1 1 1 2
(note: with an extra " " at the end, here...)
A note of warning however: very OLD version of the original awk (and maybe some nawk) are limited to 99 fields... (Rarely encountered nowadays. And if you use GNU's version, you will be fine)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Print each two column together from a matrix - shell

Related

numeric vs alphanumeric sort on ubuntu 18.04.2

Renaming file based on a value in a tsv file

how to record properties of other variables in stata

Bash: Pipe output into a table

Sort each column independently

Categories

Resources