Bash text file editing/modifying - bash

I have a text file that I am trying to modify. I am taking the input file that has lines of the form of
(y+1/4,-x+1/2,z+3/4)
and trying to change it to
0 1 0 -1 0 0 0 0 1 1 / 4 1 / 2 3 / 4
I currently can get to this point
0 1 0 1/4 -1 0 0 1/2 0 0 1 3/4
using
#!bin/bash
filename="227.dat"
sed -i 's/(/ /g' $filename
sed -i 's/)//g' $filename
sed -i 's/,/ /g' $filename
sed -i 's/-x/-1 0 0/g' $filename
sed -i 's/x/ 1 0 0/g' $filename
sed -i 's/-y/ 0 -1 0/g' $filename
sed -i 's/y/ 0 1 0/g' $filename
sed -i 's/-z/ 0 0 -1/g' $filename
sed -i 's/z/ 0 0 1/g' $filename
sed -i '/+/! s/$/ 0 \/ 1 0 \/ 1 0 \/ 1/' $filename
while ((i++)); read -r line; do
if [[ $line == *[+]* ]]
then
sed -i 's/+/ /g' $filename
echo $i
fi
done < "$filename"
The reason for the echo $i was to see that it correctly gives the line number and I thought perhaps I could use it for commands on those specific lines. I am doing this conversion as the code we use in creating crystal structures needs the vector notation with fractions at the end, not the x,y,z notation. I already know this is not the "prettiest" or simplest solution, but I am very new to all of this and it's what I have been able to piece together so far. Any suggestions?

Here's an approach that may simplify the parsing. Read each line into an array using IFS set to all possible delimiters and characters you don't care about:
while IFS=$'\(\)+,' read -ra line; do
for i in 1 3 5; do
case "${line[$i]}" in
x) printf "%s\t%s\t%s\t" 1 0 0 ;;
y) printf "%s\t%s\t%s\t" 0 1 0 ;;
z) printf "%s\t%s\t%s\t" 0 0 1 ;;
-x) printf "%s\t%s\t%s\t" -1 0 0 ;;
-y) printf "%s\t%s\t%s\t" 0 -1 0 ;;
-z) printf "%s\t%s\t%s\t" 0 0 -1 ;;
esac
done
for i in 2 4 6; do
printf "%s\t" "${line[$i]}"
done
echo
done < "$filename"

#!/usr/bin/env bash
filename="227.dat"
re='[(]y[+]([[:digit:]/]+),-x[+]([[:digit:]/]+),z[+]([[:digit:]/]+)[)]';
while IFS= read -r line; do
if [[ $line =~ $re ]]; then
printf '\t%s' \
0 1 0 \
-1 0 0 \
0 0 1 \
"${BASH_REMATCH[1]}" \
"${BASH_REMATCH[2]}" \
"${BASH_REMATCH[3]}";
printf '\n';
else
echo "ERROR: $line does not match $re" 1>&2;
fi;
done <"$filename"
...given, your input, returns:
0 1 0 -1 0 0 0 0 1 1/4 1/2 3/4
...which as far as I can tell is correct.
A more complex approach, making unfounded extrapolations (given the lack of detail and exemplars in the question itself), might look like:
#!/usr/bin/env bash
while IFS='(),' read -a pieces; do
declare -A vars=( [x]=1 [y]=1 [z]=1 [x_sigil]='' [y_sigil]='' [z_sigil]='' )
for piece in "${pieces[#]}"; do
# 1 2 3 4
if [[ $piece =~ (-?)([xyz])([+]([[:digit:]/]+))? ]]; then
if [[ ${BASH_REMATCH[4]} ]]; then # only if there *are* digits
vars[${BASH_REMATCH[2]}]=${BASH_REMATCH[4]} # ...then store them.
fi
vars[${BASH_REMATCH[2]}_sigil]=${BASH_REMATCH[1]} # store - if applicable
fi
done
printf '\t%s' \
"0" "${vars[x_sigil]}1" 0 \
"${vars[y_sigil]}1" 0 0 \
0 0 "${vars[z_sigil]}1" \
"${vars[y]}" "${vars[x]}" "${vars[z]}"
printf '\n'
done
Given the sample inputs provided in a comment on this answer, output is:
0 1 0 1 0 0 0 0 1 1 1 1
0 1 0 1 0 0 0 0 1 1 1 1
0 1 0 1 0 0 0 0 1 1 1 1
0 1 0 1 0 0 0 0 -1 3/4 1/4 1/2
0 1 0 -1 0 0 0 0 1 1/2 3/4 1/4
0 -1 0 1 0 0 0 0 1 1/4 1/2 3/4
0 -1 0 -1 0 0 0 0 -1 1 1 1
0 -1 0 -1 0 0 0 0 -1 1 1 1
0 -1 0 -1 0 0 0 0 -1 1 1 1
0 -1 0 -1 0 0 0 0 1 1/4 3/4 1/2
0 -1 0 1 0 0 0 0 -1 1/2 1/4 3/4
0 1 0 -1 0 0 0 0 -1 3/4 1/2 1/4
0 -1 0 -1 0 0 0 0 1 1/4 3/4 1/2
0 -1 0 -1 0 0 0 0 1 1/4 3/4 1/2
0 -1 0 -1 0 0 0 0 1 1/4 3/4 1/2
0 -1 0 -1 0 0 0 0 -1 1 1 1
0 -1 0 1 0 0 0 0 1 1/4 1/2 3/4
0 1 0 -1 0 0 0 0 1 1/2 3/4 1/4
0 1 0 1 0 0 0 0 -1 3/4 1/4 1/2
0 1 0 1 0 0 0 0 -1 3/4 1/4 1/2
0 1 0 1 0 0 0 0 -1 3/4 1/4 1/2
0 1 0 1 0 0 0 0 1 1 1 1
0 1 0 -1 0 0 0 0 -1 3/4 1/2 1/4
0 -1 0 1 0 0 0 0 -1 1/2 1/4 3/4
0 -1 0 1 0 0 0 0 -1 1/2 1/4 3/4
0 -1 0 1 0 0 0 0 -1 1/2 1/4 3/4
0 -1 0 1 0 0 0 0 -1 1/2 1/4 3/4
0 -1 0 1 0 0 0 0 1 1/4 1/2 3/4

Related

Parse error when building my matrix, do not understand why

I'm trying to build a matrix out of linear equations but for some reason I keep getting a parse error in my matrix when previously I did not.
CoM=[K1*(abs(Q1).^r) K2*(abs(Q2).^r) -(K3*(abs(Q3).^r)) -(K4*(abs(Q4).^r)); K3*(abs(Q3).^r) -(K5*(abs(Q5).^r)) -(K6*(abs(Q6).^r) -(K7*(abs(Q7).^r) ; 1 -1 0 0 ; 1 1 -1 0 ; 1 -1 0 0 ; 1 -1 0 0 ; 1 1 -1 0];
^
error: parse error near line 1 of file '____'\WNA2loop.m
syntax error
CoM=[K1*(abs(Q1).^r) K2*(abs(Q2).^r) -(K3*(abs(Q3).^r)) -(K4*(abs(Q4).^r)); K3*(abs(Q3).^r) -(K5*(abs(Q5).^r)) -(K6*(abs(Q6).^r) -(K7*(abs(Q7).^r) ; 1 -1 0 0 ; 1 1 -1 0 ; 1 -1 0 0 ; 1 -1 0 0 ; 1 1 -1 0];
^
Right where I put the caret but when I take that small part of the matrix and run the command it
[1 -1 0 0 ; 1 1 -1 0 ; 1 -1 0 0 ; 1 -1 0 0 ; 1 1 -1 0]
ans = 1 -1 0 0
1 1 -1 0
1 -1 0 0
1 -1 0 0
1 1 -1 0
The K1 to K7 and Q1 to Q7 as well as the are just variables that get input by the user through the input function it worked before but now it just wont budge, could someone please provide assistance on this?

Bash: Pipe output into a table

I have a program that prints out the following:
bash-3.2$ ./drawgrid
0
1 1 0
1 1 0
0 0 0
1
0 1 1
0 1 1
0 0 0
2
0 0 0
1 1 0
1 1 0
3
0 0 0
0 1 1
0 1 1
Is it possible to pipe the output of this command such that I get all the 3x3 matrices (together with their number) displayed on a table, for example a 2x2 like this?
0 1
1 1 0 0 1 1
1 1 0 0 1 1
0 0 0 0 0 0
2 3
0 0 0 0 0 0
1 1 0 0 1 1
1 1 0 0 1 1
I tried searching, and came across the column command, but I did not figure it out.
Thank you
You can use pr -2T to get the following output, which is close to what you expected:
0 2
1 1 0 0 0 0
1 1 0 1 1 0
0 0 0 1 1 0
1 3
0 1 1 0 0 0
0 1 1 0 1 1
0 0 0 0 1 1
You could use an awk script:
NF == 1 {
if ($NF % 2 == 0) {
delete line
line[1]=$1
f=1
} else {
print line[1]"\t"$1
f=0
}
n=1
}
NF > 1 {
n++
if (f)
line[n]=$0
else
print line[n]"\t"$0
}
And pipe to it like so:
$ ./drawgrid | awk -f 2x2.awk
0 1
1 1 0 0 1 1
1 1 0 0 1 1
0 0 0 0 0 0
2 3
0 0 0 0 0 0
1 1 0 0 1 1
1 1 0 0 1 1
You can get exactly what you expect with a short bash script and a little array index thought:
#!/bin/bash
declare -a idx
declare -a acont
declare -i cnt=0
declare -i offset=0
while IFS=$'\n'; read -r line ; do
[ ${#line} -eq 1 ] && { idx+=( $line ); ((cnt++)); }
[ ${#line} -gt 1 ] && { acont+=( $line );((cnt++)); }
done
for ((i = 0; i < ${#idx[#]}; i+=2)); do
printf "%4s%8s\n" ${idx[i]} ${idx[i+1]}
for ((j = offset; j < offset + 3; j++)); do
printf " %8s%8s\n" ${acont[j]} ${acont[j+3]}
done
offset=$((j + 3))
done
exit 0
Output
$ bash array_cols.sh <dat/cols.txt
0 1
1 1 0 0 1 1
1 1 0 0 1 1
0 0 0 0 0 0
2 3
0 0 0 0 0 0
1 1 0 0 1 1
1 1 0 0 1 1

Use an awk loop to subset a file

I have a file with lots of pieces of information that I want to split on the first column.
Example (example.gen):
1 rs3094315 752566 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
1 rs2094315 752999 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
2 rs3044315 759996 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
2 rs3054375 799966 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
2 rs3094375 999566 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
3 rs3078315 799866 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
3 rs4054315 759986 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
4 rs4900215 752998 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
4 rs5094315 759886 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
4 rs6094315 798866 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
Desired output:
Chr1.gen
1 rs3094315 752566 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
1 rs2094315 752999 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
Chr2.gen
2 rs3044315 759996 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
2 rs3054375 799966 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
2 rs3094375 999566 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
Chr3.gen
3 rs3078315 799866 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
3 rs4054315 759986 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
Chr4.gen
4 rs4900215 752998 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
4 rs5094315 759886 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
4 rs6094315 798866 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
I've tried to do this with the following shell scripts, but it doesn't work - I can't work out how to get awk to recognise a variable defined outside the awk script itself.
First script attempt (no awk loop):
for i in {1..23}
do
awk '{$1 = $i}' example.gen > Chr$i.gen
done
Second script attempt (with awk loop):
for i in {1..23}
do
awk '{for (i = 1; i <= 23; i++) $1 = $i}' example.gen > Chr$i.gen
done
I'm sure its probably quite basic, but I just can't work it out...
Thank you!
With awk:
awk '{print > "Chr"$1".gen"}' file
It just prints and redirects it to a file. And how is this file defined? With "Chr" + first_column + ".gen".
With your sample input it creates 4 files. For example the 4th is:
$ cat Chr4.gen
4 rs4900215 752998 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
4 rs5094315 759886 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
4 rs6094315 798866 A G 0 1 0 1 0 0 1 0 0 0 1 0 0 1
First, use #fedorqui's answer, as that is best. But to understand the mistake you made with your first attempt (which was close), read on.
Your first attempt failed because you put the test inside the action (in the braces), not preceding it. The minimal fix:
awk "\$1 == $i" example.gen > Chr$i.gen
This uses double quotes to allow the value of i to be seen by the awk script, but that requires you to then escape the dollar sign for $1 so that you don't substitute the value of the shell's first positional argument. Cleaner but longer:
awk -v i=$i '$1 == i' example.gen > Chr$i.gen
This adds creates a variable i inside the awk script with the same value as the shell's i variable.

C/C++ wavelet library that return also the NxN wavelet matrix

I am looking for a C++ library for Discrete Wavelet Transform (DWT) which can also return
the NxN DWT matrix of the transform.
There was a similar question opened here
Looking for a good C/C++ wavelet library for signal processing
but I am looking for something more specific as you can see.
It would be more helpful if the library is under some non-GNU license that lets me use it in proprietary software (LGPL, MPL, BSD etc.)
Thanks in advance
The reason why this matrix is never computed is that it is very inefficient to compute the DWT using it. The FWT approach is much faster.
For a signal of length 16 and a 3-level haar transform, I found that this matrix in matlab
>> h=[1 1];
>> g=[1 -1];
>> m1=[[ones(1,8) zeros(1,8); ...
zeros(1,8) ones(1,8); ...
1 1 1 1 -1 -1 -1 -1 zeros(1,8); ...
zeros(1,8) 1 1 1 1 -1 -1 -1 -1]/sqrt(8); ...
[1 1 -1 -1 zeros(1,12); ...
zeros(1,4) 1 1 -1 -1 zeros(1,8); ...
zeros(1,8) 1 1 -1 -1 zeros(1,4); ...
zeros(1,12) 1 1 -1 -1]/sqrt(4); ...
[g zeros(1,14); ...
zeros(1,2) g zeros(1,12); ...
zeros(1,4) g zeros(1,10); ...
zeros(1,6) g zeros(1,8); ...
zeros(1,8) g zeros(1,6); ...
zeros(1,10) g zeros(1,4); ...
zeros(1,12) g zeros(1,2); ...
zeros(1,14) g]/sqrt(2)]
m1 =
A A A A A A A A 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 A A A A A A A A
A A A A -A -A -A -A 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 A A A A -A -A -A -A
B B -B -B 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 B B -B -B 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 B B -B -B 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 B B -B -B
C -C 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 C -C 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 C -C 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 C -C 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 C -C 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 C -C 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 C -C 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 C -C
where A=1/sqrt(8), B=1/sqrt(4) and C=1/sqrt(2).
corresponds to the FWT. That shows you how you build your matrix from the filters. You start with the bottom half of the matrix --a matrix of zeroes, putting filter g 2 steps further every row. then make the filter twice as wide and repeat, only now shift 4 steps at a time. repeat this until you are at the highest level of decomposition, the finally put the approximation filter in at the same width (here, 8).
just as a check
>> signal=1:16; % ramp
>> [h g]=daubcqf(2); % Haar coefficients from the Rice wavelet toolbox
>> fwt(h,signal,3) % fwt code by Jeffrey Kantor
>> m1*signal' % should produce the same vector
Hope that helps you writing it in C++. It is not difficult (a bit of bookkeeping) but as said, noone uses it because efficient algorithms do not need it.

How to insert whitespace between characters of words in a specific field in a file

I have a file containing 100000 lines like this
1 0110100010010101
2 1000010010111001
3 1000011001111000
10 1011110000111110
123 0001000000100001
I would like to know how can I display efficiently just the second field by adding whitespaces between characters.
0 1 1 0 1 0 0 0 1 0 0 1 0 1 0 1
1 0 0 0 0 1 0 0 1 0 1 1 1 0 0 1
1 0 0 0 0 1 1 0 0 1 1 1 1 0 0 0
1 0 1 1 1 1 0 0 0 0 1 1 1 1 1 0
0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1
One solution would be to get the second column with awk and then add the whitespaces using sed. But as the file is too long I would like to avoid using pipes. Then I'm wondering if I can do that by just using awk.
Thanks in advance
is this ok?
awk '{gsub(/./,"& ",$2);print $2}' yourFile
example
kent$ echo "1 0110100010010101
2 1000010010111001
3 1000011001111000"|awk '{gsub(/./,"& ",$2);print $2}'
0 1 1 0 1 0 0 0 1 0 0 1 0 1 0 1
1 0 0 0 0 1 0 0 1 0 1 1 1 0 0 1
1 0 0 0 0 1 1 0 0 1 1 1 1 0 0 0
update
more than 2 digits in 1st column won't work? I didn't get it:
kent$ echo "133 0110100010010101
233 1000010010111001
333 1000011001111000"|awk '{gsub(/./,"& ",$2);print $2}'
0 1 1 0 1 0 0 0 1 0 0 1 0 1 0 1
1 0 0 0 0 1 0 0 1 0 1 1 1 0 0 1
1 0 0 0 0 1 1 0 0 1 1 1 1 0 0 0
gsub(/./,"& ", $2)
1 /./ match any single character
2 "& " & here means the matched string, in this case, each character
3 $2 column 2
so it means, replace each character in 2nd column into the character itself + " ".
One way using only awk:
awk '{ gsub( /./, "& ", $2 ); print $2; }' infile
That yields:
0 1 1 0 1 0 0 0 1 0 0 1 0 1 0 1
1 0 0 0 0 1 0 0 1 0 1 1 1 0 0 1
1 0 0 0 0 1 1 0 0 1 1 1 1 0 0 0
EDIT: Kent and I gave the same implementation, so, for this answer to be a bit more useful, I will add the sed one:
sed -e 's/^[^ ]* *//; s/./& /g' infile
Just adding a sed alternative:
sed -e 's/^.* *//;s/./& /g;s/ $//' file
Three comands:
Remove the characters and spaces on the start of the line
Replace everycharacter with itself followed by a space
(Optional) Remove the trailing space at the end of the line
sed solution.
sed 's/.* //;s/\(.\)/\1 /g'
It adds an extra space at the end of each line. Add ;s/ $// to the expression to remove it.
This might work for you (GNU sed):
sed 's/^\S*\s*//;s/\B/ /g' /file

Resources