I am trying to combine permutations of some .wav files.
There's 6 variations of 4 instruments. Each generated track should have one of each instrument. If my math is right, there should be 24 unique permutations.
The files are named like:
beat_1.wav, beat_2.wav ...
bass_1.wav, bass_2.wav ...
chord_1.wav, chord_2.wav ...
melody_1.wav, melody_2.wav ...
I've tried to combine them with
sox -m {beat,bass,chord,melody}_{1..6}.wav out_{1..24}.wav
but regardless of what range of values I use for the out_n.wav file, sox gives this error
immediately:
sox FAIL formats: can't open input file `out_23.wav': No such file or directory
The number in out_23.wav is always one lower than whatever range I specify.
I'm open to using tools other than sox and bash, provided I can generate all the tracks in one command/program (I don't want to do it by hand in Audacity, for example).
If you replace sox with echo you will see the command you constructed is not really permutating the way you want:
$ echo sox -m {beat,bash,chord,melody}_{1..6}.wav out_{1..24}.wav
sox -m beat_1.wav beat_2.wav beat_3.wav beat_4.wav beat_5.wav beat_6.wav bash_1.wav bash_2.wav bash_3.wav bash_4.wav bash_5.wav bash_6.wav chord_1.wav chord_2.wav chord_3.wav chord_4.wav chord_5.wav chord_6.wav melody_1.wav melody_2.wav melody_3.wav melody_4.wav melody_5.wav melody_6.wav out_1.wav out_2.wav out_3.wav out_4.wav out_5.wav out_6.wav out_7.wav out_8.wav out_9.wav out_10.wav out_11.wav out_12.wav out_13.wav out_14.wav out_15.wav out_16.wav out_17.wav out_18.wav out_19.wav out_20.wav out_21.wav out_22.wav out_23.wav out_24.wav
So what we see is there are 24 combinations for your input as required, but, it is also supplying 24 outputs on the same line, which, according to the documentation of sox, all inputs are treated as input except for the last, so, files out_1.wav ... out23.wav will also be treated as input not outputs. So, you have a logic problem.
If you want to permutate through all 24 combinations, one at a time, I recommend a for loop, e.g.
i=0
for f in {beat,bass,chord,melody}_{1..6}.wav
do
((i++))
echo "Input: " $f "Output: out_${i}.wav"
done
Which outputs:
Input: beat_1.wav Output: out_1.wav
Input: beat_2.wav Output: out_2.wav
Input: beat_3.wav Output: out_3.wav
Input: beat_4.wav Output: out_4.wav
Input: beat_5.wav Output: out_5.wav
Input: beat_6.wav Output: out_6.wav
Input: bass_1.wav Output: out_7.wav
Input: bass_2.wav Output: out_8.wav
Input: bass_3.wav Output: out_9.wav
Input: bass_4.wav Output: out_10.wav
Input: bass_5.wav Output: out_11.wav
Input: bass_6.wav Output: out_12.wav
Input: chord_1.wav Output: out_13.wav
Input: chord_2.wav Output: out_14.wav
Input: chord_3.wav Output: out_15.wav
Input: chord_4.wav Output: out_16.wav
Input: chord_5.wav Output: out_17.wav
Input: chord_6.wav Output: out_18.wav
Input: melody_1.wav Output: out_19.wav
Input: melody_2.wav Output: out_20.wav
Input: melody_3.wav Output: out_21.wav
Input: melody_4.wav Output: out_22.wav
Input: melody_5.wav Output: out_23.wav
Input: melody_6.wav Output: out_24.wav
Related
I have a list of files with file names that contain a substring of 6 numbers that represents HHMMSS, HH: 2 digits hour, MM: 2 digits minutes, SS: 2 digits seconds.
If the list of files is ordered, the increments should be in steps of 30 minutes, that is, the first substring should be 000000, followed by 003000, 010000, 013000, ..., 233000.
I want to check that no file is missing iterating the list of files and checking that neither of these substrings is missing. My approach:
string_check=000000
for file in ${file_list[#]}; do
if [[ ${file:22:6} == $string_check ]]; then
echo "Ok"
else
echo "Problem: an hour (file) is missing"
exit 99
fi
string_check=$((string_check+3000)) #this is the key line
done
And the previous to the last line is the key. It should be formatted to 6 digits, I know how to do that, but I want to add time like a clock, or, in more specific words, modular arithmetic modulo 60. How can that be done?
Assumptions:
all 6-digit strings are of the format xx[03]0000 (ie, has to be an even 00 or 30 minutes and no seconds)
if there are strings like xx1529 ... these will be ignored (see 2nd half of answer - use of comm - to address OP's comment about these types of strings being an error)
Instead of trying to do a bunch of mod 60 math for the MM (minutes) portion of the string, we can use a sequence generator to generate all the desired strings:
$ for string_check in {00..23}{00,30}00; do echo $string_check; done
000000
003000
010000
013000
... snip ...
230000
233000
While OP should be able to add this to the current code, I'm thinking we might go one step further and look at pre-parsing all of the filenames, pulling the 6-digit strings into an associative array (ie, the 6-digit strings act as the indexes), eg:
unset myarray
declare -A myarray
for file in ${file_list}
do
myarray[${file:22:6}]+=" ${file}" # in case multiple files have same 6-digit string
done
Using the sequence generator as the driver of our logic, we can pull this together like such:
for string_check in {00..23}{00,30}00
do
[[ -z "${myarray[${string_check}]}" ]] &&
echo "Problem: (file) '${string_check}' is missing"
done
NOTE: OP can decide if the process should finish checking all strings or if it should exit on the first missing string (per OP's current code).
One idea for using comm to compare the 2 lists of strings:
# display sequence generated strings that do not exist in the array:
comm -23 <(printf "%s\n" {00..23}{00,30}00) <(printf "%s\n" "${!myarray[#]}" | sort)
# OP has commented that strings not like 'xx[03]000]` should generate an error;
# display strings (extracted from file names) that do not exist in the sequence
comm -13 <(printf "%s\n" {00..23}{00,30}00) <(printf "%s\n" "${!myarray[#]}" | sort)
Where:
comm -23 - display only the lines from the first 'file' that do not exist in the second 'file' (ie, missing sequences of the format xx[03]000)
comm -13 - display only the lines from the second 'file' that do not exist in the first 'file' (ie, filenames with strings not of the format xx[03]000)
These lists could then be used as input to a loop, or passed to xargs, for additional processing as needed; keeping in mind the comm -13 output will display the indices of the array, while the associated contents of the array will contain the name of the original file(s) from which the 6-digit string was derived.
Doing this easy with POSIX shell and only using built-ins:
#!/usr/bin/env sh
# Print an x for each glob matched file, and store result in string_check
string_check=$(printf '%.0sx' ./*[0-2][0-9][03]000*)
# Now string_check length reflects the number of matches
if [ ${#string_check} -eq 48 ]; then
echo "Ok"
else
echo "Problem: an hour (file) is missing"
exit 99
fi
Alternatively:
#!/usr/bin/env sh
if [ "$(printf '%.0sx' ./*[0-2][0-9][03]000*)" \
= 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' ]; then
echo "Ok"
else
echo "Problem: an hour (file) is missing"
exit 99
fi
I'm trying to make a list with a simple bash looping
I want this:
000000
000001
000002
They give me this:
0
1
2
My shell code:
countBEG="000000"
countEND="999999"
while [ $countBEG != $countEND ]
do
echo "$countBEG"
countBEG=$[$countBEG +1]
done
Change your echo to use printf, where you can specify format for left padding.
printf "%06d\n" "$countBEG"
This sets 6 as fixed length of the output, using zeros to fill empty spaces.
You're looking for:
seq -w "$countBEG" "$countEND"
The -w option does the padding.
The following command will produce the desired output (no need for the loop) :
printf '%06d\n' {1..999999}
Explanation :
{1..999999} is expanded by bash to the sequence of 1 to 999999
the format string '%06d\n' tells printf to display the number it is given as argument padded to 6 digits and followed by a linefeed
printf repeats this output if it is given more arguments than is defined in its format specification
Say my stream is x*N lines long, where x is the number of records and N is the number of columns per record, and is output column-wise. For example, x=2, N=3:
1
2
Alice
Bob
London
New York
How can I join every line, modulo the number of records, back into columns:
1 Alice London
2 Bob New York
If I use paste, with N -s, I get the transposed output. I could use split, with the -l option equal to N, then recombine the pieces afterwards with paste, but I'd like to do it within the stream without spitting out temporary files all over the place.
Is there an "easy" solution (i.e., rather than invoking something like awk)? I'm thinking there may be some magic join solution, but I can't see it...
EDIT Another example, when x=5 and N=3:
1
2
3
4
5
a
b
c
d
e
alpha
beta
gamma
delta
epsilon
Expected output:
1 a alpha
2 b beta
3 c gamma
4 d delta
5 e epsilon
You are looking for pr to "columnate" the stream:
pr -T -s$'\t' -3 <<'END_STREAM'
1
2
Alice
Bob
London
New York
END_STREAM
1 Alice London
2 Bob New York
pr is in coreutils.
Most systems should include a tool called pr, intended to print files. It's part of POSIX.1 so it's almost certainly on any system you'll use.
$ pr -3 -t < inp1
1 a alpha
2 b beta
3 c gamma
4 d delta
5 e epsilon
Or if you prefer,
$ pr -3 -t -s, < inp1
1,a,alpha
2,b,beta
3,c,gamma
4,d,delta
5,e,epsilon
or
$ pr -3 -t -w 20 < inp1
1 a alpha
2 b beta
3 c gamma
4 d delta
5 e epsilo
Check the link above for standard usage information, or man pr for specific options in your operating system.
In order to reliably process the input you need to either know the number of columns in the output file or the number of lines in the output file. If you just know the number of columns, you'd need to read the input file twice.
Hackish coreutils solution
# If you don't know the number of output lines but the
# number of output columns in advance you can calculate it
# using wc -l
# Split the file by the number of output lines
split -l"${olines}" file FOO # FOO is a prefix. Choose a better one
paste FOO*
AWK solutions
If you know the number of output columns in advance you can use this awk script:
convert.awk:
BEGIN {
# Split the file into one big record where fields are separated
# by newlines
RS=''
FS='\n'
}
FNR==NR {
# We are reading the file twice (see invocation below)
# When reading it the first time we store the number
# of fields (lines) in the variable n because we need it
# when processing the file.
n=NF
}
{
# n / c is the number of output lines
# For every output line ...
for(i=0;i<n/c;i++) {
# ... print the columns belonging to it
for(ii=1+i;ii<=NF;ii+=n/c) {
printf "%s ", $ii
}
print "" # Adds a newline
}
}
and call it like this:
awk -vc=3 -f convert.awk file file # Twice the same file
If you know the number of ouput lines in advance you can use the following awk script:
convert.awk:
BEGIN {
# Split the file into one big record where fields are separated
# by newlines
RS=''
FS='\n'
}
{
# x is the number of output lines and has been passed to the
# script. For each line in output
for(i=0;i<x;i++){
# ... print the columns belonging to it
for(ii=i+1;ii<=NF;ii+=x){
printf "%s ",$ii
}
print "" # Adds a newline
}
}
And call it like this:
awk -vx=2 -f convert.awk file
I have a large directory of data files which I am in the process of manipulating to get them in a desired format. They each begin and end 15 lines too soon, meaning I need to strip the first 15 lines off one file and paste them to the end of the previous file in the sequence.
To begin, I have written the following code to separate the relevant data into easy chunks:
#!/bin/bash
destination='media/user/directory/'
for file1 in `ls $destination*.ascii`
do
echo $file1
file2="${file1}.end"
file3="${file1}.snip"
sed -e '16,$d' $file1 > $file2
sed -e '1,15d' $file1 > $file3
done
This worked perfectly, so the next step is the worlds simplest cat command:
cat $file3 $file2 > outfile
However, what I need to do is to stitch file2 to the previous file3. Look at this screenshot of the directory for better understanding.
See how these files are all sequential over time:
*_20090412T235945_20090413T235944_* ### April 13
*_20090413T235945_20090414T235944_* ### April 14
So I need to take the 15 lines snipped off the April 14 example above and paste it to the end of the April 13 example.
This doesn't have to be part of the original code, in fact it would be probably best if it weren't. I was just hoping someone would be able to help me get this going.
Thanks in advance! If there is anything I have been unclear about and needs further explanation please let me know.
"I need to strip the first 15 lines off one file and paste them to the end of the previous file in the sequence."
If I understand what you want correctly, it can be done with one line of code:
awk 'NR==1 || FNR==16{close(f); f=FILENAME ".new"} {print>f}' file1 file2 file3
When this has run, the files file1.new, file2.new, and file3.new will be in the new form with the lines transferred. Of course, you are not limited to three files: you may specify as many as you like on the command line.
Example
To keep our example short, let's just strip the first 2 lines instead of 15. Consider these test files:
$ cat file1
1
2
3
$ cat file2
4
5
6
7
8
$ cat file3
9
10
11
12
13
14
15
Here is the result of running our command:
$ awk 'NR==1 || FNR==3{close(f); f=FILENAME ".new"} {print>f}' file1 file2 file3
$ cat file1.new
1
2
3
4
5
$ cat file2.new
6
7
8
9
10
$ cat file3.new
11
12
13
14
15
As you can see, the first two lines of each file have been transferred to the preceding file.
How it works
awk implicitly reads each file line-by-line. The job of our code is to choose which new file a line should be written to based on its line number. The variable f will contain the name of the file that we are writing to.
NR==1 || FNR==16{f=FILENAME ".new"}
When we are reading the first line of the first file, NR==1, or when we are reading the 16th line of whatever file we are on, FNR==16, we update f to be the name of the current file with .new added to the end.
For the short example, which transferred 2 lines instead of 15, we used the same code but with FNR==16 replaced with FNR==3.
print>f
This prints the current line to file f.
(If this was a shell script, we would use >>. This is not a shell script. This is awk.)
Using a glob to specify the file names
destination='media/user/directory/'
awk 'NR==1 || FNR==16{close(f); f=FILENAME ".new"} {print>f}' "$destination"*.ascii
Your task is not that difficult at all. You want to gather a list of all _end files in the directory (using a for loop and globbing, NOT looping on the results of ls). Once you have all the end files, you simply parse the dates using parameter expansion w/substing removal say into d1 and d2 for date1 and date2 in:
stuff_20090413T235945_20090414T235944_end
| d1 | | d2 |
then you simply subtract 1 from d1 into say date0 or d0 and then construct a previous filename out of d0 and d1 using _snip instead of _end. Then just test for the existence of the previous _snip filename, and if it exists, paste your info from the current _end file to the previous _snip file. e.g.
#!/bin/bash
for i in *end; do ## find all _end files
d1="${i#*stuff_}" ## isolate first date in filename
d1="${d1%%T*}"
d2="${i%T*}" ## isolate second date
d2="${d2##*_}"
d0=$((d1 - 1)) ## subtract 1 from first, get snip d1
prev="${i/$d1/$d0}" ## create previous 'snip' filename
prev="${prev/$d2/$d1}"
prev="${prev%end}snip"
if [ -f "$prev" ] ## test that prev snip file exists
then
printf "paste to : %s\n" "$prev"
printf " from : %s\n\n" "$i"
fi
done
Test Input Files
$ ls -1
stuff_20090413T235945_20090414T235944_end
stuff_20090413T235945_20090414T235944_snip
stuff_20090414T235945_20090415T235944_end
stuff_20090414T235945_20090415T235944_snip
stuff_20090415T235945_20090416T235944_end
stuff_20090415T235945_20090416T235944_snip
stuff_20090416T235945_20090417T235944_end
stuff_20090416T235945_20090417T235944_snip
stuff_20090417T235945_20090418T235944_end
stuff_20090417T235945_20090418T235944_snip
stuff_20090418T235945_20090419T235944_end
stuff_20090418T235945_20090419T235944_snip
Example Use/Output
$ bash endsnip.sh
paste to : stuff_20090413T235945_20090414T235944_snip
from : stuff_20090414T235945_20090415T235944_end
paste to : stuff_20090414T235945_20090415T235944_snip
from : stuff_20090415T235945_20090416T235944_end
paste to : stuff_20090415T235945_20090416T235944_snip
from : stuff_20090416T235945_20090417T235944_end
paste to : stuff_20090416T235945_20090417T235944_snip
from : stuff_20090417T235945_20090418T235944_end
paste to : stuff_20090417T235945_20090418T235944_snip
from : stuff_20090418T235945_20090419T235944_end
(of course replace stuff_ with your actual prefix)
Let me know if you have questions.
You could store the previous $file3 value in a variable (and do a check if it is not the first run with -z check):
#!/bin/bash
destination='media/user/directory/'
prev=""
for file1 in $destination*.ascii
do
echo $file1
file2="${file1}.end"
file3="${file1}.snip"
sed -e '16,$d' $file1 > $file2
sed -e '1,15d' $file1 > $file3
if [ -z "$prev" ]; then
cat $prev $file2 > outfile
fi
prev=$file3
done
There are several files named TESTFILE which located in directories ~/main1/sub1, ~/main1/sub2, ~/main1/sub3, ..., ~/main2/sub1,~/main2/sub2, ... ~/mainX/subY where mainX is the main folder and subY are the subfolders inside the main folder. The TESTFILE file for each main folder-subfolder has the same pattern, but the data in each is unique.
Now here's what I want to do:
I want to read a specific number in the TESTFILE for each ~/mainX/subY.
I want to create a text file where every line has the following format [mainX][space][subY][space][value read from TESTFILE]
Some information about TESTFILE and the data I want to get:
It is an OSZICAR file from VASP, a DFT program
The number of lines in OSZICAR varies in different folder-subfolder combination
The information I want to get is always located in the last two lines of the file
The last two lines always look like this:
DAV: 2 -0.942521930239E+01 0.27889E-09 -0.79991E-13 864 0.312E-06
10 F= -.94252193E+01 E0= -.94252193E+01 d E =-.717252E-07
Or in general, the last two lines pattern is:
DAV: a b c d e f
g F= h E0= i d E = j
where the italicized parts are the parts that do not change and boldfaced variable are the ones that I want to get
Some information about main folder mainX and sub-folder subY:
The folders mainX and subY are all real numbers.
How I want the output to be:
Suppose mainX={0.12, 0.20, 0.34, 0.7} and subY={1.10, 2.30, 4.50, 1.00, 2.78}, and the last two lines of ~/0.12/1.10/OSZICAR is the example above, my output file should contain:
0.12 1.10 2 10 -.94252193E+01 -.94252193E+01 -.717252E-07
...
0.7 2.30 2 10 -.94252193E+01 -.94252193E+01 -.717252E-07
...
mainX mainY a g h i j
How do I do this in the simplest way possible? I'm reading grep, awk, sed and I'm very overwhelmed.
You could do this using some for loops in bash:
for m in ~/main*/; do
main=$(basename "$m")
for s in "$m"sub*/; do
sub=$(basename "$s")
num=$(tail -n2 TESTFILE | awk -F'[ =]+' 'NR==1{s=$2;next}{print s,$1,$3,$5,$8}')
echo "$main $sub $num"
done
done > output_file
I have modified the command to extract the data from your file. It uses tail to read the last two lines of the file. The lines are passed to awk, where they are split into fields using any number of spaces and = signs together as the field separator. The second field from the first of the two lines is saved to the variable s. next skips to the next line, then the columns that you are interested in are printed.
Your question is not very clear - specifically on how to extract the value from TESTFILE, but this is something like what you want:
#!/bin/bash
for X in {1..100}; do
for Y in {1..100}; do
directory="main${X}/sub${Y}"
echo Checking $directory
if [ -f "${directory}/TESTFILE" ]; then
something=$(grep something "${directory}/TESTFILE")
echo main${X} sub${Y} $something
fi
done
done