Robust algorithm for chromatic instrument tuner? [closed] - algorithm

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Who knows the most robust algorithm for a chromatic instrument tuner?
I am trying to write an instrument tuner. I have tried the following two algorithms:
FFT to create a welch periodogram and then detect the peak frequency
A simple autocorrelation (http://en.wikipedia.org/wiki/Autocorrelation)
I encountered the following basic problems:
Accuracy 1: in FFT the relation between samplerate, recording length and bin size is fixed. This means that I need to record a 1-2 seconds of data to get an accuracy of a few cents. This is not exactly what i would call realtime.
Accuracy 2: autocorrelation works a bit better. To get the needed accuracy of a few cents I had to introduced linear interpolation of samples.
Robustness: In case of a guitar I see a lot of overtones. Some overtones are actually stronger than the main tone produced by the string. I could not find a robust way to select the right string played.
Still, any cheap electronic tuner works more robust than my implementation.
How are those tuners implemented?

You can interpolate FFTs also, and you can often use the higher harmonics for increased precision. You need to know a little bit about the harmonics of the instrument that was produced, and it's easier if you can assume you're less than half an octave off target, but even in the absence of that, the fundamental frequency is usually much stronger than the first subharmonic, and is not that far below the primary harmonic. A simple heuristic should let you pick the fundamental frequency.
I doubt that the autocorrelation method will work all that robustly across instruments, but you should get a series of self-similarity scores that is highest when you're offset by one fundamental frequency. If you go two, you should get the same score again (to within noise and differential damping of the different harmonics).

There's a pretty cool algorithm called Bitstream Autocorrelation. It doesn't take too many CPU cycles, and it's very accurate. You basically find all the zero cross points, and then save it as a binary string. Then you use Auto-correlation on the string. It's fast because you can use XOR instead of floating point multiplication.

Related

What is the intuition behind flipping filters during convolution? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I saw that, while using the conv2d function of Theano, the filters were flipped both vertically and horizontally. Why is that? And does it matter in the case of a Convolutional Neural Network?
Because this is how convolution is defined mathematically. Without the flipping of filter, the operation is called cross-correlation. The advantage of convolution is that it has nicer mathematical properties.
However in the context of Convolutional Neural Network it doesn't matter whether you use convolution or cross-correlation, they are equivalent. This is because the weights of filters are learned during the training, i.e. they are updated to minimize a cost function. In a CNN that uses the cross-correlation operation, the learned filters will be equal to the flipped learned filters of a CNN that uses the convolution operation (assuming exactly the same training conditions for both, i.e. same initialization, inputs, number of epochs etc.). So the outputs of such two CNNs will be the same for the same inputs.
Cross-correlation operation is slightly more intuitive and simpler to implement (because no flipping is performed) and that's probably the reason why other frameworks like Tensorflow and Pytorch use it instead of the actual convolution (they still call it convolution though, probably due to historical reasons or to be consistent in terminology with other frameworks that use the actual convolution).
Just to add in to the above post, even though we say we are using "convolution" operation in CNNs, we actually use cross correlation. But both convolution and correlation produces different outcomes, and I did this exercise to actually see the difference in outcomes.
When I reserached more on this topic, I found that the very initial inception of CNNs is considered to be originated from the paper Neocognitron, which is known to use convolutional operation while using the kernel operation, but later implementations of CNNs and most Deep Learning libraries are known to use correlation filtering instead of convolutional filtering, and we still kept using the name of Convolutional Neural Networks as mostly the algorithm complexities and performance remained almost same.
If you want a detailed article and intuition on how both differ, please take a look at this: https://towardsdatascience.com/does-cnn-represent-convolutional-neural-networks-or-correlational-neural-networks-76c1625c14bd
Let me know if this helps. :)

Real running time calculation of matrix multiplication [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I would like to calculate approximately the running time of a matrix multiplication problem. Below are my assumptions:
No parallel programming
A 2 Ghz CPU
A square matrix of size n
An O(n^3) algorithm
For example suppose that n = 1000. So, how much time (approximately) should I expect taking the square of this matrix will take on the above assumptions.
Thanks.
This really terribly depends on the algorithm and the CPU. Even without parallelization, there's a lot of freedom in how the same steps would be represented on a CPU, and differences (in clock cycles needed for various operations) between different CPU's of the same family, too. Don't forget, either, that modern CPUs add some parallelization of instructions on their own. Optimization done by the compiler will make a difference in reordering memory order and branches and will likely convert instructions to vectorized ones even if you didn't specify that. Depending on further factors it may make a difference, too, whether your matrices are in a fixed location in memory or if you are accessing them by a pointer, and whether they are allocated with fixed size or each row / column dynamically. Don't forget about memory caching, page invalidations, and operation system scheduling, as I did in previous versions of my answer.
If this is for your own rough estimate or for a "typical" case, you won't do much wrong by just writing the program, running it in your specific conditions (as discussed above) in many repetitions for n = 1000, and calculating the average.
If you want a lot of hard work for a worse result, you can actually do what you probably meant to do in your original question yourself:
see what instructions your specific compiler produces for your specific algorithm under your specific conditions and with specific optimization settings (like here)
pick your specific processor and find its latency table for every instruction that's there,
add them up per iteration and multiply by 1000^3,
divide by the clock frequency.
Seriously, it's not worth the effort, a benchmark is faster, clearer, and more precise anyway (as this does not account for what happens in the branch predictor and hyperthreading and memory caching and other architectural details). If you want an exercise I'll leave that to you.

Genetic Programming - algorithms for population initialization [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Can someone provide me some pointers on population initialization algorithms for genetic programming?
I already know about the Grown, Full, Ramped half-half (taken from "A Field Guide to Genetic Programming") and saw one new algorithm Two Fast Tree-Creation (haven't read the paper yet.).
Initial population plays an important role in heuristic algorithms such as GA as it help to decrease the time those algorithms need to achieve an acceptable result. Furthermore, it may influence the quality of the final answer given by evolutionary algorithms. (http://arxiv.org/pdf/1406.4518.pdf)
so as you know about Koza's different population methods, you must also remember that each algorithm that is used, isn't 100% random, nor can it be as an algorithm can be used. therefore you can predict what the next value will be.
another method you could potentially use is something called Uniform initialisation (refer to the free pdf : "a field guide to genetic programming"). the idea of this is that initially, when a population is created, within a few generations, due to crossing over and selection, the entire syntax tree could be lost within a few generations. Langdon (2000) came up with the idea of a ramped uniform distribution which effectively allows the user to specify the range of sizes a possible tree can have, and if a permutation of the tree is generated in the search space that doesn't fulfil the range of sizes, then the tree is automatically discarded, regardless of its fitness evaluation value. from here, the ramped uniform distribution will create an equal amount of trees depending on the range that you have used - all of which are random, unique permutations of the functions and terminal values you are using. (again, refer to "a field guide on genetic programming" for more detail)
this method can be quite useful in terms of sampling where the desired solutions are asymmetric rather than symmetric (which is what the ramped half and half deal with).
other recommeded readings for population initialisation:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.50.962&rep=rep1&type=pdf
I think that will depend on the problem that you want to solve. For example I'm working on a TSP and my initial population is generated using a simple greedy technique. Sometimes you need to create only feasible solutions so you have to create a mechanism for doing that. Usually you will find papers about your problem and how to create initial solutions. Hope this helps.

Algorithm to determine if audio is music [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm not entirely sure this is the correct stack exchange subsite to post this question to, but...
I'm looking for an algorithm that I can use to determine with a decent amount of certainty if a given piece of audio is music or not. Just a boolean result is fine, I don't need to know the key, bpm or anything like that, I just need to be able to determine if it appears to be music (as opposed to speech). Programming language is irrelevant, but I'll end up converting it to Python.
In a phrase, Fourier analysis. Look at the power of different frequencies over time. Here's speech, and here's violin playing. The former shows dramatic changes with every syllable; the 'flow' is very disjoint and could be picked up by an algorithm which took the derivative of the different frequency bands as a function of time. In paradigmatic music, on the other hand, the transitions are much smoother and the tones are purer (less 'blur' in the graph). See also the 'spectrogram' wikipedia page.
What you could do is set up a few Karplus Strong resonance rings and through the audio through them, and just monitor the level of energy in each ring
if it is Western music, it is pretty much all tuned to 12-TET, ie logarithmic 12 tone scale based around concert pitch A4#440Hz
so just pick 3 or 4 notes equally spaced through the octave eg C5, (omit C# D D#) E5 (omit F F# G) G#5 (omit A A# B)
and at least one of those rings will be flaring regularly -- whichever key the music is in, it's probably going to hit one of those notes quite a lot
ideally do it for a bunch of notes, but if you need this real-time it can get a bit heavy feeding your audio simultaneously into 50 rings
alternatively you could even use a pitch detector and catalogue recorded pitches, and look at ratios of log(noteAfreq):log(noteBfreq) see whether they are arranging themselves into low order fractions like 3:4 += 0.5%. but I don't think anyone has built a decent polyphonic pitch detector -- it is practically impossible.
Melodyne might have pulled it off
If it's just a vocal signal you can e-mail me.
For some reason this question has attracted a large number of really bad answers.
Use pyAudioAnalysis. Also, Google "audio feature analysis".
On its surface, this sounds like a hard problem, but there's been an explosion of great work on classifiers in the past 20 years, so many well-documented solutions exist. Most classifiers today usually can figure this out with an error rate of only a few percent. Some classifiers can even figure out what genre of music it is.
Most current algorithms for doing this break down into detecting a bunch of statistical representations of the input audio (features), and then doing a set of automatic classifications of the inputs based on previous training data.
pyAudioAnalysis is one library for extracting these features and then training a kNN or other mixed model based on the detected features. There are many more comparable libraries, such as Essentia for C++. Essentia also has Python bindings.
An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics is a good introductory book.
Look for a small "First differential" over a sequence of FFTs that are in the range of music tones (ie: 1024 samples per chunk run through FFT, then plot chunk1-chunk0,chunk2-chunk1,...). As a first approximation, this should be enough to detect simple things.
This is the sort of algorithm that could be tweaked forever, even in genre-specific ways. Music itself is generally periodic as well, so coming up with a way to run FFTs over the FFTs. And the idea to look for a consistent twelfth root of two spread of outstanding frequencies sounds really plausible.
I bet you were hoping to find this sitting in an free Python library for you to simply drop a file into. :-)

Simple tool for shipping dates estimates without uncertainty [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
What I'm looking for is very simple: I want a tool that computes the calculated, as opposed to estimated based on confidence intervals, shipping date given a list of tasks with total estimates and current progress each without introducing further uncertainty as I want to handle that externally.
I want it to take workdays duration and user input holidays into account, etc.
I know Fogbugz's Evidence Base Scheduling does something very close to that but I would like it without the statistical aspect and associated confidence intervals. I'm aware it's a drastic simplification and that statistical estimation is the essence of EBS but I'm not looking for a subjective discussion here, I just want to be able to access this simple information (the supposedly exact shipping date) at any given time during the project without having to calculate it myself.
So I'm looking for one of three things : 1) a way to customize Fogbugz (6.0) to show me the information I want besides confidence intervals 2) a way to customize Fogbugz to set estimates uncertainty to 0 3) another tool (free) that does what I want exactly.
EDIT: By "supposedly exact" or "calculated", I don't mean with respect to what is actually going to happen, that would indeed be trying to predict the future. I mean with respect to the information that was input, together with its obvious uncertainty. In that case, I guess estimates for individual tasks should be more seen as spending limits or upper bounds. The information I would like to be able to compute is really very simple : if everything goes exactly as specified, where does it take us ? Then, with information about how the estimates were made, such as the ability of each individual developper to make good estimates, I can derive the confidence interval. EBS does this automatically and, undoubtebly, very well which is why I use it. What I would like is to obtain is one more little piece of information, ie the same starting point EBS uses and try to play with my own asumptions as to how the statistical estimation should be made.
FogBugz will show you the sum of estimates at the bottom of the LIST page, labelled "Total estimated time remaining". This is the raw sum of estimates, without any EBS calculations.
You can't predict the future. So any calculated shipping date can only be a guess. That guess depends on the confidence intervals around each individual number that went into it. This is a matter of definition -- even though you may not like it.
You may want to have a "100% confident" date, but such a thing (by definition) cannot exist. You cannot have an uncertainty of zero unless you want a date infinitely far in the future. It's the nature of statistics: the distribution is actually infinite, but data is considerably more likely to cluster around the mean.
The only thing you can do is pick a really big confidence interval (99.7%). You are free to ignore the supporting statistical facts about the confidence interval and pretend it has zero uncertainty. For all practical purposes 0.3% uncertainty is small enough that you're not going to be unhappy with that date.
However, all statistically-based predictions of the future must have uncertainty. It's a law.

Resources