Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Let's say some researchers have figured out a way to analyze data and they have developed an algorithm for that. At the time, the algorithm is described in a book, using lots of mathematical formulas.
Now the algorithm needs to be implemented in software. The developer can read the formulas and starts translating e.g. Sum(f(x)) [1..n] (seems TeX is not allowed here) to a for loop.
Depending on how the developer converts the formula into code, there might be overflows or truncation in floating point operations. Not knowing much about real-world input values, unit tests might not detect those issues. However, in some cases, this can be avoided just by re-ordering the items or simplifying terms.
I wonder who is responsible for the precision of the output. Is it the mathematician or is it the developer? The mathematician might not know enough about computer number formats while the developer might not know enough about mathematics to restructure the formula.
A simple example:
Given the Binomial coefficient n over k which translates to n! / (k! (n-k)!).
A simple implementation would probably use the factorial function and then input the numbers directly (pseudo code):
result = fac(n) / (fac(k) * fac(n-k))
This can lead to overflows for larger n. Knowing that, one could divide n! by k! first and do (pseudo code):
result = 1
for (i = k+1 to n) result *= i
result = result / fac(n-k)
which is a) faster because it needs less calculations and b) does not suffer from overflows.
This science is called numerical analysis
http://en.wikipedia.org/wiki/Numerical_analysis
In my understanding the analysis is on the mathematician side, but it is the responsibility of the programmer to know the problem exists and to look for the correct well known solutions (like not using a simple Euler integrator but Runge-Kutta).
Short answer: developer.
Algorithm (of just a formula) manipulates arbitrary precision real numbers as pure math objects.
Code (based on the formula) works with real hardware and must overcome limitations (which depends on your hardware) by using more complex code.
Example: Formula f(x,y) = x * y may lead to very complex source code (if x,y are 64-bit floating point real numbers and your hardware is 8-bit microcontroller without FPU support and without integer MUL instruction).
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
To all he-experts out there:
I want to implement a matrix-vector multiplication with very large matrices (600000 x 55). Currently I am able to perform he operations like Addition, Multiplication, InnerProduct etc. with small inputs. When I try to apply these operations on larger inputs I get errors like Invalid next size (normal) or I ran out of main memory until the os kills the process (exit code 9).
Do you have any recommendations/examples how to archive an efficient way of implementing a matrix-vector multiplication or something similar? (Using BFV and CKKS).
PS: I am using the PALISADE library but if you have better suggestions like SEAL or Helib I would happily use them as well.
CKKS, which is also available in PALISADE, would be a much better option for your scenario as it supports approximate (floating-point-like) arithmetic and does not require high precision (large plaintext modulus). BFV performs all operations exactly (mod plaintext modulus). You would have to use a really large plaintext modulus to make sure your result does not wrap around the plaintext modulus. This gets much worse as you increase the depth, e.g., two chained multiplications.
For matrix-vector multiplication, you could use the techniques described in https://eprint.iacr.org/2019/223, https://eprint.iacr.org/2018/254, and the supplemental information of https://eprint.iacr.org/2020/563. The main idea is to choose the right encoding and take advantage of SIMD packing. You would work with a power-of-two vector size and could pack the matrix either as 64xY (multiple rows) per ciphertext or a part of each row per ciphertext, depending on which one is more efficient.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I would like to calculate approximately the running time of a matrix multiplication problem. Below are my assumptions:
No parallel programming
A 2 Ghz CPU
A square matrix of size n
An O(n^3) algorithm
For example suppose that n = 1000. So, how much time (approximately) should I expect taking the square of this matrix will take on the above assumptions.
Thanks.
This really terribly depends on the algorithm and the CPU. Even without parallelization, there's a lot of freedom in how the same steps would be represented on a CPU, and differences (in clock cycles needed for various operations) between different CPU's of the same family, too. Don't forget, either, that modern CPUs add some parallelization of instructions on their own. Optimization done by the compiler will make a difference in reordering memory order and branches and will likely convert instructions to vectorized ones even if you didn't specify that. Depending on further factors it may make a difference, too, whether your matrices are in a fixed location in memory or if you are accessing them by a pointer, and whether they are allocated with fixed size or each row / column dynamically. Don't forget about memory caching, page invalidations, and operation system scheduling, as I did in previous versions of my answer.
If this is for your own rough estimate or for a "typical" case, you won't do much wrong by just writing the program, running it in your specific conditions (as discussed above) in many repetitions for n = 1000, and calculating the average.
If you want a lot of hard work for a worse result, you can actually do what you probably meant to do in your original question yourself:
see what instructions your specific compiler produces for your specific algorithm under your specific conditions and with specific optimization settings (like here)
pick your specific processor and find its latency table for every instruction that's there,
add them up per iteration and multiply by 1000^3,
divide by the clock frequency.
Seriously, it's not worth the effort, a benchmark is faster, clearer, and more precise anyway (as this does not account for what happens in the branch predictor and hyperthreading and memory caching and other architectural details). If you want an exercise I'll leave that to you.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm looking for pointers to guide me in the right direction in the construction of an algorithm.
Situation is simple: There are multiple bits of information that may be indicate an individuals' geo location. For instance, recent IP addresses or TLD of email address or information explicitely provided such as town or postal code.
These bits of information may or may not be present, they may have certain levels of accuracy (a postal code would be more accurate than a national TLD) and reliability (IP may be more reliable than a postal code, even if the postal code would be more accurate). Also, information may suffer from aging.
I'm looking to create an algorithm that attempts to determine the most likely location based on this information. I've got several ideas on how to solve this, mostly involving pre-determining and calculating scores for accuracy and reliability, but it's pretty easy to poke holes in this.
Are the any algorithms that handle this particular or similar problems? Perhaps algorithms that deal with data reliability/accuracy in general or actual statistical data on reliability/accuracy of geo-information?
You want to find the most likely location L, given some piece of Information I. That is, you want to maximize the conditional probability
P(L|I) -> max
Because this function P(L|I) is hard to estimate, one typically applies Bayes' theorem here:
P(L|I) = P(I|L)*P(L) / P(I)
The denominator P(I) is the probability of that information I. Since this information is fixed, this term is constant and not of interest for finding the maximum above. P(L) is the unconditional probability of a certain location. Something like the population density at this place might be a good estimate for that. Finally, you need a model for P(I|L), the probability of getting I given location L. For multiple pieces of information this would be the product of the individual probabilities:
P(I|L) = P(I1|L)*P(I2|L)*...
This works if the individual pieces I1, I2, ... are conditionally independent given the location L, which seems to be the case here. As an example, the likelihood of a certain postal code and the likelihood of some cell tower are generally strongly correlated, but as soon as we assume a specific location L the postal code does not influence the likelihood of a cell tower anymore.
Those individual probabilities P(I1|L) ... represent the reliability and accuracy of the information and must be provided externally. You have to come up with some assumptions here. As a general rule, when in doubt you better be pessimistic about the reliability and accuracy of the information.
If you are too pessimistic you result will be somewhat off, but if you are too optimistic your result can get totally wrong very quickly. Another thing you need to keep in mind is the feasibility of the maximization. A very accurate model for P(I1|L) is useless if the effort to find the maximum becomes too high. Generally picking smooth functions for the models simplifies the optimization in the end.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I understand that this makes the algorithms faster and use less storage space, and that these would have been critical features for software to run on the hardware of previous decades, but is this still an important feature? If the calculations were done with exact rational arithmetic then there would be no rounding errors at all, which would simplify many algorithms as you would no longer have to worry about catastrophic cancellation or anything like that.
Floating point is much faster than arbitrary-precision and symbolic packages, and 12-16 significant figures is usually plenty for demanding science/engineering applications where non-integral computations are relevant.
The programming language ABC used rational numbers (x / y where x and y were integers) wherever possible.
Sometimes calculations would become very slow because the numerator and denominator had become very big.
So it turns out that it's a bad idea if you don't put some kind of limit on the numerator and denominator.
In the vast majority of computations, the size of numbers required to to compute answers exactly would quickly grow beyond the point where computation would be worth the effort, and in many calculations it would grow beyond the point where exact calculation would even be possible. Consider that even running something like like a simple third-order IIR filter for a dozen iterations would require a fraction with thousands of bits in the denominator; running the algorithm for a few thousand iterations (hardly an unusual operation) could require more bits in the denominator than there exist atoms in the universe.
Many numerical algorithms still require fixed-precision numbers in order to perform well enough. Such calculations can be implemented in hardware because the numbers fit entirely in registers, whereas arbitrary precision calculations must be implemented in software, and there is a massive performance difference between the two. Ask anybody who crunches numbers for a living whether they'd be ok with things running X amount slower, and they probably will say "no that's completely unworkable."
Also, I think you'll find that having arbitrary precision is impractical and even impossible. For example, the number of decimal places can grow fast enough that you'll want to drop some. And then you're back to square one: rounded number problems!
Finally, sometimes the numbers beyond a certain precision do not matter anyway. For example, generally the nnumber of significant digits should reflect the level of experimental uncertainty.
So, which algorithms do you have in mind?
Traditionally integer arithmetic is easier and cheaper to implement in hardware (uses less space on the die so you can fit more units on there). Especially when you go into the DSP segment this can make a lot of difference.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Who knows the most robust algorithm for a chromatic instrument tuner?
I am trying to write an instrument tuner. I have tried the following two algorithms:
FFT to create a welch periodogram and then detect the peak frequency
A simple autocorrelation (http://en.wikipedia.org/wiki/Autocorrelation)
I encountered the following basic problems:
Accuracy 1: in FFT the relation between samplerate, recording length and bin size is fixed. This means that I need to record a 1-2 seconds of data to get an accuracy of a few cents. This is not exactly what i would call realtime.
Accuracy 2: autocorrelation works a bit better. To get the needed accuracy of a few cents I had to introduced linear interpolation of samples.
Robustness: In case of a guitar I see a lot of overtones. Some overtones are actually stronger than the main tone produced by the string. I could not find a robust way to select the right string played.
Still, any cheap electronic tuner works more robust than my implementation.
How are those tuners implemented?
You can interpolate FFTs also, and you can often use the higher harmonics for increased precision. You need to know a little bit about the harmonics of the instrument that was produced, and it's easier if you can assume you're less than half an octave off target, but even in the absence of that, the fundamental frequency is usually much stronger than the first subharmonic, and is not that far below the primary harmonic. A simple heuristic should let you pick the fundamental frequency.
I doubt that the autocorrelation method will work all that robustly across instruments, but you should get a series of self-similarity scores that is highest when you're offset by one fundamental frequency. If you go two, you should get the same score again (to within noise and differential damping of the different harmonics).
There's a pretty cool algorithm called Bitstream Autocorrelation. It doesn't take too many CPU cycles, and it's very accurate. You basically find all the zero cross points, and then save it as a binary string. Then you use Auto-correlation on the string. It's fast because you can use XOR instead of floating point multiplication.