How to know the compute time needed? - time

I am working on a project and would like to know if there's a way to roughly estimate the time required for the process to finish.
My very bad estimate for how many operations would it takes is somewhere between 10^70 and 10^100. I know, huge numbers and I'm almost certain that my computer won't be able to finish in a short time period. (Or ever). How could I get around to knowing how long would it take. (For any project, this is more of an example)

Something with that many operations certainly is looping thru things, possibly in a multi-levelled nested fashion.
If you can find the run time of any of the loops, multiply it by the number of times it loops to get your answer.
Say one loop has 10^10 operations, or 10 billion operations. Say this takes 10 minutes. Well, you can do the math from here.

Here you can find nicely explained material on the topic. This is a short version of the corresponding book chapter ("Algorithms 4th Edition" by Robert Sedgewick and Kevin Wayne, chapter "Analysis of Algorithms").

As a rule of thumb, modern computers should be able to do 10^8, or 100 000 000 more complex operations in a few seconds.
So if you program competitively, it shouldn't go much over that.
Also, this is just a rule of thumb and slow computers may do significantly worse while there are chips (designed for mining) that can do much much more much faster.
Still a good rule of thumb

Related

Benchmark by number of iterations per constant interval

For benchmarking the efficiency of different algorithms for a simple task and comparing them, the way I find most often is to set a constant number of times to iterate over the task, and measure the time interval spent for each algorithm.
But, if the number of times is set too small, the interval difference among the algorithms will be too small, and may be masked by external factors. If you set the number of times too large, then it will take too much time to execute. So you have to guess the right number of times by trial end error.
Rather than doing it this way, I think it makes more sense to set a constant time interval that you want to run each algorithm, and then measure how many iterations can be made in that interval for each algorithm.
By doing so, the reliability of the benchmark will be more stable. In the conventional way, the benchmark will be more reliable for tasks that take time.
I haven't seen this way of benchmarking. Do people actually do this way, and is there a benchmarking framework for this way of measurement? I am asking this as a non-language-specific question, but if there is such framework, especially for Ruby, please introduce some. Or am I wrong about this idea?
I found this gem: benchmark/ips.
Take a look at perfer:
https://github.com/jruby/perfer
This has a couple of mechanisms including iterations/s. Don't worry that this is a jruby repo, it works on all Ruby implementations and was written as part of GSoC 2012.

Memory and time issues when dividing two matrices

I have two sparse matrices in matlab
M1 of size 9thousandx1.8million and M2 of size 1.8millionx1.8million.
Now I need to calculate the expression
M1/M2
and it took me like an hour. Is it normal? Is there any efficient way in matlab so that I can overcome this time issue. I mean it's a lot and if I make number of iterations then it will keep on taking 1 hour. Any suggestion?
A quick back-of-the-envelope calculation based on assuming some iterative method like conjugate gradient or Kaczmarz method is used, and plugging in the sizes makes me believe that an hour isn't bad.
Because of the tridiagonality the matrix that's being "inverted" (if not explicitly), both of those methods are going to take a number of instructions near "some near-unity scalar factor" times ~9000 times 1.8e6 times "the number of iterations required for convergence". The product of the two things in quotes is probably around 50 (minimum) to around 1000 (maximum). I didn't cherry pick these to make your math work, these are about what I'd expect from having done these. If you assume about 1e9 instructions per second (which doesn't account much for memory access etc.) you get around 13 minutes to around 4.5 hours.
Thus, it seems in the right range for an algorithm that's exploiting sparsity.
Might be able to exploit it better yourself if you know the structure, but probably not by much.
Note, this isn't to say that 13 minutes is achievable.
Edit: One side note, I'm not sure what's being used, but I assumed iterative methods. It's also possible that direct methods are used (like explained here). These methods can be very efficient for sparse systems if you exploit the sparsity right. It's very possible that Matlab is using these by default, but it's worth investigating what Matlab is doing in your case.
In my limited experience, iterative methods were usually preferred over direct methods as the size of the systems get large (yours is large.) Our linear systems worked out to be block tridiagonal as well, as they often do in image processing.

Single-layer Perceptron

I'm building a single-layer perceptron that has a reasonably long feature vector (30-200k), all normalised.
Let's say I have 30k features which are somewhat useful at predicting a class but then add 100 more features which are excellent predictors. The accuracy of the predictions only goes up a negligible amount. However, if I manually increase the weights on the 100 excellent features (say by 5x), the accuracy goes up several percent.
It was my impression that the nature of the training process should give better features a higher weight naturally. However, it seems like the best features are being 'drowned out' by the worse ones.
I tried running it with a larger number of iterations, but that didn't help.
How can I adapt the algorithm to better weight features in a reasonably simple way? Also, a reasonably fast way; if I had fewer features it'd be easy to just run the algorithm leaving one out at a time but it's not really feasible with 30k.
My experience with implementing perceptron based network is that it takes a lot of iterations to learn something. I believe I used each sample about 1k times to learn the xor function(when having only 4 inputs). So if you have 200k inputs it will take a lot of samples and a lot of time to train your network.
I have a few suggestions for you:
try to reduce the size of the input(try to aggregate several inputs into a single one or try to remove redundant once).
try to use each sample more times. As I said it may take a lot of iterations to learn even a simple function
Hope this helps.

When timing how long a quick process runs, how many runs should be used?

Lets say I am going to run process X and see how long it takes.
I am going to save into a database a date I ran this process, and the time it took. I want to know what to put into the DB.
Process X almost always runs under 1500ms, so this is a short process. It usually runs between 500 and 1500ms, quite a range (3x difference).
My question is, how many "runs" should be saved into the DB as a single run?
Every run saved into the DB as its
own row?
5 Runs, averaged, then save that
time?
10 Runs averaged?
20 Runs, remove anything more than 2
std deviations away, and save
everything inside that range?
Does anyone have any good info backing them up on this?
Save the data for every run into its own row. Then later you can use and analyze the data however you like... ie, all you the other options you listed can be performed after the fact. It's not really possible for someone else to draw meaningful conclusions about how to average/analyze the data without knowing more about what's going on.
The fastest run is the one that most accurately times only your code.
All slower runs are slower because of noise introduced by the operating system scheduler.
The variance you experience is going to differ from machine to machine, and even on identical machines, the set of runnable processes will introduce noise.
None of the above. Bran is close though. You should save every measurment. But don't average them. The average (arithmetic mean) can be very misleading in this type of analysis. The reason is that some of your measurments will be much longer than the others. This will happen becuse things can interfere with your process - even on 'clean' test systems. It can also happen becuse your process may not be as deterministic as you might thing.
Some people think that simply taking more samples (running more iterations) and averaging the measurmetns will give them better data. It doesn't. The more you run, the more likelty it is that you will encounter a perturbing event, thus making the average overly high.
A better way to do this is to run as many measurments as you can (time permitting). 100 is not a bad number, but 30-ish can be enough.
Then, sort these by magnitude and graph them. Note that this is not a standard distribution. Compute compute some simple statistics: mean, median, min, max, lower quaertile, upper quartile.
Contrary to some guidance, do not 'throw away' outside vaulues or 'outliers'. These are often the most intersting measurments. For example, you may establish a nice baseline, then look for departures. Understanding these departures will help you fully understand how your process works, how the sytsem affecdts your process, and what can interfere with your process. It will often readily expose bugs.
Depends what kind of data you want. I'd say one line per run initially, then analyze the data, go from there. Maybe store a min/max/average of X runs if you want to consolidate it.
http://en.wikipedia.org/wiki/Sample_size
Bryan is right - you need to investigate more. if your code has that much variance even "most" of the time then you might have a lot of fluctuation in your test environment because of other processes, os paging or other factors. If not it seems that you have code paths doing wildly varying amount of work and coming up with a single number/run data to describe the performance of such a multi-modal system is not going to tell you much. So i'd say isolate your setup as much as possible, run at least 30 trials and get a feel for what your performance curve looks like. Once you have that, you can use that wikipedia page to come up with a number that will tell you how many trials you need to run per code-change to see if the performance has increased/decreased with some level of statistical significance.
While saying, "Save every run," is nice, it might not be practical in your case. However, I do think that storing only the average eliminates too much data. I like storing the average of ten runs, but instead of storing just the average, I'd also store the max and min values, so that I can get a feel for the spread of the data in addition to its center.
The max and min information in particular will tell you how often corner cases arise. Is the 1500ms case a one-in-1000 outlier? Or is it something that recurs on a regular basis?

why program running time is not a measure?

i have learned that a program is measured by it's complexity - i mean by Big O Notation.
why don't we measure it by it's absolute running time?
thanks :)
You use the complexity of an algorithm instead of absolute running times to reason about algorithms, because the absolute running time of a program does not only depend on the algorithm used and the size of the input. It also depends on the machine it's running on, various implementations detail and what other programs are currently using system resources. Even if you run the same application twice with the same input on the same machine, you won't get exactly the same time.
Consequently when given a program you can't just make a statement like "this program will take 20*n seconds when run with an input of size n" because the program's running time depends on a lot more factors than the input size. You can however make a statement like "this program's running time is in O(n)", so that's a lot more useful.
Absolute running time is not an indicator of how the algorithm grows with different input sets. It's possible for a O(n*log(n)) algorithm to be far slower than an O(n^2) algorithm for all practical datasets.
Running time does not measure complexity, it only measures performance, or the time required to perform the task. An MP3 player will run for the length of the time require to play the song. The elapsed CPU time may be more useful in this case.
One measure of complexity is how it scales to larger inputs. This is useful for planning the require hardware. All things being equal, something that scales relatively linearly is preferable to one which scales poorly. Things are rarely equal.
The other measure of complexity is a measure of how simple the code is. The code complexity is usually higher for programs with relatively linear performance complexity. Complex code can be costly maintain, and changes are more likely to introduce errors.
All three (or four) measures are useful, and none of them are highly useful by themselves. The three together can be quite useful.
The question could use a little more context.
In programming a real program, we are likely to measure the program's running time. There are multiple potential issues with this though
1. What hardware is the program running on? Comparing two programs running on different hardware really doesn't give a meaningful comparison.
2. What other software is running? If anything else running, it's going to steal CPU cycles (or whatever other resource your program is running on).
3. What is the input? As already said, for a small set, a solution might look very fast, but scalability goes out the door. Also, some inputs are easier than others. If as a person, you hand me a dictionary and ask me to sort, I'll hand it right back and say done. Giving me a set of 50 cards (much smaller than a dictionary) in random order will take me a lot longer to do.
4. What is the starting conditions? If your program runs for the first time, chances are, spinning it off the hard disk will take up the largest chunk of time on modern systems. Comparing two implementations with small inputs will likely have their differences masked by this.
Big O notation covers a lot of these issues.
1. Hardware doesn't matter, as everything is normalized by the speed of 1 operation O(1).
2. Big O talks about the algorithm free of other algorithms around it.
3. Big O talks about how the input will change the running time, not how long one input takes. It tells you the worse the algorithm will perform, not how it performs on an average or easy input.
4. Again, Big O handles algorithms, not programs running in a physical system.

Resources