Suppose I am developing an algorithm that needs a number that must be between two numbers and for those values it must always yield the same number. I moved that calculation to a new method because I use it repeated times.
My friend proposed I should call it "randomBetween(min, max)" but I argue that this would not be correct because it always returns the same value if I send the same arguments, so it isn't random.
Am I wrong?
Thanks
If you return the same number for the same arguments, your function is not a random number generator, it's just a mathematical function.
You could call it deterministicRandomBetween(min,max), which captures the repeatability aspect.
However, this doesn't really sound like a general-purpose function. I can't imagine ever needing this, other than for the specific algorithm you are creating. So, it might make more sense to give it a name specific to its role in the algorithm (whatever that may be).
you can take the mean value of given numbers.
function int mean(int x, int y)
{
return (x+y)/2
}
Related
I've been trying to learn what recursion in programming is, and I need someone to confirm whether I have thruly understood what it is.
The way I'm trying to think about it is through collision-detection between objects.
Let's say we have a function. The function is called when it's certain that a collision has occured, and it's called with a list of objects to determine which object collided, and with what object it collided with. It does this by first confirming whether the first object in the list collided with any of the other objects. If true, the function returns the objects in the list that collided. If false, the function calls itself with a shortened list that excludes the first object, and then repeats the proccess to determine whether it was the next object in the list that collided.
This is a finite recursive function because if the desired conditions aren't met, it calls itself with a shorter and shorter list to until it deductively meets the desired conditions. This is in contrast to a potentially infinite recursive function, where, for example, the list it calls itself with is not shortened, but the order of the list is randomized.
So... is this correct? Or is this just another example of iteration?
Thanks!
Edit: I was fortunate enough to get three VERY good answers by #rici, #Evan and #Jack. They all gave me valuable insight on this, in both technical and practical terms from different perspectives. Thank you!
Any iteration can be expressed recursively. (And, with auxiliary data structures, vice versa, but not so easily.)
I would say that you are thinking iteratively. That's not a bad thing; I don't say it to criticise. Simply, your explanation is of the form "Do this and then do that and continue until you reach the end".
Recursion is a slightly different way of thinking. I have some problem, and it's not obvious how to solve it. But I observe that if I knew the answer to a simpler problem, I could easily solve the problem at hand. And, moreover, there are some very simple problems which I can solve directly.
The recursive solution is based on using a simpler (smaller, fewer, whatever) problem to solve the problem at hand. How do I find out which pairs of objects in a set of objects collide?
If the set has fewer than 2 elements, there are no pairs. That's the simplest problem, and it has an obvious solution: the empty set.
Otherwise, I select some object. All colliding pairs either include this object, or they don't. So that gives me two subproblems.
The set of collisions which don't involve the selected object is obviously the same problem which I started with, but with a smaller set. So I've replaced this part of the problem with a smaller problem. That's one recursion.
But I also need the set of objects which the selected object collides with (which might be an empty set). That's a simpler problem, because now one element of each pair is known. I can solve that problem recursively as well:
I need the set of pairs which include the object X and a set S of objects.
If the set is empty, there are no pairs. Simple.
Otherwise, I choose some element from the set. Then I find all the collisions between X and the rest of the set (a simpler but otherwise identical problem).
If there is a collision between X and the selected element, I add that to the set I just found.
Then I return the set.
Technically speaking, you have the right mindset of how recursion works.
Practically speaking, you would not want to use recursion for an instance such as the one you described above. Reasons being is that every recursive call adds to the stack (which is finite in size), and recursive calls are expensive on the processor, with enough objects you are going to run into some serious bottle-necking on a large application. With enough recursive calls, you would result with a stack overflow, which is exactly what you would get in "infinite recursion". You never want something to infinitely recurse; it goes against the fundamental principal of recursion.
Recursion works on two defining characteristics:
A base case can be defined: It is possible to eventually reach 0 or 1 depending on your necessity
A general case can be defined: The general case is continually called, reducing the problem set until your base case is reached.
Once you have defined both cases, you can define a recursive solution.
The point of recursion is to take a very large and difficult-to-solve problem and continually break it down until it's easy to work with.
Once our base case is reached, the methods "recurse-out". This means they bounce backwards, back into the function that called it, bringing all the data from the functions below it!
It is at this point that our operations actually occur.
Once the original function is reached, we have our final result.
For example, let's say you want the summation of the first 3 integers. The first recursive call is passed the number 3.
public factorial(num) {
//Base case
if (num == 1) {
return 1;
}
//General case
return num + factorial(num-1);
}
Walking through the function calls:
factorial(3); //Initial function call
//Becomes..
factorial(1) + factorial(2) + factorial(3) = returned value
This gives us a result of 6!
Your scenario seems to me like iterative programming, but your function is simply calling itself as a way of continuing its comparisons. That is simply re-tasking your function to be able to call itself with a smaller list.
In my experience, a recursive function has more potential to branch out into multiple 'threads' (so to speak), and is used to process information the same way the hierarchy in a company works for delegation; The boss hands a contract down to the managers, who divide up the work and hand it to their respective staff, the staff get it done, and had it back to the managers, who report back to the boss.
The best example of a recursive function is one that iterates through all files on a file system. ( I will do this in pseudo code because it works in all languages).
function find_all_files (directory_name)
{
- Check the given directory name for sub-directories within it
- for each sub-directory
find_all_files(directory_name + subdirectory_name)
- Check the given directory for files
- Do your processing of the filename; it is located at directory_name + filename
}
You use the function by calling it with a directory path as the parameter. The first thing it does is, for each subdirectory, it generates a value of the actual path to the subdirectory and uses it as a value to call find_all_files() with. As long as there are sub-directories in the given directory, it will keep calling itself.
Now, when the function reaches a directory that contains only files, it is allowed to proceed to the part where it process the files. Once done that, it exits, and returns to the previous instance of itself that is iterating through directories.
It continues to process directories and files until it has completed all iterations and returns to the main program flow where you called the original instance of find_all_files in the first place.
One additional note: Sometimes global variables can be handy with recursive functions. If your function is merely searching for the first occurrence of something, you can set an "exit" variable as a flag to "stop what you are doing now!". You simply add checks for the flag's status during any iterations you have going on inside the function (as in the example, the iteration through all the sub-directories). Then, when the flag is set, your function just exits. Since the flag is global, all generations of the function will exit and return to the main flow.
This question doesn't address any programming language in particular but of course I'm happy to hear some examples.
Imagine a big number of files, let's say 5000, that have all kinds of letters and numbers in it. Then, there is a method that receives a user input that acts as an alias in order to display that file. Without having the files sorted in a folder, the method(s) need to return the file name that is associated to the alias the user provided.
So let's say user input "gd322" stands for the file named "k4e23", the method would look like
if(input.equals("gd322")){
return "k4e23";
}
Now, imagine having 4 values in that method:
switch(input){
case gd322: return fw332;
case g344d: return 5g4gh;
case s3red: return 536fg;
case h563d: return h425d;
} //switch on string, no break, no string indicators, ..., pls ignore the syntax, it's just pseudo
Keeping in mind we have 5000 entries, there are probably more than just 2 entries starting with g. Now, if the user input starts with 's', instead of wasting CPU cycles checking all the a's, b's, c's, ..., we could also make another switch for this, which then directs to the 'next' methods like this:
switch(input[0]){ //implying we could access strings like that
case a: switchA(input);
case b: switchB(input);
// [...]
case g: switchG(input);
case s: switchS(input);
}
So the CPU doesn't have to check on all of them, but rather calls a method like this:
switchG(String input){
switch(input){
case gd322: return fw332;
case g344d: return 5g4gh;
// [...]
}
Is there any field of computer science dealing with this? I don't know how to call it and therefore don't know how to search for it but I think my thoughts make sense on a large scale. Pls move the thread if it doesn't belong here but I really wanna see your thoughts on this.
EDIT: don't quote me on that "5000", I am not in the situation described above and I wanted to talk about this completely theoretical, it could also be 3 entries or 300'000, maybe even less or more
If you have 5000 options, you're probably better off hashing them than having hard-coded if / switch statements. In c++ you could also use std::map to pair a function pointer or other option handling information with each possible option.
Interesting, but I don't think you can give a generic answer. It all depends on how the code is executed. Many compilers will have all kinds of optimizations, in the if and switch, but also in the way strings are compared.
That said, if you have actual (disk) files with those lists, then reading the file will probably take much longer than processing it, since disk I/O is very slow compared to memory access and CPU processing.
And if you have a list like that, you may want to build a hash table, or simply a sorted list/array in which you can perform a binary search. Sorting it also takes time, but if you have to do many lookups in the same list, it may be well worth the time.
Is there any field of computer science dealing with this?
Yes, the science of efficient data structures. Well, isn't that what CS is all about? :-)
The algorithm you described resembles a trie. It wouldn't be statically encoded in the source code with switch statements, but would use dynamic lookups in a structure loaded from somewhere and stuff, but the idea is the same.
Yes the problem is known and solved since decades. Hash functions.
Basically you have a set of values (here strings like "gd322", "g344d") and you want to know if some other value v is among them.
The idea is to put the strings in a big array, at an index calculated from their values by some function. Given a value v, you'll compute an index the same way, and check whether the value v is here or not. Much faster than checking the whole array.
Of course there is a problem with different values falling at the same place : collisions. Some magic is needed then : perfect hash functions whose coefficients are tweaked so values from the initial set don't cause any collisions.
Typically, the re-estimation iterative procedure stops when lambda.bar - lambda is less than some epsilon value.
How exactly does one determine this epsilon value? I often only see is written as the general epsilon symbol in papers, and never the actual value used, which I assume would change depending on the data.
So, for instance, if the lambda value of my first iteration was 5*10^-22, second iteration was 1.3*10^-15, third was 8.45*10^-15, fourth was 1.65*10^-14, etc., how would I determine when the algorithm needed no more iteratons?
Moreover, what if I were to apply the same alogrithm to a different datset? would I need to change my epsilon definitions?
Sorry for the long question. Pretty puzzled by it... :)
"how would I determine when the algorithm needed no more iteratons?"
When you get a "good-enough" result within a reasonable amount of time. ;-)
"Moreover, what if I were to apply the same alogrithm to a different datset? would I need to
change my epsilon definitions?"
Yes, most probably.
If you can afford it, you can just let it iterate until the updated value <= the old value (it could be < due to floating point error). I would be inclined to go with this until I ran out of patience or cpu budget.
I have values returned by unknown function like for example
# this is an easy case - parabolic function
# but in my case function is realy unknown as it is connected to process execution time
[0, 1, 4, 9]
is there a way to predict next value?
Not necessarily. Your "parabolic function" might be implemented like this:
def mindscrew
#nums ||= [0, 1, 4, 9, "cat", "dog", "cheese"]
#nums.pop
end
You can take a guess, but to predict with certainty is impossible.
You can try using neural networks approach. There are pretty many articles you can find by Google query "neural network function approximation". Many books are also available, e.g. this one.
If you just want data points
Extrapolation of data outside of known points can be estimated, but you need to accept the potential differences are much larger than with interpolation of data between known points. Strictly, both can be arbitrarily inaccurate, as the function could do anything crazy between the known points, even if it is a well-behaved continuous function. And if it isn't well-behaved, all bets are already off ;-p
There are a number of mathematical approaches to this (that have direct application to computer science) - anything from simple linear algebra to things like cubic splines; and everything in between.
If you want the function
Getting esoteric; another interesting model here is genetic programming; by evolving an expression over the known data points it is possible to find a suitably-close approximation. Sometimes it works; sometimes it doesn't. Not the language you were looking for, but Jason Bock shows some C# code that does this in .NET 3.5, here: Evolving LINQ Expressions.
I happen to have his code "to hand" (I've used it in some presentations); with something like a => a * a it will find it almost instantly, but it should (in theory) be able to find virtually any method - but without any defined maximum run length ;-p It is also possible to get into a dead end (evolutionary speaking) where you simply never recover...
Use the Wolfram Alpha API :)
Yes. Maybe.
If you have some input and output values, i.e. in your case [0,1,2,3] and [0,1,4,9], you could use response surfaces (basicly function fitting i believe) to 'guess' the actual function (in your case f(x)=x^2). If you let your guessing function be f(x)=c1*x+c2*x^2+c3 there are algorithms that will determine that c1=0, c2=1 and c3=0 given your input and output and given the resulting function you can predict the next value.
Note that most other answers to this question are valid as well. I am just assuming that you want to fit some function to data. In other words, I find your question quite vague, please try to pose your questions as complete as possible!
In general, no... unless you know it's a function of a particular form (e.g. polynomial of some degree N) and there is enough information to constrain the function.
e.g. for a more "ordinary" counterexample (see Chuck's answer) for why you can't necessarily assume n^2 w/o knowing it's a quadratic equation, you could have f(n) = n4 - 6n3 + 12n2 - 6n, which has for n=0,1,2,3,4,5 f(n) = 0,1,4,9,40,145.
If you do know it's a particular form, there are some options... if the form is a linear addition of basis functions (e.g. f(x) = a + bcos(x) + csqrt(x)) then using least-squares can get you the unknown coefficients for the best fit using those basis functions.
See also this question.
You can apply statistical methods to try and guess the next answer, but that might not work very well if the function is like this one (c):
int evil(void){
static int e = 0;
if(50 == e++){
e = e * 100;
}
return e;
}
This function will return nice simple increasing numbers then ... BAM.
That's a hard problem.
You should check out the recurrence relation equation for special cases where it could be possible such a task.
Lambda calculus of course is quite elegant, but doesn't it bother you that there is this asymmetry between input and output of a function? I.e. you can make the function take two parameters (by returning a function) but you can't make it return two values.
I don't think we could find it in The Book.
You can make it return a function whose evaluations return two values.
Or you can return a compound structure. Just like a vector function which returns a vector, which has separate components in there.
BTW what does your header question have to do with the actual question you've asked?