Related
I was asked this during a recent phone interview -
Given a Dictionary with a word and the weight of a word(frequency, higher is better), like so -
var words = new Dictionary<string,int>();
words.Add("am",7);
words.Add("ant", 5);
words.Add("amazon", 10);
words.Add("amazing", 8);
words.Add("an", 4);
words.Add("as", 11);
words.Add("be", 8);
words.Add("bee", 2);
words.Add("bed", 4);
words.Add("best", 12);
words.Add("amuck", 1);
words.Add("amock", 2);
words.Add("bestest", 1);
Design an API method, that given a prefix and a number k, return the top k words that match the prefix.
The words should be sorted based on their weight, the higher the better.
So, prefix = "am", k = 5, returns amazon, amazing, am, amock, amuck - in that specific order.
Performance on the prefix lookup is paramount, you can pre-process and use as much space as you like, as long as the prefix lookup is fast.
This is a Trie implementation, but my question is how best to handle the word weight and optimise the lookup. In my mind the options are -
a. For each node in the Trie, also store a sorted list of words (SortedDictionary<int,List<string>>) that start with this prefix - more space, but faster lookup.
b. For each node, store the Child nodes in some kind of sorted list, so you would still need to do a DFS for each child node to get the K words needed - less space compared to a., but slower.
I decided to go with option a.
public class TrieWithSuggestions
{
TrieWithSuggestions _trieRoot;
public TrieWithSuggestions()
{
}
public char Character { get; set; }
public int WordCount { get; set; } = 1;
public TrieWithSuggestions[] ChildNodes { get; set; } = new TrieWithSuggestions[26];
//Stores all words with this prefix.
public SortedDictionary<int, HashSet<string>> PrefixWordsDictionary = new SortedDictionary<int, HashSet<string>>();
public TrieWithSuggestions ConstructTrie(Dictionary<string, int> words)
{
if (words.Count > 0)
{
_trieRoot = new TrieWithSuggestions() { Character = default(char) };
foreach (var word in words)
{
var node = _trieRoot;
for (int i = 0; i < word.Key.Length; i++)
{
var c = word.Key[i];
if (node.ChildNodes[c - 'a'] != null)
{
node = node.ChildNodes[c - 'a'];
UpdateParentNodeInformation(node, word.Key, words[word.Key]);
node.WordCount++;
}
else
{
InsertIntoTrie(node, word.Key, i, words);
break;
}
}
}
}
return _trieRoot;
}
public List<string> GetMathchingWords(string prefix, int k)
{
if (_trieRoot != null)
{
var node = _trieRoot;
foreach (var ch in prefix)
{
if (node.ChildNodes[ch - 'a'] != null)
{
node = node.ChildNodes[ch - 'a'];
}
else
return null;
}
if (node != null)
return GetWords(node, k);
else
return null;
}
return null;
}
List<string> GetWords(TrieWithSuggestions node, int k)
{
List<string> output = new List<string>();
foreach (var dictEntry in node.PrefixWordsDictionary)
{
var entries = node.PrefixWordsDictionary[dictEntry.Key];
var take = Math.Min(entries.Count, k);
output.AddRange(entries.Take(take).ToList());
k -= entries.Count;
if (k == 0)
break;
}
return output;
}
void InsertIntoTrie(TrieWithSuggestions parentNode, string word, int startIndex, Dictionary<string, int> words)
{
for (int i = startIndex; i < word.Length; i++)
{
var c = word[i];
var childNode = new TrieWithSuggestions() { Character = c };
parentNode.ChildNodes[c - 'a'] = childNode;
UpdateParentNodeInformation(parentNode, word, words[word]);
parentNode = childNode;
if (i == word.Length - 1)
UpdateParentNodeInformation(parentNode, word, words[word]);
}
}
void UpdateParentNodeInformation(TrieWithSuggestions parentNode, string word, int wordWeight)
{
wordWeight *= -1;
if (parentNode.PrefixWordsDictionary.ContainsKey(wordWeight))
{
if (!parentNode.PrefixWordsDictionary[wordWeight].Contains(word))
parentNode.PrefixWordsDictionary[wordWeight].Add(word);
}
else
parentNode.PrefixWordsDictionary.Add(wordWeight, new HashSet<string>() { word });
}
}
Construct Trie - RunTime O(N* M * logN), Space - O(N * M * N) , N - #of words, M - avg word length.
Justification -
If there were no Dictionary, this would be O(N * M), insertion into a SortedDictionary is O(logN), so worst case Runtime must be O(N* M * logN)
Space seems trickier, but like before if there were no SortedDictionary, space would be O(N * M), and in the worst case, the Dictionary could have all N words, so Space Complexity looks like O(N * M * N)
GetMatchingWords - RunTime O(len(prefix) + k)
Function call -
var trie = new TrieWithSuggestions();
trie.ConstructTrie(words);
var list = trie.GetMathchingWords("am", 10); //amazon, amazing, am, amock, amuck
QUESTION:
Given the conditions on space and pre-processing, is there a better way to do this?
EDIT 1 -
a. Given this setup, it is best to sort the words by weight and then insert into the Trie. In this case a simple List<string> would suffice, since higher frequency words would have been inserted first automatically.
b. Now lets say that in addition to being initialized with a Dictionary<string,int>, we are also going to get additional word, frequency pairs. We would still want a lookup that is as fast as possible, given this requirement what is now the best data-structure to store the sorted list of words within a TrieNode, is a SortedDictionary<int,HashSet<string>> the best option?
You could first sort the input with respect to the weights. Then, you could use Lists instead of Dictionaries on the nodes of trie. Since the words come in increasing (or decreasing) order of weight, checking the last element of the list is enough to decide where to put this new word. This gets rid of the O(logN) time taken by Dictionary.
The input can be sorted in O(N * logN) with a comparison sort, or in O(N + W) with a counting sort where W is the maximum weight.
Time complexity of setting up the trie becomes O(N * logN + N * M). This is better than O(N * M * logN). Query time does not change.
(Last paragraph assumes HashSet operations execute in O(1) as in the question. It is wrong to make this assumption for arbitrary inputs and hash functions.)
I'm learning data structures from a book. In the book, they have snippets of pseudocode at the end of the chapter and I'm trying to determine the time complexity. I'm having a little bit of difficulty understand some concepts in time complexity.
I have two pieces of code that do the same thing, seeing if an element in an array occurs at least 3 times; however, one uses recursion and the other uses loops. I have answers for both; can someone tell me whether or not they're correct?
First way (without recursion):
boolean findTripleA(int[] anArray) {
if (anArray.length <= 2) {
return false;
}
for (int i=0; i < anArray.length; i++) {
// check if anArray[i] occurs at least three times
// by counting how often it occurs in anArray
int count = 0;
for (int j = 0; j < anArray.length; j++) {
if (anArray[i] == anArray[j]) {
count++;
}
}
if (count >= 3) {
return true;
}
}
return false;
}
I thought the first way had a time complexity of O(n^2) in best and worst case scenario because there is no way to avoid the inner for loop.
Second way (with recursion):
public static Integer findTripleB(int[] an Array) {
if (anArray.length <= 2) {
return false;
}
// use insertion sort to sort anArray
for (int i = 1; i < anArray.length; i++) {
// insert anArray[i]
int j = i-1;
int element = anArray[i];
while (j >= 0 && anArray[j] > element) {
anArray[j+1] = anArray[j];
j--;
}
anArray[j+1] = element;
}
// check whether anArray contains three consecutive
// elements of the same value
for (int i = 0; i < anArray.length-2; i++) {
if (anArray[i] == anArray[i+2]) {
return new Integer(anArray[i]);
}
}
return null;
}
I thought the second way had a worst case time complexity of O(n^2) and a best case of O(n) if the array is sorted already and insertion sort can be skipped; however I don't know how recursion plays into effect.
The best case for the first one is O(n) - consider what happens when the element appearing first appears 3 times (it will return after just one iteration of the outer loop).
The second one doesn't use recursion, and your running times are correct (technically insertion sort won't 'be skipped', but the running time of insertion sort on an already sorted array is O(n) - just making sure we're on the same page).
Two other ways to solve the problem:
Sort using a better sorting algorithm such as mergesort.
Would take O(n log n) best and worst case.
Insert the elements into a hash map of element to count.
Would take expected O(n) worst case, O(1) best case.
Problem
Find a list of non repeating number in a array of repeating numbers.
My Solution
public static int[] FindNonRepeatedNumber(int[] input)
{
List<int> nonRepeated = new List<int>();
bool repeated = false;
for (int i = 0; i < input.Length; i++)
{
repeated = false;
for (int j = 0; j < input.Length; j++)
{
if ((input[i] == input[j]) && (i != j))
{
//this means the element is repeated.
repeated = true;
break;
}
}
if (!repeated)
{
nonRepeated.Add(input[i]);
}
}
return nonRepeated.ToArray();
}
Time and space complexity
Time complexity = O(n^2)
Space complexity = O(n)
I am not sure with the above calculated time complexity, also how can I make this program more efficient and fast.
The complexity of the Algorithm you provided is O(n^2).
Use Hashmaps to improve the algorithm. The Psuedo code is as follows:
public static int[] FindNonRepeatedNumbers(int[] A)
{
Hashtable<int, int> testMap= new Hashtable<int, int>();
for (Entry<Integer, String> entry : testMap.entrySet()) {
tmp=testMap.get(A[i]);
testMap.put(A[i],tmp+1);
}
/* Elements that are not repeated are:
Set set = teatMap.entrySet();
// Get an iterator
Iterator i = set.iterator();
// Display elements
while(i.hasNext()) {
Map.Entry me = (Map.Entry)i.next();
if(me.getValue() >1)
{
System.out.println(me.getValue());
}
}
Operation:
What I did here is I used Hashmaps with keys to the hashmaps being the elements of the input array. The values for the hashmaps are like counters for each element. So if an element occurs once then the value for that key is 1 and the key value is subsequently incremented based on recurrence of element in input array.
So finally you just check your hashmap and then display elements with hashvalue 1 which are non-repated elements. The time complexity for this algorithm is O(k) for creating hashmap and O(k) for searching, if the input array length is k. This is much faster than O(n^2). The worst case is when there are no repeated elements at all. The psuedo code might be messy but this approach is the best way I could think of.
Time complexity O(n) means you can't have an inner loop. A full inner loop is O(n^2).
two pointers. begining and end. increment begining when same letters reached and store the start and end pos ,length for reference... increment end otherwise.. keep doing this til end of list..compare all the outputs and you should have the longest continuous list of unique numbers. I hope this is what the question required. Linear algo.
void longestcontinuousunique(int arr[])
{
int start=0;
int end =0;
while (end! =arr.length())
{
if(arr[start] == arr[end])
{
start++;
savetolist(start,end,end-start);
}
else
end++
}
return maxelementof(savedlist);
}
Given n string of max length m. How can we find the longest common prefix shared by at least two strings among them?
Example: ['flower', 'flow', 'hello', 'fleet']
Answer: fl
I was thinking of building a Trie for all the string and then checking the deepest node (satisfies longest) that branches out to two/more substrings (satisfies commonality). This takes O(n*m) time and space. Is there a better way to do this
Why to use trie(which takes O(mn) time and O(mn) space, just use the basic brute force way. first loop, find the shortest string as minStr, which takes o(n) time, second loop, compare one by one with this minStr, and keep an variable which indicates the rightmost index of minStr, this loop takes O(mn) where m is the shortest length of all strings. The code is like below,
public String longestCommonPrefix(String[] strs) {
if(strs.length==0) return "";
String minStr=strs[0];
for(int i=1;i<strs.length;i++){
if(strs[i].length()<minStr.length())
minStr=strs[i];
}
int end=minStr.length();
for(int i=0;i<strs.length;i++){
int j;
for( j=0;j<end;j++){
if(minStr.charAt(j)!=strs[i].charAt(j))
break;
}
if(j<end)
end=j;
}
return minStr.substring(0,end);
}
there is an O(|S|*n) solution to this problem, using a trie. [n is the number of strings, S is the longest string]
(1) put all strings in a trie
(2) do a DFS in the trie, until you find the first vertex with more than 1 "edge".
(3) the path from the root to the node you found at (2) is the longest common prefix.
There is no possible faster solution then it [in terms of big O notation], at the worst case, all your strings are identical - and you need to read all of them to know it.
I would sort them, which you can do in n lg n time. Then any strings with common prefixes will be right next to eachother. In fact you should be able to keep a pointer of which index you're currently looking at and work your way down for a pretty speedy computation.
As a completely different answer from my other answer...
You can, with one pass, bucket every string based on its first letter.
With another pass you can sort each bucket based on its second later. (This is known as radix sort, which is O(n*m), and O(n) with each pass.) This gives you a baseline prefix of 2.
You can safely remove from your dataset any elements that do not have a prefix of 2.
You can continue the radix sort, removing elements without a shared prefix of p, as p approaches m.
This will give you the same O(n*m) time that the trie approach does, but will always be faster than the trie since the trie must look at every character in every string (as it enters the structure), while this approach is only guaranteed to look at 2 characters per string, at which point it culls much of the dataset.
The worst case is still that every string is identical, which is why it shares the same big O notation, but will be faster in all cases as is guaranteed to use less comparisons since on any "non-worst-case" there are characters that never need to be visited.
public String longestCommonPrefix(String[] strs) {
if (strs == null || strs.length == 0)
return "";
char[] c_list = strs[0].toCharArray();
int len = c_list.length;
int j = 0;
for (int i = 1; i < strs.length; i++) {
for (j = 0; j < len && j < strs[i].length(); j++)
if (c_list[j] != strs[i].charAt(j))
break;
len = j;
}
return new String(c_list).substring(0, len);
}
It happens that the bucket sort (radix sort) described by corsiKa can be extended such that all strings are eventually placed alone in a bucket, and at that point, the LCP for such a lonely string is known. Further, the shustring of each string is also known; it is one longer than is the LCP. The bucket sort is defacto the construction of a suffix array but, only partially so. Those comparisons that are not performed (as described by corsiKa) indeed represent those portions of the suffix strings that are not added to the suffix array. Finally, this method allows for determination of not just the LCP and shustrings, but also one may easily find those subsequences that are not present within the string.
Since the world is obviously begging for an answer in Swift, here's mine ;)
func longestCommonPrefix(strings:[String]) -> String {
var commonPrefix = ""
var indices = strings.map { $0.startIndex}
outerLoop:
while true {
var toMatch: Character = "_"
for (whichString, f) in strings.enumerate() {
let cursor = indices[whichString]
if cursor == f.endIndex { break outerLoop }
indices[whichString] = cursor.successor()
if whichString == 0 { toMatch = f[cursor] }
if toMatch != f[cursor] { break outerLoop }
}
commonPrefix.append(toMatch)
}
return commonPrefix
}
Swift 3 Update:
func longestCommonPrefix(strings:[String]) -> String {
var commonPrefix = ""
var indices = strings.map { $0.startIndex}
outerLoop:
while true {
var toMatch: Character = "_"
for (whichString, f) in strings.enumerated() {
let cursor = indices[whichString]
if cursor == f.endIndex { break outerLoop }
indices[whichString] = f.characters.index(after: cursor)
if whichString == 0 { toMatch = f[cursor] }
if toMatch != f[cursor] { break outerLoop }
}
commonPrefix.append(toMatch)
}
return commonPrefix
}
What's interesting to note:
this runs in O^2, or O(n x m) where n is the number of strings and m
is the length of the shortest one.
this uses the String.Index data type and thus deals with Grapheme Clusters which the Character type represents.
And given the function I needed to write in the first place:
/// Takes an array of Strings representing file system objects absolute
/// paths and turn it into a new array with the minimum number of common
/// ancestors, possibly pushing the root of the tree as many level downwards
/// as necessary
///
/// In other words, we compute the longest common prefix and remove it
func reify(fullPaths:[String]) -> [String] {
let lcp = longestCommonPrefix(fullPaths)
return fullPaths.map {
return $0[lcp.endIndex ..< $0.endIndex]
}
}
here is a minimal unit test:
func testReifySimple() {
let samplePaths:[String] = [
"/root/some/file"
, "/root/some/other/file"
, "/root/another/file"
, "/root/direct.file"
]
let expectedPaths:[String] = [
"some/file"
, "some/other/file"
, "another/file"
, "direct.file"
]
let reified = PathUtilities().reify(samplePaths)
for (index, expected) in expectedPaths.enumerate(){
XCTAssert(expected == reified[index], "failed match, \(expected) != \(reified[index])")
}
}
Perhaps a more intuitive solution. Channel the already found prefix out of earlier iteration as input string to the remaining or next string input. [[[w1, w2], w3], w4]... so on], where [] is supposedly the LCP of two strings.
public String findPrefixBetweenTwo(String A, String B){
String ans = "";
for (int i = 0, j = 0; i < A.length() && j < B.length(); i++, j++){
if (A.charAt(i) != B.charAt(j)){
return i > 0 ? A.substring(0, i) : "";
}
}
// Either of the string is prefix of another one OR they are same.
return (A.length() > B.length()) ? B.substring(0, B.length()) : A.substring(0, A.length());
}
public String longestCommonPrefix(ArrayList<String> A) {
if (A.size() == 1) return A.get(0);
String prefix = A.get(0);
for (int i = 1; i < A.size(); i++){
prefix = findPrefixBetweenTwo(prefix, A.get(i)); // chain the earlier prefix
}
return prefix;
}
For example,
Look at the code that calculates the n-th Fibonacci number:
fib(int n)
{
if(n==0 || n==1)
return 1;
return fib(n-1) + fib(n-2);
}
The problem with this code is that it will generate stack overflow error for any number greater than 15 (in most computers).
Assume that we are calculating fib(10). In this process, say fib(5) is calculated a lot of times. Is there some way to store this in memory for fast retrieval and thereby increase the speed of recursion?
I am looking for a generic technique that can be used in almost all problems.
Yes your insight is correct.
This is called dynamic programming. It is usually a common memory runtime trade-off.
In the case of fibo, you don't even need to cache everything :
[edit]
The author of the question seems to be looking for a general method to cache rather than a method to compute Fibonacci. Search wikipedia or look at the code of the other poster to get this answer. Those answers are linear in time and memory.
**Here is a linear-time algorithm O(n), constant in memory **
in OCaml:
let rec fibo n =
let rec aux = fun
| 0 -> (1,1)
| n -> let (cur, prec) = aux (n-1) in (cur+prec, cur)
let (cur,prec) = aux n in prec;;
in C++:
int fibo(int n) {
if (n == 0 ) return 1;
if (n == 1 ) return 1;
int p = fibo(0);
int c = fibo(1);
int buff = 0;
for (int i=1; i < n; ++i) {
buff = c;
c = p+c;
p = buff;
};
return c;
};
This perform in linear time. But log is actually possible !!!
Roo's program is linear too, but way slower, and use memory.
Here is the log algorithm O(log(n))
Now for the log-time algorithm (way way way faster), here is a method :
If you know u(n), u(n-1), computing u(n+1), u(n) can be done by applying a matrix:
| u(n+1) | = | 1 1 | | u(n) |
| u(n) | | 1 0 | | u(n-1) |
So that you have :
| u(n) | = | 1 1 |^(n-1) | u(1) | = | 1 1 |^(n-1) | 1 |
| u(n-1) | | 1 0 | | u(0) | | 1 0 | | 1 |
Computing the exponential of the matrix has a logarithmic complexity.
Just implement recursively the idea :
M^(0) = Id
M^(2p+1) = (M^2p) * M
M^(2p) = (M^p) * (M^p) // of course don't compute M^p twice here.
You can also just diagonalize it (not to difficult), you will find the gold number and its conjugate in its eigenvalue, and the result will give you an EXACT mathematical formula for u(n). It contains powers of those eigenvalues, so that the complexity will still be logarithmic.
Fibo is often taken as an example to illustrate Dynamic Programming, but as you see, it is not really pertinent.
#John:
I don't think it has anything to do with do with hash.
#John2:
A map is a bit general don't you think? For Fibonacci case, all the keys are contiguous so that a vector is appropriate, once again there are much faster ways to compute fibo sequence, see my code sample over there.
This is called memoization and there is a very good article about memoization Matthew Podwysocki posted these days. It uses Fibonacci to exemplify it. And shows the code in C# also. Read it here.
If you're using C#, and can use PostSharp, here's a simple memoization aspect for your code:
[Serializable]
public class MemoizeAttribute : PostSharp.Laos.OnMethodBoundaryAspect, IEqualityComparer<Object[]>
{
private Dictionary<Object[], Object> _Cache;
public MemoizeAttribute()
{
_Cache = new Dictionary<object[], object>(this);
}
public override void OnEntry(PostSharp.Laos.MethodExecutionEventArgs eventArgs)
{
Object[] arguments = eventArgs.GetReadOnlyArgumentArray();
if (_Cache.ContainsKey(arguments))
{
eventArgs.ReturnValue = _Cache[arguments];
eventArgs.FlowBehavior = FlowBehavior.Return;
}
}
public override void OnExit(MethodExecutionEventArgs eventArgs)
{
if (eventArgs.Exception != null)
return;
_Cache[eventArgs.GetReadOnlyArgumentArray()] = eventArgs.ReturnValue;
}
#region IEqualityComparer<object[]> Members
public bool Equals(object[] x, object[] y)
{
if (Object.ReferenceEquals(x, y))
return true;
if (x == null || y == null)
return false;
if (x.Length != y.Length)
return false;
for (Int32 index = 0, len = x.Length; index < len; index++)
if (Comparer.Default.Compare(x[index], y[index]) != 0)
return false;
return true;
}
public int GetHashCode(object[] obj)
{
Int32 hash = 23;
foreach (Object o in obj)
{
hash *= 37;
if (o != null)
hash += o.GetHashCode();
}
return hash;
}
#endregion
}
Here's a sample Fibonacci implementation using it:
[Memoize]
private Int32 Fibonacci(Int32 n)
{
if (n <= 1)
return 1;
else
return Fibonacci(n - 2) + Fibonacci(n - 1);
}
Quick and dirty memoization in C++:
Any recursive method type1 foo(type2 bar) { ... } is easily memoized with map<type2, type1> M.
// your original method
int fib(int n)
{
if(n==0 || n==1)
return 1;
return fib(n-1) + fib(n-2);
}
// with memoization
map<int, int> M = map<int, int>();
int fib(int n)
{
if(n==0 || n==1)
return 1;
// only compute the value for fib(n) if we haven't before
if(M.count(n) == 0)
M[n] = fib(n-1) + fib(n-2);
return M[n];
}
EDIT: #Konrad Rudolph
Konrad points out that std::map is not the fastest data structure we could use here. That's true, a vector<something> should be faster than a map<int, something> (though it might require more memory if the inputs to the recursive calls of the function were not consecutive integers like they are in this case), but maps are convenient to use generally.
According to wikipedia Fib(0) should be 0 but it does not matter.
Here is simple C# solution with for cycle:
ulong Fib(int n)
{
ulong fib = 1; // value of fib(i)
ulong fib1 = 1; // value of fib(i-1)
ulong fib2 = 0; // value of fib(i-2)
for (int i = 0; i < n; i++)
{
fib = fib1 + fib2;
fib2 = fib1;
fib1 = fib;
}
return fib;
}
It is pretty common trick to convert recursion to tail recursion and then to loop. For more detail see for example this lecture (ppt).
What language is this? It doesnt overflow anything in c...
Also, you can try creating a lookup table on the heap, or use a map
caching is generally a good idea for this kind of thing. Since fibonacci numbers are constant, you can cache the result once you have calculated it. A quick c/pseudocode example
class fibstorage {
bool has-result(int n) { return fibresults.contains(n); }
int get-result(int n) { return fibresult.find(n).value; }
void add-result(int n, int v) { fibresults.add(n,v); }
map<int, int> fibresults;
}
fib(int n ) {
if(n==0 || n==1)
return 1;
if (fibstorage.has-result(n)) {
return fibstorage.get-result(n-1);
}
return ( (fibstorage.has-result(n-1) ? fibstorage.get-result(n-1) : fib(n-1) ) +
(fibstorage.has-result(n-2) ? fibstorage.get-result(n-2) : fib(n-2) )
);
}
calcfib(n) {
v = fib(n);
fibstorage.add-result(n,v);
}
This would be quite slow, as every recursion results in 3 lookups, however this should illustrate the general idea
Is this a deliberately chosen example? (eg. an extreme case you're wanting to test)
As it's currently O(1.6^n) i just want to make sure you're just looking for answers on handling the general case of this problem (caching values, etc) and not just accidentally writing poor code :D
Looking at this specific case you could have something along the lines of:
var cache = [];
function fib(n) {
if (n < 2) return 1;
if (cache.length > n) return cache[n];
var result = fib(n - 2) + fib(n - 1);
cache[n] = result;
return result;
}
Which degenerates to O(n) in the worst case :D
[Edit: * does not equal + :D ]
[Yet another edit: the Haskell version (because i'm a masochist or something)
fibs = 1:1:(zipWith (+) fibs (tail fibs))
fib n = fibs !! n
]
Try using a map, n is the key and its corresponding Fibonacci number is the value.
#Paul
Thanks for the info. I didn't know that. From the Wikipedia link you mentioned:
This technique of saving values that
have already been calculated is called
memoization
Yeah I already looked at the code (+1). :)
#ESRogs:
std::map lookup is O(log n) which makes it slow here. Better use a vector.
vector<unsigned int> fib_cache;
fib_cache.push_back(1);
fib_cache.push_back(1);
unsigned int fib(unsigned int n) {
if (fib_cache.size() <= n)
fib_cache.push_back(fib(n - 1) + fib(n - 2));
return fib_cache[n];
}
Others have answered your question well and accurately - you're looking for memoization.
Programming languages with tail call optimization (mostly functional languages) can do certain cases of recursion without stack overflow. It doesn't directly apply to your definition of Fibonacci, though there are tricks..
The phrasing of your question made me think of an interesting idea.. Avoiding stack overflow of a pure recursive function by only storing a subset of the stack frames, and rebuilding when necessary.. Only really useful in a few cases. If your algorithm only conditionally relies on the context as opposed to the return, and/or you're optimizing for memory not speed.
Mathematica has a particularly slick way to do memoization, relying on the fact that hashes and function calls use the same syntax:
fib[0] = 1;
fib[1] = 1;
fib[n_] := fib[n] = fib[n-1] + fib[n-2]
That's it. It caches (memoizes) fib[0] and fib[1] off the bat and caches the rest as needed. The rules for pattern-matching function calls are such that it always uses a more specific definition before a more general definition.
One more excellent resource for C# programmers for recursion, partials, currying, memoization, and their ilk, is Wes Dyer's blog, though he hasn't posted in awhile. He explains memoization well, with solid code examples here:
http://blogs.msdn.com/wesdyer/archive/2007/01/26/function-memoization.aspx
The problem with this code is that it will generate stack overflow error for any number greater than 15 (in most computers).
Really? What computer are you using? It's taking a long time at 44, but the stack is not overflowing. In fact, your going to get a value bigger than an integer can hold (~4 billion unsigned, ~2 billion signed) before the stack is going to over flow (Fibbonaci(46)).
This would work for what you want to do though (runs wiked fast)
class Program
{
public static readonly Dictionary<int,int> Items = new Dictionary<int,int>();
static void Main(string[] args)
{
Console.WriteLine(Fibbonacci(46).ToString());
Console.ReadLine();
}
public static int Fibbonacci(int number)
{
if (number == 1 || number == 0)
{
return 1;
}
var minus2 = number - 2;
var minus1 = number - 1;
if (!Items.ContainsKey(minus2))
{
Items.Add(minus2, Fibbonacci(minus2));
}
if (!Items.ContainsKey(minus1))
{
Items.Add(minus1, Fibbonacci(minus1));
}
return (Items[minus2] + Items[minus1]);
}
}
If you're using a language with first-class functions like Scheme, you can add memoization without changing the initial algorithm:
(define (memoize fn)
(letrec ((get (lambda (query) '(#f)))
(set (lambda (query value)
(let ((old-get get))
(set! get (lambda (q)
(if (equal? q query)
(cons #t value)
(old-get q))))))))
(lambda args
(let ((val (get args)))
(if (car val)
(cdr val)
(let ((ret (apply fn args)))
(set args ret)
ret))))))
(define fib (memoize (lambda (x)
(if (< x 2) x
(+ (fib (- x 1)) (fib (- x 2)))))))
The first block provides a memoization facility and the second block is the fibonacci sequence using that facility. This now has an O(n) runtime (as opposed to O(2^n) for the algorithm without memoization).
Note: the memoization facility provided uses a chain of closures to look for previous invocations. At worst case this can be O(n). In this case, however, the desired values are always at the top of the chain, ensuring O(1) lookup.
As other posters have indicated, memoization is a standard way to trade memory for speed, here is some pseudo code to implement memoization for any function (provided the function has no side effects):
Initial function code:
function (parameters)
body (with recursive calls to calculate result)
return result
This should be transformed to
function (parameters)
key = serialized parameters to string
if (cache[key] does not exist) {
body (with recursive calls to calculate result)
cache[key] = result
}
return cache[key]
By the way Perl has a memoize module that does this for any function in your code that you specify.
# Compute Fibonacci numbers
sub fib {
my $n = shift;
return $n if $n < 2;
fib($n-1) + fib($n-2);
}
In order to memoize this function all you do is start your program with
use Memoize;
memoize('fib');
# Rest of the fib function just like the original version.
# Now fib is automagically much faster ;-)
#lassevk:
This is awesome, exactly what I had been thinking about in my head after reading about memoization in Higher Order Perl. Two things which I think would be useful additions:
An optional parameter to specify a static or member method that is used for generating the key to the cache.
An optional way to change the cache object so that you could use a disk or database backed cache.
Not sure how to do this sort of thing with Attributes (or if they are even possible with this sort of implementation) but I plan to try and figure out.
(Off topic: I was trying to post this as a comment, but I didn't realize that comments have such a short allowed length so this doesn't really fit as an 'answer')