Calculating radial velocity from an inverse linear scale - algorithm

I have a user input that can have an integer value of 1 through 50.
i have, let's imagine, a needle that turns, like it was a clock.
The speed of that turn is determined by the delta in radians it moves every frame.
So, if i have a speed of PI/2, the needle turns a half circle every frame.
I have come to the conclusion that the possible speed, should be between PI/8 (the fastest) and PI/256 (the slowest).
I am trying to build an algorithm that will translate the user input of 1 (the slowest) and 50 (the fastest) into PI/256 and PI/8 (the max value 50 is arbitrary, can be something else); obviously the numbers between should be in reverse correspondence.
what i need would be a formula like:
delta = userInput * (.............)
I'have been trying for hours, if someone could help me out would be very much appreciated.

Solve the line equation: y = m * x + b. I.e., plug in your two points to get two equations with m and b as unknowns, then solve for m and b.

(See my answer here for a more detailed explanation of why this works.)
This was just slightly too long for a comment. Whether the two scales agree in direction doesn't matter, although yours do: you said 1 (the slowest) corresponds to pi/256 (the slowest), and 50 (the fastest) corresponds to pi/8 (the fastest). 1 < 50, and pi/256 < pi/8.
So if that's the right ordering:
>>> a0, a1 = 1., 50.
>>> b0, b1 = pi/256, pi/8
>>> def rescale(x):
... return ((x-a0)/(a1-a0)) * (b1-b0) + b0
...
>>> rescale(1)
0.01227184630308513
>>> rescale(1) == pi/256
True
>>>
>>> rescale(50)
0.39269908169872414
>>> rescale(50) == pi/8
True
with 25 somewhere close to the middle:
>>> rescale(25)
0.198603553435643
If you want 1 to correspond to the fastest speed instead, then simply flip b0 and b1:
>>> a0, a1 = 1., 50.
>>> b0, b1 = pi/8, pi/256
>>> def rescale(x):
... return ((x-a0)/(a1-a0)) * (b1-b0) + b0
...
>>> rescale(1)
0.39269908169872414
>>> rescale(50)
0.012271846303085143
The formula continues to apply.

Related

How to determine probability of 0 to N events occurring given probability of each of those N events?

First time posting here, so if I make a mistake with something let me know and I'd be more than happy to fix it!
Given N events, each of which have an individual probability (from 0 to 100%) of occurring, I'd like to determine the probability of 0 to N of those events occurring together.
For example, if I have event 1, 2, 3,...,N and 5 (E1, E2, E3...,EN) where the individual probability of a specific event occurring is as follows:
E1 = 30% probability of occurring
E2 = 40% probability of occurring
E3 = 50% probability of occurring
...
EN = x% probability of occurring
I'd like to know the probability of having:
none of these events occurring
any 1 of these events occurring
any 2 of these events occurring
any 3 of these events occurring
...
all N of these events occurring
I understand that having 0 events occurring is (1-E1)(1-E2)...(1-EN) and that having all N events occurring is E1*E2*...*E3. However, I do not know how to calculate the other possibilities (1 to N-1 events occurring).
I have been looking for some recursive algorithm (binomial compound distribution) that could solve this but I have not found any explicit formula that does this. Wondering if any of you guys could help!
Thanks in advance!
EDIT: The events are indeed independent.
Sounds like Poisson binomial
wikipedia link.
There's an explicit recursive formula but beware of numerical stability.
where
Something like the following recursive program should work.
function ans = probability_vector(probabilities)
if len(probabilities) == 0
% No events can happen.
ans = [1];
elseif len(probabilities) == 1
% 0 or 1 events can happen.
ans = [1 - probabilities[1], probabilities[1]];
else
half = ceil(len(probabilities)/2);
ans_half1 = probability_vector(probabilities[1: half]);
ans_half2 = probability_vector(probabilities[half + 1: end]);
ans = convolve(ans_half1, ans_half2)
end
return
end
And if p is a probability vector, then p[i+1] is the probability of i of the events happening.
See http://matlabtricks.com/post-3/the-basics-of-convolution for an explanation of the magic conv operator that does the meat of the work.
You need to compute your own version of Pascal's triangle, with probabilities (instead of counts) in each location. Row 0 will be the single figure 1.00; row 1 consists of two values, P(E1) and 1-P(E1). Below that, in row k, each position is P(Ek)[above-right entry] + (1-P(Ek))[above-left entry]. I recommend a lower-triangular matrix for this, something like:
1.00
0.30 0.70
0.12 0.46 0.42 # These are 0.3*0.4 | 0.3*0.6 + 0.7*0.4 | 0.7*0.6
0.06 0.29 0.44 0.21 # 0.12*0.5 | 0.12*0.5 + 0.46*0.5 | ...
See how it works? In array / matrix notation for a matrix M, given event probabilities in vector P, this looks something like
M[k, i] = P[k] * M[k-1, i] +
(1-P[k]) * M[k-1, i] + P[k] * M[k-1, i-1]
The above is a nice recursive definition. Note that my earlier "above-right" reference in the lower-matrix notation is simply a row above; above-left is exactly that: row k-1, column i-1.
When you're done, the bottom row of the matrix will be the probabilities of getting N, N-1, N-2, ... 0 of the events. If you want these probabilities in the opposite order, then simply switch the coefficients P[k] and 1-P[k]
Does that get you moving toward a solution?
After tons of research and some help from the answers here, I've come up with the following code:
function [ prob_numSites ] = probability_activationSite( prob_distribution_site )
N = length(prob_distribution_site); % number of events
notProb = 1 - prob_distribution_site; % find probability of no occurrence
syms x; % create symbolic variable
prob_number = 1; % initializing prob_number to 1
for i = 1:N
prob_number = prob_number*(prob_distribution_site(i)*x + notProb(i));
end
prob_number_polynomial = expand(prob_number); % expands the function into a polynomial
revProb_numSites = coeffs(prob_number_polynomial); % returns the coefficients of the above polynomial (ie probability of 0 to N events, where first coefficient is N events occurring, last coefficient is 0 events occurring)
prob_numSites = fliplr(revProb_numSites); % reverses order of coefficients
This takes in probability of certain number of individual events occurring and returns array of the probability of 0 to N events occurring.
(This answer helped a lot).
None of these answers seemed to worked/was understandable for me so I computed it and made it myself in python:
def combin(n, k):
if k > n//2:
k = n-k
x = 1
y = 1
i = n-k+1
while i <= n:
x = (x*i)//y
y += 1
i += 1
return x
# proba being the probability of each of the N evenments, each being different from one another.
for i in range(N,0,-1):
print(i)
if sums[i]> 0:
continue
print(combin(N,i))
for j in itertools.combinations(proba, i):
sums[i]+=np.prod(j)
for i in range(N,0,-1):
for j in range(i+1,N+1):
icomb = combin(j,i)
sums[str(i)] -= icomb*sums[str(j)]
the math is not super simple:
Let $C_{w_n}$ be the set of all unordered sets $(i,j,k...n)$
where $i,j,k...n\in w$
$Co(i,proba) = sum{C_{w_i}} - sum_{u from i+1..n}{(u \choose i) sum{C_{w_u}}}$*
$Co(i, P)$ being the probability of i events occuring given $P = {p_i...p_n}$, the bernouilli probability of each event.

Psuedo-Random Variable

I have a variable, between 0 and 1, which should dictate the likelyhood that a second variable, a random number between 0 and 1, is greater than 0.5. In other words, if I were to generate the second variable 1000 times, the average should be approximately equal to the first variable's value. How do I make this code?
Oh, and the second variable should always be capable of producing either 0 or 1 in any condition, just more or less likely depending on the value of the first variable. Here is a link to a graph which models approximately how I would like the program to behave. Each equation represents a separate value for the first variable.
You have a variable p and you are looking for a mapping function f(x) that maps random rolls between x in [0, 1] to the same interval [0, 1] such that the expected value, i.e. the average of all rolls, is p.
You have chosen the function prototype
f(x) = pow(x, c)
where c must be chosen appropriately. If x is uniformly distributed in [0, 1], the average value is:
int(f(x) dx, [0, 1]) == p
With the integral:
int(pow(x, c) dx) == pow(x, c + 1) / (c + 1) + K
one gets:
c = 1/p - 1
A different approach is to make p the median value of the distribution, such that half of the rolls fall below p, the other half above p. This yields a different distribution. (I am aware that you didn't ask for that.) Now, we have to satisfy the condition:
f(0.5) == pow(0.5, c) == p
which yields:
c = log(p) / log(0.5)
With the current function prototype, you cannot satisfy both requirements. Your function is also asymmetric (f(x, p) != f(1-x, 1-p)).
Python functions below:
def medianrand(p):
"""Random number between 0 and 1 whose median is p"""
c = math.log(p) / math.log(0.5)
return math.pow(random.random(), c)
def averagerand(p):
"""Random number between 0 and 1 whose expected value is p"""
c = 1/p - 1
return math.pow(random.random(), c)
You can do this by using a dummy. First set the first variable to a value between 0 and 1. Then create a random number in the dummy between 0 and 1. If this dummy is bigger than the first variable, you generate a random number between 0 and 0.5, and otherwise you generate a number between 0.5 and 1.
In pseudocode:
real a = 0.7
real total = 0.0
for i between 0 and 1000 begin
real dummy = rand(0,1)
real b
if dummy > a then
b = rand(0,0.5)
else
b = rand(0.5,1)
end if
total = total + b
end for
real avg = total / 1000
Please note that this algorithm will generate average values between 0.25 and 0.75. For a = 1 it will only generate random values between 0.5 and 1, which should average to 0.75. For a=0 it will generate only random numbers between 0 and 0.5, which should average to 0.25.
I've made a sort of pseudo-solution to this problem, which I think is acceptable.
Here is the algorithm I made;
a = 0.2 # variable one
b = 0 # variable two
b = random.random()
b = b^(1/(2^(4*a-1)))
It doesn't actually produce the average results that I wanted, but it's close enough for my purposes.
Edit: Here's a graph I made that consists of a large amount of datapoints I generated with a python script using this algorithm;
import random
mod = 6
div = 100
for z in xrange(div):
s = 0
for i in xrange (100000):
a = (z+1)/float(div) # variable one
b = random.random() # variable two
c = b**(1/(2**((mod*a*2)-mod)))
s += c
print str((z+1)/float(div)) + "\t" + str(round(s/100000.0, 3))
Each point in the table is the result of 100000 randomly generated points from the algorithm; their x positions being the a value given, and their y positions being their average. Ideally they would fit to a straight line of y = x, but as you can see they fit closer to an arctan equation. I'm trying to mess around with the algorithm so that the averages fit the line, but I haven't had much luck as of yet.

How Could One Implement the K-Means++ Algorithm?

I am having trouble fully understanding the K-Means++ algorithm. I am interested exactly how the first k centroids are picked, namely the initialization as the rest is like in the original K-Means algorithm.
Is the probability function used based on distance or Gaussian?
In the same time the most long distant point (From the other centroids) is picked for a new centroid.
I will appreciate a step by step explanation and an example. The one in Wikipedia is not clear enough. Also a very well commented source code would also help. If you are using 6 arrays then please tell us which one is for what.
Interesting question. Thank you for bringing this paper to my attention - K-Means++: The Advantages of Careful Seeding
In simple terms, cluster centers are initially chosen at random from the set of input observation vectors, where the probability of choosing vector x is high if x is not near any previously chosen centers.
Here is a one-dimensional example. Our observations are [0, 1, 2, 3, 4]. Let the first center, c1, be 0. The probability that the next cluster center, c2, is x is proportional to ||c1-x||^2. So, P(c2 = 1) = 1a, P(c2 = 2) = 4a, P(c2 = 3) = 9a, P(c2 = 4) = 16a, where a = 1/(1+4+9+16).
Suppose c2=4. Then, P(c3 = 1) = 1a, P(c3 = 2) = 4a, P(c3 = 3) = 1a, where a = 1/(1+4+1).
I've coded the initialization procedure in Python; I don't know if this helps you.
def initialize(X, K):
C = [X[0]]
for k in range(1, K):
D2 = scipy.array([min([scipy.inner(c-x,c-x) for c in C]) for x in X])
probs = D2/D2.sum()
cumprobs = probs.cumsum()
r = scipy.rand()
for j,p in enumerate(cumprobs):
if r < p:
i = j
break
C.append(X[i])
return C
EDIT with clarification: The output of cumsum gives us boundaries to partition the interval [0,1]. These partitions have length equal to the probability of the corresponding point being chosen as a center. So then, since r is uniformly chosen between [0,1], it will fall into exactly one of these intervals (because of break). The for loop checks to see which partition r is in.
Example:
probs = [0.1, 0.2, 0.3, 0.4]
cumprobs = [0.1, 0.3, 0.6, 1.0]
if r < cumprobs[0]:
# this event has probability 0.1
i = 0
elif r < cumprobs[1]:
# this event has probability 0.2
i = 1
elif r < cumprobs[2]:
# this event has probability 0.3
i = 2
elif r < cumprobs[3]:
# this event has probability 0.4
i = 3
One Liner.
Say we need to select 2 cluster centers, instead of selecting them all randomly{like we do in simple k means}, we will select the first one randomly, then find the points that are farthest to the first center{These points most probably do not belong to the first cluster center as they are far from it} and assign the second cluster center nearby those far points.
I have prepared a full source implementation of k-means++ based on the book "Collective Intelligence" by Toby Segaran and the k-menas++ initialization provided here.
Indeed there are two distance functions here. For the initial centroids a standard one is used based numpy.inner and then for the centroids fixation the Pearson one is used. Maybe the Pearson one can be also be used for the initial centroids. They say it is better.
from __future__ import division
def readfile(filename):
lines=[line for line in file(filename)]
rownames=[]
data=[]
for line in lines:
p=line.strip().split(' ') #single space as separator
#print p
# First column in each row is the rowname
rownames.append(p[0])
# The data for this row is the remainder of the row
data.append([float(x) for x in p[1:]])
#print [float(x) for x in p[1:]]
return rownames,data
from math import sqrt
def pearson(v1,v2):
# Simple sums
sum1=sum(v1)
sum2=sum(v2)
# Sums of the squares
sum1Sq=sum([pow(v,2) for v in v1])
sum2Sq=sum([pow(v,2) for v in v2])
# Sum of the products
pSum=sum([v1[i]*v2[i] for i in range(len(v1))])
# Calculate r (Pearson score)
num=pSum-(sum1*sum2/len(v1))
den=sqrt((sum1Sq-pow(sum1,2)/len(v1))*(sum2Sq-pow(sum2,2)/len(v1)))
if den==0: return 0
return 1.0-num/den
import numpy
from numpy.random import *
def initialize(X, K):
C = [X[0]]
for _ in range(1, K):
#D2 = numpy.array([min([numpy.inner(c-x,c-x) for c in C]) for x in X])
D2 = numpy.array([min([numpy.inner(numpy.array(c)-numpy.array(x),numpy.array(c)-numpy.array(x)) for c in C]) for x in X])
probs = D2/D2.sum()
cumprobs = probs.cumsum()
#print "cumprobs=",cumprobs
r = rand()
#print "r=",r
i=-1
for j,p in enumerate(cumprobs):
if r 0:
for rowid in bestmatches[i]:
for m in range(len(rows[rowid])):
avgs[m]+=rows[rowid][m]
for j in range(len(avgs)):
avgs[j]/=len(bestmatches[i])
clusters[i]=avgs
return bestmatches
rows,data=readfile('/home/toncho/Desktop/data.txt')
kclust = kcluster(data,k=4)
print "Result:"
for c in kclust:
out = ""
for r in c:
out+=rows[r] +' '
print "["+out[:-1]+"]"
print 'done'
data.txt:
p1 1 5 6
p2 9 4 3
p3 2 3 1
p4 4 5 6
p5 7 8 9
p6 4 5 4
p7 2 5 6
p8 3 4 5
p9 6 7 8

An inverse Fibonacci algorithm?

There are dozens of ways of computing F(n) for an arbitrary n, many of which have great runtime and memory usage.
However, suppose I wanted to ask the opposite question:
Given F(n) for n > 2, what is n?
(The n > 2 restriction is in there since F(1) = F(2) = 1 and there's no unambiguous inverse).
What would be the most efficient way of solving this problem? It's easy to do this in linear time by enumerating the Fibonacci numbers and stopping when you hit the target number, but is there some way of doing this any faster than that?
EDIT: currently, the best solution posted here runs in O(log n) time using O(log n) memory, assuming that mathematical operations run in O(1) and that a machine word can hold any number in O(1) space. I'm curious if it's possible to drop the memory requirements, since you can compute Fibonacci numbers using O(1) space.
Since OP has asked about matrix solution not involving any floating point computations, here it is. We can achieve O(logn) complexity this way, assuming numeric operations have O(1) complexity.
Let's take 2x2 matrix A having following structure
1 1
1 0
Now consider vector (8, 5), storing two consecutive fibonacci numbers. If you multiply it by this matrix, you'll get (8*1 + 5*1, 8*1 + 5*0) = (13, 8) - the next fibonacci number.
If we generalize, A^n * (1, 0) = (f(n), f(n - 1)).
The actual algorithm takes two steps.
Calculate A^2, A^4, A^8, etc. until we pass desired number.
Do a binary search by n, using calculated powers of A.
On a side note, any sequence of the form f(n) = k1*f(n-1) + k2*f(n-2) + k3*f(n-3) + .. + kt*f(n-t) can be presented like this.
Wikipedia gives the result as
n(F) = Floor[ Log(F Sqrt(5) + 1/2)/Log(Phi)]
where Phi is the golden ratio.
If you can easily interpret F(n) in binary,
You may be suspicious of the constants 1.7 and 1.1. These work because d*1.44042009041... + C never gets very close to an integer.
I can post a derivation tomorrow if there is interest.
Here is a table with n = 2 through 91, which shows the formula result before flooring:
n formula w/o floor F(n) F(n) in binary
2 2.540 1 1
3 3.981 2 10
4 4.581 3 11
5 5.421 5 101
6 6.862 8 1000
7 7.462 13 1101
8 8.302 21 10101
9 9.743 34 100010
10 10.343 55 110111
11 11.183 89 1011001
12 12.623 144 10010000
13 13.223 233 11101001
14 14.064 377 101111001
15 15.504 610 1001100010
16 16.104 987 1111011011
17 17.545 1597 11000111101
18 18.385 2584 101000011000
19 19.825 4181 1000001010101
20 20.425 6765 1101001101101
21 21.266 10946 10101011000010
22 22.706 17711 100010100101111
23 23.306 28657 110111111110001
24 24.147 46368 1011010100100000
25 25.587 75025 10010010100010001
26 26.187 121393 11101101000110001
27 27.028 196418 101111111101000010
28 28.468 317811 1001101100101110011
29 29.068 514229 1111101100010110101
30 30.508 832040 11001011001000101000
31 31.349 1346269 101001000101011011101
32 32.789 2178309 1000010011110100000101
33 33.389 3524578 1101011100011111100010
34 34.230 5702887 10101110000010011100111
35 35.670 9227465 100011001100110011001001
36 36.270 14930352 111000111101000110110000
37 37.111 24157817 1011100001001111001111001
38 38.551 39088169 10010101000111000000101001
39 39.151 63245986 11110001010000111010100010
40 40.591 102334155 110000110010111111011001011
41 41.432 165580141 1001110111101000110101101101
42 42.032 267914296 1111111110000000110000111000
43 43.472 433494437 11001110101101001100110100101
44 44.313 701408733 101001110011101010010111011101
45 45.753 1134903170 1000011101001010011111110000010
46 46.353 1836311903 1101101011100111110010101011111
47 47.193 2971215073 10110001000110010010010011100001
48 48.634 4807526976 100011110100011010000101001000000
49 49.234 7778742049 111001111101001100010111100100001
50 50.074 12586269025 1011101110001100110011100101100001
51 51.515 20365011074 10010111101110110010110100010000010
52 52.115 32951280099 11110101100000011001010000111100011
53 53.555 53316291173 110001101001111001100000101001100101
54 54.396 86267571272 1010000010101111100101010110001001000
55 55.836 139583862445 10000001111111110110001011011010101101
56 56.436 225851433717 11010010010101110010110110001011110101
57 57.276 365435296162 101010100010101101001000001100110100010
58 58.717 591286729879 1000100110101011011011110111110010010111
59 59.317 956722026041 1101111011000001000100111001011000111001
60 60.157 1548008755920 10110100001101100100000110001001011010000
61 61.598 2504730781961 100100011100101101100101101010100100001001
62 62.198 4052739537881 111010111110011010000110011011101111011001
63 63.038 6557470319842 1011111011011000111101100000110010011100010
64 64.478 10610209857723 10011010011001100001110010100010000010111011
65 65.078 17167680177565 11111001110100101001011110101000010110011101
66 66.519 27777890035288 110010100001110001011010001001010011001011000
67 67.359 44945570212853 1010001110000010110100101111110010101111110101
68 68.800 72723460248141 10000100010010001000000000000111101001001001101
69 69.400 117669030460994 11010110000010011110100110000101111111001000010
70 70.240 190392490709135 101011010010100100110100110001101101000010001111
71 71.681 308061521170129 1000110000010111000101001100010011100111011010001
72 72.281 498454011879264 1110001010101011101011110010100001001111101100000
73 73.121 806515533049393 10110111011000010110000111110110100110111000110001
74 74.561 1304969544928657 100101000101101110011100110001010110000110110010001
75 75.161 2111485077978050 111100000000110001001101110000001010111101111000010
76 76.602 3416454622906707 1100001000110011111101010100001100001000100101010011
77 77.442 5527939700884757 10011101000111010000111000010001101100000010100010101
78 78.042 8944394323791464 11111110001101110000100010110011001101000111001101000
79 79.483 14472334024676221 110011011010101000001011011000100111001001001101111101
80 80.323 23416728348467685 1010011001100010110001111101111000000110010000111100101
81 81.764 37889062373143906 10000110100110111110011011000111100111111011010101100010
82 82.364 61305790721611591 11011001110011010100101010110110101000101101011101000111
83 83.204 99194853094755497 101100000011010010011000101111110010000101000110010101001
84 84.644 160500643816367088 1000111010001101100111110000110100111001010110001111110000
85 85.244 259695496911122585 1110011010100111111010110110110011001001111111000010011001
86 86.085 420196140727489673 10111010100110101100010100111101000000011010101010010001001
87 87.525 679891637638612258 100101101111011101011101011110011011001101010100010100100010
88 88.125 1100087778366101931 111101000100010011000000000110000011010000101001100110101011
89 89.566 1779979416004714189 1100010110011110000011101100100011110011101111101111011001101
90 90.406 2880067194370816120 10011111111000000011011101101010100001101110100111100001111000
91 91.846 4660046610375530309 100000010101011110011111011001111000000001100100101011101000101
Derivation
Required Precision for α = log 2/log Φ ≈ 1.44042...
Measuring memory usage by counting unbounded words is sort of silly, but as long as that's the model, there's an O(log n) time, O(1) word solution similar to Nikita Rybak's that in essence computes n via its Zeckendorf representation, which is based on the Fibonacci numbers (YO DAWG).
Define the matrix
1 1
A = ,
1 0
which satisfies
F(m + 1) F(m)
A^m = .
F(m) F(m - 1)
Instead of the sequence A^(2^k), we're going to use the sequence A^F(k). The latter sequence has the property that we can move forward with a matrix multiply
A^F(k + 1) = A^F(k - 1) * A^F(k)
and backward with a matrix inverse and multiplication
A^F(k - 1) = A^F(k + 1) (A^F(k))^-1,
so we can build a bidirectional iterator with only eight six twelve words assuming we store everything as rationals (to avoid assuming the existence of a unit-cost divide). The rest is just adapting this O(1)-space algorithm for finding a Zeckendorf representation.
def zeck(n):
a, b = (0, 1)
while b < n:
a, b = (b, a + b)
yield a
n1 = a
while n1 < n:
a, b = (b - a, a)
if n1 + a <= n:
yield a
n1 += a
a, b = (b - a, a)
>>> list(zeck(0))
[0]
>>> list(zeck(2))
[1, 1]
>>> list(zeck(12))
[8, 3, 1]
>>> list(zeck(750))
[610, 89, 34, 13, 3, 1]
It's been proven that the formula for a fib n is fib(n) = ( (phi)^n - (-phi)^(-n) ) / sqrt(5) where phi = (1+sqrt(5)) / 2, the golden section number. (see this link).
You could try to find a mathematical inverse to the fib function above, or otherwise do a binary search in 32/64 operations (depending on how big your searchable maximum is) to find the n that matches the number (try each n by computing fib(n) and splitting your sample space in two according to how fib(n) compares to the given fibonacci number).
Edit: #rcollyer's solution is faster, as mine is in O(lg n) and the one he found is in O(1) = constant time.
So I was thinking about this problem and I think that it's possible to do this in O(lg n) time with O(lg n) memory usage. This is based on the fact that
F(n) = (1 / √5) (Φn - φn)
Where Φ = (1 + √5)/2 and φ = 1 - Φ.
The first observation is that φn < 1 for any n > 1. This means that for any n > 2, we have that
F(n) = ⌊ Φn / √5 ⌋
Now, take n and write it in binary as bk-1bk-2...b1b0. This means that
n = 2k-1 bk-1 + 2k-2 bk-2 + ... + 21 b1 + 20 b0.
This means that
F(n) = ⌊ Φ2k-1 bk-1 + 2k-2 bk-2 + ... + 21 b1 + 20 b0 / √5 ⌋
Or, more readably, that
F(n) = ⌊ Φ2k-1 bk-1Φ2k-2 bk-2 ... Φ21 b1Φ20 b0 / √5 ⌋
This suggests the following algorithm. First, start computing Φ2k for all k until you compute a number Φz such that ⌊ Φz / √5 ⌋ that's greater than your number F(n). Now, from there, iterate backwards across all of the powers of Φ you generated this way. If the current number is bigger than the indicated power of Φ, then divide it by that power of Φ and record that the number was divided by this value. This process essentially recovers one bit of n at a time by subtracting out the largest power of 2 that you can at a time. Consequently, once you're done, you'll have found n.
The runtime of this algorithm is O(lg n), since you can generate Φ2i by repeated squaring, and we only generate O(lg n) terms. The memory usage is O(lg n), since we store all of these values.
You can find n for any Fib(n) in O(1) time and O(1) space.
You can use a fixed-point CORDIC algorithm to compute ln() using only shift and add on integer data types.
If x = Fib(n), then n can be determined by
n = int(2.0801 * ln(x) + 2.1408)
CORDIC run-time is determined by the desired level of precision. The two floating-point values would be encoded as fixed-point values.
The only issue with this proposal is that it returns a value for numbers that are not in the Fibonacci sequence, but the original problem specifically stated that the input to the function would be Fib(n), which implies that only valid Fibonacci numbers would be used.
EDIT: Never mind. The asker has stated in comments that exponentiation is definitely not constant time.
Is exponentiation one of the mathematical operations that you'll allow in constant time? If so, we can compute F(n) in constant time via the closed-form formula. Then, given some F, we can do the following:
Compute F(1), F(2), F(4), F(16), F(256), ... until F(2^k) <= F < F(2^{k+1})
Do a binary search for i between 2^k and 2^{k+1} until F(i) <= F < F(i+1)
If F = F(n), then first part takes k = O(log(n)) steps. The second part is a binary search over a range of size O(2^k), so it also takes k = O(log(n)). So, in total, we have O(log(n)) time in O(1) space if (and it's a big if) we have exponentiation in O(1) time.
A closed form of the Fibonacci number formula is:
Fn = Round(φ^n / Sqrt(5))
Where φ is the golden ratio.
If we ignore the rounding factor this is invertible and the inverse function is:
F(-1)n= log(n*Sqrt(5))/logφ
Because we ignored the rounding factor there is an error in the formula which could be calculated. However if we consider that a number n is a Fibonacci number iff the interval [n*φ - 1/n, n*φ + 1/n] contains a natural number then:
A number is a Fibonacci number iff the interval [n*φ - 1/n, n*φ + 1/n] contains a natural number and that number's index in the Fibonacci sequence is given by rounding log(n*Sqrt(5))/logφ
This should be doable in (pseudo)-constant time depending on the algorithms used for calculating the log and square roots etc.
Edit: φ = (1+Sqrt(5))/2
This might be similar to user635541's answer. I don't fully understand his approach.
Using the matrix representation for Fibonacci numbers, discussed in other answers, we get a way to go from F_n and F_m to F_{n+m} and F_{n-m} in constant time, using only plus, multiplication, minus and division (actually not! see the update). We also have a zero (the identity matrix), so it is a mathematical group!
Normally when doing binary search we also want a division operator for taking averages. Or at least division by 2. However if we want to go from F_{2n} to F_n it requires a square root. Luckily it turns out that plus and minus are all we need for a logarithmic time 'nearly' binary search!
Wikipedia writes about the approach, ironically called Fibonacci_search, but the article is not very clearly written, so I don't know if it is exactly the same approach as mine. It is very important to understand that the Fibonacci numbers used for the Fibonacci search have nothing to do with the numbers we are looking for. It's a bit confusing. To demonstrate the approach, here is first an implementation of standard 'binary search' only using plus and minus:
def search0(test):
# Standard binary search invariants:
# i <= lo then test(i)
# i >= hi then not test(i)
# Extra invariants:
# hi - lo = b
# a, b = F_{k-1}, F_k
a, b = 0, 1
lo, hi = 0, 1
while test(hi):
a, b = b, a + b
hi = b
while b != 1:
mi = lo + a
if test(mi):
lo = mi
a, b = 2*a - b, b - a
else:
hi = mi
a, b = b - a, a
return lo
>>> search0(lambda n: n**2 <= 25)
5
>>> search0(lambda n: 2**n <= 256)
8
Here test is some boolean function; a and b are consecutive fibonacci numbers f_k and f_{k-1} such that the difference between out upper bound hi and lower bound lo is always f_k. We need both a and b so we can increase and decrease the implicit variable k efficiently.
Alright, so how do we use this to solve the problem? I found it useful to create a wrapper around our Fibonacci representation, that hides the matrix details. In practice (is there such a thing for a Fibonacci searcher?) you would want to inline everything manually. That would spare you the redundancy in the matrices and make some optimization around the matrix inversion.
import numpy as np
class Fib:
def __init__(self, k, M):
""" `k` is the 'name' of the fib, e.g. k=6 for F_6=8.
We need this to report our result in the very end.
`M` is the matrix representation, that is
[[F_{k+1}, F_k], [F_k, F_{k-1}]] """
self.k = k
self.M = M
def __add__(self, other):
return Fib(self.k + other.k, self.M.dot(other.M))
def __sub__(self, other):
return self + (-other)
def __neg__(self):
return Fib(-self.k, np.round(np.linalg.inv(self.M)).astype(int))
def __eq__(self, other):
return self.k == other.k
def value(self):
return self.M[0,1]
However the code does work, so we can test it as follows. Notice how little different the search function is from when our objects were integers and not Fibonaccis.
def search(test):
Z = Fib(0, np.array([[1,0],[0,1]])) # Our 0 element
A = Fib(1, np.array([[1,1],[1,0]])) # Our 1 element
a, b = Z, A
lo, hi = Z, A
while test(hi.value()):
a, b = b, a + b
hi = b
while b != A:
mi = lo + a
if test(mi.value()):
lo = mi
a, b = a+a-b, b-a
else:
hi = mi
a, b = b-a, a
return lo.k
>>> search(lambda n: n <= 144)
12
>>> search(lambda n: n <= 0)
0
The remaining open question is whether there is an efficient search algorithm for monoids. That is one that doesn't need a minus / additive inverse. My guess is no: that without minus you need the extra memory of Nikita Rybak.
Update
I just realized that we don't need division at all. The determinant of the F_n matrix is just (-1)^n, so we can actually do everything without division. In the below I removed all the matrix code, but I kept the Fib class, just because everything got so extremely messy otherwise.
class Fib2:
def __init__(self, k, fp, f):
""" `fp` and `f` are F_{k-1} and F_{k} """
self.k, self.fp, self.f = k, fp, f
def __add__(self, other):
fnp, fn, fmp, fm = self.fp, self.f, other.fp, other.f
return Fib2(self.k + other.k, fn*fm+fnp*fmp, (fn+fnp)*fm+fn*fmp)
def __sub__(self, other):
return self + (-other)
def __neg__(self):
fp, f = self.f + self.fp, -self.f
return Fib2(-self.k, (-1)**self.k*fp, (-1)**self.k*f)
def __eq__(self, other):
return self.k == other.k
def value(self):
return self.f
def search2(test):
Z = Fib2(0, 1, 0)
A = Fib2(1, 0, 1)
...
>>> search2(lambda n: n <= 280571172992510140037611932413038677189525)
200
>>> search2(lambda n: n <= 4224696333392304878706725602341482782579852840250681098010280137314308584370130707224123599639141511088446087538909603607640194711643596029271983312598737326253555802606991585915229492453904998722256795316982874482472992263901833716778060607011615497886719879858311468870876264597369086722884023654422295243347964480139515349562972087652656069529806499841977448720155612802665404554171717881930324025204312082516817125)
2000
>>> search2(lambda n: n <= 2531162323732361242240155003520607291766356485802485278951929841991312781760541315230153423463758831637443488219211037689033673531462742885329724071555187618026931630449193158922771331642302030331971098689235780843478258502779200293635651897483309686042860996364443514558772156043691404155819572984971754278513112487985892718229593329483578531419148805380281624260900362993556916638613939977074685016188258584312329139526393558096840812970422952418558991855772306882442574855589237165219912238201311184749075137322987656049866305366913734924425822681338966507463855180236283582409861199212323835947891143765414913345008456022009455704210891637791911265475167769704477334859109822590053774932978465651023851447920601310106288957894301592502061560528131203072778677491443420921822590709910448617329156135355464620891788459566081572824889514296350670950824208245170667601726417091127999999941149913010424532046881958285409468463211897582215075436515584016297874572183907949257286261608612401379639484713101138120404671732190451327881433201025184027541696124114463488665359385870910331476156665889459832092710304159637019707297988417848767011085425271875588008671422491434005115288334343837778792282383576736341414410248994081564830202363820504190074504566612515965134665683289356188727549463732830075811851574961558669278847363279870595320099844676879457196432535973357128305390290471349480258751812890314779723508104229525161740643984423978659638233074463100366500571977234508464710078102581304823235436518145074482824812996511614161933313389889630935320139507075992100561077534028207257574257706278201308302642634678112591091843082665721697117838726431766741158743554298864560993255547608496686850185804659790217122426535133253371422250684486113457341827911625517128815447325958547912113242367201990672230681308819195941016156001961954700241576553750737681552256845421159386858399433450045903975167084252876848848085910156941603293424067793097271128806817514906531652407763118308162377033463203514657531210413149191213595455280387631030665594589183601575340027172997222489081631144728873621805528648768511368948639522975539046995395707688938978847084621586473529546678958226255042389998718141303055036060772003887773038422366913820397748550793178167220193346017430024134496141145991896227741842515718997898627269918236920453493946658273870473264523119133765447653295022886429174942653014656521909469613184983671431465934965489425515981067546087342348350724207583544436107294087637975025147846254526938442435644928231027868701394819091132912397475713787593612758364812687556725146456646878912169274219209708166678668152184941578590201953144030519381922273252666652671717526318606676754556170379350956342095455612780202199922615392785572481747913435560866995432578680971243966868110016581395696310922519803685837460795358384618017215468122880442252343684547233668502313239328352671318130604247460452134121833305284398726438573787798499612760939462427922917659263046333084007208056631996856315539698234022953452211505675629153637867252695056925345220084020071611220575700841268302638995272842160994219632684575364180160991884885091858259996299627148614456696661412745040519981575543804847463997422326563897043803732970397488471644906183310144691243649149542394691524972023935190633672827306116525712882959108434211652465621144702015336657459532134026915214509960877430595844287585350290234547564574848753110281101545931547225811763441710217452979668178025286460158324658852904105792472468108996135476637212057508192176910900422826969523438985332067597093454021924077101784215936539638808624420121459718286059401823614213214326004270471752802725625810953787713898846144256909835116371235019527013180204030167601567064268573820697948868982630904164685161783088076506964317303709708574052747204405282785965604677674192569851918643651835755242670293612851920696732320545562286110332140065912751551110134916256237884844001366366654055079721985816714803952429301558096968202261698837096090377863017797020488044826628817462866854321356787305635653577619877987998113667928954840972022833505708587561902023411398915823487627297968947621416912816367516125096563705174220460639857683971213093125)
20000
This all works like a charm. My only worry is that the bit complexity such dominates the calculation, that we might as well have just done a sequential search. Or actually, just looking at the number of digits could probably tell you pretty much which you were looking at. That's not as fun though.

Efficient algorithm to determine range [a, b] of sin wave with interval

I have a sine wave whose parameters I can determine (they are user-input). It's of the form y=a*sin(m*x + t)
I'd like to know whether anyone knows an efficient algorithm to figure out the range of y for a given interval which goes from [0, x] (x is again another input)
For example:
for y = sin(x) (i.e. a=1, t=0, m=1), for the interval [0, 4] I'd like an output like [1, -0.756802]
Please keep in mind, m and t can be anything. Thus, the y-curve does not have to start (or end) at 0 (or 1). It could start anywhere.
Also, please note that x will be discrete.
Any ideas?
PS: I'll use python for implementing the algorithm.
Since function y(x) = a*sin(m*x + t) is continuous, maximum will be either at one of the interval's ends or at the maximum inside interval, in this case dy/dx will be equal to zero.
So:
1. Find values of y(x) at the ends of interval.
2. Find out if dy/dx == a * m cos (mx + t) have zero(s) in interval, find out values of y(x) at the zero(s).
3. Choose point where y(x) have maximum value
If you have greater than one period then the result is just +/- a.
For less than one period you can evaluate y at the start/end points and then find any maxima between the start/end points by solving for y' = 0, i.e. cos(m*x + t) = 0.
All the answers are more or less the same. Thanks guys=)
I think I'd go with something like the following (note that I am renaming the variable I called "x" to "end". I had this "x" at the beginning which denoted the end of my interval on the X-axis):
1) Evaluate y at 0 and "end", use an if-block to assign the two values to the correct PRELIMINARY "min" and "max" of the range
2) Evaluate number of evolutions: "evolNr" = (m*end)/2Pi. If evolNr > 1, return [-a, a]
3) If evolNr < 1: First find the root of the derivative, which is at "firstRoot" = (1/2m)*Pi - phase + q * 1/m * Pi, where q = ceil(m/Pi * ((1/2m) * Pi - phase) ) --- this gives me the first root at some position x > 0. From then on I know that all other extremes are within firstRoot and "end", we have a new root every 1/m * Pi.
In code: for (a=firstRoot; a < end; a += 1/m*Pi) {Eval y at a, if > 0 its a maximum, update "max", otherwise update "min"}
return [min, max]

Resources