I want to select a random number from 0,1,2,3...n, however I want to make it that the chance of selecting k|0<k<n will be lower by multiplication of x from selecting k - 1 so x = (k - 1) / k. As bigger the number as smaller the chances to pick it up.
As an answer I want to see the implementation of the next method:
int pickANumber(n,x)
This is for a game that I am developing, I saw those questions as related but not exactly that same:
How to pick an item by its probability
C Function for picking from a list where each element has a distinct probabili
p1 + p2 + ... + pn = 1
p1 = p2 * x
p2 = p3 * x
...
p_n-1 = pn * x
Solving this gives you:
p1 + p2 + ... + pn = 1
(p2 * x) + (p3 * x) + ... + (pn * x) + pn = 1
((p3*x) * x) + ((p4*x) * x) + ... + ((p_n-1*x) * x) + pn = 1
....
pn* (x^(n-1) + x^(n-2) + ... +x^1 + x^0) = 1
pn*(1-x^n)/(1-x) = 1
pn = (1-x)/(1-x^n)
This gives you the probability you need to set to pn, and from it you can calculate the probabilities for all other p1,p2,...p_n-1
Now, you can use a "black box" RNG that chooses a number with a distribution, like those in the threads you mentioned.
A simple approach to do it is to set an auxillary array:
aux[i] = p1 + p2 + ... + pi
Now, draw a random number with uniform distribution between 0 to aux[n], and using binary search (aux array is sorted), get the first value, which matching value in aux is greater than the random uniform number you got
Original answer, for substraction (before question was editted):
For n items, you need to solve the equation:
p1 + p2 + ... + pn = 1
p1 = p2 + x
p2 = p3 + x
...
p_n-1 = pn + x
Solving this gives you:
p1 + p2 + ... + pn = 1
(p2 + x) + (p3 + x) + ... + (pn + x) + pn = 1
((p3+x) + x) + ((p4+x) + x) + ... + ((p_n-1+x) + x) + pn = 1
....
pn* ((n-1)x + (n-2)x + ... +x + 0) = 1
pn* x = n(n-1)/2
pn = n(n-1)/(2x)
This gives you the probability you need to set to pn, and from it you can calculate the probabilities for all other p1,p2,...p_n-1
Now, you can use a "black box" RNG that chooses a number with a distribution, like those in the threads you mentioned.
Be advised, this is not guaranteed you will have a solution such that 0<p_i<1 for all i, but you cannot guarantee one given from your requirements, and it is going to depend on values of n and x to fit.
Edit This answer was for the OPs original question, which was different in that each probability was supposed to be lower by a fixed amount than the previous one.
Well, let's see what the constraints say. You want to have P(k) = P(k - 1) - x. So we have:
P(0)
P(1) = P(0) - x
P(2) = P(0) - 2x
...
In addition, Sumk P(k) = 1. Summing, we get:
1 = (n + 1)P(0) -x * n / 2 (n + 1),
This gives you an easy constraint between x and P(0). Solve for one in terms of the other.
For this I would use the Mersenne Twister algorithm for a uniform distribution which Boost provides, then have a mapping function to map the results of that random distribution to the actual number select.
Here's a quick example of a potential implementation, although I left out the quadtratic equation implementation since it is well known:
int f_of_xib(int x, int i, int b)
{
return x * i * i / 2 + b * i;
}
int b_of_x(int i, int x)
{
return (r - ( r ) / 2 );
}
int pickANumber(mt19937 gen, int n, int x)
{
// First, determine the range r required where the probability equals i * x
// since probability of each increasing integer is x higher of occuring.
// Let f(i) = r and given f'(i) = x * i then r = ( x * i ^2 ) / 2 + b * i
// where b = ( r - ( x * i ^ 2 ) / 2 ) / i . Since r = x when i = 1 from problem
// definition, this reduces down to b = r - r / 2. therefore to find r_max simply
// plugin x to find b, then plugin n for i, x, and b to get r_max since r_max occurs
// when n == i.
// Find b when
int b = b_of_x(x);
int r_max = f_of_xib(x, n, b);
boost::uniform_int<> range(0, r_max);
boost::variate_generator<boost::mt19937&, boost::uniform_int<> > next(gen, range);
// Now to map random number to desired number, just find the positive value for i
// when r is the return random number which boils down to finding the non-zero root
// when 0 = ( x * i ^ 2 ) / 2 + b * i - r
int random_number = next();
return quadtratic_equation_for_positive_value(1, b, r);
}
int main(int argc, char** argv)
{
mt19937 gen;
gen.seed(time(0));
pickANumber(gen, 10, 1);
system("pause");
}
Related
I'm looking for an optimized integer-based point-on-line algorithm, where you can define the line using begin and end coordinates, and the point to find based on either an x or y input.
I know how to do this using dy/dx division but I'm looking for an algorithm that eliminates all divisions.
This is what I'm currently doing:
int mult = ((px - v0.x)<<16) / (v1.x - v0.x);
vec2 result{px, v0.y + (lerpmult*(v1.y - v0.y))>>16};
The division in the first line is the problem I'm trying to eliminate.
One trick to solve this would be using the scalar product to determine the cosine of the angle between two vectors:
def line_test(a, b, p):
v_ap = tuple(m - n for n, m in zip(a, p))
v_ab = tuple(m - n for n, m in zip(a, b))
scp = sum(m * n for m, n in zip(v_ap, v_ab))
return scp > 0 and scp * scp == sum(n * n for n in v_ap) * sum(n * n for n in v_ab) and all(m <= n for m, n in zip(v_ap, v_ab))
The parameters of the above function are the end-points of the line (a and b) and the point p (c in the image), which we want to test.
Step by step the following happens in each line:
v_ap = tuple(m - n for n, m in zip(a, p))
We calculate the vector from a to p (v_ap)
v_ab = tuple(m - n for n, m in zip(a, b))
The vector from a to b (v_ab)
scp = sum(m * n for m, n in zip(v_ap, v_ab))
In this line the scalar product of v_ap and v_ab is calculated. The result is scp = cos(v_ab, v_ap) * euclidean_length(v_ab) * euclidean_length(v_ap), where the euclidean length of a vector is defined as sqrt(sum(n * n for n in vector)) (the standard definition of the geometric length of a vector).
return scp > 0 and scp * scp == sum(n * n for n in v_ap) * sum(n * n for n in v_ab) and all(m <= n for m, n in zip(v_ap, v_ab)
This line is pretty complex, so I'll break it down into a few parts:
scp * scp == sum(n * n for n in v_ap) * sum(n * n for n in v_ab)
Since division isn't allowed, we shouldn't use the square-root either, since it's calculation usually involves divisions. So instead of calculating the square-root, we take the square of both the euclidean length of both vectors and the scalar product, thus eliminating the square-root calculation:
scp = cos(v_ab, v_ap) * euclidean_length(v_ab) * euclidean_length(v_ap) =
= cos(v_ab, v_ap) * sqrt(sum(n ^ 2 for n in v_ab)) * sqrt(sum(n ^ 2 for n in v_ap))
scp ^ 2 = cos(v_ab, v_ap) ^ 2 * sum(n ^ 2 for n in v_ab) * sum(n ^ 2 for n in v_ap)
The cosine of the angle between the two vectors should be 1, if they point in the same direction. So the square of the scalar product if the vectors share the same direction would be
euclidean_length(v_ap) ^ 2 * euclidean_length(v_ab) ^ 2
which we then compare to the actual scalar product scp.
This however leaves one problem: taking the square eliminates the sign, which we check separately with the comparison scp > 0. Since the euclidean length is always positive, only the sign of the cosine determines the value of scp. A negative value of scp means that the angle of between v_ap and v_ab is at least pi / 4 and at most pi * 3/4. However the sign of scp get's lost when squaring, which means that we can only check whether the two vectors are parallel, not if they point into the same direction. This problem is solved by checking scp > 0 in addition.
Last but not least we have to check whether the distance from a to p is shorter than the distance from a to b. This can be done by checking whether v_ap has a smaller length than v_ab. Since we already checked that the two vectors point into exactly the same direction, it is sufficient check whether all elements in v_ap are at most as large as the corresponding element in v_ab, which is done by
all(m <= n for m, n in zip(v_ap, v_ab))
The answer what you are finding is as follows:
Lets say our line equation is Ax + By + C = 0. Then we just need
this three coefficients (A, B and C).
Say this line goes through point P(P_x, P_y) and Q(Q_x, Q_y). Then
it is easy to calculate the above three coefficients.
A = P_y - Q_y,
B = Q_x - P_x,
C = - A P_x - B P_y
Once we have our line equation, we can easily calculate x or y
coordinate for given y or x respectfully.
Here is my c++ template:
#include <iostream>
using namespace std;
// point struct
struct pt {
int x, y;
};
// line struct
struct line {
int a, b, c;
// create line object
line() {}
line (pt p, pt q) {
a = p.y - q.y;
b = q.x - p.x;
c = - a * p.x - b * p.y;
}
// a > 0; is must be true otherwise runtime error will occure
int getX(int y) {
return (-b * y - c) / a;
}
// b > 0; is must be true otherwise runtime error will occure
int getY(int x) {
return (-a * x - c) / b;
}
};
int main() {
pt p, q;
p.x = 1, p.y = 2;
q.x = 3, q.y = 6;
line m = line(p, q);
cout << "for y = 4, x = " << m.getX(4) << endl;
cout << "for x = 2, y = " << m.getY(2) << endl;
return 0;
}
Output:
for y = 4, x = 2
for x = 2, y = 4
Ref: http://e-maxx.ru/algo/segments_intersection
How to find the n-th term in a sequence with following recurrence relation for a given n?
F(n) = 2 * b * F(n – 1) – F(n – 2), F(0) = a, F(1) = b
where a and b are constants.
The value of N is quite large (1 ≤ n ≤ 1012) and so matrix exponentiation is required.
Here is my code for it; ll is a typedef for long long int, and value is to be taken modulo r.
void multiply(ll F[2][2], ll M[2][2])
{
ll x = ((F[0][0] * M[0][0]) % r + (F[0][1] * M[1][0]) % r) % r;
ll y = ((F[0][0] * M[0][1]) % r + (F[0][1] * M[1][1]) % r) % r;
ll z = ((F[1][0] * M[0][0]) % r + (F[1][1] * M[1][0]) % r) % r;
ll w = ((F[1][0] * M[0][1]) % r + (F[1][1] * M[1][1]) % r) % r;
F[0][0] = x;
F[0][1] = y;
F[1][0] = z;
F[1][1] = w;
}
void power(ll F[2][2], ll n, ll b)
{
if (n == 0 || n == 1)
return;
ll M[2][2] = {{2 * b, -1}, {1, 0}};
power(F, n / 2,b);
multiply(F, F);
if (n % 2 != 0)
multiply(F, M);
}
ll rec(ll n, ll b, ll a)
{
ll F[2][2] = {{2 * b, -1}, {1, 0}};
if (n == 0)
return a;
if (n == 1)
return b;
power(F, n - 1,b);
return F[0][0] % r;
}
However I am facing problems getting required value in all cases, that is I am getting Wrong Answer (WA) verdict for some cases.
Could anyone help me with this question and point out the mistake in this code so I can tackle these kind of problems myself afterward?
P.S. First timer here. Apologies if I did something incorrectly and missed out on anything.
Technical:
Perhaps you are asked to find the value res modulo r so that 0 <= res < r.
However, by using -1 in the matrix, you can actually get negative intermediate and final values. The reason is that, in most programming languages, the modulo operation actually uses division rounded towards zero, and so produces a result in the range -r < res < r (example link).
Try either of the following:
Change that -1 to r - 1, so that all intermediate values remain non-negative.
Fix the final result by returning (F[0][0] + r) % r instead of just F[0][0] % r.
Formula:
Your formula looks wrong. Logically, your rec function says that nothing except F(0) depends on a, which is obviously wrong.
Recall why and how we use the matrix in the first place:
( F(n) ) = ( 2b -1 ) * ( F(n-1) )
( F(n-1) ) ( 1 0 ) ( F(n-2) )
Here, we get a 2x1 vector by multiplying a 2x2 matrix and a 2x1 vector. We then look at its top element and have, by multiplication rules,
F(n) = 2b * F(n-1) + (-1) * F(n-2)
The point is, we can take the power of the matrix to get the following:
( F(n) ) = ( 2b -1 ) ^{n-1} * ( F(1) )
( F(n-1) ) ( 1 0 ) ( F(0) )
By the same argument, we have
F(n) = X * F(1) + Y * F(0)
where X and Y are the top row of the matrix:
( 2b -1 ) ^{n-1} = ( X Y )
( 1 0 ) ( Z T )
So F[0][0] % r is not the answer, really.
The real answer looks like
(F[0][0] * b + F[0][1] * a) % r
If we can have negative intermediate values (see point 1 above), the result is still from -r to r instead of from 0 to r. To fix it, we can add one more r and take the modulo once again:
((F[0][0] * b + F[0][1] * a) % r + r) % r
Possible reason for WA is, you return a or b without doing any mod.
Try it.
if (n == 0)
return a%r;
if (n == 1)
return b%r;
If you are still getting WA, please give some test cases or problem link.
Here I'll use the notation
It is possible to find the continued fraction of a number by computing it then applying the definition, but that requires at least O(n) bits of memory to find a0, a1 ... an, in practice it is a much worse. Using double floating point precision it is only possible to find a0, a1 ... a19.
An alternative is to use the fact that if a,b,c are rational numbers then there exist unique rationals p,q,r such that 1/(a+b*21/3+c*22/3) = x+y*21/3+z*22/3, namely
So if I represent x,y, and z to absolute precision using the boost rational lib I can obtain floor(x + y*21/3+z*22/3) accurately only using double precision for 21/3 and 22/3 because I only need it to be within 1/2 of the true value. Unfortunately the numerators and denominators of x,y, and z grow considerably fast, and if you use regular floats instead the errors pile up quickly.
This way I was able to compute a0, a1 ... a10000 in under an hour, but somehow mathematica can do that in 2 seconds. Here's my code for reference
#include <iostream>
#include <boost/multiprecision/cpp_int.hpp>
namespace mp = boost::multiprecision;
int main()
{
const double t_1 = 1.259921049894873164767210607278228350570251;
const double t_2 = 1.587401051968199474751705639272308260391493;
mp::cpp_rational p = 0;
mp::cpp_rational q = 1;
mp::cpp_rational r = 0;
for(unsigned int i = 1; i != 10001; ++i) {
double p_f = static_cast<double>(p);
double q_f = static_cast<double>(q);
double r_f = static_cast<double>(r);
uint64_t floor = p_f + t_1 * q_f + t_2 * r_f;
std::cout << floor << ", ";
p -= floor;
//std::cout << floor << " " << p << " " << q << " " << r << std::endl;
mp::cpp_rational den = (p * p * p + 2 * q * q * q +
4 * r * r * r - 6 * p * q * r);
mp::cpp_rational a = (p * p - 2 * q * r) / den;
mp::cpp_rational b = (2 * r * r - p * q) / den;
mp::cpp_rational c = (q * q - p * r) / den;
p = a;
q = b;
r = c;
}
return 0;
}
The Lagrange algorithm
The algorithm is described for example in Knuth's book The Art of Computer Programming, vol 2 (Ex 13 in section 4.5.3 Analysis of Euclid's Algorithm, p. 375 in 3rd edition).
Let f be a polynomial of integer coefficients whose only real root is an irrational number x0 > 1. Then the Lagrange algorithm calculates the consecutive quotients of the continued fraction of x0.
I implemented it in python
def cf(a, N=10):
"""
a : list - coefficients of the polynomial,
i.e. f(x) = a[0] + a[1]*x + ... + a[n]*x^n
N : number of quotients to output
"""
# Degree of the polynomial
n = len(a) - 1
# List of consecutive quotients
ans = []
def shift_poly():
"""
Replaces plynomial f(x) with f(x+1) (shifts its graph to the left).
"""
for k in range(n):
for j in range(n - 1, k - 1, -1):
a[j] += a[j+1]
for _ in range(N):
quotient = 1
shift_poly()
# While the root is >1 shift it left
while sum(a) < 0:
quotient += 1
shift_poly()
# Otherwise, we have the next quotient
ans.append(quotient)
# Replace polynomial f(x) with -x^n * f(1/x)
a.reverse()
a = [-x for x in a]
return ans
It takes about 1s on my computer to run cf([-2, 0, 0, 1], 10000). (The coefficients correspond to the polynomial x^3 - 2 whose only real root is 2^(1/3).) The output agrees with the one from Wolfram Alpha.
Caveat
The coefficients of the polynomials evaluated inside the function quickly become quite large integers. So this approach needs some bigint implementation in other languages (Pure python3 deals with it, but for example numpy doesn't.)
You might have more luck computing 2^(1/3) to high accuracy and then trying to derive the continued fraction from that, using interval arithmetic to determine if the accuracy is sufficient.
Here's my stab at this in Python, using Halley iteration to compute 2^(1/3) in fixed point. The dead code is an attempt to compute fixed-point reciprocals more efficiently than Python via Newton iteration -- no dice.
Timing from my machine is about thirty seconds, spent mostly trying to extract the continued fraction from the fixed point representation.
prec = 40000
a = 1 << (3 * prec + 1)
two_a = a << 1
x = 5 << (prec - 2)
while True:
x_cubed = x * x * x
two_x_cubed = x_cubed << 1
x_prime = x * (x_cubed + two_a) // (two_x_cubed + a)
if -1 <= x_prime - x <= 1: break
x = x_prime
cf = []
four_to_the_prec = 1 << (2 * prec)
for i in range(10000):
q = x >> prec
r = x - (q << prec)
cf.append(q)
if True:
x = four_to_the_prec // r
else:
x = 1 << (2 * prec - r.bit_length())
while True:
delta_x = (x * ((four_to_the_prec - r * x) >> prec)) >> prec
if not delta_x: break
x += delta_x
print(cf)
I'm trying to analyze these functions but i am getting a bit lost. So for function f when t(n) = c if n < 1^-5
so if n >= 1^5 i get t(n) = c2 + t( n / 2 ) + t2( n / 2) where t2 is the time analysis of function h, but i'm confused on expanding it should it be something like
t(n) = ( t(n / 2) + t2( n / 2) ) * c2 + c
or should i be expanding t2 in side of that?
here is the code i am trying to analyze.
float f( float x) {
if ( abs( x ) < 1e-5 ) {
return x + ( ( x * x * x ) / 2 );
}
float y = f( x / 2 );
float z = g( x / 2 );
return 2 * y * z;
}
float g( float x ) {
if ( abs( x ) < 1e-5 ) {
return 1 + ( ( x * x ) / 2 );
}
float y = f( x / 2 );
float z = g( x / 2 );
return ( z * z ) + ( y * y );
}
T1(n) = T1(n / 2) + T2(n / 2) + c1
T2(n) = T1(n / 2)+T2(n / 2) + c2
so we have
T1(n) = O(T2(n))
T1(n) = 2T1(n / 2) + c1
since c1 = O(nlog22) master theorem implies that
T(n) = O(n)
Even though we are calling two different functions in this code, there is a thing about them that makes finding the complexity of this recursion easy.
What's happening is that at the top level, if you are entering f(), you are evaluating x and then calling two different functions - itself and g(). Even if you enter the function g() first, same thing happens, i.e. g() calls itself and f().
Since, every level down the tree the value of x halves, the number of levels on this tree would be Log2(n). Also, every node has 2 children viz. f(x/2) and g(x/2).
This is a complete binary tree of length Log2(n).
Work done on each node is constant - If the node represents the call to f(), you do 2 * y * z, which is constant. If the node represents the call to g(), you do y*y + z*z, which is also constant.
Hence, all we need to do is, find the total number of nodes in a compete binary tree of length Log2(n) and we have our complexity.
A perfect binary tree of height h has total 2h + 1 - 1 nodes.
In this case it would be 2Log2(n) + 1 - 1 nodes.
Also, aLogab = b (By property of logarithms)1
Hence, the complexity is O(2Log2(n)) = O(n).
1 See first property in "Cancelling Exponentials" section.
A question last week defined the zig zag ordering on an n by m matrix and asked how to list the elements in that order.
My question is how to quickly find the ith item in the zigzag ordering? That is, without traversing the matrix (for large n and m that's much too slow).
For example with n=m=8 as in the picture and (x, y) describing (row, column)
f(0) = (0, 0)
f(1) = (0, 1)
f(2) = (1, 0)
f(3) = (2, 0)
f(4) = (1, 1)
...
f(63) = (7, 7)
Specific question: what is the ten billionth (1e10) item in the zigzag ordering of a million by million matrix?
Let's assume that the desired element is located in the upper half of the matrix. The length of the diagonals are 1, 2, 3 ..., n.
Let's find the desired diagonal. It satisfies the following property:
sum(1, 2 ..., k) >= pos but sum(1, 2, ..., k - 1) < pos. The sum of 1, 2, ..., k is k * (k + 1) / 2. So we just need to find the smallest integer k such that k * (k + 1) / 2 >= pos. We can either use a binary search or solve this quadratic inequality explicitly.
When we know the k, we just need to find the pos - (k - 1) * k / 2 element of this diagonal. We know where it starts and where we should move(up or down, depending on the parity of k), so we can find the desired cell using a simple formula.
This solution has an O(1) or an O(log n) time complexity(it depends on whether we use a binary search or solve the inequation explicitly in step 2).
If the desired element is located in the lower half of the matrix, we can solve this problem for a pos' = n * n - pos + 1 and then use symmetry to get the solution to the original problem.
I used 1-based indexing in this solution, using 0-based indexing might require adding +1 or -1 somewhere, but the idea of the solution is the same.
If the matrix is rectangular, not square, we need to consider the fact the length of diagonals look this way: 1, 2, 3, ..., m, m, m, .., m, m - 1, ..., 1(if m <= n) when we search for the k, so the sum becomes something like k * (k + 1) / 2 if k <= m and k * (k + 1) / 2 + m * (k - m) otherwise.
import math, random
def naive(n, m, ord, swap = False):
dx = 1
dy = -1
if swap:
dx, dy = dy, dx
cur = [0, 0]
for i in range(ord):
cur[0] += dy
cur[1] += dx
if cur[0] < 0 or cur[1] < 0 or cur[0] >= n or cur[1] >= m:
dx, dy = dy, dx
if cur[0] >= n:
cur[0] = n - 1
cur[1] += 2
if cur[1] >= m:
cur[1] = m - 1
cur[0] += 2
if cur[0] < 0: cur[0] = 0
if cur[1] < 0: cur[1] = 0
return cur
def fast(n, m, ord, swap = False):
if n < m:
x, y = fast(m, n, ord, not swap)
return [y, x]
alt = n * m - ord - 1
if alt < ord:
x, y = fast(n, m, alt, swap if (n + m) % 2 == 0 else not swap)
return [n - x - 1, m - y - 1]
if ord < (m * (m + 1) / 2):
diag = int((-1 + math.sqrt(1 + 8 * ord)) / 2)
parity = (diag + (0 if swap else 1)) % 2
within = ord - (diag * (diag + 1) / 2)
if parity: return [diag - within, within]
else: return [within, diag - within]
else:
ord -= (m * (m + 1) / 2)
diag = int(ord / m)
within = ord - diag * m
diag += m
parity = (diag + (0 if swap else 1)) % 2
if not parity:
within = m - within - 1
return [diag - within, within]
if __name__ == "__main__":
for i in range(1000):
n = random.randint(3, 100)
m = random.randint(3, 100)
ord = random.randint(0, n * m - 1)
swap = random.randint(0, 99) < 50
na = naive(n, m, ord, swap)
fa = fast(n, m, ord, swap)
assert na == fa, "(%d, %d, %d, %s) ==> (%s), (%s)" % (n, m, ord, swap, na, fa)
print fast(1000000, 1000000, 9999999999, False)
print fast(1000000, 1000000, 10000000000, False)
So the 10-billionth element (the one with ordinal 9999999999), and the 10-billion-first element (the one with ordinal 10^10) are:
[20331, 121089]
[20330, 121090]
An analytical solution
In the general case, your matrix will be divided in 3 areas:
an initial triangle t1
a skewed part mid where diagonals have a constant length
a final triangle t2
Let's call p the index of your diagonal run.
We want to define two functions x(p) and y(p) that give you the column and row of the pth cell.
Initial triangle
Let's look at the initial triangular part t1, where each new diagonal is one unit longer than the preceding.
Now let's call d the index of the diagonal that holds the cell, and
Sp = sum(di) for i in [0..p-1]
We have p = Sp + k, with 0 <=k <= d and
Sp = d(d+1)/2
if we solve for d, it brings
d²+d-2p = 0, a quadratic equation where we retain only the positive root:
d = (-1+sqrt(1+8*p))/2
Now we want the highest integer value closest to d, which is floor(d).
In the end, we have
p = d + k with d = floor((-1+sqrt(1+8*p))/2) and k = p - d(d+1)/2
Let's call
o(d) the function that equals 1 if d is odd and 0 otherwise, and
e(d) the function that equals 1 if d is even and 0 otherwise.
We can compute x(p) and y(p) like so:
d = floor((-1+sqrt(1+8*p))/2)
k = p - d(d+1)/2
o = d % 2
e = 1 - o
x = e*d + (o-e)*k
y = o*d + (e-o)*k
even and odd functions are used to try to salvage some clarity, but you can replace
e(p) with 1 - o(p) and have slightly more efficient but less symetric formulaes for x and y.
Middle part
let's consider the smallest matrix dimension s, i.e. s = min (m,n).
The previous formulaes hold until x or y (whichever comes first) reaches the value s.
The upper bound of p such as x(i) <= s and y(i) <= s for all i in [0..p]
(i.e. the cell indexed by p is inside the initial triangle t1) is given by
pt1 = s(s+1)/2.
For p >= pt1, diagonal length remains equal to s until we reach the second triangle t2.
when inside mid, we have:
p = s(s+1)/2 + ds + k with k in [0..s[.
which yields:
d = floor ((p - s(s+1)/2)/s)
k = p - ds
We can then use the same even/odd trick to compute x(p) and y(p):
p -= s(s+1)/2
d = floor (p / s)
k = p - d*s
o = (d+s) % 2
e = 1 - o
x = o*s + (e-o)*k
y = e*s + (o-e)*k
if (n > m)
x += d+e
y -= e
else
y += d+o
x -= o
Final triangle
Using symetry, we can calculate pt2 = m*n - s(s+1)/2
We now face nearly the same problem as for t1, except that the diagonal may run in the same direction as for t1 or in the reverse direction (if n+m is odd).
Using symetry tricks, we can compute x(p) and y(p) like so:
p = n*m -1 - p
d = floor((-1+sqrt(1+8*p))/2)
k = p - d*(d+1)/2
o = (d+m+n) % 2
e = 1 - $o;
x = n-1 - (o*d + (e-o)*k)
y = m-1 - (e*d + (o-e)*k)
Putting all together
Here is a sample c++ implementation.
I used 64 bits integers out of sheer lazyness. Most could be replaced by 32 bits values.
The computations could be made more effective by precomputing a few more coefficients.
A good part of the code could be factorized, but I doubt it is worth the effort.
Since this is just a quick and dirty proof of concept, I did not optimize it.
#include <cstdio> // printf
#include <algorithm> // min
using namespace std;
typedef long long tCoord;
void panic(const char * msg)
{
printf("PANIC: %s\n", msg);
exit(-1);
}
struct tPoint {
tCoord x, y;
tPoint(tCoord x = 0, tCoord y = 0) : x(x), y(y) {}
tPoint operator+(const tPoint & p) const { return{ x + p.x, y + p.y }; }
bool operator!=(const tPoint & p) const { return x != p.x || y != p.y; }
};
class tMatrix {
tCoord n, m; // dimensions
tCoord s; // smallest dimension
tCoord pt1, pt2; // t1 / mid / t2 limits for p
public:
tMatrix(tCoord n, tCoord m) : n(n), m(m)
{
s = min(n, m);
pt1 = (s*(s + 1)) / 2;
pt2 = n*m - pt1;
}
tPoint diagonal_cell(tCoord p)
{
tCoord x, y;
if (p < pt1) // inside t1
{
tCoord d = (tCoord)floor((-1 + sqrt(1 + 8 * p)) / 2);
tCoord k = p - (d*(d + 1)) / 2;
tCoord o = d % 2;
tCoord e = 1 - o;
x = o*d + (e - o)*k;
y = e*d + (o - e)*k;
}
else if (p < pt2) // inside mid
{
p -= pt1;
tCoord d = (tCoord)floor(p / s);
tCoord k = p - d*s;
tCoord o = (d + s) % 2;
tCoord e = 1 - o;
x = o*s + (e - o)*k;
y = e*s + (o - e)*k;
if (m > n) // vertical matrix
{
x -= o;
y += d + o;
}
else // horizontal matrix
{
x += d + e;
y -= e;
}
}
else // inside t2
{
p = n * m - 1 - p;
tCoord d = (tCoord)floor((-1 + sqrt(1 + 8 * p)) / 2);
tCoord k = p - (d*(d + 1)) / 2;
tCoord o = (d + m + n) % 2;
tCoord e = 1 - o;
x = n - 1 - (o*d + (e - o)*k);
y = m - 1 - (e*d + (o - e)*k);
}
return{ x, y };
}
void check(void)
{
tPoint move[4] = { { 1, 0 }, { -1, 1 }, { 1, -1 }, { 0, 1 } };
tPoint pos;
tCoord dir = 0;
for (tCoord p = 0; p != n * m ; p++)
{
tPoint dc = diagonal_cell(p);
if (pos != dc) panic("zot!");
pos = pos + move[dir];
if (dir == 0)
{
if (pos.y == m - 1) dir = 2;
else dir = 1;
}
else if (dir == 3)
{
if (pos.x == n - 1) dir = 1;
else dir = 2;
}
else if (dir == 1)
{
if (pos.y == m - 1) dir = 0;
else if (pos.x == 0) dir = 3;
}
else
{
if (pos.x == n - 1) dir = 3;
else if (pos.y == 0) dir = 0;
}
}
}
};
void main(void)
{
const tPoint dim[] = { { 10, 10 }, { 11, 11 }, { 10, 30 }, { 30, 10 }, { 10, 31 }, { 31, 10 }, { 11, 31 }, { 31, 11 } };
for (tPoint d : dim)
{
printf("Checking a %lldx%lld matrix...", d.x, d.y);
tMatrix(d.x, d.y).check();
printf("done\n");
}
tCoord p = 10000000000;
tMatrix matrix(1000000, 1000000);
tPoint cell = matrix.diagonal_cell(p);
printf("Coordinates of %lldth cell: (%lld,%lld)\n", p, cell.x, cell.y);
}
Results are checked against "manual" sweep of the matrix.
This "manual" sweep is a ugly hack that won't work for a one-row or one-column matrix, though diagonal_cell() does work on any matrix (the "diagonal" sweep becomes linear in that case).
The coordinates found for the 10.000.000.000th cell of a 1.000.000x1.000.000 matrix seem consistent, since the diagonal d on which the cell stands is about sqrt(2*1e10), approx. 141421, and the sum of cell coordinates is about equal to d (121090+20330 = 141420). Besides, it is also what the two other posters report.
I would say there is a good chance this lump of obfuscated code actually produces an O(1) solution to your problem.