Strassen matrix multiplication to store linear equations - algorithm

One of the questions I've come across in my textbook is:
In Computer Graphics transformations are applied on many vertices on the screen. Translation, Rotations
and Scaling.
Assume you’re operating on a vertex with 3 values (X, Y, 1). X, Y being the X Y coordinates and 1 is always
constant
A Translation is done on X as X = X + X’ and on Y as Y = Y + Y’
X’ and Y’ being the values to translate by
A scaling is done on X as X = aX and on Y as Y = bY
a and b being the scaling factors
Propose the best way to store these linear equations and an optimal way to calculate them on each vertex
It is hinted that it involves matrix multiplication and Strassen. However, I'm not sure where to start here? It doesn't involve complex code and it states to come up with something simple to showcase my idea but all Strassen implementations I've come across are definitely large enough to not call complex. What should my thought process be here?
Would my matrix look like this? 3x3 for each equation or do I combine them all in one?
[ X X X']
[ Y Y Y']
[ 1 1 1 ]

What you're trying to find is a transformation matrix, which you can then use to transform some current (x, y) point into the next (nx, ny) point. In other words, we want
start = Vec([x, y, 1])
matrix = Matrix(...)
next = start * matrix // * is matrix multiplication
Now, if your next is supposed to look something like Vec([a * x + x', b * y + y', 1]), we can work our way backwards to figure out the matrix. First, look at just the x component. We're going to effectively take the dot product of our start vector and the topmost row of our matrix, yielding a * x + x'.
If we write it out more explicitly, we want a * x + 0 * y + x' * 1. Hopefully that makes it a bit more easy to see that the vector we want to dot start with is Vec([a, 0, x']). We can repeat this for the remaining two rows of the matrix, and obtain the following matrix:
matrix = Matrix(
[[a, 0, x'],
[0, b, y'],
[0, 0, 1]])
Double check that this makes sense and seems reasonable to you. If we take our start vector and multiply it with this matrix, we'll get the translated vector next as Vec([a * x + x', b * y + y', 1]).
Now for the real beauty of this- the matrix itself doesn't care at all about what our inputs are, its completely independent. So, we can repeatedly apply this matrix over and over again to step forward through more scaling and translations.
next_next_next = start * matrix * matrix * matrix
Knowing this, we can actually compute many steps ahead really quickly, using some mathematical tricks. Multiplying but the matrix n times is the same as multiplying by matrix raised to the nth power. And fortunately, we have an efficient method for computing a matrix to a power- its called exponentiation by squaring (actually applies to regular numbers as well, but here we're concerned with multiplying matrices, and the logic still applies). In a nutshell, rather than multiplying the number or matrix over and over again n times, we square it and multiply intermediate values by the original number / matrix at the right times, to very rapidly approach the desired power (in log(n) multiplications).
This is almost certainly what your professor is wanting you to realize. You can simulate n translations / scalings / rotations (yes, there are rotation matrices as well) in log(n) time.
Extra Mile
What's even cooler is that using some more advanced linear algebra, you can actually do it even faster. You can diagonalize your matrix (meaning you rewrite your matrix as P * D * P^-1, that is, the product of a some matrix P with a matrix D where the only non-zero elements are along the main diagonal, multiplied by the inverse of P). You can then raise this diagonalized matrix to a power really quickly, because (P * D * P^-1) * (P * D * P^-1) simplifies to P * D * D * P^-1, and this generalizes to:
M^N = (P * D * P^-1)^N = (P * D^N * P^-1)
Since D only has non-zero elements along its diagonal, you can raise it to any power by just raising each individual element to that power, which is just the normal cost of integer multiplication, across as many elements as your matrix is wide/tall. This is stupidly fast, and then you just do a single matrix multiplication on either side to arrive at M^N, and then multiply your start vector with this, for your end result.

Related

Computer representation of quaternions that's exact for 90-degree rotations?

Unit quaternions have several advantages over 3x3 orthogonal matrices
for representing 3d rotations on a computer.
However, one thing that has been disappointing me about the unit quaternion
representation is that axis-aligned 90 degree rotations
aren't exactly representable. For example, a 90-degree rotation around the z axis, taking the +x axis to the +y axis, is represented as [w=sqrt(1/2), x=0, y=0, z=sqrt(1/2)].
Surprising/unpleasant consequences include:
applying a floating-point-quaternion-represented axis-aligned 90 degree rotation to a vector v
often doesn't rotate v by exactly 90 degrees
applying a floating-point-quaternion-represented axis-aligned 90 degree rotation to a vector v four times
often doesn't yield exactly v
squaring a floating-point-quaternion representing a 90 degree rotation around a coordinate axis
doesn't exactly yield the (exactly representable) 180 degree rotation
around that coordinate axis,
and raising it to the eighth power doesn't yield the identity quaternion.
Because of this unfortunate lossiness of the quaternion representation on "nice" rotations,
I still sometimes choose 3x3 matrices for applications in which I'd like axis-aligned
90 degree rotations, and compositions of them,
to be exact and floating-point-roundoff-error-free.
But the matrix representation isn't ideal either,
since it loses the sometimes-needed double covering property
(i.e. quaternions distinguish between the identity and a 360-degree rotation,
but 3x3 rotation matrices don't) as well as other familiar desirable numerical properties
of the quaternion representation, such as lack of need for re-orthogonalization.
My question: is there a computer representation of unit quaternions that does not suffer this
imprecision, and also doesn't lose the double covering property?
One solution I can think of is to represent each of the 4 elements of the quaternion
as a pair of machine-representable floating-point numbers [a,b], meaning a + b √2.
So the representation of a quaternion would consist of eight floating-point numbers.
I think that works, but it seems rather heavyweight;
e.g. when computing the product of a long sequence of quaternions,
each multiplication in the simple quaternion calculation would turn into
4 floating-point multiplications and 2 floating-point additions,
and each addition would turn into 2 floating-point additions. From the point of view of trying to write a general-purpose library implementation, all that extra computation and storage seems pointless as soon as there's a factor that's not one of these "nice" rotations.
Another possible solution would be to represent each quaternion q=w+xi+yj+zk
as a 4-tuple [sign(w)*w2, sign(x)*x2, sign(y)*y2, sign(z)*z2].
This representation is concise and has the desired non-lossiness for the subgroup of
interest, but I don't know how to multiply together two quaternions in this representation.
Yet another possible approach would be to store the quaternion q2
instead of the usual q. This seems promising at first,
but, again, I don't know how to non-lossily multiply
two of these representions together on the computer, and furthermore
the double-cover property is evidently lost.
You probably want to check the paper "Algorithms for manipulating quaternions in floating-point arithmetic" published in 2020 available online:
https://hal.archives-ouvertes.fr/hal-02470766/file/quaternions.pdf
Which shows how to perform exact computations and avoid unbounded numerical errors.
EDIT:
You can get rid of the square roots by using an unnormalized (i.e., non-unit) quaternion. Let me explain the idea.
Having two unit 3D-vectors X and Y, represented as pure quaternions, the quaternion Q rotating X to Y is
Q = X * (X + Y) / |X * (X + Y)|
The denominator, which is taking the norm, is the problem since it involves a square root.
You can see it by expanding the expresion as:
Q = (1 + X * Y) / sqrt(2 + 2*(X • Y))
If we replace X = i and Y = j to see the 90 degree rotation you get:
Q = (1 + k) / sqrt(2)
So, |1 + k| = sqrt(2). But we can actually use the unnormalized quaternion Q = 1 + k to perform rotations, all we need to do is to normalize the rotated result by the
SQUARED norm of the quaternion.
For example, the squared norm of Q = 1 + k is |1 + k|^2 = 2 (and that is exact as you never took the square root) lets apply the unnormalized quaternion to the vector X = i:
= (1 + k) i (1 - k)
= (i + k * i - i * k - k * i * k)
= (i + 2 j - i)
= 2 j
To get the correct result we divide by squared norm.
I haven't tested but I believe you would get exact results by applying unnormalized quaternions to your vectors and then dividing the result by the squared norm.
The algorithm would be
Create the unnormalized quaternion Q = X * (X + Y)
Apply Q to your vectors as: v' = Q * v * ~Q
Normalize by squared norm: v'' = v' / |Q|^2

Is there an easy function from a pair of 32-bit ints to a single 64-bit int that preserves rotational order?

This is a question that came up in the context of sorting points with integer coordinates into clockwise order, but this question is not about how to do that sorting.
This question is about the observation that 2-d vectors have a natural cyclic ordering. Unsigned integers with usual overflow behavior (or signed integers using twos-complement) also have a natural cyclic ordering. Can you easily map from the first ordering to the second?
So, the exact question is whether there is a map from pairs of twos-complement signed 32-bit integers to unsigned (or twos-complement signed) 64-bit integers such that any list of vectors that is in clockwise order maps to integers that are in decreasing (modulo overflow) order?
Some technical cases that people will likely ask about:
Yes, vectors that are multiples of each other should map to the same thing
No, I don't care which vector (if any) maps to 0
No, the images of antipodal vectors don't have to differ by 2^63 (although that is a nice-to-have)
The obvious answer is that since there are only around 0.6*2^64 distinct slopes, the answer is yes, such a map exists, but I'm looking for one that is easily computable. I understand that "easily" is subjective, but I'm really looking for something reasonably efficient and not terrible to implement. So, in particular, no counting every lattice point between the ray and the positive x-axis (unless you know a clever way to do that without enumerating them all).
An important thing to note is that it can be done by mapping to 65-bit integers. Simply project the vector out to where it hits the box bounded by x,y=+/-2^62 and round toward negative infinity. You need 63 bits to represent that integer and two more to encode which side of the box you hit. The implementation needs a little care to make sure you don't overflow, but only has one branch and two divides and is otherwise quite cheap. It doesn't work if you project out to 2^61 because you don't get enough resolution to separate some slopes.
Also, before you suggest "just use atan2", compute atan2(1073741821,2147483643) and atan2(1073741820,2147483641)
EDIT: Expansion on the "atan2" comment:
Given two values x_1 and x_2 that are coprime and just less than 2^31 (I used 2^31-5 and 2^31-7 in my example), we can use the extended Euclidean algorithm to find y_1 and y_2 such that y_1/x_1-y_2/x_2 = 1/(x_1*x_2) ~= 2^-62. Since the derivative of arctan is bounded by 1, the difference of the outputs of atan2 on these values is not going to be bigger than that. So, there are lots of pairs of vectors that won't be distinguishable by atan2 as vanilla IEEE 754 doubles.
If you have 80-bit extended registers and you are sure you can retain residency in those registers throughout the computation (and don't get kicked out by a context switch or just plain running out of extended registers), then you're fine. But, I really don't like the correctness of my code relying on staying resident in extended registers.
Here's one possible approach, inspired by a comment in your question. (For the tl;dr version, skip down to the definition of point_to_line at the bottom of this answer: that gives a mapping for the first quadrant only. Extension to the whole plane is left as a not-too-difficult exercise.)
Your question says:
in particular, no counting every lattice point between the ray and the positive x-axis (unless you know a clever way to do that without enumerating them all).
There is an algorithm to do that counting without enumerating the points; its efficiency is akin to that of the Euclidean algorithm for finding greatest common divisors. I'm not sure to what extent it counts as either "easily computable" or "clever".
Suppose that we're given a point (p, q) with integer coordinates and both p and q positive (so that the point lies in the first quadrant). We might as well also assume that q < p, so that the point (p, q) lies between the x-axis y = 0 and the diagonal line y = x: if we can solve the problem for the half of the first quadrant that lies below the diagonal, we can make use of symmetry to solve it generally.
Write M for the bound on the size of p and q, so that in your example we want M = 2^31.
Then the number of lattice points strictly inside the triangle bounded by:
the x-axis y = 0
the ray y = (q/p)x that starts at the origin and passes through (p, q), and
the vertical line x = M
is the sum as x ranges over integers in (0, M) of ⌈qx/p⌉ - 1.
For convenience, I'll drop the -1 and include 0 in the range of the sum; both those changes are trivial to compensate for. And now the core functionality we need is the ability to evaluate the sum of ⌈qx/p⌉ as x ranges over the integers in an interval [0, M). While we're at it, we might also want to be able to compute a closely-related sum: the sum of ⌊qx/p⌋ over that same range of x (and it'll turn out that it makes sense to evaluate both of these together).
For testing purposes, here are slow, naive-but-obviously-correct versions of the functions we're interested in, here written in Python:
def floor_sum_slow(p, q, M):
"""
Sum of floor(q * x / p) for 0 <= x < M.
Assumes p positive, q and M nonnegative.
"""
return sum(q * x // p for x in range(M))
def ceil_sum_slow(p, q, M):
"""
Sum of ceil(q * x / p) for 0 <= x < M.
Assumes p positive, q and M nonnegative.
"""
return sum((q * x + p - 1) // p for x in range(M))
And an example use:
>>> floor_sum_slow(51, 43, 2**28) # takes several seconds to complete
30377220771239253
>>> ceil_sum_slow(140552068, 161600507, 2**28)
41424305916577422
These sums can be evaluated much faster. The first key observation is that if q >= p, then we can apply the Euclidean "division algorithm" and write q = ap + r for some integers a and r. The sum then simplifies: the ap part contributes a factor of a * M * (M - 1) // 2, and we're reduced from computing floor_sum(p, q, M) to computing floor_sum(p, r, M). Similarly, the computation of ceil_sum(p, q, M) reduces to the computation of ceil_sum(p, q % p, M).
The second key observation is that we can express floor_sum(p, q, M) in terms of ceil_sum(q, p, N), where N is the ceiling of (q/p)M. To do this, we consider the rectangle [0, M) x (0, (q/p)M), and divide that rectangle into two triangles using the line y = (q/p)x. The number of lattice points within the rectangle that lie on or below the line is floor_sum(p, q, M), while the number of lattice points within the rectangle that lie above the line is ceil_sum(q, p, N). Since the total number of lattice points in the rectangle is (N - 1)M, we can deduce the value of floor_sum(p, q, M) from that of ceil_sum(q, p, N), and vice versa.
Combining those two ideas, and working through the details, we end up with a pair of mutually recursive functions that look like this:
def floor_sum(p, q, M):
"""
Sum of floor(q * x / p) for 0 <= x < M.
Assumes p positive, q and M nonnegative.
"""
a = q // p
r = q % p
if r == 0:
return a * M * (M - 1) // 2
else:
N = (M * r + p - 1) // p
return a * M * (M - 1) // 2 + (N - 1) * M - ceil_sum(r, p, N)
def ceil_sum(p, q, M):
"""
Sum of ceil(q * x / p) for 0 <= x < M.
Assumes p positive, q and M nonnegative.
"""
a = q // p
r = q % p
if r == 0:
return a * M * (M - 1) // 2
else:
N = (M * r + p - 1) // p
return a * M * (M - 1) // 2 + N * (M - 1) - floor_sum(r, p, N)
Performing the same calculation as before, we get exactly the same results, but this time the result is instant:
>>> floor_sum(51, 43, 2**28)
30377220771239253
>>> ceil_sum(140552068, 161600507, 2**28)
41424305916577422
A bit of experimentation should convince you that the floor_sum and floor_sum_slow functions give the same result in all cases, and similarly for ceil_sum and ceil_sum_slow.
Here's a function that uses floor_sum and ceil_sum to give an appropriate mapping for the first quadrant. I failed to resist the temptation to make it a full bijection, enumerating points in the order that they appear on each ray, but you can fix that by simply replacing the + gcd(p, q) term with + 1 in both branches.
from math import gcd
def point_to_line(p, q, M):
"""
Bijection from [0, M) x [0, M) to [0, M^2), preserving
the 'angle' ordering.
"""
if p == q == 0:
return 0
elif q <= p:
return ceil_sum(p, q, M) + gcd(p, q)
else:
return M * (M - 1) - floor_sum(q, p, M) + gcd(p, q)
Extending to the whole plane should be straightforward, though just a little bit messy due to the asymmetry between the negative range and the positive range in the two's complement representation.
Here's a visual demonstration for the case M = 7, printed using this code:
M = 7
for q in reversed(range(M)):
for p in range(M):
print(" {:02d}".format(point_to_line(p, q, M)), end="")
print()
Results:
48 42 39 36 32 28 27
47 41 37 33 29 26 21
46 40 35 30 25 20 18
45 38 31 24 19 16 15
44 34 23 17 14 12 11
43 22 13 10 09 08 07
00 01 02 03 04 05 06
This doesn't meet your requirement for an "easy" function, nor for a "reasonably efficient" one. But in principle it would work, and it might give some idea of how difficult the problem is. To keep things simple, let's consider just the case where 0 < y ≤ x, because the full problem can be solved by splitting the full 2D plane into eight octants and mapping each to its own range of integers in essentially the same way.
A point (x1, y1) is "anticlockwise" of (x2, y2) if and only if the slope y1/x1 is greater than the slope y2/x2. To map the slopes to integers in an order-preserving way, we can consider the sequence of all distinct fractions whose numerators and denominators are within range (i.e. up to 231), in ascending numerical order. Note that each fraction's numerical value is between 0 and 1 since we are just considering one octant of the plane.
This sequence of fractions is finite, so each fraction has an index at which it occurs in the sequence; so to map a point (x, y) to an integer, first reduce the fraction y/x to its simplest form (e.g. using Euclid's algorithm to find the GCD to divide by), then compute that fraction's index in the sequence.
It turns out this sequence is called a Farey sequence; specifically, it's the Farey sequence of order 231. Unfortunately, computing the index of a given fraction in this sequence turns out to be neither easy nor reasonably efficient. According to the paper
Computing Order Statistics in the Farey Sequence by Corina E. Pǎtraşcu and Mihai Pǎtraşcu, there is a somewhat complicated algorithm to compute the rank (i.e. index) of a fraction in O(n) time, where n in your case is 231, and there is unlikely to be an algorithm in time polynomial in log n because the algorithm can be used to factorise integers.
All of that said, there might be a much easier solution to your problem, because I've started from the assumption of wanting to map these fractions to integers as densely as possible (i.e. no "unused" integers in the target range), whereas in your question you wrote that the number of distinct fractions is about 60% of the available range of size 264. Intuitively, that amount of leeway doesn't seem like a lot to me, so I think the problem is probably quite difficult and you may need to settle for a solution that uses a larger output range, or a smaller input range. At the very least, by writing this answer I might save somebody else the effort of investigating whether this approach is feasible.
Just some random ideas / observations:
(edit: added two more and marked the first one as wrong as pointed out in the comments)
Divide into 16 22.5° segments instead of 8 45° segments
If I understand the problem correctly, the lines spread out "more" towards 45°, "wasting" resolution that you need for smaller angles. (Incorrect, see below)
In the mapping to 62 bit integers, there must be gaps. Identify enough low density areas to map down to 61 bits? Perhaps plot for a smaller problem to potentially see a pattern?
As the range for x and y is limited, for a given x0, all (legal) x < x0 with y > q must have a smaller angle. Could this help to break down the problem in some way? Perhaps cutting a triangle where points can easily be enumerated out of the problem for each quadrant?

Making a customizable LCG that travels backward and forward

How would i go about making an LCG (type of pseudo random number generator) travel in both directions?
I know that travelling forward is (a*x+c)%m but how would i be able to reverse it?
I am using this so i can store the seed at the position of the player in a map and be able to generate things around it by propogating backward and forward in the LCG (like some sort of randomized number line).
All LCGs cycle. In an LCG which achieves maximal cycle length there is a unique predecessor and a unique successor for each value x (which won't necessarily be true for LCGs that don't achieve maximal cycle length, or for other algorithms with subcycle behaviors such as von Neumann's middle-square method).
Suppose our LCG has cycle length L. Since the behavior is cyclic, that means that after L iterations we are back to the starting value. Finding the predecessor value by taking one step backwards is mathematically equivalent to taking (L-1) steps forward.
The big question is whether that can be converted into a single step. If you're using a Prime Modulus Multiplicative LCG (where the additive constant is zero), it turns out to be pretty easy to do. If xi+1 = a * xi % m, then xi+n = an * xi % m. As a concrete example, consider the PMMLCG with a = 16807 and m = 231-1. This has a maximal cycle length of m-1 (it can never yield 0 for obvious reasons), so our goal is to iterate m-2 times. We can precalculate am-2 % m = 1407677000 using readily available exponentiation/mod libraries. Consequently, a forward step is found as xi+1 = 16807 * xi % 231-1, while a backwards step is found as xi-1 = 1407677000 * xi % 231-1.
ADDITIONAL
The same concept can be extended to generic full-cycle LCGs by casting the transition in matrix form and doing fast matrix exponentiation to come up with the equivalent one-stage transform. The matrix formulation for xi+1 = (a * xi + c) % m is Xi+1 = T · Xi % m, where T is the matrix [[a c],[0 1]] and X is the column vector (x, 1) transposed. Multiple iterations of the LCG can be quickly calculated by raising T to any desired power through fast exponentiation techniques using squaring and halving the power. After noticing that powers of matrix T never alter the second row, I was able to focus on just the first row calculations and produced the following implementation in Ruby:
def power_mod(ary, mod, power)
return ary.map { |x| x % mod } if power < 2
square = [ary[0] * ary[0] % mod, (ary[0] + 1) * ary[1] % mod]
square = power_mod(square, mod, power / 2)
return square if power.even?
return [square[0] * ary[0] % mod, (square[0] * ary[1] + square[1]) % mod]
end
where ary is a vector containing a and c, the multiplicative and additive coefficients.
Using this with power set to the cycle length - 1, I was able to determine coefficients which yield the predecessor for various LCGs listed in Wikipedia. For example, to "reverse" the LCG with a = 1664525, c = 1013904223, and m = 232, use a = 4276115653 and c = 634785765. You can easily confirm that the latter set of coefficients reverses the sequence produced by using the original coefficients.

What is the most efficient algorithm to find a straight line that goes through most points?

The problem:
N points are given on a 2-dimensional plane. What is the maximum number of points on the same straight line?
The problem has O(N2) solution: go through each point and find the number of points which have the same dx / dy with relation to the current point. Store dx / dy relations in a hash map for efficiency.
Is there a better solution to this problem than O(N2)?
There is likely no solution to this problem that is significantly better than O(n^2) in a standard model of computation.
The problem of finding three collinear points reduces to the problem of finding the line that goes through the most points, and finding three collinear points is 3SUM-hard, meaning that solving it in less than O(n^2) time would be a major theoretical result.
See the previous question on finding three collinear points.
For your reference (using the known proof), suppose we want to answer a 3SUM problem such as finding x, y, z in list X such that x + y + z = 0. If we had a fast algorithm for the collinear point problem, we could use that algorithm to solve the 3SUM problem as follows.
For each x in X, create the point (x, x^3) (for now we assume the elements of X are distinct). Next, check whether there exists three collinear points from among the created points.
To see that this works, note that if x + y + z = 0 then the slope of the line from x to y is
(y^3 - x^3) / (y - x) = y^2 + yx + x^2
and the slope of the line from x to z is
(z^3 - x^3) / (z - x) = z^2 + zx + x^2 = (-(x + y))^2 - (x + y)x + x^2
= x^2 + 2xy + y^2 - x^2 - xy + x^2 = y^2 + yx + x^2
Conversely, if the slope from x to y equals the slope from x to z then
y^2 + yx + x^2 = z^2 + zx + x^2,
which implies that
(y - z) (x + y + z) = 0,
so either y = z or z = -x - y as suffices to prove that the reduction is valid.
If there are duplicates in X, you first check whether x + 2y = 0 for any x and duplicate element y (in linear time using hashing or O(n lg n) time using sorting), and then remove the duplicates before reducing to the collinear point-finding problem.
If you limit the problem to lines passing through the origin, you can convert the points to polar coordinates (angle, distance from origin) and sort them by angle. All points with the same angle lie on the same line. O(n logn)
I don't think there is a faster solution in the general case.
The Hough Transform can give you an approximate solution. It is approximate because the binning technique has a limited resolution in parameter space, so the maximum bin will give you some limited range of possible lines.
Again an O(n^2) solution with pseudo code. Idea is create a hash table with line itself as the key. Line is defined by slope between the two points, point where line cuts x-axis and point where line cuts y-axis.
Solution assumes languages like Java, C# where equals method and hashcode methods of the object are used for hashing function.
Create an Object (call SlopeObject) with 3 fields
Slope // Can be Infinity
Point of intercept with x-axis -- poix // Will be (Infinity, some y value) or (x value, 0)
Count
poix will be a point (x, y) pair. If line crosses x-axis the poix will (some number, 0). If line is parallel to x axis then poix = (Infinity, some number) where y value is where line crosses y axis.
Override equals method where 2 objects are equal if Slope and poix are equal.
Hashcode is overridden with a function which provides hashcode based on combination of values of Slope and poix. Some pseudo code below
Hashmap map;
foreach(point in the array a) {
foeach(every other point b) {
slope = calculateSlope(a, b);
poix = calculateXInterception(a, b);
SlopeObject so = new SlopeObject(slope, poix, 1); // Slope, poix and intial count 1.
SlopeObject inMapSlopeObj = map.get(so);
if(inMapSlopeObj == null) {
inMapSlopeObj.put(so);
} else {
inMapSlopeObj.setCount(inMapSlopeObj.getCount() + 1);
}
}
}
SlopeObject maxCounted = getObjectWithMaxCount(map);
print("line is through " + maxCounted.poix + " with slope " + maxCounted.slope);
Move to the dual plane using the point-line duality transform for p=(a,b) p*:y=a*x + b.
Now using a line sweep algorithm find all intersection points in NlogN time.
(If you have points which are one above the other just rotate the points to some small angle).
The intersection points corresponds in the dual plane to lines in the primer plane.
Whoever said that since 3SUM have a reduction to this problem and thus the complexity is O(n^2). Please note that the complexity of 3SUM is less than that.
Please check https://en.wikipedia.org/wiki/3SUM and also read
https://tmc.web.engr.illinois.edu/reduce3sum_sosa.pdf
As already mentioned, there probably isn't a way to solve the general case of this problem better than O(n^2). However, if you assume a large number of points lie on the same line (say the probability that a random point in the set of points lie on the line with the maximum number of points is p) and don't need an exact algorithm, a randomized algorithm is more efficient.
maxPoints = 0
Repeat for k iterations:
1. Pick 2 random, distinct points uniformly at random
2. maxPoints = max(maxPoints, number of points that lies on the
line defined by the 2 points chosen in step 1)
Note that in the first step, if you picked 2 points which lies on the line with the maximum number of points, you'll get the optimal solution. Assuming n is very large (i.e. we can treat the probability of finding 2 desirable points as sampling with replacement), the probability of this happening is p^2. Therefore the probability of finding a suboptimal solution after k iterations is (1 - p^2)^k.
Suppose you can tolerate a false negative rate rate = err. Then this algorithm runs in O(nk) = O(n * log(err) / log(1 - p^2)). If both n and p are large enough, this is significantly more efficient than O(n^2). (i.e. Supposed n = 1,000,000 and you know there are at least 10,000 points that lie on the same line. Then n^2 would required on the magnitude of 10^12 operations, while randomized algorithm would require on the magnitude of 10^9 operations to get a error rate of less than 5*10^-5.)
It is unlikely for a $o(n^2)$ algorithm to exist, since the problem (of even checking if 3 points in R^2 are collinear) is 3Sum-hard (http://en.wikipedia.org/wiki/3SUM)
This is not a solution better than O(n^2), but you can do the following,
For each point convert first convert it as if it where in the (0,0) coordinate, and then do the equivalent translation for all the other points by moving them the same x,y distance you needed to move the original choosen point.
2.Translate this new set of translated points to the angle with respect to the new (0,0).
3.Keep stored the maximum number (MSN) of points that are in each angle.
4.Choose the maximum stored number (MSN), and that will be the solution

How can a transform a polynomial to another coordinate system?

Using assorted matrix math, I've solved a system of equations resulting in coefficients for a polynomial of degree 'n'
Ax^(n-1) + Bx^(n-2) + ... + Z
I then evaulate the polynomial over a given x range, essentially I'm rendering the polynomial curve. Now here's the catch. I've done this work in one coordinate system we'll call "data space". Now I need to present the same curve in another coordinate space. It is easy to transform input/output to and from the coordinate spaces, but the end user is only interested in the coefficients [A,B,....,Z] since they can reconstruct the polynomial on their own. How can I present a second set of coefficients [A',B',....,Z'] which represent the same shaped curve in a different coordinate system.
If it helps, I'm working in 2D space. Plain old x's and y's. I also feel like this may involve multiplying the coefficients by a transformation matrix? Would it some incorporate the scale/translation factor between the coordinate systems? Would it be the inverse of this matrix? I feel like I'm headed in the right direction...
Update: Coordinate systems are linearly related. Would have been useful info eh?
The problem statement is slightly unclear, so first I will clarify my own interpretation of it:
You have a polynomial function
f(x) = Cnxn + Cn-1xn-1 + ... + C0
[I changed A, B, ... Z into Cn, Cn-1, ..., C0 to more easily work with linear algebra below.]
Then you also have a transformation such as: z = ax + b that you want to use to find coefficients for the same polynomial, but in terms of z:
f(z) = Dnzn + Dn-1zn-1 + ... + D0
This can be done pretty easily with some linear algebra. In particular, you can define an (n+1)×(n+1) matrix T which allows us to do the matrix multiplication
d = T * c ,
where d is a column vector with top entry D0, to last entry Dn, column vector c is similar for the Ci coefficients, and matrix T has (i,j)-th [ith row, jth column] entry tij given by
tij = (j choose i) ai bj-i.
Where (j choose i) is the binomial coefficient, and = 0 when i > j. Also, unlike standard matrices, I'm thinking that i,j each range from 0 to n (usually you start at 1).
This is basically a nice way to write out the expansion and re-compression of the polynomial when you plug in z=ax+b by hand and use the binomial theorem.
If I understand your question correctly, there is no guarantee that the function will remain polynomial after you change coordinates. For example, let y=x^2, and the new coordinate system x'=y, y'=x. Now the equation becomes y' = sqrt(x'), which isn't polynomial.
Tyler's answer is the right answer if you have to compute this change of variable z = ax+b many times (I mean for many different polynomials). On the other hand, if you have to do it just once, it is much faster to combine the computation of the coefficients of the matrix with the final evaluation. The best way to do it is to symbolically evaluate your polynomial at point (ax+b) by Hörner's method:
you store the polynomial coefficients in a vector V (at the beginning, all coefficients are zero), and for i = n to 0, you multiply it by (ax+b) and add Ci.
adding Ci means adding it to the constant term
multiplying by (ax+b) means multiplying all coefficients by b into a vector K1, multiplying all coefficients by a and shifting them away from the constant term into a vector K2, and putting K1+K2 back into V.
This will be easier to program, and faster to compute.
Note that changing y into w = cy+d is really easy. Finally, as mattiast points out, a general change of coordinates will not give you a polynomial.
Technical note: if you still want to compute matrix T (as defined by Tyler), you should compute it by using a weighted version of Pascal's rule (this is what the Hörner computation does implicitely):
ti,j = b ti,j-1 + a ti-1,j-1
This way, you compute it simply, column after column, from left to right.
You have the equation:
y = Ax^(n-1) + Bx^(n-2) + ... + Z
In xy space, and you want it in some x'y' space. What you need is transformation functions f(x) = x' and g(y) = y' (or h(x') = x and j(y') = y). In the first case you need to solve for x and solve for y. Once you have x and y, you can substituted those results into your original equation and solve for y'.
Whether or not this is trivial depends on the complexity of the functions used to transform from one space to another. For example, equations such as:
5x = x' and 10y = y'
are extremely easy to solve for the result
y' = 2Ax'^(n-1) + 2Bx'^(n-2) + ... + 10Z
If the input spaces are linearly related, then yes, a matrix should be able to transform one set of coefficients to another. For example, if you had your polynomial in your "original" x-space:
ax^3 + bx^2 + cx + d
and you wanted to transform into a different w-space where w = px+q
then you want to find a', b', c', and d' such that
ax^3 + bx^2 + cx + d = a'w^3 + b'w^2 + c'w + d'
and with some algebra,
a'w^3 + b'w^2 + c'w + d' = a'p^3x^3 + 3a'p^2qx^2 + 3a'pq^2x + a'q^3 + b'p^2x^2 + 2b'pqx + b'q^2 + c'px + c'q + d'
therefore
a = a'p^3
b = 3a'p^2q + b'p^2
c = 3a'pq^2 + 2b'pq + c'p
d = a'q^3 + b'q^2 + c'q + d'
which can be rewritten as a matrix problem and solved.

Resources