Simplifying recursive mean calculation - algorithm

If we have
Ei = mean [abs (Hi - p) for p in Pi]
H = mean [H0, H1, ... Hi, ... Hn]
P = concat [P0, P1, ... Pi, ... Pn]
then does there exist a more efficient way to compute
E = mean [abs (H - p) for p in P]
in terms of H, P, and the Eis and His, given that H, E, and P go on to be used as Hi, Ei, and Pi for some i, at a higher recursive level?
If we store the length of Pi as Li at each stage, then we can let
L = sum [L0, L1, ... Li, ... Ln]
allowing us to perform the somewhat easier calculation
E = sum ([abs (H - p) for p in P] / L)
but the use of the abs function seems to severely restrict the kinds of algebraic manipulations we can use to simplify the numerator.

No. Imagine you have just two groups, and one group has H1 = 1 and the other group has H2 = 2. Imagine that every p in P1 is either 0 or 2, and every p in P2 in is either 1 or 3. Now you will always have E1 = 1 and E2 = 1, regardless of the actual values in P1 and P2. However, you can see that if all p in P1 are 2, and all p in P2 are 1, then E will be minimized (specifically 0.5) because H = 1.5. Or all p in P1 could be 0 and all p in P2 could be 3, in which case E would be maximized. (specifically 1.5). And you could get any answer for E in between 0.5 and 1.5 depending on the distribution of the p. If you don't actually go and look at all the individual p, there's no way to tell what exact value of E you will get between 0.5 and 1.5. So you can't do any better than O(n) time to compute E, where n is the total size of P, which is the same running time if you just compute your desired quantity E directly from it's definition formula.

Related

Algorithm to precisely compare two exponentiations for very large integers (order of 1 billion)

We want to compare a^b to c^d, and tell if the first is smaller, greater, or equal (where ^ denotes exponentiation).
Obviously, for very large numbers, we cannot explicitely compute these values.
The most common approach in this situation is to apply log on both sides and compare b * log(a) to d * log(c). The issue here is that logs are floating-point operations, and as such we cannot trust our answer with 100% confidence (there might be some values which are incredibly close, and because of floating-point error we get a wrong answer).
Is there an algorithm for solving this problem? I've been scouring the intrernet for this, but I can only find solutions which work for particular cases only (e.g. in which one exponent is a multiple of another), or which use floating point in some way (logarithms, division) etc.
This is sort of two questions in one:
Are they equal?
If not, which one is greater?
As Peter O. observes, it's easiest to build in a language that provides an arbitrary-precision fraction type. I'll use Python 3.
Let's assume without loss of generality that a ≤ c (swap if necessary) and b is relatively prime to d (divide both by the greatest common divisor).
To get at the core of the question, I'm going to assume that a, c > 0 and b, d ≥ 0. Removing this assumption is tedious but not difficult.
Equality test
There are some easy cases where a = 1 or b = 0 or c = 1 or d = 0.
Separately, necessary conditions for a^b = c^d are
i. b ≥ d, since otherwise b < d, which together with a ≤ c implies a^b < c^d;
ii. a is a divisor of c, since we know from (i) that a^b = c^d is a divisor of c^b = c^(b−d) c^d.
When these conditions hold, we can divide through by a^d to reduce the problem to testing whether a^(b−d) = (c/a)^d.
In Python 3:
def equal_powers(a, b, c, d):
while True:
lhs_is_one = a == 1 or b == 0
rhs_is_one = c == 1 or d == 0
if lhs_is_one or rhs_is_one:
return lhs_is_one and rhs_is_one
if a > c:
a, b, c, d = c, d, a, b
if b < d:
return False
q, r = divmod(c, a)
if r != 0:
return False
b -= d
c = q
def test_equal_powers():
for a in range(1, 25):
for b in range(25):
for c in range(1, 25):
for d in range(25):
assert equal_powers(a, b, c, d) == (a ** b == c ** d)
test_equal_powers()
Inequality test
Once we've established that the two quantities are not equal, it's time to figure out which one is greater. (Without the equality test, the code here could run forever.)
If you're doing this for real, you should consult an actual reference on computing elementary functions. I'm just going to try to do the simplest thing that works.
Time for a calculus refresher. We have the Taylor series
−log x = (1−x) + (1−x)^2/2 + (1−x)^3/3 + (1−x)^4/4 + ...
To get a lower bound, truncate the series. To get an upper bound, we can truncate but replace the final term (1−x)^n/n with (1−x)^n/n (1/x), since
(1−x)^n/n (1/x)
= (1−x)^n/n (1 + (1−x) + (1−x)^2 + ...)
= (1−x)^n/n + (1−x)^(n+1)/n + (1−x)^(n+2)/n + ...
> (1−x)^n/n + (1−x)^(n+1)/(n+1) + (1−x)^(n+2)/(n+2) + ...
To get a good convergence rate, we're going to want 0.5 ≤ x < 1, which we can achieve by dividing x by a power of two.
In Python, we'll represent a real number as an infinite generator of shrinking intervals that contain the true value. Once the intervals for b log a and d log c are disjoint, we can determine how they compare.
import fractions
def minus(x, y):
while True:
x_lo, x_hi = next(x)
y_lo, y_hi = next(y)
yield x_lo - y_hi, x_hi - y_lo
def times(b, x):
for lo, hi in x:
yield b * lo, b * hi
def restricted_log(a):
series = 0
n = 0
numerator = 1
while True:
n += 1
numerator *= 1 - a
series += fractions.Fraction(numerator, n)
yield -(series + fractions.Fraction(numerator * (1 - a), (n + 1) * a)), -series
def log(a):
n = 0
while a >= 1:
a = fractions.Fraction(a, 2)
n += 1
return minus(restricted_log(a), times(n, restricted_log(fractions.Fraction(1, 2))))
def less_powers(a, b, c, d):
lhs = times(b, log(a))
rhs = times(d, log(c))
while True:
lhs_lo, lhs_hi = next(lhs)
rhs_lo, rhs_hi = next(rhs)
if lhs_hi < rhs_lo:
return True
if rhs_hi < lhs_lo:
return False
def test_less_powers():
for a in range(1, 10):
for b in range(10):
for c in range(1, 10):
for d in range(10):
if a ** b != c ** d:
assert less_powers(a, b, c, d) == (a ** b < c ** d)
test_less_powers()

Probability of a disjunction on N dependent events in Prolog

Does anybody know where to find a Prolog algorithm for computing the probability of a disjunction for N dependent events? For N = 2 i know that P(E1 OR E2) = P(E1) + P(E2) - P(E1) * P(E2), so one could do:
prob_disjunct(E1, E2, P):- P is E1 + E2 - E1 * E2
But how can this predicate be generalised to N events when the input is a list? Maybe there is a package which does this?
Kinds regards/JCR
The recursive formula from Robert Dodier's answer directly translates to
p_or([], 0).
p_or([P|Ps], Or) :-
p_or(Ps, Or1),
Or is P + Or1*(1-P).
Although this works fine, e.g.
?- p_or([0.5,0.3,0.7,0.1],P).
P = 0.9055
hardcore Prolog programmers can't help noticing that the definition isn't tail-recursive. This would really only be a problem when you are processing very long lists, but since the order of list elements doesn't matter, it is easy to turn things around. This is a standard technique, using an auxiliary predicate and an "accumulator pair" of arguments:
p_or(Ps, Or) :-
p_or(Ps, 0, Or).
p_or([], Or, Or).
p_or([P|Ps], Or0, Or) :-
Or1 is P + Or0*(1-P),
p_or(Ps, Or1, Or). % tail-recursive call
I don't know anything about Prolog, but anyway it's convenient to write the probability of a disjunction of a number of independent items p_m = Pr(S_1 or S_2 or S_3 or ... or S_m) recursively as
p_m = Pr(S_m) + p_{m - 1} (1 - P(S_m))
You can prove this by just peeling off the last item -- look at Pr((S_1 or ... or S_{m - 1}) or S_m) and just write that in terms of the usual formula, writing Pr(A or B) = Pr(A) + Pr(B) - Pr(A) Pr(B) = Pr(B) + Pr(A) (1 - Pr(B)), for A and B independent.
The formula above is item C.3.10 in my dissertation: http://riso.sourceforge.net/docs/dodier-dissertation.pdf It is a simple result, and I suppose it must be an exercise in some textbooks, although I don't remember seeing it.
For any event E I'll write E' for the complementary event (ie E' occurs iff E doesn't).
Then we have:
P(E') = 1 - P(E)
(A union B)' = A' inter B'
A and B are independent iff A' and B' are independent
so for independent E1..En
P( E1 union .. union En ) = 1 - P( E1' inter .. inter En')
= 1 - product{ i<=i<=n | 1 - P(E[i])}

What is wrong with this implementation of IDA* algorithm in Haskell? Bad heuristic or simply bad code?

I am trying to write a haskell program which can solve rubiks' cube. Firstly I tried this, but did not figure a way out to avoid writing a whole lot of codes, so I tried using an IDA* search for this task.
But I do not know which heuristic is appropriate here: I tried dividing the problem into subproblems, and measuring the distance from being in a reduced state, but the result is disappointing: the program cannot reduce a cube that is three moves from a standard cube in a reasonable amount of time. I tried measuring parts of edges, and then either summing them, or using the maximums... but none of these works, and the result is almost identical.
So I want to know what the problem with the code is: is the heuristic I used non-admissible? Or is my code causing some infinite loops that I did not detect? Or both? And how to fix that? The code (the relevant parts) goes as follows:
--Some type declarations
data Colors = R | B | W | Y | G | O
type R3 = (Int, Int, Int)
type Cube = R3 -> Colors
points :: [R3] --list of coordinates of facelets of a cube; there are 48 of them.
mU :: Cube -> Cube --and other 5 similar moves.
type Actions = [Cube -> Cube]
turn :: Cube -> Actions -> Cube --chains the actions and turns the cube.
edges :: [R3] --The edges of cubes
totheu1 :: Cube -> Int -- Measures how far away the cube is from having the cross of the first layer solved.
totheu1 c = sum $ map (\d -> if d then 0 else 1)
[c (-2, 3, 0) == c (0, 3, 0),
c (2, 3, 0) == c (0, 3, 0),
c (0, 3, -2) == c (0, 3, 0),
c (0, 3, 2) == c (0, 3, 0),
c (0, 2, -3) == c (0, 0, -3),
c (-3, 2, 0) == c (-3, 0, 0),
c (0, 2, 3) == c (0, 0, 3),
c (3, 2, 0) == c (3, 0, 0)]
expandnr :: (Cube -> Cube) -> Cube -> [(Cube, String)] -- Generates a list of tuples of cubes and strings,
-- the result after applying a move, and the string represents that move, while avoiding moving on the same face as the last one,
-- and avoiding repetitions caused by commuting moves, like U * D = D * U.
type StateSpace = (Int, [String], Cube) -- Int -> f value, [String] = actions applied so far, Cube = result cube.
fstst :: StateSpace -> Int
fstst s#(x, y, z) = x
stst :: StateSpace -> [String]
stst s#(x, y, z) = y
cbst :: StateSpace -> Cube
cbst s#(x, y, z) = z
stage1 :: Cube -> StateSpace
stage1 c = (\(x, y, z) -> (x, [sconcat y], z)) t
where
bound = totheu1 c
t = looping c bound
looping c bound = do let re = search (c, [""]) (\j -> j) 0 bound
let found = totheu1 $ cbst re
if found == 0 then re else looping c found
sconcat [] = ""
sconcat (x:xs) = x ++ (sconcat xs)
straction :: String -> Actions -- Converts strings to actions
search :: (Cube, [String]) -> (Cube -> Cube) -> Int -> Int -> StateSpace
search cs#(c, s) k g bound
| f > bound = (f, s, c)
| totheu1 c == 0 = (0, s, c)
| otherwise = ms
where
f = g + totheu1 c
olis = do
(succs, st) <- expandnr k c
let [newact] = straction st
let t = search (succs, s ++ [st]) newact (g + 1) bound
return t
lis = map fstst olis
mlis = minimum lis
ms = olis !! (ind)
Just ind = elemIndex mlis lis
I know that this heuristic is inconsistent, but am not sure if it is really admissible, maybe the problem is its non-admissibility?
Any ideas, hints, and suggestions are well appreciated, thanks in advance.
Your heuristic is inadmissible. An admissible heuristic must be a lower bound on the real cost of a solution.
You are trying to use as a heuristic the number of side pieces of the first layer that aren't correct, or perhaps the number of faces of the side pieces of the first layer that aren't correct, which is what you have actually written. Either way the heuristic is inadmissible.
The following cube is only 1 move away from being solved, but 4 of the pieces in the first layer are in incorrect positions and 4 of the faces have the wrong color. Either heuristic would say that this puzzle will take at least 4 moves to solve when it can be solved in only 1 move. The heuristics are inadmissible, because they are not lower bounds on the real cost of the solution.

Algorithm: Triangle with two constraints, each corner on a given line

Some time ago I asked a question on math.stackexchange and got an answer. I have difficulties deriving an algorithm from that answer because my background is in design and hope some of you can help me.
The original question with visual sketch and possible answer are here:
https://math.stackexchange.com/questions/667432/triangle-with-two-constraints-each-corner-on-a-given-line
The question was: Given 3 3-dimensional lines (a, b and c) that coincide in a common point S and a given Point B on b, I'm looking for a point A on a and a point C on c where AB and BC have the same length and the angle ABC is 90 degrees.
I will have to implement this algorithm in an imperative language, any code in C++, Java, imperative pseudo-code or similar is fine.
Also, different approaches to this problem are equally welcome. Plus: Thanks for any hints, if the complete solution is indeed too time-consuming!
The two key formulas are
(I've replied the derivation for the formulas in the mathematics stack exchange site)
Substituting the first in the second gives in the end a 4th degree equation that is quite annoying to solve with a closed form. I've therefore used instead a trivial numerical solver in Python:
# function to solve (we look for t such that f(t)=0)
def f(t):
s = (t*cB - B2) / (t*ac - aB)
return s*s - 2*s*aB - t*t + 2*t*cB
# given f and an interval to search generates all solutions in the range
def solutions(f, x0, x1, n=100, eps=1E-10):
X = [x0 + i*(x1 - x0)/(n - 1) for i in xrange(n)]
Y = map(f, X)
for i in xrange(n-1):
if (Y[i]<0 and Y[i+1]>=0 or Y[i+1]<0 and Y[i]>=0):
xa, xb = X[i], X[i+1]
ya, yb = Y[i], Y[i+1]
if (xb - xa) < eps:
# Linear interpolation
# 0 = ya + (x - xa)*(yb - ya)/(xb - xa)
yield xa - ya * (xb - xa) / (yb - ya)
else:
for x in solutions(f, xa, xb, n, eps):
yield x
The search algorithm samples the function in the interval and when it finds two adjacent samples that are crossing the f=0 line repeats the search recursively between those two samples (unless the interval size is below a specified limit, approximating the function with a line and computing the crossing point in that case).
I've tested the algorithm generating random problems and solving them with
from random import random as rnd
for test in xrange(1000):
a = normalize((rnd()-0.5, rnd()-0.5, rnd()-0.5))
b = normalize((rnd()-0.5, rnd()-0.5, rnd()-0.5))
c = normalize((rnd()-0.5, rnd()-0.5, rnd()-0.5))
L = rnd() * 100
B = tuple(x*L for x in b)
aB = dot(a, B)
cB = dot(c, B)
B2 = dot(B, B)
ac = dot(a, c)
sols = list(solutions(f, -1000., 1000.))
And there are cases in which the solutions are 0, 1, 2, 3 or 4. For example the problem
a = (-0.5900900304960981, 0.4717596600172049, 0.6551614908475357)
b = (-0.9831451620384042, -0.10306322574446096, 0.15100848274062748)
c = (-0.6250439408232388, 0.49902426033920616, -0.6002456660677057)
B = (-33.62793897729328, -3.5252208930692497, 5.165162011403056)
has four distinct solutions:
s = 57.3895941365 , t = -16.6969433689
A = (-33.865027354189415, 27.07409541837935, 37.59945205363035)
C = (10.436323283003153, -8.332179814593692, 10.022267893763457)
|A - B| = 44.5910029061
|C - B| = 44.5910029061
(A - B)·(C - B) = 1.70530256582e-13
s = 43.619078237 , t = 32.9673082734
A = (-25.739183207076163, 20.5777215193455, 28.577540327140607)
C = (-20.606016281518986, 16.45148662649085, -19.78848391300571)
|A - B| = 34.5155582156
|C - B| = 34.5155582156
(A - B)·(C - B) = 1.13686837722e-13
s = -47.5886624358 , t = 83.8222109697
A = (28.08159526800866, -22.450411211385674, -31.17825902887765)
C = (-52.39256507303229, 41.82931682916268, -50.313918854788845)
|A - B| = 74.0747844969
|C - B| = 74.0747844969
(A - B)·(C - B) = 4.54747350886e-13
s = 142.883074325 , t = 136.634726869
A = (-84.31387768560096, 67.4064705656035, 93.61148799140805)
C = (-85.40270813540043, 68.1840435123674, -82.01440263735996)
|A - B| = 124.189861967
|C - B| = 124.189861967
(A - B)·(C - B) = -9.09494701773e-13
Write two quadratic equations for lambda, mu unknowns (just above matrix forms).
Solve this system with paper, pen and head, or with any mathematical software like Maple, Mathematica, Matlab, Derive etc. You will have 4th order equation. It has closed-form solution - apply Ferrari or Kardano method and get real roots, find mu, lambda, then point coordinates.

Weight-Biased Leftist Heaps: advantages of top-down version of merge?

I am self-studying Okasaki's Purely Functional Data Structures, now on exercise 3.4, which asks to reason about and implement a weight-biased leftist heap. This is my basic implementation:
(* 3.4 (b) *)
functor WeightBiasedLeftistHeap (Element : Ordered) : Heap =
struct
structure Elem = Element
datatype Heap = E | T of int * Elem.T * Heap * Heap
fun size E = 0
| size (T (s, _, _, _)) = s
fun makeT (x, a, b) =
let
val sizet = size a + size b + 1
in
if size a >= size b then T (sizet, x, a, b)
else T (sizet, x, b, a)
end
val empty = E
fun isEmpty E = true | isEmpty _ = false
fun merge (h, E) = h
| merge (E, h) = h
| merge (h1 as T (_, x, a1, b1), h2 as T (_, y, a2, b2)) =
if Elem.leq (x, y) then makeT (x, a1, merge (b1, h2))
else makeT (y, a2, merge (h1, b2))
fun insert (x, h) = merge (T (1, x, E, E), h)
fun findMin E = raise Empty
| findMin (T (_, x, a, b)) = x
fun deleteMin E = raise Empty
| deleteMin (T (_, x, a, b)) = merge (a, b)
end
Now, in 3.4 (c) & (d), it asks:
Currently, merge operates in two
passes: a top-down pass consisting of
calls to merge, and a bottom-up pass
consisting of calls to the helper
function, makeT. Modify merge to
operate in a single, top-down pass.
What advantages would the top-down
version of merge have in a lazy
environment? In a concurrent
environment?
I changed the merge function by simply inlining makeT, but I fail to see any advantages, so I think I haven't grasped the spirit of these parts of the exercise. What am I missing?
fun merge (h, E) = h
| merge (E, h) = h
| merge (h1 as T (s1, x, a1, b1), h2 as T (s2, y, a2, b2)) =
let
val st = s1 + s2
val (v, a, b) =
if Elem.leq (x, y) then (x, a1, merge (b1, h2))
else (y, a2, merge (h1, b2))
in
if size a >= size b then T (st, v, a, b)
else T (st, v, b, a)
end
I think I've figured out one point with regards to lazy evaluation. If I don't use the recursive merge to calculate the size, then the recursive call won't need to be evaluated until the child is needed:
fun merge (h, E) = h
| merge (E, h) = h
| merge (h1 as T (s1, x, a1, b1), h2 as T (s2, y, a2, b2)) =
let
val st = s1 + s2
val (v, ma, mb1, mb2) =
if Elem.leq (x, y) then (x, a1, b1, h2)
else (y, a2, h1, b2)
in
if size ma >= size mb1 + size mb2
then T (st, v, ma, merge (mb1, mb2))
else T (st, v, merge (mb1, mb2), ma)
end
Is that all? I am not sure about concurrency though.
I think you've essentially got it as far as the lazy evaluation goes -- it's not very helpful to use lazy evaluation if you are going to have to end up traversing the whole data structure to find out anything every time you do a merge...
As to the concurrency, I expect the issue is that if, while one thread is evaluating the merge, another comes along and wants to look something up, it will not be able to get anything useful done at least until the first thread completes the merge. (And it might even take longer than that.)
It doesn’t any benefit to WMERGE-3-4C function in a lazy environment. It still does all the work that the original down-up merge did. It pretty sure it would not be any easier for the language system to memorize..
No benefit to WMERGE-3-4C functions in a concurrent environment. Each call to WMERGE-3-4C does all its work before passing the buck to another instance of WMERGE-3-4C. In fact, if we eliminated the recursion by hand, WMERGE-3-4C could be implemented as a single loop that does all the work while accumulating a stack, then a second loop that does the REDUCE work on the stack. The first loop would not be naturally parallizable, though maybe the REDUCE could operate by calling the function on pairs, in parallel, until only one element remained in the list.

Resources