Ordered set and natural bijection (combinatorial species) - algorithm

Let A some set (eg. 1000, 1001, 1002, ..., 1999).
Let lessThan some order relation function (eg. (a lessThan b) <-> (a > b)).
Let index a function (with inverse index') mapping a A element to naturals.
Example:
index a = 2000 - a
index' n = 2000 - n
Exists some way to construct index (and index') function for all (or some kinds of) (A, lessThan) pairs in P (polynomial time)?
Best regards and thank's in advance!
EDITED: A could be a set by definition (eg. all combinations with repetition of another big subset), then, we can't suppose A is completely traversable (in P).
EDITED: another non trivial example, let An a set (with elements like (x, y, p)) whose elements are ordered clockwise into a n X n square, like this
1 2 3 4
12 13 14 5
11 16 15 6
10 9 8 7
then, we can map each triplet in An to Bn = [1..n^2] with O(1) (a polynomial).
Given one An element we can index to Bn with O(1).
Given one Bn element we can index' to An with O(1).
// Square perimeter; square x = 1, 2, 3, ...
Func<int, int, int> perimeter = ( x, n ) => 4 * ( n - 2 * x + 1 );
// Given main diagonal coordinates (1, 1), (2, 2), ... return cell number
Func<int, int, int> diagonalPos = ( x, n ) => -4 * x * x + ( 4 * n + 8 ) * x - 4 * n - 3;
// Given a number, return their square
Func<int, int, int> inSquare = ( z, n ) => (int) Math.Floor(n * 0.5 - 0.5 * Math.Sqrt(n * n - z + 1.0) + 1.0);
Func<int, int, Point> coords = ( z, n ) => {
var s = inSquare(z, n);
var l = perimeter(s, n) / 4; // length sub-square edge -1
var l2 = l + l;
var l3 = l2 + l;
var d = diagonalPos(s, n);
if( z <= d + l )
return new Point(s + z - d, s);
if( z <= d + l2 )
return new Point(s + l, s + z - d - l);
if( z <= d + l3 )
return new Point(s + d + l3 - z, s + l);
return new Point(s, s + d + l2 + l2 - z);
};
(I have read about "Combinatorial species", "Ordered construction of combinatorial objects", "species" haskell package and others)

I may be misunderstanding what you want, but in case I'm not:
If lessThan defines a total order on the set, you can create the index and index' functions by
converting the set to a list (or an array/vector)
sorting that according to lessThan
construct index' as Data.Map.fromDistinctAscList $ zip [1 .. ] sortedList
construct index as Data.Map.fromDistinctAscList $ zip (map NTC sortedList) [1 .. ]
where NTC is a newtype constructor wrapping the type of elements of the set in a newtype whose Ord instance is given by lessThan.
newtype Wrapped = NTC typeOfElements
instance Eq Wrapped where
(NTC x) /= (NTC y) = x `lessThan` y || y `lessThan` x
-- that can usually be done more efficiently
instance Ord Wrapped where
(NTC x) <= (NTC y) = not $ y `lessThan` x
EDITED: A could be a set by definition (eg. all combinations with repetition of another big subset), then, we can't suppose A is completely traversable (in P).
In that case, unless I'm missing something fundamental, it's impossible in principle, because the index' function would provide a complete traversal of the set.
So you can create the index and index' functions in polynomial time if and only if the set is traversable in polynomial time.

Related

How can I speed up this haskell lastDigits x y function?

I have a haskell assignment in which i have to create a function lastDigit x y of 2 arguments that calculates the sum of all [x^x | (0..x)], mine is too slow and i need to speed it up. Anyone has any ideas??
list :: Integral x=>x->[x]
list 0 = []
list x = list(div x 10) ++ [(mod x 10)]
sqrall :: Integer->[Integer]
sqrall x y = [mod (mod x 10^y)^x 10^y | x <- [1..x]]
lastDigits :: Integer -> Int -> [Integer]
lastDigits x y = drop (length((list(sum (sqrall x y))))-y) (list(sum (sqrall x)))
The main reason this will take too long is because you calculate the entire number of x^x, which scales super exponentially. This means that even for very small x, it will still take a considerable amount of time.
The point is however that you do not need to calculate the entire number. Indeed, you can make use of the fact that x×y mod n = (x mod n) × (y mod n) mod n. For example Haskell's arithmoi package makes use of this [src]:
powMod :: (Integral a, Integral b) => a -> b -> a -> a
powMod x y m
| m <= 0 = error "powModInt: non-positive modulo"
| y < 0 = error "powModInt: negative exponent"
| otherwise = f (x `rem` m) y 1 `mod` m
where
f _ 0 acc = acc
f b e acc = f (b * b `rem` m) (e `quot` 2)
(if odd e then (b * acc `rem` m) else acc)
We can make a specific version for modulo 10 with:
pow10 :: Integral i => i -> i
pow10 x = go x x
where go 0 _ = 1
go i j | odd i = rec * j `mod` 10
| otherwise = rec
where rec = go (div i 2) ((j*j) `mod` 10)
This then matches x^x `mod` 10, except that we do not need to calculate the entire number:
Prelude> map pow10 [1 .. 20]
[1,4,7,6,5,6,3,6,9,0,1,6,3,6,5,6,7,4,9,0]
Prelude> [x^x `mod` 10 | x <- [1..20]]
[1,4,7,6,5,6,3,6,9,0,1,6,3,6,5,6,7,4,9,0]
Now that we have that, we can also calculate the the sum of the two last digits with integers that range to at most 18:
sum10 :: Int -> Int -> Int
sum10 x y = (x + y) `mod` 10
we thus can calculate the last digit with:
import Data.List(foldl')
lastdigit :: Int -> Int
lastdigit x = foldl' sum10 0 (map pow10 [0 .. x])
For example for x = 26, we get:
Prelude Data.List> lastdigit 26
4
Prelude Data.List> sum [ x^x | x <- [0 .. 26] ]
6246292385799360560872647730684286774
I keep it as an exercise to generalize the above to calculate it for the last y digits. As long as y is relatively small, this will be efficient, since then the numbers never take huge amounts of memory. Furthermore if the numbers have an upper bound, addition, multiplication, etc. are done in constant time. If you however use an Integer, then the numbers can be arbitrary large, and thus operations like addition are not constant.

Invariant induction over horn-clauses with Z3py

I am currently using Z3py to to deduce some invariants which are encoded as a conjunction of horn-clauses whilst also providing a template for the invariant. I'm starting with a simple example first if you see the code snippet below.
x = 0;
while(x < 5){
x += 1
}
assert(x == 5)
This translates into the horn clauses
x = 0 => Inv(x)
x < 5 /\ Inv(x) => Inv(x +1)
Not( x < 5) /\ Inv(x) => x = 5
The invariant here is x <= 5.
I have provided a template for the invariant of the form a*x + b <= c
so that all the solver has to do is guess a set of values for a,b and c that can reduce to x <= 5.
However when I encode it up I keep getting unsat. If try to assert Not (x==5) I get a=2 , b = 1/8 and c = 2 which makes little sense to me as a counterexample.
I provide my code below and would be grateful for any help on correcting my encoding.
x = Real('x')
x_2 = Real('x_2')
a = Real('a')
b = Real('b')
c = Real('c')
s = Solver()
s.add(ForAll([x],And(
Implies(x == 0 , a*x + b <= c),
Implies(And(x_2 == x + 1, x < 5, a*x + b <= c), a*x_2 + b <= c),
Implies(And(a*x + b <= c, Not(x < 5)), x==5)
)))
if (s.check() == sat):
print(s.model())
Edit: it gets stranger for me. If I remove the x_2 definition and just replace x_2 with (x + 1) in the second horn clause as well as delete the x_2 = x_2 + 1, I get unsat whether I write Not( x==5) or x==5 in the final horn clause.
There were two things preventing your original encoding from working:
1) It's not possible to satisfy x_2 == x + 1 for all x for a single value of x_2. Thus, if you're going to write x_2 == x + 1, both x and x_2 need to be universally quantified.
2) Somewhat surprisingly, this problem is satisfiable in the integers but not in the reals. You can see the problem with the clause x < 5 /\ Inv(x) => Inv(x + 1). If x is an integer, then this is satisfied by x <= 5. However, if x is allowed to be any real value, then you could have x == 4.5, which satisfies both x < 5 and x <= 5, but not x + 1 <= 5, so Inv(x) = (x <= 5) does not satisfy this problem in the reals.
Also, you might find it helpful to define Inv(x), it cleans up the code quite a bit. Here is the encoding of your problem with those changes:
from z3 import *
# Changing these from 'Int' to 'Real' changes the problem from sat to unsat.
x = Int('x')
x_2 = Int('x_2')
a = Int('a')
b = Int('b')
c = Int('c')
def Inv(x):
return a*x + b <= c
s = Solver()
# I think this is the simplest encoding for your problem.
clause1 = Implies(x == 0 , Inv(x))
clause2 = Implies(And(x < 5, Inv(x)), Inv(x + 1))
clause3 = Implies(And(Inv(x), Not(x < 5)), x == 5)
s.add(ForAll([x], And(clause1, clause2, clause3)))
# Alternatively, if clause2 is specified with x_2, then x_2 needs to be
# universally quantified. Note the ForAll([x, x_2]...
#clause2 = Implies(And(x_2 == x + 1, x < 5, Inv(x)), Inv(x_2))
#s.add(ForAll([x, x_2], And(clause1, clause2, clause3)))
# Print result all the time, to avoid confusing unknown with unsat.
result = s.check()
print result
if (result == sat):
print(s.model())
One more thing: it's a bit strange to me to write a*x + b <= c as a template, because this is the same as a*x <= d for some integer d.

More efficient algorithm preforms worse in Haskell

A friend of mine showed me a home exercise in a C++ course which he attend. Since I already know C++, but just started learning Haskell I tried to solve the exercise in the "Haskell way".
These are the exercise instructions (I translated from our native language so please comment if the instructions aren't clear):
Write a program which reads non-zero coefficients (A,B,C,D) from the user and places them in the following equation:
A*x + B*y + C*z = D
The program should also read from the user N, which represents a range. The program should find all possible integral solutions for the equation in the range -N/2 to N/2.
For example:
Input: A = 2,B = -3,C = -1, D = 5, N = 4
Output: (-1,-2,-1), (0,-2, 1), (0,-1,-2), (1,-1, 0), (2,-1,2), (2,0, -1)
The most straight-forward algorithm is to try all possibilities by brute force. I implemented it in Haskell in the following way:
triSolve :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
triSolve a b c d n =
let equation x y z = (a * x + b * y + c * z) == d
minN = div (-n) 2
maxN = div n 2
in [(x,y,z) | x <- [minN..maxN], y <- [minN..maxN], z <- [minN..maxN], equation x y z]
So far so good, but the exercise instructions note that a more efficient algorithm can be implemented, so I thought how to make it better. Since the equation is linear, based on the assumption that Z is always the first to be incremented, once a solution has been found there's no point to increment Z. Instead, I should increment Y, set Z to the minimum value of the range and keep going. This way I can save redundant executions.
Since there are no loops in Haskell (to my understanding at least) I realized that such algorithm should be implemented by using a recursion. I implemented the algorithm in the following way:
solutions :: (Integer -> Integer -> Integer -> Bool) -> Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
solutions f maxN minN x y z
| solved = (x,y,z):nextCall x (y + 1) minN
| x >= maxN && y >= maxN && z >= maxN = []
| z >= maxN && y >= maxN = nextCall (x + 1) minN minN
| z >= maxN = nextCall x (y + 1) minN
| otherwise = nextCall x y (z + 1)
where solved = f x y z
nextCall = solutions f maxN minN
triSolve' :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
triSolve' a b c d n =
let equation x y z = (a * x + b * y + c * z) == d
minN = div (-n) 2
maxN = div n 2
in solutions equation maxN minN minN minN minN
Both yield the same results. However, trying to measure the execution time yielded the following results:
*Main> length $ triSolve' 2 (-3) (-1) 5 100
3398
(2.81 secs, 971648320 bytes)
*Main> length $ triSolve 2 (-3) (-1) 5 100
3398
(1.73 secs, 621862528 bytes)
Meaning that the dumb algorithm actually preforms better than the more sophisticated one. Based on the assumption that my algorithm was correct (which I hope won't turn as wrong :) ), I assume that the second algorithm suffers from an overhead created by the recursion, which the first algorithm isn't since it's implemented using a list comprehension.
Is there a way to implement in Haskell a better algorithm than the dumb one?
(Also, I'll be glad to receive general feedbacks about my coding style)
Of course there is. We have:
a*x + b*y + c*z = d
and as soon as we assume values for x and y, we have that
a*x + b*y = n
where n is a number we know.
Hence
c*z = d - n
z = (d - n) / c
And we keep only integral zs.
It's worth noticing that list comprehensions are given special treatment by GHC, and are generally very fast. This could explain why your triSolve (which uses a list comprehension) is faster than triSolve' (which doesn't).
For example, the solution
solve :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
-- "Buffalo buffalo buffalo buffalo Buffalo buffalo buffalo..."
solve a b c d n =
[(x,y,z) | x <- vals, y <- vals
, let p = a*x +b*y
, let z = (d - p) `div` c
, z >= minN, z <= maxN, c * z == d - p ]
where
minN = negate (n `div` 2)
maxN = (n `div` 2)
vals = [minN..maxN]
runs fast on my machine:
> length $ solve 2 (-3) (-1) 5 100
3398
(0.03 secs, 4111220 bytes)
whereas the equivalent code written using do notation:
solveM :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
solveM a b c d n = do
x <- vals
y <- vals
let p = a * x + b * y
z = (d - p) `div` c
guard $ z >= minN
guard $ z <= maxN
guard $ z * c == d - p
return (x,y,z)
where
minN = negate (n `div` 2)
maxN = (n `div` 2)
vals = [minN..maxN]
takes twice as long to run and uses twice as much memory:
> length $ solveM 2 (-3) (-1) 5 100
3398
(0.06 secs, 6639244 bytes)
Usual caveats about testing within GHCI apply -- if you really want to see the difference, you need to compile the code with -O2 and use a decent benchmarking library (like Criterion).

Haskell - list comprehension can't enumerate N × N

I have to write a function which returns a list of all pairs (x,y) where x,
y ∈ N , and:
x is the product of two natural numbers (x = a • b, where a, b ∈ N) and
x is really bigger than 5 but really smaller than 500, and
y is a square number (y = c² where c ∈ N) NOT greater than 1000, and
x is a divisor of y.
My attempt:
listPairs :: [(Int, Int)]
listPairs = [(a*b, y) | y <- [0..], a <- [0..], b <- [0..],
(a*b) > 5, (a*b) < 500, (y*y) < 1001,
mod y (a*b) == 0]
But it doesn't return anything and the computer works a lot on it.
However if I choose a smaller range for a, b and y e. g. [0..400], it takes up to a minute but it returns the right result.
So how could I solve the performance issue?
So, of course nested list comprehensions on infinite lists do not terminate.
Fortunately, your lists are not infinite. There's a limit. If x = a*b < 500, then we know that it must be a < 500 and b < 500. Also, c = y*y < 1001 is just y < 32. So,
listPairs :: [(Int, Int)]
listPairs =
[(x, c*c) | c <- [1..31], a <- [1..499], -- a*b < 500 ==> b<500/a ,
b <- [a..min 499 (div 500 a)], -- a*b==b*a ==> b >= a
let x = a*b, x > 5,
-- (a*b) < 500, (c*c) < 1001, -- no need to test this
rem (c*c) x == 0]
mod 0 n == 0 trivially holds, so I'm excluding 0 from "natural numbers" here.
There are still some duplicates produced here, even though we've limited the b value to b >= a in x=a*b, because x can have several representations (e.g. 1*6 == 2*3).
You can use Data.List.nub to get rid of them.

Finding major axis/image orientation of binary image in R

I have a high res binary image which looks something like:
I'm trying to compute the major axis which should be slightly rotated to the right and eventually get the axis of orientation of the object
A post here (in matlab) suggests a way of doing this is computing the covariance matrix for the datapoints and finding their eigenvalues/eigenvectors
I am trying to implement something similar in R
%% MATLAB CODE Calculate axis and draw
[M N] = size(Ibw);
[X Y] = meshgrid(1:N,1:M);
%Mass and mass center
m = sum(sum(Ibw));
x0 = sum(sum(Ibw.*X))/m;
y0 = sum(sum(Ibw.*Y))/m;
#R code
d = dim(im)
M = d[1]
N = d[2]
t = meshgrid(M,N)
X = t[[2]]
Y = t[[1]]
m = sum(im);
x0 = sum(im %*% X)/m;
y0 = sum(im %*% Y)/m;
meshgrid <-function(r,c){
return(list(R=matrix(rep(1:r, r), r, byrow=T),
C=matrix(rep(1:c, c), c)))
}
However, computing m , x0 and y0 takes too long in R.
Does anyone know of an implementation in R?
Computing the variance matrix directly, with var, takes 1/3 of a second.
# Sample data
M <- 2736
N <- 3648
im <- matrix( FALSE, M, N );
y <- as.vector(row(im))
x <- as.vector(col(im))
im[ abs( y - M/2 ) < M/3 & abs( x - N/2 ) < N/3 ] <- TRUE
#image(im)
theta <- runif(1, -pi/12, pi/12)
xy <- cbind(x+1-N/2,y+1-M/2) %*% matrix(c( cos(theta), sin(theta), -sin(theta), cos(theta) ), 2, 2)
#plot(xy[,1]+N/2-1, xy[,2]+M/2-1); abline(h=c(1,M),v=c(1,N))
f <- function(u, lower, upper) pmax(lower,pmin(round(u),upper))
im[] <- im[cbind( f(xy[,2] + M/2 - 1,1,M), f(xy[,1] + N/2 - 1,1,N) )]
image(1:N, 1:M, t(im), asp=1)
# Variance matrix of the points in the rectangle
i <- which(im)
V <- var(cbind( col(im)[i], row(im)[i] ))
# Their eigenvectors
u <- eigen(V)$vectors
abline( M/2-N/2*u[2,1]/u[1,1], u[2,1]/u[1,1], lwd=5 )
abline( M/2-N/2*u[2,2]/u[1,2], u[2,2]/u[1,2] )
Try replacing the default Rblas.dll with a suitable one from this link.

Resources