Efficient way to write ordering instances? - sorting

I'm working on a basic Haskell exercise that is set up as follows: a data definition is made, where Zero is declared to be a NaturalNumber, and a series of numbers (printed out by name, so, for instance, four) up to ten is constructed with this.
I didn't have too much trouble with understanding how the declaration of Eq instances works (apart from not having been given an exact explanation for the syntax), but I'm having trouble with declaring all instances I need for Ord -- I need to be able to construct an ordering over the entire set of numbers, such that I'll get True if I input "ten > nine" or something.
Right now, I have this snippet of code. The first two lines should be correct, as I copied them (as I was supposed to) from the exercise itself.
instance Ord NaturalNumber where
compare Zero Zero = EQ
compare Zero (S Zero) = LT
compare (S Zero) Zero = GT
compare x (S x) = LT
The first four lines work fine, but they can't deal with cases like "compare four five", and anything similar to what I typed in the last doesn't work even if I type in something like compare four four = EQ: I get a "conflicting definitions" error, presumably because the x appears twice. If I write something like compare two one = GT instead, I get a "pattern match(es) are overlapped" warning, but it works. However, I also get the result GT when I input compare one two into the actual Haskell platform, so clearly something isn't working. This happens even if I add compare one two = LT below that line.
So clearly I can't finish off this description of Ord instances by writing every instance I could possibly need, and even if I could, it would be incredibly inefficient to write out all 100 instances by hand.
Might anyone be able to provide me with a hint as to how I can resolve this problem and finish off the construction of an ordering mechanism?

What this task focuses on is finding base cases and recursion rules. The first two lines you were given were
instance Ord NaturalNumber where
compare Zero Zero = EQ
This is the first base case, in words:
zero is equal to zero
The other two base cases are:
zero is less than the successor of any NaturalNumber
the successor of any NaturalNumber is greater than zero
Note that your lines three and four only say that 0 < 1 and 1 > 0, but nothing about any other nonzero numbers.
The recursion rule, then, is that it makes no difference if you compare two nonzero numbers, or the numbers they are successors of:
comparing 1 + x and 1 + y is the same as comparing x and y.
Codifying that into Haskell should give you the solution to this exercise.

You'll need to organize your instances in a way that will cover all possible patterns. To make it simpler, remember how your numbers are defined:
one = S Zero
two = S one -- or S (S Zero)
and think in terms of S and Zero, not one, two etc. (they are merely aliases). Once you do this, it should become clear that you're missing a case like:
compare (S x) (S y) = compare x y
Edit:
Like Jakob Runge noticed, also the following base clauses should be improved:
compare Zero (S Zero) = LT
compare (S Zero) Zero = GT
As they're written, they allow comparison only between zero and one. You should change them to cover comparison between zero and any positive number:
compare Zero (S _) = LT
compare (S _) Zero = GT

Your compare function needs to be recursive. You will want your last case to capture the situation where both arguments are the successor of something, and then recurse on what they are the successor of. Additionally, your middle two cases, are probably not what you want, as they will only capture the following cases:
1 > 0
0 < 1
You would like this to be more general, so that you can handle cases like:
S x > 0, for all x
0 < S x, for all x

Related

Where are these negatives coming from in Maple execution?

I am interested in simulating the phenomenon of "regression to the mean". Say a 0-1 vector V of length N is "gifted" if the number of 1s in V is greater than N/2 + 5*sqrt(N).
I want Maple to evaluate a string of M 0-1 lists, each of length N, to determine whether they are gifted.
Then, given that list V[i] is gifted, I want to evaluate the probability that list V[i+1] is gifted.
So far my code is failing in a strange way. So far all the code is supposed to do is create the list of sums (called 'total') and the list 'g' which carries a 0 if total[i] <= N/2 + 5sqrt(N), and a 1 otherwise.
Here is the code:
RS:=proc(N) local ra,i:
ra:=rand(0..1):
[seq(ra(),i=1..N)]:
end:
Gift:=proc(N,M) local total, i, g :
total:=[seq(add(RS(N)),i=1..M)]:
g:=[seq(0,i=1..M)]:
for i from 1 to M do
if total[i] > (N/2 + 5*(N^(1/2))) then
g[i]:=1
fi:
od:
print(total, g)
end:
The trouble is, Maple responds, when I try Gift(100,20),
"Error, (in Gift) cannot determine if this expression is true or false: 5*100^(1/2) < -2"
or, when I try Gift(10000,20), "Error, (in Gift) cannot determine if this expression is true or false: 5*10000^(1/2) < -103."
Where are these negative numbers coming from? And why can't Maple tell whether 5(10000)^{1/2} < -103 or not?
The negative quantities are simply the part of the inequality that results when the portion with the radical is moved to one side and the purely rational portion is moved to the other.
Use an appropriate mechanism for the resolution of the conditional test. For example,
if is( total[i] > (N/2 + 5*N^(1/2)) ) then
...etc
or, say,
temp := evalf(N/2 + 5*N^(1/2));
for i from 1 to M do
if total[i] > temp then
...etc
From the Maple online help:
Important: The evalb command does not simplify expressions. It may return false for a relation that is true. In such a case, apply a simplification to the relation before using evalb.
...
You must convert symbolic arguments to floating-point values when using the evalb command for inequalities that use <, <=, >, or >=.
In this particular example, Maple chokes when trying to determine if the symbolic square root is less than -2, though it tried its best to simplify before quitting.
One fix is to apply evalf to inequalities. Rather than, say, evalb(x < y), you would write evalb(evalf(x < y)).
As to why Maple can't handle these inequalities, I don't know.

Maximum sum of sequence

Suppose we have sequence of x numbers and x-1 operators (+ or -), where the order of the numbers and the operators are fixed. For example 5-2-1+3. By different parentheses you get different values. For example (5 - 2)-1+3 = 5, 5-(2-1)+3=7 and so on. I am now interested in the maximum sum and best in linear run-time/memory space.
I think that this problem can be solved with dynamic programming, but I simply don't find a meaningful variant.
What you need here is certainly a dynamic algorithm.
This would work in a recursive way, finding the maximum value that can be gotten for every range.
Algorithm:
You could separate the numbers and the operators into different lists (if the first number is positive add + to the list first).
max_sum(expression, operators):
if len(expression) == 1: return expression
max_value = -float('inf') # minus infinity
length = len(expression)
for i in range(length):
left_exp = max_sum(expression[0:i], operators[0:i])
right_exp = max_sum(expression[i:length], operators[i:length])
value = operator[i].apply(left_exp, right_exp)
if value >= max_value:
max_value = value
return max_value
The main idea of the algorithm is that it checks the maximum sums in every possible range division, goes all the way down recursively and then returns the maximum sum it got.
The pseudo-code doesn't take into account a case where you could get a maximum value by substracting the minimum value of the right expression, but with a few tweaks I think you could fix it pretty fast.
I tried to make the pseudo-code as easy to convert to code as possible out of my head, I hope this helps you.
Let an expression be a sequence of operator-number pairs: it starts with an operator followed by a number, and ends with an operator followed by a number. Your example 5-2-1+3 can be made into an expression by placing a + at the beginning: +5-2-1+3.
Let the head of an expression be its first operator-number pair, and its tail, the rest. The head of +5-2-1+3 is +5 and the tail, -2-1+3.
In this context, let parenthesizing an expression mean placing an opening parenthesis just after the first operator and a closing parenthesis at the end of the expression, like so: +(5-2-1+3). Parenthesizing an expression with a positive head doesn't do anything. Parenthesizing an expression with a negative head is equivalent to changing every sign of its tail: -(5 -2-1+3) = -5 +2+1-3.
If you want to get an extremum by parenthesizing some of its subexpressions, then you can first make some simplifications. It's easy to see that any subexpression of the form +x1+x2+...+xn won't be split: all of its elements will be used together towards the extremum. Similarly, any subexpression of the form -x1-x2-...-xn won't be split, but may be parenthesized (-(x1-x2-...-xn)). Therefore, you can first simplify any subexpression of the first form into +X, where X is the sum of its elements, and any subexpression of the second form into -x1-X, where X is the sum of its tail elements.
The resulting expression cannot have 3 consecutive - operators or 2 consecutive + operators. Now, start from the end, find the first subexpression of the form -a-b, -a+b-c, or -a+b, and compute its potential minimum and its potential maximum:
min(-a-b) = -a-b
max(-a-b) = -(a-b)
min(-a+b-c) = -(a+b)-c
max(-a+b-c) = -a+b-c if b>=c, max(-a+b-c) = -(a+b-c) if b<=c
min(-a+b) = -(a+b)
max(-a+b) = -a+b
Repeat by treating that subexpression as a single operator-number pair in the next one, albeit with two possible values (its two extrema). This way, the extrema of each subsequent subexpression is computed until you get to the main expression, of which you can simply compute the maximum. Note that the main expression may have a positive first pair, which makes it a special case, but that's easy to take into account: just add it to the maximum.

Scope of variables and the digits function

My question is twofold:
1) As far as I understand, constructs like for loops introduce scope blocks, however I'm having some trouble with a variable that is define outside of said construct. The following code depicts an attempt to extract digits from a number and place them in an array.
n = 654068
l = length(n)
a = Int64[]
for i in 1:(l-1)
temp = n/10^(l-i)
if temp < 1 # ith digit is 0
a = push!(a,0)
else # ith digit is != 0
push!(a,floor(temp))
# update n
n = n - a[i]*10^(l-i)
end
end
# last digit
push!(a,n)
The code executes fine, but when I look at the a array I get this result
julia> a
0-element Array{Int64,1}
I thought that anything that goes on inside the for loop is invisible to the outside, unless I'm operating on variables defined outside the for loop. Moreover, I thought that by using the ! syntax I would operate directly on a, this does not seem to be the case. Would be grateful if anyone can explain to me how this works :)
2) Second question is about syntex used when explaining functions. There is apparently a function called digits that extracts digits from a number and puts them in an array, using the help function I get
julia> help(digits)
Base.digits(n[, base][, pad])
Returns an array of the digits of "n" in the given base,
optionally padded with zeros to a specified size. More significant
digits are at higher indexes, such that "n ==
sum([digits[k]*base^(k-1) for k=1:length(digits)])".
Can anyone explain to me how to interpret the information given about functions in Julia. How am I to interpret digits(n[, base][, pad])? How does one correctly call the digits function? I can't be like this: digits(40125[, 10])?
I'm unable to reproduce you result, running your code gives me
julia> a
1-element Array{Int64,1}:
654068
There's a few mistakes and inefficiencies in the code:
length(n) doesn't give the number of digits in n, but always returns 1 (currently, numbers are iterable, and return a sequence that only contain one number; itself). So the for loop is never run.
/ between integers does floating point division. For extracting digits, you´re better off with div(x,y), which does integer division.
There's no reason to write a = push!(a,x), since push! modifies a in place. So it will be equivalent to writing push!(a,x); a = a.
There's no reason to digits that are zero specially, they are handled just fine by the general case.
Your description of scoping in Julia seems to be correct, I think that it is the above which is giving you trouble.
You could use something like
n = 654068
a = Int64[]
while n != 0
push!(a, n % 10)
n = div(n, 10)
end
reverse!(a)
This loop extracts the digits in opposite order to avoid having to figure out the number of digits in advance, and uses the modulus operator % to extract the least significant digit. It then uses reverse! to get them in the order you wanted, which should be pretty efficient.
About the documentation for digits, [, base] just means that base is an optional parameter. The description should probably be digits(n[, base[, pad]]), since it's not possible to specify pad unless you specify base. Also note that digits will return the least significant digit first, what we get if we remove the reverse! from the code above.
Is this cheating?:
n = 654068
nstr = string(n)
a = map((x) -> x |> string |> int , collect(nstr))
outputs:
6-element Array{Int64,1}:
6
5
4
0
6
8

Check whether a point is inside a rectangle by bit operator

Days ago, my teacher told me it was possible to check if a given point is inside a given rectangle using only bit operators. Is it true? If so, how can I do that?
This might not answer your question but what you are looking for could be this.
These are the tricks compiled by Sean Eron Anderson and he even put a bounty of $10 for those who can find a single bug. The closest thing I found here is a macro that finds if any integer X has a word which is between M and N
Determine if a word has a byte between m and n
When m < n, this technique tests if a word x contains an unsigned byte value, such that m < value < n. It uses 7 arithmetic/logical operations when n and m are constant.
Note: Bytes that equal n can be reported by likelyhasbetween as false positives, so this should be checked by character if a certain result is needed.
Requirements: x>=0; 0<=m<=127; 0<=n<=128
#define likelyhasbetween(x,m,n) \
((((x)-~0UL/255*(n))&~(x)&((x)&~0UL/255*127)+~0UL/255*(127-(m)))&~0UL/255*128)
This technique would be suitable for a fast pretest. A variation that takes one more operation (8 total for constant m and n) but provides the exact answer is:
#define hasbetween(x,m,n) \
((~0UL/255*(127+(n))-((x)&~0UL/255*127)&~(x)&((x)&~0UL/255*127)+~0UL/255*(127-(m)))&~0UL/255*128)
It is possible if the number is a finite positive integer.
Suppose we have a rectangle represented by the (a1,b1) and (a2,b2). Given a point (x,y), we only need to evaluate the expression (a1<x) & (x<a2) & (b1<y) & (y<b2). So the problems now is to find the corresponding bit operation for the expression c
Let ci be the i-th bit of the number c (which can be obtained by masking ci and bit shift). We prove that for numbers with at most n bit, c<d is equivalent to r_(n-1), where
r_i = ((ci^di) & ((!ci)&di)) | (!(ci^di) & r_(i-1))
Prove: When the ci and di are different, the left expression might be true (depends on ((!ci)&di)), otherwise the right expression might be true (depends on r_(i-1) which is the comparison of next bit).
The expression ((!ci)&di) is actually equivalent to the bit comparison ci < di. Hence, this recursive relation return true that it compares the bit by bit from left to right until we can decide c is smaller than d.
Hence there is an purely bit operation expression corresponding to the comparison operator, and so it is possible to find a point inside a rectangle with pure bitwise operation.
Edit: There is actually no need for condition statement, just expands the r_(n+1), then done.
x,y is in the rectangle {x0<x<x1 and y0<y<y1} if {x0<x and x<x1 and y0<y and y<y1}
If we can simulate < with bit operators, then we're good to go.
What does it mean to say something is < in binary? Consider
a: 0 0 0 0 1 1 0 1
b: 0 0 0 0 1 0 1 1
In the above, a>b, because it contains the first 1 whose counterpart in b is 0. We are those seeking the leftmost bit such that myBit!=otherBit. (== or equiv is a bitwise operator which can be represented with and/or/not)
However we need some way through to propagate information in one bit to many bits. So we ask ourselves this: can we "code" a function using only "bit" operators, which is equivalent to if(q,k,a,b) = if q[k] then a else b. The answer is yes:
We create a bit-word consisting of replicating q[k] onto every bit. There are two ways I can think of to do this:
1) Left-shift by k, then right-shift by wordsize (efficient, but only works if you have shift operators which duplicate the last bit)
2) Inefficient but theoretically correct way:
We left-shift q by k bits
We take this result and and it with 10000...0
We right-shift this by 1 bit, and or it with the non-right-shifted version. This copies the bit in the first place to the second place. We repeat this process until the entire word is the same as the first bit (e.g. 64 times)
Calling this result mask, our function is (mask and a) or (!mask and b): the result will be a if the kth bit of q is true, other the result will be b
Taking the bit-vector c=a!=b and a==1111..1 and b==0000..0, we use our if function to successively test whether the first bit is 1, then the second bit is 1, etc:
a<b :=
if(c,0,
if(a,0, B_LESSTHAN_A, A_LESSTHAN_B),
if(c,1,
if(a,1, B_LESSTHAN_A, A_LESSTHAN_B),
if(c,2,
if(a,2, B_LESSTHAN_A, A_LESSTHAN_B),
if(c,3,
if(a,3, B_LESSTHAN_A, A_LESSTHAN_B),
if(...
if(c,64,
if(a,64, B_LESSTHAN_A, A_LESSTHAN_B),
A_EQUAL_B)
)
...)
)
)
)
)
This takes wordsize steps. It can however be written in 3 lines by using a recursively-defined function, or a fixed-point combinator if recursion is not allowed.
Then we just turn that into an even larger function: xMin<x and x<xMax and yMin<y and y<yMax

Calculate image of a set for a function represented as an array of ROBDD's

I have a set of integers, represented as a Reduced Ordered Binary Decision Diagram (ROBDD) (interpreted as a function which evaluates to true iff the input is in the set) which I shall call Domain, and an integer function (which I shall call F) represented as an array of ROBDD's (one entry per bit of the result).
Now I want to calculate the image of the domain for F. It's definitely possible, because it could trivially be done by enumerating all items from the domain, apply F, and insert the result in the image. But that's a horrible algorithm with exponential complexity (linear in the size of the domain), and my gut tells me it can be faster. I've been looking into the direction of:
apply Restrict(Domain) to all bits of F
do magic
But the second step proved difficult. The result of the first step contains the information I need (at least, I'm 90% sure of it), but not in the right form. Is there an efficient algorithm to turn it into a "set encoded as ROBDD"? Do I need an other approach?
Define two set-valued functions:
N(d1...dn): The subset of the image where members start with a particular sequence of digits d0...dn.
D(d1...dn): The subset of the inputs that produce N(d1...dn).
Then when the sequences are empty, we have our full problem:
D(): The entire domain.
N(): The entire image.
From the full domain we can define two subsets:
D(0) = The subset of D() such that F(x)[1]==0 for any x in D().
D(1) = The subset of D() such that F(x)[1]==1 for any x in D().
This process can be applied recursively to generate D for every sequence.
D(d1...d[m+1]) = D(d1...dm) & {x | F(x)[m+1]==d[m+1]}
We can then determine N(x) for the full sequences:
N(d1...dn) = 0 if D(d1...dn) = {}
N(d1...dn) = 1 if D(d1...dn) != {}
The parent nodes can be produced from the two children, until we've produced N().
If at any point we determine that D(d1...dm) is empty, then we know
that N(d1...dm) is also empty, and we can avoid processing that branch.
This is the main optimization.
The following code (in Python) outlines the process:
def createImage(input_set_diagram,function_diagrams,index=0):
if input_set_diagram=='0':
# If the input set is empty, the output set is also empty
return '0'
if index==len(function_diagrams):
# The set of inputs that produce this result is non-empty
return '1'
function_diagram=function_diagrams[index]
# Determine the branch for zero
set0=intersect(input_set_diagram,complement(function_diagram))
result0=createImage(set0,function_diagrams,index+1)
# Determine the branch for one
set1=intersect(input_set_diagram,function_diagram)
result1=createImage(set1,function_diagrams,index+1)
# Merge if the same
if result0==result1:
return result0
# Otherwise create a new node
return {'index':index,'0':result0,'1':result1}
Let S(x1, x2, x3...xn) be the indicator function for the set S, so that S(x1, x2...xn) = true if (x1, x2,...xn) is an element of S. Let F1(x1, x2, x3... xn), F2(),... Fn() be the individual functions that define F. Then I could ask if a particular bit pattern, with wild cards, is in the image of F by forming the equation e.g. S() & F1() & ~F2() for bit-pattern 10 and then solving this equation, which I presume that I can do since it is an ROBDD.
Of course you want a general indicator function, which tells me if abc is in the image. Extending the above, I think you get S() & (a&F1() | ~a&~F1()) & (b&F2() | ~b&~F2()) &... If you then re-order the variables so that the original x1, x2, ... xn occur last in the ROBDD order, then you should be able to prune the tree to return true for the case where any setting of the x1, x2, ... xn leads to the value true, and to return false otherwise.
(of course you could run of space, or patience, waiting for the re-ordering to work).

Resources