Do programming languages have consistent interpretations of {1,...,n} when n = 0? - algorithm

On math.SE, a question about math notation enerated a discussion of how programming languages interpret the set {1,...,n} when n=0
The question asked for a mathematical notation to represent the R code 1:n
According to the comments, the mathematical interpretation of {1,...,n} when n=0 is that this is an empty set. A subsequent comment suggested that C is consistent with this interpretation, because for (int i = 1; i < n; i++) returns a empty set because it iterates 0 times.
It is not clear to me what the equivalent statement in R is, but 1:0 returns the vector [1,0]
Thus, for (i in 1:0) print(i) iterates over 1 and 0 (I interpret as analogous to the C code above)
Is this because {1,...,n} is not the correct notation for 1:n?
Does this mean R violates a universal rule?
Is there a consistent interpretation for this set among programming languages?

Each mathematical formalism has its own notation. To suggest that there is a "universal notation" is very "un-mathematical". Look at the notation associated with tensors or groups if you want examples of mathematical domains where multiple notational systems exist.
In R the code x <- 1:0 returns the ordered vector c(1,0). Just as the code x <- 2:-2 returns c(2,1,0,-1,-2). The code x <- seq(1, length=0) returns a sequence of length 0 which is printed in console sessions as integer(0). R is not really designed to mimic set notation but it does have some set functions and it also has packages that more fully implement set notation.

C has no concept of a set that a for loop runs over. A for loop for(a;b;c) d; is simply syntactic sugar for:
a;
loop: if (!b) goto done;
d;
c;
goto loop;
done: ;

See also my response at: Sequence construction that creates an empty sequence if lower is greater than upper bound - in R, seq_len(n) should be used in preference to 1:n for exactly this reason (the latter fails misbehaves when n=0).

some languages support the concept of ranges, in C it is arbitary what you make a for loop do, you could make it mean 0 or you could make it count backwards. In other languages a range that has the second number less that the first often produces a number sequence that is decreasing. But its arbitrary, and there is no universal rule.

Related

Where are these negatives coming from in Maple execution?

I am interested in simulating the phenomenon of "regression to the mean". Say a 0-1 vector V of length N is "gifted" if the number of 1s in V is greater than N/2 + 5*sqrt(N).
I want Maple to evaluate a string of M 0-1 lists, each of length N, to determine whether they are gifted.
Then, given that list V[i] is gifted, I want to evaluate the probability that list V[i+1] is gifted.
So far my code is failing in a strange way. So far all the code is supposed to do is create the list of sums (called 'total') and the list 'g' which carries a 0 if total[i] <= N/2 + 5sqrt(N), and a 1 otherwise.
Here is the code:
RS:=proc(N) local ra,i:
ra:=rand(0..1):
[seq(ra(),i=1..N)]:
end:
Gift:=proc(N,M) local total, i, g :
total:=[seq(add(RS(N)),i=1..M)]:
g:=[seq(0,i=1..M)]:
for i from 1 to M do
if total[i] > (N/2 + 5*(N^(1/2))) then
g[i]:=1
fi:
od:
print(total, g)
end:
The trouble is, Maple responds, when I try Gift(100,20),
"Error, (in Gift) cannot determine if this expression is true or false: 5*100^(1/2) < -2"
or, when I try Gift(10000,20), "Error, (in Gift) cannot determine if this expression is true or false: 5*10000^(1/2) < -103."
Where are these negative numbers coming from? And why can't Maple tell whether 5(10000)^{1/2} < -103 or not?
The negative quantities are simply the part of the inequality that results when the portion with the radical is moved to one side and the purely rational portion is moved to the other.
Use an appropriate mechanism for the resolution of the conditional test. For example,
if is( total[i] > (N/2 + 5*N^(1/2)) ) then
...etc
or, say,
temp := evalf(N/2 + 5*N^(1/2));
for i from 1 to M do
if total[i] > temp then
...etc
From the Maple online help:
Important: The evalb command does not simplify expressions. It may return false for a relation that is true. In such a case, apply a simplification to the relation before using evalb.
...
You must convert symbolic arguments to floating-point values when using the evalb command for inequalities that use <, <=, >, or >=.
In this particular example, Maple chokes when trying to determine if the symbolic square root is less than -2, though it tried its best to simplify before quitting.
One fix is to apply evalf to inequalities. Rather than, say, evalb(x < y), you would write evalb(evalf(x < y)).
As to why Maple can't handle these inequalities, I don't know.

finding all first consecutive prime factors and find max of that by Mathematica

Let
2|n, 3|n,..., p_i|n, p_ j|n,..., p_k|n
p_i < p_ j< ... < p_k
where all primes up to p_i divide n and
j > i+1
I want to write a code in Mathematica to find p_i and determine {2,3,5,...,p_i}.
thanks.
B = {};
n = 2^6 * 3^8 * 5^3 * 7^2 * 11 * 23 * 29;
For[i = 1, i <= k, i++,
If[Mod[n, Prime[i]] == 0, AppendTo[B, Prime[i]]
If[Mod[n, Prime[i + 1]] > 0, Break[]]]];
mep1= Max[B];
B
mep1
result is
{2,3,5,7,11}
11
I would like to write the code instead of B to get B[n], since I need to draw the graph of mep1[n] for given n.
If I understand your question and code correctly you want a list of prime factors of the integer n but only the initial part of that list which matches the initial part of the list of all prime numbers.
I'll first observe that what you've posted looks much more like C or one of its relatives than like Mathematica. In fact you don't seem to have used any of the power of Mathematica's in-built functions at all. If you want to really use Mathematica you need to start familiarising yourself with these functions; if that doesn't appeal stick to C and its ilk, it's a fairly useful programming language.
The first step I'd take is to get the prime factors of n like this:
listOfFactors = Transpose[FactorInteger[n]][[1]]
Look at the documentation for the details of what FactorInteger returns; here I'm using transposition and part to get only the list of prime factors and to drop their coefficients. You may not notice the use of the Part function, the doubled square brackets are the usual notation. Note also that I don't have Mathematica on this machine so my syntax may be a bit awry.
Next, you want only those elements of listOfFactors which match the corresponding elements in the list of all prime numbers. Do this in two steps. First, get the integers from 1 to k at which the two lists match:
matches = TakeWhile[Range[Length[listOfFactors]],(listOfFactors[[#]]==Prime[#])&]
and then
listOfFactors[[matches]]
I'll leave it to you to:
assemble these fragments into the function you want;
correct the syntactical errors I have made; and
figured out exactly what is going on in each (sub-)expression.
I make no warranty that this approach is the best approach in any general sense, but it makes much better use of Mathematica's intrinsic functionality than your own first try and will, I hope, point you towards better use of the system in future.

Check whether a point is inside a rectangle by bit operator

Days ago, my teacher told me it was possible to check if a given point is inside a given rectangle using only bit operators. Is it true? If so, how can I do that?
This might not answer your question but what you are looking for could be this.
These are the tricks compiled by Sean Eron Anderson and he even put a bounty of $10 for those who can find a single bug. The closest thing I found here is a macro that finds if any integer X has a word which is between M and N
Determine if a word has a byte between m and n
When m < n, this technique tests if a word x contains an unsigned byte value, such that m < value < n. It uses 7 arithmetic/logical operations when n and m are constant.
Note: Bytes that equal n can be reported by likelyhasbetween as false positives, so this should be checked by character if a certain result is needed.
Requirements: x>=0; 0<=m<=127; 0<=n<=128
#define likelyhasbetween(x,m,n) \
((((x)-~0UL/255*(n))&~(x)&((x)&~0UL/255*127)+~0UL/255*(127-(m)))&~0UL/255*128)
This technique would be suitable for a fast pretest. A variation that takes one more operation (8 total for constant m and n) but provides the exact answer is:
#define hasbetween(x,m,n) \
((~0UL/255*(127+(n))-((x)&~0UL/255*127)&~(x)&((x)&~0UL/255*127)+~0UL/255*(127-(m)))&~0UL/255*128)
It is possible if the number is a finite positive integer.
Suppose we have a rectangle represented by the (a1,b1) and (a2,b2). Given a point (x,y), we only need to evaluate the expression (a1<x) & (x<a2) & (b1<y) & (y<b2). So the problems now is to find the corresponding bit operation for the expression c
Let ci be the i-th bit of the number c (which can be obtained by masking ci and bit shift). We prove that for numbers with at most n bit, c<d is equivalent to r_(n-1), where
r_i = ((ci^di) & ((!ci)&di)) | (!(ci^di) & r_(i-1))
Prove: When the ci and di are different, the left expression might be true (depends on ((!ci)&di)), otherwise the right expression might be true (depends on r_(i-1) which is the comparison of next bit).
The expression ((!ci)&di) is actually equivalent to the bit comparison ci < di. Hence, this recursive relation return true that it compares the bit by bit from left to right until we can decide c is smaller than d.
Hence there is an purely bit operation expression corresponding to the comparison operator, and so it is possible to find a point inside a rectangle with pure bitwise operation.
Edit: There is actually no need for condition statement, just expands the r_(n+1), then done.
x,y is in the rectangle {x0<x<x1 and y0<y<y1} if {x0<x and x<x1 and y0<y and y<y1}
If we can simulate < with bit operators, then we're good to go.
What does it mean to say something is < in binary? Consider
a: 0 0 0 0 1 1 0 1
b: 0 0 0 0 1 0 1 1
In the above, a>b, because it contains the first 1 whose counterpart in b is 0. We are those seeking the leftmost bit such that myBit!=otherBit. (== or equiv is a bitwise operator which can be represented with and/or/not)
However we need some way through to propagate information in one bit to many bits. So we ask ourselves this: can we "code" a function using only "bit" operators, which is equivalent to if(q,k,a,b) = if q[k] then a else b. The answer is yes:
We create a bit-word consisting of replicating q[k] onto every bit. There are two ways I can think of to do this:
1) Left-shift by k, then right-shift by wordsize (efficient, but only works if you have shift operators which duplicate the last bit)
2) Inefficient but theoretically correct way:
We left-shift q by k bits
We take this result and and it with 10000...0
We right-shift this by 1 bit, and or it with the non-right-shifted version. This copies the bit in the first place to the second place. We repeat this process until the entire word is the same as the first bit (e.g. 64 times)
Calling this result mask, our function is (mask and a) or (!mask and b): the result will be a if the kth bit of q is true, other the result will be b
Taking the bit-vector c=a!=b and a==1111..1 and b==0000..0, we use our if function to successively test whether the first bit is 1, then the second bit is 1, etc:
a<b :=
if(c,0,
if(a,0, B_LESSTHAN_A, A_LESSTHAN_B),
if(c,1,
if(a,1, B_LESSTHAN_A, A_LESSTHAN_B),
if(c,2,
if(a,2, B_LESSTHAN_A, A_LESSTHAN_B),
if(c,3,
if(a,3, B_LESSTHAN_A, A_LESSTHAN_B),
if(...
if(c,64,
if(a,64, B_LESSTHAN_A, A_LESSTHAN_B),
A_EQUAL_B)
)
...)
)
)
)
)
This takes wordsize steps. It can however be written in 3 lines by using a recursively-defined function, or a fixed-point combinator if recursion is not allowed.
Then we just turn that into an even larger function: xMin<x and x<xMax and yMin<y and y<yMax

Is there any clever way to determine whether a point is in a rectangle?

I want to calculate whether a point, (x,y), is inside a rectangle which is determined by two points, (a,b) and (c,d).
If a<=c and b<=d, then it is simple:
a<=x&&x<=c&&b<=y&&y<=d
However, since it is unknown whether a<=c or b<=d, the code should be
(a<=x&&x<=c||c<=x&&x<=a)&&(b<=y&&y<=d||d<=y&&y<=b)
This code may work, but it is too long. I can write a function and use it, but I wonder if there's shorter way (and should be executed very fast - the code is called a lot) to write it.
One I can imagine is:
((c-x)*(x-a)>=0)&&((d-y)*(y-b)>=0)
Is there more clever way to do this?
(And, is there any good way to iterate from a from c?)
Swap the variables as needed so that a = xmin and b = ymin:
if a > c: swap(a,c)
if b > d: swap(b,d)
a <= x <= c and b <= y <= d
Shorter but slightly less efficient:
min(a,c) <= x <= max(a,c) and min(b,d) <= y <= max(b,d)
As always when optimizing you should profile the different options and compare hard numbers. Pipelining, instruction reordering, branch prediction, and other modern day compiler/processor optimization techniques make it non-obvious whether programmer-level micro-optimizations are worthwhile. For instance it used to be significantly more expensive to do a multiply than a branch, but this is no longer always the case.
I like the this:
((c-x)*(x-a)>=0)&&((d-y)*(y-b)>=0)
but with more whitespace and more symmetry:
(c-x)*(a-x) <= 0 && (d-y)*(b-y) <= 0
It's mathematically elegant, and probably the fastest too. You will need to measure to really determine which is the fastest. With modern pipelined processors, I would expect that straight-line code with the minimum number of operators will run fastest.
While sorting the (a, b) and (c, d) pairs as suggested in the accepted answer is probably the best solution in this case, an even better application of this method would probably be to elevate the a < b and c < d requirement to the level of the program-wide invariant. I.e. require that all rectangles in your program are created and maintained in this "normalized" form from the very beginning. Thus, inside your point-in-rectangle test function you should simply assert that a < b and c < d instead of wasting CPU resources on actually sorting them in every call.
Define intermediary variables i = min(a,b), j = min(c,d), k = max(a,b), l = max(c,d)
Then you only need i<=x && x<=k && j<=y && y<=l.
EDIT: Mind you, efficiency-wise it's probably better to use your "too long" code in a function.

Hashing sets of integers

I'm looking for a hash function over sets H(.) and a relation R(.,.) such that if A is included in B then R(H(A), H(B)). Of course, R(.,.) must be easy to verify (constant time), and H(A) should be computed in linear time.
One example of H and R is:
H(A) = OR over 1 << (h(x) % k), for x in A, k a fixed integer and h(x) a hash function over integers.
R(H(A), H(B)) = ((H(A) & H(B)) == H(A))
Are there any other good examples? (good is hard to define but intuitively if R(H(A), H(B)) then whp A is included in B).
After thinking about this, I ended up with the example you gave. I.e. each element in B sets a bit in the hash, and A is only contained in B if each bit which is set in H(A) is also set in H(B).
Maybe a Bloom filter is applicable in your case. It seems to use the same bit trick, but with multiple hash functions.

Resources