Permutations subject to partial orders - algorithm

I have two partial orders s_1 and s_2 of natural numbers. How to compute the possible permutations of the numbers of the two sets following the partial orders. We suppose that the two orders are compatible.
For example:
s_1=(1, 2, 4)
s_2=(2,3)
In this example, we search the number of permutations of the numbers from 1, 2, 3 and 4 following the orders in s_1 and s_2.
I would appreciate any suggestions for the general case.

Supposing the partial orderings are compatible, you can split them into binary relations. Your example would become:
s_1 = (1,2)
s_2 = (2,3)
s_3 = (2,4)
You can write an algorithm to traverse all legal orderings from this information. A simple approach would be to recursively search through the available choices of the partially ordered set. Here is an example pseudocode:
Recursive Search For All Legal Permutations Subject to Partial Ordering
1: procedure FindPOPerms(Poset)
2: MinNode = minimum value in Poset
3: MaxNode = maximum value in Poset
3: Available = MinNode→MaxNode
4: NNodes = No. of elements in Available
5: NPoset = No. of rows in Poset
6: Sequence = column array of zeros with length NNodes
7: Candidates = difference between Available set and column 2 of Poset
8: Selection = Candidates(1)
9: Available = set difference between Available and Selection
10: Poset = all Poset rows where column 1 is not equal to Selection
11: Iter = 0
12: POPerms = POPermSearch(Available, Candidates, Poset, Iter, Sequence)
13: end procedure
14: procedure POPermSearch(Available, Candidates, Poset, Iter, Sequence)
15: Iter = Iter+1
16: POPerms = empty array
17: if Available is not empty then
18: for i=1→number of elements in Candidates
19: S = Candidates(i)
20: A = set difference between Available and S
21: P = all Poset rows where column 1 is not equal to S
22: C = set difference between A and column 2 of P
23: Seq = Sequence
24: Seq(Iter) = S
25: POP = POPermSearch(A, C, P, Iter, Seq)
26: POPerms = add POP as new row to POPerms
27: end for
28: else
29: POPerms = Sequence
30: end if
31: end procedure
The input Poset for your case would be a 3x2 array:
1
2
2
3
2
4
with POPerms 2x4 output array:
1
2
3
4
1
2
4
3

Related

Ada: Integer Overflow

so I am hashing and have defined these types/functions:
subtype string2 is String(1..2);
function cString2 is new Ada.Unchecked_Conversion(string2, long_integer);
function cChar is new Ada.Unchecked_Conversion(character, long_integer);
and MUST use this hash function:
HA = (((cString2(s1) + cString2(s2)) * 256) + cChar(char)) mod 128
(the function is bad on purpose, but I must implement it) The problem occurs when adding and/or trying to multiply 256 by the sum of the two long integers, for it overflows. I need to somehow treat the strings as POSITIVE integer values and also not have my function overflow. THANKS!!!
The type Long_Integer is a signed integer type, and guaranteed to cover the range –2**31+1 .. +2**31–1 (if it exists):
LRM 3.5.4(22):
If Long_Integer is predefined for an implementation, then its range shall include the range –2**31+1 .. +2**31–1.
With your declarations you are likely to include at least 2 bytes of random junk in your converted values, but as the sizes don't match, the result is implementation defined and possibly invalid or abnormal.
I suggest that you read up on the 'Pos attribute and Ada.Unchecked_Conversion in the LRM.
You can compare the quality of various Hash functions using the approach shown here, which tallies collisions in a hash table of dictionary words. The resulting Counts are stored in an instance of Ada.Containers.Ordered_Maps.
As a concrete example, the library Hash function
function Hash is new Ada.Strings.Bounded.Hash(ASB);
produces a result having unique hashes for over half of the words and just seven collisions in the worst case:
Word count: 235886
Table size: 393241
Load factor: 59.99%
0: 215725 (0.00%)
1: 129710 (54.99%)
2: 38727 (32.84%)
3: 7768 (9.88%)
4: 1153 (1.96%)
5: 143 (0.30%)
6: 14 (0.04%)
7: 1 (0.00%)
In contrast, this Hash function
function Hash(Key : ASB.Bounded_String) return Ada.Containers.Hash_Type is
S : String := ASB.To_String(Key);
H : Ada.Containers.Hash_Type := 0;
begin
for C of S loop
H := H * 3 + Character'Pos(C);
end loop;
return H;
end;
produces a result having unique hashes for fewer than half of the words and twenty collisions each for two different hash values in the worst case:
Word count: 235886
Table size: 393241
Load factor: 59.99%
0: 236804 (0.00%)
1: 107721 (45.67%)
2: 32247 (27.34%)
3: 9763 (12.42%)
4: 3427 (5.81%)
5: 1431 (3.03%)
6: 813 (2.07%)
7: 441 (1.31%)
8: 250 (0.85%)
9: 150 (0.57%)
10: 88 (0.37%)
11: 41 (0.19%)
12: 27 (0.14%)
13: 14 (0.08%)
14: 11 (0.07%)
15: 7 (0.04%)
16: 2 (0.01%)
17: 1 (0.01%)
19: 1 (0.01%)
20: 2 (0.02%)

Algorithm that displays all the even numbers between 0 and 1000

This is how I did it and I think it's wrong
Step 1: Start
Step 2: I = 0
Step 3: Input number
Step 4:
while(i <= 1000)
if(number % 2 == 0)
display the number
Step 5: End
Increase i with each iteration in loop

Bradley Adaptive Thresholding -- Confused (questions)

I have some questions, probably stupid, about the implementation of the adaptive thresholding by Bradley. I have read paper about it http://people.scs.carleton.ca:8008/~roth/iit-publications-iti/docs/gerh-50002.pdf and I am a bit confused. Mainly about this statement:
if ((in[i,j]*count) ≤ (sum*(100−t)/100)) then
Let's assume that we have this input:
width, i
[0] [1] [2]
+---+---+---+
height [0] | 1 | 2 | 2 |
j +---+---+---+
[1] | 3 | 4 | 3 |
+---+---+---+
[2] | 5 | 3 | 2 |
+---+---+---+
and let's say that:
s = 2
s/2 = 1
t = 15
i = 1
j = 1 (we are at the center pixel)
So that means we have a window 3x3, right? Then:
x1 = 0, x2 = 2, y1 = 0, y2 = 2
What is count then? If it is number of pixels in the window, why it is 2*2=4, instead of 3*3=9 according to the algorithm? Further, why is the original value of the pixel multiplied by the count?
The paper says that the value is compared to the average value of surrounding pixels, why it isn't
in[i,j] <= (sum/count) * ((100 - t) / 100)
then?
Can somebody please explain this to me? It is probably very stupid question but I can't figure it out.
Before we start, let's present the pseudocode of the algorithm written in their paper:
procedure AdaptiveThreshold(in,out,w,h)
1: for i = 0 to w do
2: sum ← 0
3: for j = 0 to h do
4: sum ← sum+in[i, j]
5: if i = 0 then
6: intImg[i, j] ← sum
7: else
8: intImg[i, j] ← intImg[i−1, j] +sum
9: end if
10: end for
11: end for
12: for i = 0 to w do
13: for j = 0 to h do
14: x1 ← i−s/2 {border checking is not shown}
15: x2 ← i+s/2
16: y1 ← j −s/2
17: y2 ← j +s/2
18: count ← (x2−x1)×(y2−y1)
19: sum ← intImg[x2,y2]−intImg[x2,y1−1]−intImg[x1−1,y2] +intImg[x1−1,y1−1]
20: if (in[i, j]×count) ≤ (sum×(100−t)/100) then
21: out[i, j] ← 0
22: else
23: out[i, j] ← 255
24: end if
25: end for
26: end for
intImg is the integral image of the input image to threshold, assuming grayscale.
I've implemented this algorithm with success, so let's talk about your doubts.
What is count then? If it is number of pixels in the window, why it is 2*2=4, instead of 3*3=9 according to the algorithm?
There is an underlying assumption in the paper that they don't talk about. The value of s needs to be odd, and the windowing should be:
x1 = i - floor(s/2)
x2 = i + floor(s/2)
y1 = j - floor(s/2)
y2 = j + floor(s/2)
count is certainly the total number of pixels in the window, but you also need to make sure that you don't go out of bounds. What you have there should certainly be a 3 x 3 window and so s = 3, not 2. Now, if s = 3, but if we were to choose i = 0, j = 0, we will have x and y values that are negative. We can't have this and so the total number of valid pixels within this 3 x 3 window centred at i = 0, j = 0 is 4, and so count = 4. For windows that are within the bounds of the image, then count would be 9.
Further, why is the original value of the pixel multiplied by the count? The paper says that the value is compared to the average value of surrounding pixels, why it isn't:
in[i,j] <= (sum/count) * ((100 - t) / 100)
then?
The condition you're looking at is at line 20 of the algorithm:
20: (in[i, j]×count) ≤ (sum×(100−t)/100)
The reason why we take a look at in[i,j]*count is because we assume that in[i,j] is the average intensity within the s x s window. Therefore, if we examined a s x s window and added up all of the intensities, this is equal to in[i,j] x count. The algorithm is quite ingenious. Basically, we compare the assumed average intensity (in[i,j] x count) within the s x s window and if this is less than t% of the actual average within this s x s window (sum x ((100-t)/100)), then the output is set to black. If it is larger, than the output is set to white. However, you have eloquently stated that it should be this instead:
in[i,j] <= (sum/count) * ((100 - t) / 100)
This is essentially the same as line 20, but you divided both sides of the equation by count, so it's still the same expression. I would say that this explicitly states what I talked about above. The multiplication by count is certainly confusing, and so what you have written makes more sense.
Therefore, you're just seeing it a different way, and that's totally fine! So to answer your question, what you have stated is certainly correct and is equivalent to the expression seen in the actual algorithm.
Hope this helps!

Distinct values obtained by 1's and 2's with + and *

Is there a way to calculate the number of distinct values that can be obtained by using at most A 1's and B 2's with the operations + and *? Multiplication takes precedence and bracketing is not allowed.
For example, say that A=2 and B=2. Then
1: 1 and 1*1
2: 1*2 and 1*1*2 and 2 and 1+1
3: 1+2 and 1*2 + 1
4: 1+1+2 and 2+2 and 2*2 and 1*2*2 and 1*1*2*2 and so on
5: 2*2 + 1 and 2+2+1 and 2+2+1*1
6: 1+1+2+2 and 2*2 +1+1
So the values (1, 2, 3, 4, 5, 6) can be obtained using 2 1's and 2 2's. But 8 cannot be formed because bracketing values is not allowed. Hence the answer is 6 (number of distinct values obtainable).
I have observed that,
If B = 0, then the answer is A.
If A = 0, then the answer is (number of possible values for B).
* for A values have no effect. So only + should be considered for A.
From there, I'm kind of stuck. Any help appreciated.
Thanks.

Compute rand7() using rand5()

I have a solution for the problem to generate rand7() using only rand5().
One of the solution states:
5 * rand5() + rand5() would generate number 0 - 24 in equal probability so we just need to loop until we get a number < 21 ( 3 * 7 ) than % 7 to get the right answer between 0 - 6.
My question is why couldn't we just do 3 * rand5() + rand5() to generate number < 14 ( 2 * 7 ) instead?
If X and Y are independent and uniformly distributed on the set S_5 = {0,1,2,3,4}, then
5*X + Y is uniformly distributed on the set {0,...,24}, but
3*X + Y is not uniformly distributed on {0,...,16} and neither is its restriction on {0,...,13}
It's easy to see that (1) is indeed the case, because f(x,y) = 5*x + y is a bijection between S_5 x S_5 and S_25.
If we look at the distribution of 3*X + Y we get:
>>> Counter(3*x + y for x in range(5) for y in range(5))
Counter({3: 2, 4: 2, 6: 2, 7: 2, 9: 2, 10: 2, 12: 2, 13: 2, 0: 1, 1: 1, 2: 1, 5: 1, 8: 1, 11: 1, 14: 1, 15: 1, 16: 1}
The results 3, 4, 6, 7, 9, 10, 12, 13 are twice as likely as 1, 2, 5, 8 or 11. More proof:
>>> def rand7():
... x = 3*rand5() + rand5()
... if x < 14: return x % 7
... return rand7()
...
>>> Counter(rand7() for _ in xrange(100000))
Counter({6: 18219, 3: 18105, 4: 13734, 5: 13715, 2: 13634, 0: 13560, 1: 9033}
6 and 3 have a 4/22 ~ 18.2% chance of occuring, 4, 5, 2 and 0 have an 3/22 ~ 13.6% chance and 1 only has a 2/22 ~ 9.1% chance. That's one rigged dice.
3 * rand5() + rand5()
is not uniformly distributed. For example, it generates 0 in only one way, but 3 two ways, so 3 is more likely to occur than 0.
It's just like 2 * rand5() * rand5(), 4 * rand5() + rand5(), etc.
But 5 * rand5() + rand5() is uniformly distributed.
It is like generating two random digits of a base-5 number.
00 => 0
01 => 1
02 => 2
03 => 3
04 => 4
10 => 5
11 => 6
12 => 7
...
There is only and only one way to generate each number from 0 to 24.
In order to have a uniform distribution, the contributions of the two random numbers have to be independent. This means the range of the one must not overlap the range of the other, nor must there be any gaps.
In your proposed method there are two ways to get a 3 for example: the first random number returns 1 and the second returns 0, or the first returns 0 and the second returns 3. This makes it twice as likely to occur than a result of 0, which can only occur if both random numbers are 0.

Resources