Improve a working "Bulgarian Solitaire" J verb - solver

"Bulgarian Solitaire" is a mathematical curiosity. It is played with a deck of 45 (any triangular number will work) unmarked cards. Put them into randomly sized piles. Then, to play a round, remove a card from each pile and create a new pile with the removed cards. Repeating this step eventually yields the configuration 1 2 3 4 5 6 7 8 9 (for 45 cards), which is clearly a fixed point of the game and thus the end of the solitaire. I wanted to simulate this game in J.
After a couple of days thinking about it and some long-awaited insight into J gerunds, I came up with a solution, on which I would like some opinions. It starts with this verb:
bsol =: ((#~ ~:&0) , #)#:(-&1)^:(<_)
Given a vector of positive integers whose sum is triangular, this verb returns a rank 2 array showing the rounds of the solitaire that results. I also came up with this verb to generate an initial configuration, but I'm less happy with it:
t =: 45 & - # (+/) NB. Would work with any triangular number
cards =: (]`(]#,>:#?&t#]))#.(0&<#t)^:_
Given a vector y of positive integers, t returns the defect from 45, i.e., the number 45 - +/ y of cards not accounted for in the piles represented by the argument. Using t, the verb cards appends to such a vector y an integer from >: i. t y repeatedly until the defect is 0.
Expanding t explicitly, I get
cards =: (]`(]#,>:#?&(45 & - # (+/))#]))#.(0&<#(45 & - # (+/)))^:_
I feel like this is not very brief, and maybe overly parenthesized. But it does work, and the complete solution now looks like this:
bsol # cards # >: # ? 44 NB. Choose the first pile randomly from >: i. 44
Without the named verbs:
(((#~ ~:&0) , #)#:(-&1)^:(<_)) #: ((]`(]#,>:#?&(45 & - # (+/))#]))#.(0&<#(45 & - # (+/)))^:_)#>:#? 44
Not knowing much about J idiom, I have the same feeling about this: it's not very brief, certainly redundant (would it be better to use a local verb like t here, since it's repeated, e.g.?), and probably overly parenthesized. What opportunities do I have to improve this program?

You can improve t with
t =: 45 - +/
Using 46 - +/ will spare you some >:.
You can replace cards with a recursive definition:
cards =: }.`(($:#] , -) ?)#.(0&<)
where, now, cards n produces an initial configuration with sum n.
In bsol you don't need -&1 if you remove (-.) the zeroes and rearrange it like:
bsol =: (0 -.~ [: (, #) <:)^:(<_)

Related

Quick way to compute n-th sequence of bits of size b with k bits set?

I want to develop a way to be able to represent all combinations of b bits with k bits set (equal to 1). It needs to be a way that given an index, can get quickly the binary sequence related, and the other way around too. For instance, the tradicional approach which I thought would be to generate the numbers in order, like:
For b=4 and k=2:
0- 0011
1- 0101
2- 0110
3- 1001
4-1010
5-1100
If I am given the sequence '1010', I want to be able to quickly generate the number 4 as a response, and if I give the number 4, I want to be able to quickly generate the sequence '1010'. However I can't figure out a way to do these things without having to generate all the sequences that come before (or after).
It is not necessary to generate the sequences in that order, you could do 0-1001, 1-0110, 2-0011 and so on, but there has to be no repetition between 0 and the (combination of b choose k) - 1 and all sequences have to be represented.
How would you approach this? Is there a better algorithm than the one I'm using?
pkpnd's suggestion is on the right track, essentially process one digit at a time and if it's a 1, count the number of options that exist below it via standard combinatorics.
nCr() can be replaced by a table precomputation requiring O(n^2) storage/time. There may be another property you can exploit to reduce the number of nCr's you need to store by leveraging the absorption property along with the standard recursive formula.
Even with 1000's of bits, that table shouldn't be intractably large. Storing the answer also shouldn't be too bad, as 2^1000 is ~300 digits. If you meant hundreds of thousands, then that would be a different question. :)
import math
def nCr(n,r):
return math.factorial(n) // math.factorial(r) // math.factorial(n-r)
def get_index(value):
b = len(value)
k = sum(c == '1' for c in value)
count = 0
for digit in value:
b -= 1
if digit == '1':
if b >= k:
count += nCr(b, k)
k -= 1
return count
print(get_index('0011')) # 0
print(get_index('0101')) # 1
print(get_index('0110')) # 2
print(get_index('1001')) # 3
print(get_index('1010')) # 4
print(get_index('1100')) # 5
Nice question, btw.

How many times does a zero occur on an odometer

I am solving how many times a zero occus on an odometer. I count +1 everytime I see a zero.
10 -> +1
100-> +2 because in 100 I see 2 zero's
10004 -> +3 because I see 3 zero's
So I get,
1 - 100 -> +11
1 - 500 -> +91
1 - 501 -> +92
0 - 4294967295-> +3825876150
I used rubydoctest for it. I am not doing anything with begin_number yet. Can anyone explain how to calculate it without a brute force method?
I did many attempts. They go well for numbers like 10, 1000, 10.000, 100.000.000, but not for numbers like 522, 2280. If I run the rubydoctest, it will fail on # >> algorithm_count_zero(1, 500)
# doctest: algorithm_count_zero(begin_number, end_number)
# >> algorithm_count_zero(1, 10)
# => 1
# >> algorithm_count_zero(1, 1000)
# => 192
# >> algorithm_count_zero(1, 10000000)
# => 5888896
# >> algorithm_count_zero(1, 500)
# => 91
# >> algorithm_count_zero(0, 4294967295)
# => 3825876150
def algorithm_count_zero(begin_number, end_number)
power = Math::log10(end_number) - 1
if end_number < 100
return end_number/10
else
end_number > 100
count = (9*(power)-1)*10**power+1
end
answer = ((((count / 9)+power)).floor) + 1
end
end_number = 20000
begin_number = 10000
puts "Algorithm #{algorithm_count_zero(begin_number, end_number)}"
As noticed in a comment, this is a duplicate to another question, where the solution gives you correct guidelines.
However, if you want to test your own solution for correctness, i'll put in here a one-liner in the parallel array processing language Dyalog APL (which i btw think everyone modelling mathemathics and numbers should use).
Using tryapl.org you'll be able to get a correct answer for any integer value as argument. Tryapl is a web page with a backend that executes simple APL code statements ("one-liners", which are very typical to the APL language and it's extremely compact code).
The APL one-liner is here:
{+/(c×1+d|⍵)+d×(-c←0=⌊(a|⍵)÷d←a×+0.1)+⌊⍵÷a←10*⌽⍳⌈10⍟⍵} 142857
Copy that and paste it into the edit row at tryapl.org, and press enter - you will quickly see an integer, which is the answer to your problem. In the code row above, you can see the argument rightmost; it is 142857 this time but you can change it to any integer.
As you have pasted the one-liner once, and executed it with Enter once, the easiest way to get it back for editing is to press [Up arrow]. This returns the most recently entered statement; then you can edit the number sitting rightmost (after the curly brace) and press Enter again to get the answer for a different argument.
Pasting teh code row above will return 66765 - that many zeroes exist for 142857.
If you paste this 2 characters shorter row below, you will see the individual components of the result - the sum of these components make up the final result. You will be able to see a pattern, which possibly makes it easier to understand what happens.
Try for example
{(c×1+d|⍵)+d×(-c←0=⌊(a|⍵)÷d←a×+0.1)+⌊⍵÷a←10*⌽⍳⌈10⍟⍵} 1428579376
0 100000000 140000000 142000000 142800000 142850000 142857000 142857900 142857930 142857937
... and see how the intermediate results contain segments of the argument 1428579376, starting from left! There are as many intermediate results as there are numbers in the argument (10 this time).
The result for 1428579376 will be 1239080767, ie. the sum of the 10 numbers above. This many zeroes appear in all numbers between 1 and 1428579376 :-).
Consider each odometer position separately. The position x places from the far right changes once every 10^x times. By looking at the numbers to its right, you know how long it will be until it next changes. It will then hold each value for 10^x times before changing, until it reaches the end of the range you are considering, when it will hold its value at that time for some number of times that you can work out given the value at the very end of the range.
Now you have a sequence of the form x...0123456789012...y where you know the length and you know the values of x and y. One way to count the number of 0s (or any other digit) within this sequence is to clip off the prefix from x.. to just before the first 0, and clip off the suffix from just after the last 9 to y. Look for 0s n in this suffix, and measure the length of the long sequence from prefix to suffix. This will be of a length divisible by 10, and will contain each digit the same number of times.
Based on this you should be able to work out, for each position, how often within the range it will assume each of its 10 possible values. By summing up the values for 0 from each of the odometer positions you get the answer you want.

String to Number and back algorithm

This is a hard one (for me) I hope people can help me. I have some text and I need to transfer it to a number, but it has to be unique just as the text is unique.
For example:
The word 'kitty' could produce 12432, but only the word kitty produces that number. The text could be anything and a proper number should be given.
One problem the result integer must me a 32-bit unsigned integer, that means the largest possible number is 2147483647. I don't mind if there is a text length restriction, but I hope it can be as large as possible.
My attempts. You have the letters A-Z and 0-9 so one character can have a number between 1-36. But if A = 1 and B = 2 and the text is A(1)B(2) and you add it you will get the result of 3, the problem is the text BA produces the same result, so this algoritm won't work.
Any ideas to point me in the right direction or is it impossible to do?
Your idea is generally sane, only needs to be developed a little.
Let f(c) be a function converting character c to a unique number in range [0..M-1]. Then you can calculate result number for the whole string like this.
f(s[0]) + f(s[1])*M + f(s[2])*M^2 + ... + f(s[n])*M^n
You can easily prove that number will be unique for particular string (and you can get string back from the number).
Obviously, you can't use very long strings here (up to 6 characters for your case), as 36^n grows fast.
Imagine you were trying to store Strings from the character set "0-9" only in a number (the equivalent of obtaining a number of a string of digits). What would you do?
Char 9 8 7 6 5 4 3 2 1 0
Str 0 5 2 1 2 5 4 1 2 6
Num = 6 * 10^0 + 2 * 10^1 + 1 * 10^2...
Apply the same thing to your characters.
Char 5 4 3 2 1 0
Str A B C D E F
L = 36
C(I): transforms character to number: C(0)=0, C(A)=10, C(B)=11, ...
Num = C(F) * L ^ 0 + C(E) * L ^ 1 + ...
Build a dictionary out of words mapped to unique numbers and use that, that's the best you can do.
I doubt there are more than 2^32 number of words in use, but this is not the problem you're facing, the problem is that you need to map numbers back to words.
If you were only mapping words over to numbers, some hash algorithm might work, although you'd have to work a bit to guarantee that you have one that won't produce collisions.
However, for numbers back to words, that's quite a different problem, and the easiest solution to this is to just build a dictionary and map both ways.
In other words:
AARDUANI = 0
AARDVARK = 1
...
If you want to map numbers to base 26 characters, you can only store 6 characters (or 5 or 7 if I miscalculated), but not 12 and certainly not 20.
Unless you only count actual words, and they don't follow any good countable rules. The only way to do that is to just put all the words in a long list, and start assigning numbers from the start.
If it's correctly spelled text in some language, you can have a number for each word. However you'd need to consider all possible plurals, place and people names etc. which is generally impossible. What sort of text are we talking about? There's usually going to be some existing words that can't be coded in 32 bits in any way without prior knowledge of them.
Can you build a list of words as you go along? Just give the first word you see the number 1, second number 2 and check if a word has a number already or it needs a new one. Then save your newly created dictionary somewhere. This would likely be the only workable solution if you require 100% reliable, reversible mapping from the numbers back to original words given new unknown text that doesn't follow any known pattern.
With 64 bits and a sufficiently good hash like MD5 it's extremely unlikely to have collisions, but for 32 bits it doesn't seem likely that a safe hash would exist.
Just treat each character as a digit in base 36, and calculate the decimal equivalent?
So:
'A' = 0
'B' = 1
[...]
'Z' = 25
'0' = 26
[...]
'9' = 35
'AA' = 36
'AB' = 37
[...]
'CAB' = 46657

Algorithm to find streets and same kind in a hand

This is actually a Mahjong-based question, but a Romme- or even Poker-based background will also easily suffice to understand.
In Mahjong 14 tiles (tiles are like cards in Poker) are arranged to 4 sets and a pair. A street ("123") always uses exactly 3 tiles, not more and not less. A set of the same kind ("111") consists of exactly 3 tiles, too. This leads to a sum of 3 * 4 + 2 = 14 tiles.
There are various exceptions like Kan or Thirteen Orphans that are not relevant here. Colors and value ranges (1-9) are also not important for the algorithm.
I'm trying to determine if a hand can be arranged in the way described above. For certain reasons it should not only be able to deal with 14 but any number of tiles. (The next step would be to find how many tiles need to be exchanged to be able to complete a hand.)
Examples:
11122233344455 - easy enough, 4 sets and a pair.
12345555678999 - 123, 456, 789, 555, 99
11223378888999 - 123, 123, 789, 888, 99
11223344556789 - not a valid hand
My current and not yet implemented idea is this: For each tile, try to make a) a street b) a set c) a pair. If none works (or there would be > 1 pair), go back to the previous iteration and try the next option, or, if this is the highest level, fail. Else, remove the used tiles from the list of remaining tiles and continue with the next iteration.
I believe this approach works and would also be reasonably fast (performance is a "nice bonus"), but I'm interested in your opinion on this. Can you think of alternate solutions? Does this or something similar already exist?
(Not homework, I'm learning to play Mahjong.)
The sum of the values in a street and in a set can be divided by 3:
n + n + n = 3n
(n-1) + n + (n + 1) = 3n
So, if you add together all the numbers in a solved hand, you would get a number of the form 3N + 2M where M is the value of the tile in the pair. The remainder of the division by three (total % 3) is, for each value of M :
total % 3 = 0 -> M = {3,6,9}
total % 3 = 1 -> M = {2,5,8}
total % 3 = 2 -> M = {1,4,7}
So, instead of having to test nine possible pairs, you only have to try three based on a simple addition. For each possible pair, remove two tiles with that value and move on to the next step of the algorithm to determine if it's possible.
Once you have this, start with the lowest value. If there are less than three tiles with that value, it means they're necessarily the first element of a street, so remove that street (if you can't because tiles n+1 or n+2 are missing, it means the hand is not valid) and move on to the next lowest value.
If there are at least three tiles with the lowest value, remove them as a set (if you ask "what if they were part of a street?" consider that if they were, then there are also three of tile n+1 and three of tile n+2, which can also be turned into sets) and continue.
If you reach an empty hand, the hand is valid.
For example, for your invalid hand the total is 60, which means M = {3,6,9}:
Remove the 3: 112244556789
- Start with 1: there are less than three, so remove a street
-> impossible: 123 needs a 3
Remove the 6: impossible, there is only one
Remove the 9: impossible, there is only one
With your second example 12345555678999, the total is 78, which means M = {3,6,9}:
Remove the 3: impossible, there is only one
Remove the 6: impossible, there is only one
Remove the 9: 123455556789
- Start with 1: there is only one, so remove a street
-> 455556789
- Start with 4: there is only one, so remove a street
-> 555789
- Start with 5: there are three, so remove a set
-> 789
- Start with 7: there is only one, so remove a street
-> empty : hand is valid, removals were [99] [123] [456] [555] [789]
Your third example 11223378888999 also has a total of 78, which causes backtracking:
Remove the 3: 11227888899
- Start with 1: there are less than three, so remove a street
-> impossible: 123 needs a 3
Remove the 6: impossible, there are none
Remove the 9: 112233788889
- Start with 1: there are less than three, so remove streets
-> 788889
- Start with 7: there is only one, so remove a street
-> 888
- Start with 8: there are three, so remove a set
-> empty, hand is valid, removals were : [99] [123] [123] [789] [888]
There is a special case that you need to do some re-work to get it right. This happens when there is a run-of-three and a pair with the same value (but in different suit).
Let b denates bamboo, c donates character, and d donates dot, try this hand:
b2,b3,b4,b5,b6,b7,c4,c4,c4,d4,d4,d6,d7,d8
d4,d4 should serve as the pair, and c4,c4,c4 should serve as the run-of-3 set.
But because the 3 "c4" tiles appear before the 2 d4 tiless, the first 2 c4 tiles will be picked up as the pair, leaving an orphan c4 and 2 d4s, and these 3 tiles won't form a valid set.
In this case, you'll need to "return" the 2 c4 tiles back to the hand (and keep the hand sorted), and search for next tile that meets the criteria (value == 4). To do that you'll need to make the code "remember" that it had tried c4 so in next iteration it should skip c4 and looks for other tiles with value == 4. The code will be a bit messy, but doable.
I would break it down into 2 steps.
Figure out possible combinations. I think exhaustive checking is feasible with these numbers. The result of this step is a list of combinations, where each combination has a type (set, street, or pair) and a pattern with the cards used (could be a bitmap).
With the previous information, determine possible collections of combinations. This is where a bitmap would come in handy. Using bitwise operators, you could see overlaps in usage of the same tile for different combinators.
You could also do a step 1.5 where you just check to see if enough of each type is available. This step and step 2 would be where you would be able to create a general algorithm. The first step would be the same for all numbers of tiles and possible combinations quickly.

How can I generate the Rowland prime sequence idiomatically in J?

If you're not familiar with the Rowland prime sequence, you can find out about it here. I've created an ugly, procedural monadic verb in J to generate the first n terms in this sequence, as follows:
rowland =: monad define
result =. 0 $ 0
t =. 1 $ 7
while. (# result) < y do.
a =. {: t
n =. 1 + # t
t =. t , a + n +. a
d =. | -/ _2 {. t
if. d > 1 do.
result =. ~. result , d
end.
end.
result
)
This works perfectly, and it indeed generates the first n terms in the sequence. (By n terms, I mean the first n distinct primes.)
Here is the output of rowland 20:
5 3 11 23 47 101 7 13 233 467 941 1889 3779 7559 15131 53 30323 60647 121403 242807
My question is, how can I write this in more idiomatic J? I don't have a clue, although I did write the following function to find the differences between each successive number in a list of numbers, which is one of the required steps. Here it is, although it too could probably be refactored by a more experienced J programmer than I:
diffs =: monad : '}: (|#-/#(2&{.) , $:#}.) ^: (1 < #) y'
While I already marked estanford's answer as the correct one, I've come a long, long way with J since I asked this question. Here's a much more idiomatic way to generate the rowland prime sequence in J:
~. (#~ 1&<) | 2 -/\ (, ({: + ({: +. >:##)))^:(1000 - #) 7
The expression (, ({: + ({: +. >:##)))^:(1000 - #) 7 generates the so-called original sequence up to 1000 members. The first differences of this sequence can be generated by | 2 -/\, i.e., the absolute values of the differences of every two elements. (Compare this to my original, long-winded diffs verb from the original question.)
Lastly, we remove the ones and the duplicate primes ~. (#~ 1&<) to get the sequence of primes.
This is vastly superior to the way I was doing it before. It can easily be turned into a verb to generate n number of primes with a little recursion.
I don't have a full answer yet, but this essay by Roger Hui has a tacit construct you can use to replace explicit while loops. Another (related) avenue would be to make the inner logic of the block into a tacit expression like so:
FUNWITHTACIT =: ] , {: + {: +. 1 + #
rowland =: monad define
result =. >a:
t =. 7x
while. (# result) < y do.
t =. FUNWITHTACIT t
d =. | -/ _2 {. t
result =. ~.result,((d>1)#d)
end.
result
)
(You might want to keep the if block for efficiency, though, since I wrote the code in such a way that result is modified regardless of whether or not the condition was met -- if it wasn't, the modification has no effect. The if logic could even be written back into the tacit expression by using the Agenda operator.)
A complete solution would consist of finding out how to represent all the logic inside the while loop of as a single function, and then use Roger's trick to implement the while logic as a tacit expression. I'll see what I can turn up.
As an aside, I got J to build FUNWITHTACIT for me by taking the first few lines of your code, manually substituting in the functions you declared for the variable values (which I could do because they were all operating on a single argument in different ways), replaced every instance of t with y and told J to build the tacit equivalent of the resulting expression:
]FUNWITHTACIT =: 13 : 'y,({:y)+(1+#y)+.({:y)'
] , {: + {: +. 1 + #
Using 13 to declare the monad is how J knows to take a monad (otherwise explicitly declared with 3 : 0, or monad define as you wrote in your program) and convert the explicit expression into a tacit expression.
EDIT:
Here are the functions I wrote for avenue (2) that I mentioned in the comment:
candfun =: 3 : '(] , {: + {: +. 1 + #)^:(y) 7'
firstdiff =: [: | 2 -/\ ]
triage =: [: /:~ [: ~. 1&~: # ]
rowland2 =: triage #firstdiff #candfun
This function generates the first n-many candidate numbers using the Rowland recurrence relation, evaluates their first differences, discards all first-differences equal to 1, discards all duplicates, and sorts them in ascending order. I don't think it's completely satisfactory yet, since the argument sets the number of candidates to try instead of the number of results. But, it's still progress.
Example:
rowland2 1000
3 5 7 11 13 23 47 101 233 467 941
Here's a version of the first function I posted, keeping the size of each argument to a minimum:
NB. rowrec takes y=(n,an) where n is the index and a(n) is the
NB. nth member of the rowland candidate sequence. The base case
NB. is rowrec 1 7. The output is (n+1,a(n+1)).
rowrec =: (1 + {.) , }. + [: +./ 1 0 + ]
rowland3 =: 3 : 0
result =. >a:
t =. 1 7
while. y > #result do.
ts =. (<"1)2 2 $ t,rowrec t
fdiff =. |2-/\,>(}.&.>) ts
if. 1~:fdiff do.
result =. ~. result,fdiff
end.
t =. ,>}.ts
end.
/:~ result
)
which finds the first y-many distinct Rowland primes and presents them in ascending order:
rowland3 20
3 5 7 11 13 23 47 53 101 233 467 941 1889 3779 7559 15131 30323 60647 121403 242807
Most of the complexity of this function comes from my handling of boxed arrays. It's not pretty code, but it only keeps 4+#result many data elements (which grows on a log scale) in memory during computation. The original function rowland keeps (#t)+(#result) elements in memory (which grows on a linear scale). rowland2 y builds an array of y-many elements, which makes its memory profile almost the same as rowland even though it never grows beyond a specified bound. I like rowland2 for its compactness, but without a formula to predict the exact size of y needed to generate n-many distinct primes, that task will need to be done on a trial-and-error basis and thus potentially use many more cycles than rowland or rowland3 on redundant calculations. rowland3 is probably more efficient than my version of rowland, since FUNWITHTACIT recomputes #t on every loop iteration -- rowland3 just increments a counter, which is less computationally intensive.
Still, I'm not happy with rowland3's explicit control structures. It seems like there should be a way to accomplish this behavior using recursion or something.

Resources