Pseudocode: How to decode a PNG file from bits and bytes? - image

I've been trying to figure this out by reverse engineering a .png file I created in GIMP. It's 4x4 pixels. My goal is to decode the raw pixels out of the file with the intent of reversing this to encode.
Here is a full hex dump of the file:
89504E47 0D0A1A0A 0000000D 49484452 00000004 00000004
08020000 00269309 29000000 3F494441 54081D01 3400CBFF
01CC96B1 134FE120 C0CECDF1 5101FFA5 60000000 000000E0
403201DF E59286DF 6D000000 00000004 EDB11F00 2E007A21
93EDB11F 3063136F 4733525A 00000000 49454E44 AE426082
According to the spec, we start with the PNG signature which is the first 8 bytes.
89504E47 0D0A1A0A
We then have repeating "chunk" structures, this file has 3 "chunks", the header (IHDR), the image data (IDAT) and then the end "chunk" (IEND).
Each chunk is arranged into: the first 4 bytes for the length of the chunk data, the next 4 bytes for the type of data, then n-bytes for the actual data and then 4 bytes for the cyclic redundancy check (CRC) of the type of data and actual data sections.
Following this through...
0000000D
Is the chunk's data length (13 bytes).
49484452
Is the chunk's type (IHDR).
00000004 00000004 08020000 00
Is the chunk's data (4 bytes width, height; 1 byte bit depth, colour type, compression method, filter method, interlace method).
269309 29
Is the CRC of the data and type (managed to get the code to work this out from here.
000000 3F
Is the next chunk's data length (63 bytes).
494441 54
Is the chunk's type (IDAT).
081D01 3400CBFF 01CC96B1 134FE120 C0CECDF1 5101FFA5 60000000 000000E0 403201DF E59286DF 6D000000 00000004 EDB11F00 2E007A21 93EDB11F 3063136F
Is the chunk's actual data (the image data compressed and filtered).
So my actual question is how do I decode this last section into raw pixels?
According to the spec, I must first decompress the data (INFLATE?) and then unfilter it (??) to be left with scanlines of pixels (my goal).
If this could be explained in pseudocode that would be amazing! Otherwise I'm familiar with Swift and less so with C...

This book explains the process of decoding for programmers:
https://www.amazon.com/Compressed-Image-File-Formats-JPEG/dp/0201604434
The entire process is way to complicated to fit within the space of a SO answer. PNG uses two different methods of compression: LZ and Huffman encoding.

I think pngcheck will help you considerably:
pngcheck -vv result.png
Sample Output
File: result.png (334985 bytes)
chunk IHDR at offset 0x0000c, length 13
600 x 450 image, 24-bit RGB, non-interlaced
chunk IDAT at offset 0x00025, length 65536
zlib: deflated, 32K window, default compression
row filters (0 none, 1 sub, 2 up, 3 avg, 4 paeth):
1 4 4 4 4 4 4 4 1 4 4 4 4 4 4 4 1 4 4 4 4 4 4 4 4
1 4 4 1 4 1 4 1 4 4 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 4 4 1 1 1 1 1 1 1 4 4 1 4 1 1 4
4 4 4 4 4 1 4 4 4 4 4 4 4 1 4 4 4 4 4 4 4 (96 out of 450)
chunk IDAT at offset 0x10031, length 65536
row filters (0 none, 1 sub, 2 up, 3 avg, 4 paeth):
1 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 1
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 1 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 1 4 4 4 (188 out of 450)
chunk IDAT at offset 0x2003d, length 65536
row filters (0 none, 1 sub, 2 up, 3 avg, 4 paeth):
4 4 4 4 4 4 4 4 4 4 4 4 1 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 1 4 4 4 4 4 4 4 4 (273 out of 450)
chunk IDAT at offset 0x30049, length 65536
row filters (0 none, 1 sub, 2 up, 3 avg, 4 paeth):
4 4 4 4 4 4 4 4 4 2 4 4 4 4 4 1 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4 4 1 4 4 4 4 4 4 4 1 4 4 4
4 4 4 4 1 4 4 4 (356 out of 450)
chunk IDAT at offset 0x40055, length 65536
row filters (0 none, 1 sub, 2 up, 3 avg, 4 paeth):
4 4 4 4 1 4 4 4 4 4 4 4 1 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 1 4 4 4 4 4 4 4 1 1 1 4 4 1 4 4 1 1 1 4 4 4
4 4 1 1 1 4 1 4 4 4 1 4 1 4 4 4 1 4 1 4 4 4 4 4 4
4 1 1 4 1 4 4 1 4 1 (441 out of 450)
chunk IDAT at offset 0x50061, length 7188
row filters (0 none, 1 sub, 2 up, 3 avg, 4 paeth):
4 1 1 4 4 4 4 1 4 (450 out of 450)
chunk IEND at offset 0x51c81, length 0
No errors detected in result.png (8 chunks, 58.6% compression).

Your question is way too broad for pseudocode -- or at least, actionable pseudocode - if we go high-level enough, you already know the basics: 1. uncompress, 2. unfilter, 3. profit.
Since you state in comments that you want to do everything yourself, I propose that you start by implementing the default decompression algorithm, ZLIB CM=8 "deflate", which is well-described in https://www.ietf.org/rfc/rfc1951.txt among others.

Related

Find unrelated partitions of a complete binary tree

I have a complete binary tree of height 'h'.
How do I find 'h' number of unrelated partitions for this ?
NOTE:
Unrelated partition means no child can be present with its immediate parent.
There is a constraint on the number of nodes in each partition.
The difference of the maximum number nodes in a partition and the minimum number of nodes in the partition can either be 0 or 1.
Also, root is excluded from including in the partitions.
Who devised the problem probably had a more elegant solution in mind, but the following works.
Let's say we have h partitions numbered 1 to h, and that the nodes of partition n have value n. The root node has value 0, and does not participate in the partitions. Let's call a partition even if nis even, and odd if n is odd. Let's also number the levels of the complete binary tree, ignoring the root and starting from level 1 with 2 nodes. Level n has 2n nodes, and the complete tree has 2h+1-1 nodes, but only P=2h+1-2 nodes belong to the partitions (because the root is excluded). Each partition consists of p=⌊P/h⌋ or p=⌈P/h⌉ nodes, such that ∑ᵢpᵢ=P.
If the height h of the tree is even, put all even partitions into the even levels of the left subtree and the odd levels of the right subtee, and put all odd partitions into the odd levels of the left subtree and the even levels of the right subtree.
If h is odd, distribute all partitions up to partition h-1 like in the even case, but distribute partition h evenly into the last level of the left and right subtrees.
This is the result for h up to 7 (I wrote a tiny Python library to print binary trees to the terminal in a compact way for this purpose):
0
1 1
0
1 2
2 2 1 1
0
1 2
2 2 1 1
1 1 3 3 2 2 3 3
0
1 2
2 2 1 1
1 1 1 1 2 2 2 2
2 4 4 4 4 4 4 4 1 3 3 3 3 3 3 3
0
1 2
2 2 1 1
1 1 1 1 2 2 2 2
2 2 2 2 2 2 4 4 1 1 1 1 1 1 3 3
3 3 3 3 3 3 3 3 3 3 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5
0
1 2
2 2 1 1
1 1 1 1 2 2 2 2
2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1
1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 3 3 3 3 3 3 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
0
1 2
2 2 1 1
1 1 1 1 2 2 2 2
2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 4 4 4 4 4 4 4 4 4 4 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7
And this is the code that generates it:
from basicbintree import Node
for h in range(1, 7 + 1):
root = Node(0)
P = 2 ** (h + 1) - 2 # nodes in partitions
p = P // h # partition size (may be p or p + 1)
if h & 1: # odd height
t = (p + 1) // 2 # subtree tail nodes from split partition
n = (h - 1) // 2 # odd or even partitions in subtrees except tail
else: # even height
t = 0 # no subtree tail nodes from split partition
n = h // 2 # odd or even partitions in subtrees
s = P // 2 - t # subtree nodes excluding tail
r = s - n * p # partitions of size p + 1 in subtrees
x = [p + 1] * r + [p] * (n - r) # nodes indexed by subtree partition - 1
odd = [1 + 2 * i for i, c in enumerate(x) for _ in range(c)] + [h] * t
even = [2 + 2 * i for i, c in enumerate(x) for _ in range(c)] + [h] * t
for g in range(1, h + 1):
start = 2 ** (g - 1) - 1
stop = 2 ** g - 1
if g & 1: # odd level
root.set_level(odd[start:stop] + even[start:stop])
else: # even level
root.set_level(even[start:stop] + odd[start:stop])
print('```none')
root.print_tree()
print('```')
All trees produced up to height 27 have been programmatically confirmed to meet the specifications.
Some parts of the algorithm would need a proof, like, e.g., that it's always possible to choose an even size for the split partition in the odd height case, but this and other proofs are left as an exercise to the reader ;-)

Determine all consecutive subsets of the set {1,2,3,…,n}. The subsets should have at least 2 elements

I need to partition a set S={1, 2, 3, … , n} consisting of consecutive numbers such that each subset has has at least 2 elements (rule 1) and it consists of consecutive numbers (rule 2).
The rules are:
Each subset has at least two elements.
All elements of all subsets are consecutive.
All elements of S are included in the partition.
Examples:
There is 1 subset for n = 2:
1 2
There is 1 subset for n = 3:
1 2 3
There are 2 subset combinations for n = 4:
1 2 3 4
1 2 - 3 4
There are 3 subset combinations for n = 5:
1 2 3 4 5
1 2 - 3 4 5
1 2 3 - 4 5
There are 5 subset combinations for n = 6:
1 2 3 4 5 6
1 2 - 3 4 5 6
1 2 3 - 4 5 6
1 2 3 4 - 5 6
1 2 - 3 4 - 5 6
There are 8 subset combinations for n = 7:
1 2 3 4 5 6 7
1 2 - 3 4 5 6 7
1 2 3 - 4 5 6 7
1 2 3 4 - 5 6 7
1 2 3 4 5 - 6 7
1 2 - 3 4 - 5 6 7
1 2 - 3 4 5 - 6 7
1 2 3 - 4 5 - 6 7
There are 13 subset combinations for n = 8:
1 2 3 4 5 6 7 8
1 2 - 3 4 5 6 7 8
1 2 3 - 4 5 6 7 8
1 2 3 4 - 5 6 7 8
1 2 3 4 5 - 6 7 8
1 2 3 4 5 6 - 7 8
1 2 - 3 4 - 5 6 7 8
1 2 - 3 4 5 - 6 7 8
1 2 - 3 4 5 6 - 7 8
1 2 3 - 4 5 - 6 7 8
1 2 3 - 4 5 6 - 7 8
1 2 3 4 - 5 6 - 7 8
1 2 - 3 4 - 5 6 - 7 8
There are 21 subset combinations for n = 9:
1 2 3 4 5 6 7 8 9
1 2 - 3 4 5 6 7 8 9
1 2 3 - 4 5 6 7 8 9
1 2 3 4 - 5 6 7 8 9
1 2 3 4 5 - 6 7 8 9
1 2 3 4 5 6 - 7 8 9
1 2 3 4 5 6 7 - 8 9
1 2 - 3 4 - 5 6 7 8 9
1 2 - 3 4 5 - 6 7 8 9
1 2 - 3 4 5 6 - 6 7 9
1 2 - 3 4 5 6 7 - 8 9
1 2 3 - 4 5 - 6 7 8 9
1 2 3 - 4 5 6 - 7 8 9
1 2 3 - 4 5 6 7 - 8 9
1 2 3 4 - 5 6 - 7 8 9
1 2 3 4 - 5 6 7 - 8 9
1 2 3 4 5 - 6 7 - 8 9
1 2 - 3 4 - 5 6 - 7 8 9
1 2 - 3 4 - 5 6 7 - 8 9
1 2 - 3 4 5 - 6 7 - 8 9
1 2 3 - 4 5 - 6 7 - 8 9
There are 34 subset combinations for n = 10:
1 2 3 4 5 6 7 8 9 10
1 2 - 3 4 5 6 7 8 9 10
1 2 3 - 4 5 6 7 8 9 10
1 2 3 4 - 5 6 7 8 9 10
1 2 3 4 5 - 6 7 8 9 10
1 2 3 4 5 6 - 7 8 9 10
1 2 3 4 5 6 7 - 8 9 10
1 2 3 4 5 6 7 8 - 9 10
1 2 - 3 4 - 5 6 7 8 9 10
1 2 - 3 4 5 - 6 7 8 9 10
1 2 - 3 4 5 6 - 6 7 9 10
1 2 - 3 4 5 6 7 - 8 9 10
1 2 - 3 4 5 6 7 8 - 9 10
1 2 3 - 4 5 - 6 7 8 9 10
1 2 3 - 4 5 6 - 7 8 9 10
1 2 3 - 4 5 6 7 - 8 9 10
1 2 3 - 4 5 6 7 8 - 9 10
1 2 3 4 - 5 6 - 7 8 9 10
1 2 3 4 - 5 6 7 - 8 9 10
1 2 3 4 - 5 6 7 8 - 9 10
1 2 3 4 5 - 6 7 - 8 9 10
1 2 3 4 5 - 6 7 8 - 9 10
1 2 3 4 5 6 - 7 8 - 9 10
1 2 - 3 4 - 5 6 - 7 8 9 10
1 2 - 3 4 - 5 6 7 - 8 9 10
1 2 - 3 4 - 5 6 7 8 - 9 10
1 2 - 3 4 5 - 6 7 - 8 9 10
1 2 - 3 4 5 - 6 7 8 - 9 10
1 2 - 3 4 5 6 - 7 8 - 9 10
1 2 3 - 4 5 - 6 7 - 8 9 10
1 2 3 - 4 5 - 6 7 8 - 9 10
1 2 3 - 4 5 6 - 7 8 - 9 10
1 2 3 4 - 5 6 - 7 8 - 9 10
1 2 - 3 4 - 5 6 - 7 8 - 9 10
I didn't write them down here but there are 55 subset combinations for n = 11 and 89 subset combinations for n = 12.
I need to write a Visual Basic code listing all possible subset groups for n. I have been thinking on the solution for days but it seems that the solution of the problem is beyond my capacity. The number of required nested loops increases with n and I could not figure out how to program the nested loops with increasing number. Any help will be greatly appreciated.
After some research, I found out this is the problem of "compositions of n with all parts >1" and the total number of possible compositions are Fibonacci numbers (Fn-1 for n).
We already know the answer for these cases (as you wrote in your examples):
n=2
n=3
n=4
For n=5:
You can partition from 2: 1 2 - 3 4 5. This is like dividing the 5 member set into two sets, first one n=2, and second one n=3. We can now continue dividing each half, but we already know the solutions when n=2 and n=3!
You can partition from 3: 1 2 3 - 4 5. This is like dividing the 5 member set into two sets, first one n=3, and second one n=2. We can now continue dividing each half, but we already know the solution when n=2 and n=3!
For n=6:
You can partition into two sets from 2: 1 2 - 3 4 5 6. This is like dividing the 6 member set into two sets, first one n=2, and second one n=4. We can now continue dividing each half, but we already know the solution when n=2. Solve the second half by assuming n=4!
You can partition into two sets from 3: 1 2 3 - 4 5 6. This is like dividing the 6 member set into two sets, first one n=3, and second one n=3, We can now continue dividing each half, but we already know the solution when n=3 and n=3!
You can partition into two sets from 3: 1 2 3 4 - 5 6. This is like dividing the 6 member set into two sets, first one n=4, and second one n=2, We can now continue dividing each half. Solve the first half by setting n=4. For the second half, we already know the solution when n=2!
This is a simple recursion relationship. The general case:
Partition (S): (where |S|>4)
- For i from 2 to |S|-2, partition the given set into two halves: s1 and s2 from i (s1={1,...,i}, s2={i+1,...,n}), and print the two subsets as a solution.
- Recursively continue for each half by calling Partition(s1) and Partition(s2)
---
Another perhaps harder solution is to assume we are dividing the numbers 1 to n into n sections, where the length of each section can be either 0, 2, or a number greater than 2. In other words let xi be the length of each section:
x1 + x2 + ... xn = n, where the range of xi is: {0} + [2,n]
This is a system of linear non-equalities can can be solved by methods described here.
My answer to you is to try and come up with a recurrence relation of the given pattern. Think recursively. How can I break this problem down into smaller subproblems until reaching the smallest problem. Solve that smallest problem. After solving that smallest problem, think induction. Hypothesize on the what the nth step will be and how you will reach the (n+1)th step. Try to solve that (n+1)th step. Once you have come up with a recurrence relation on the pattern being given, it should not be too difficult to think about how to solve this pattern recursively. Instead of trying to use nested loops, this approach may be more intuitive.

Tree searching algorithm: how to determine quickly if A has a sure-to-win strategy

The original question goes like this: There are 99 stones, A and B are playing a game that, each one take some stones in turn, and each turn one can only take 1, 2, 4, or 6 stones, the one take the last stone wins. If A is the first one to take stones, how many stones shall A take in the first turn?
This seems a quite complex tree searching quiz, listing out all the branches, then work it bottom up: the leaf with A taking the last stone is marked as "win"; for the intermediate node that whatever strategies B might take, if A always has a way to reach a node marked as "win", this node is also marked as "win".
But this approach is quite time consuming. Is there any smart algorithm to check out if A has a "guaranteed to win" strategy?
O(n) solution
If we start with 1, 2, 4 or 6 stones, A will always win, because he'll just take them all in the first move.
If we start with 3, A will lose no matter what he does, because regardless of whether he takes 1 or 2, B will take 2 or 1 next and win.
If we start with 5, A will win by taking 2 first, thus sending B to the case above, where he starts with 3 stones.
If we start with 7, A will win by taking 4, sending B to the same case with 3.
If we start with 8, A will lose no matter what he does: whatever he takes, he will send B to a winning position.
If we start with 9, A can take 1 and send B to the situation with 8, causing him to lose.
If we start with 10, A can take 2 and send B to the situation with 8 again, causing him to lose.
By now, it should become quite obvious how you can incrementally build an O(n) solution: let win[i] = true if i stones are winnable for the first person to move
We have:
win[1] = win[2] = win[4] = win[5] = win[6] = true, win[3] = false
win[x > 6] = not (win[x - 6] and win[x - 4] and win[x - 2] and win[x - 1])
For example:
win[7] = not (win[1] and win[3] and win[5] and win[6])
= not (true and false and true and true)
= not false
= true
Compute this up until the number you're interested in and that's it. No trees involved.
O(1) solution
By looking carefully at the above solution, we can derive a simple constant time solution: note that A can only lose if he sends B to a winning position no matter what he does, so if k - 6, k - 4, k - 2, k - 1 are all winning positions.
If you compute win for a few values, the pattern becomes obvious:
win[k] = false if k = 3, 8, 11, 16, 19, 24, 27, 32...
=> win[k] = false iff k mod 8 == 3 or k mod 8 == 0
For 99, 99 mod 8 = 3, so A does not have a sure winning strategy.
OK, so we can see that:
Every turn, number of stones can be taken is less than 7, so the result should be related to modulus 7.
So, for n < 1000, I have printed out the sequence of number of stones that makes the first person win, modulus 7, and it is a truly repeated cycle.
1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5 0 1 3 4 5 6 1 2 4 5 6 0 2 3 5 6 0 1 3 4 6 0 1 2 4 5 0 1 2 3 5 6 1 2 3 4 6 0 2 3 4 5
This cycle has the length is 56, so the problem can be solved in O(1) by finding the result of first 56 numbers.

Data structures - Queue

I have queue q of integer numbers stored in an array in the circular fashion
from front to rear that is..
f: 1
r: 8
array Q: 0 1 2 3 4 5 6 7 8 9
2 8 4 4 3 5 4
What is the array representation of queue q after i perform the following?
while q.front() is an even number do q.enqueue(q.dequeue()).
It will loop until it reaches number 3. So the array will be
0 1 2 3 4 5 6 7 8 9
4 4 3 5 4 2 8

Solving a recreational square packing problem

I was asked to find a 11x11-grid containing the digits such that one can read the squares of 1,...,100. Here read means that you fix the starting position and direction (8 possibilities) and if you can find for example the digits 1,0,0,0,0,4 consecutively, you have found the squares of 1, 2, 10, 100 and 20. I made a program (the algorithm is not my own. I modified slightly a program which uses best-first search to find a solution but it is too slow. Does anyone know a better algorithm to solve the problem?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <vector>
#include <algorithm>
using namespace std;
int val[21][21];//number which is present on position
int vnum[21][21];//number of times the position is used - useful if you want to backtrack
//5 unit borders
int mx[4]={-1,0,1,0};//movement arrays
int my[4]={0,-1,0,1};
int check(int x,int y,int v,int m)//check if you can place number - if you can, return number of overlaps
{
int c=1;
while(v)//extract digits one by one
{
if(vnum[x][y] && (v%10)!=val[x][y])
return 0;
if(vnum[x][y])
c++;
v/=10;
x+=mx[m];
y+=my[m];
}
return c;
}
void apply(int x,int y,int v,int m)//place number - no sanity checks
{
while(v)//extract digits one by one
{
val[x][y]=v%10;
vnum[x][y]++;
v/=10;
x+=mx[m];
y+=my[m];
}
}
void deapply(int x,int y,int v,int m)//remove number - no sanity checks
{
while(v)
{
vnum[x][y]--;
v/=10;
x+=mx[m];
y+=my[m];
}
}
int best=100;
void recur(int num)//go down a semi-random path
{
if(num<best)
{
best=num;
if(best)
printf("FAILED AT %d\n",best);
else
printf("SUCCESS\n");
for(int x=5;x<16;x++) // 16 and 16
{
for(int y=5;y<16;y++)
{
if(vnum[x][y]==0)
putchar('.');
else
putchar(val[x][y]+'0');
}
putchar('\n');
}
fflush(stdout);
}
if(num==0)
return;
int s=num*num,t;
vector<int> poss;
for(int x=5;x<16;x++)
for(int y=5;y<16;y++)
for(int m=0;m<4;m++)
if(t=check(x,y,s,m))
poss.push_back((x)|(y<<8)|(m<<16)|(t<<24));//compress four numbers into an int
if(poss.size()==0)
return;
sort(poss.begin(),poss.end());//essentially sorting by t
t=poss.size()-1;
while(t>=0 && (poss[t]>>24)==(poss.back()>>24))
t--;
t++;
//t is now equal to the smallest index which has the maximal overlap
t=poss[rand()%(poss.size()-t)+t];//select random index>=t
apply(t%256,(t>>8)%256,s,(t>>16)%256);//extract random number
recur(num-1);//continue down path
}
int main()
{
srand((unsigned)time(0));//seed
while(true)
{
for(int i=0;i<21;i++)//reset board
{
memset(val[i],-1,21*sizeof(int));
memset(vnum[i],-1,21*sizeof(int));
}
for(int i=5;i<16;i++)
{
memset(val[i]+5,0,11*sizeof(int));
memset(vnum[i]+5,0,11*sizeof(int));
}
recur(100);
}
}
Using a random search so far I only got to 92 squares with one unused spot (8 missing numbers: 5041 9025 289 10000 4356 8464 3364 3249)
1 5 2 1 2 9 7 5 6 9 5
6 1 0 8 9 3 8 4 4 1 2
9 7 2 2 5 0 0 4 8 8 2
1 6 5 9 6 0 4 4 7 7 4
4 4 2 7 6 1 2 9 0 2 2
2 9 6 1 7 8 4 4 0 9 3
6 5 5 3 2 6 0 1 4 0 6
4 7 6 1 8 1 1 8 2 8 1
8 0 1 3 4 8 1 5 3 2 9
0 5 9 6 9 8 8 6 7 4 5
6 6 2 9 1 7 3 9 6 9
The algorithm basically uses as solution encoding a permutation on the input (search space is 100!) and then places each number in the "topmost" legal position. The solution value is measured as the sum of the squares of the lengths of the numbers placed (to give more importance to long numbers) and the number of "holes" remaining (IMO increasing the number of holes should raise the likehood that another number will fit in).
The code has not been optimized at all and is only able to decode a few hundred solutions per second. Current solution has been found after 196k attempts.
UPDATE
Current best solution with this approach is 93 without free holes (7 missing numbers: 676 7225 3481 10000 3364 7744 5776):
9 6 0 4 8 1 0 0 9 3 6
6 4 0 0 2 2 5 6 8 8 9
1 7 2 9 4 1 5 4 7 6 3
5 8 2 3 8 6 4 9 6 5 7
2 4 4 4 1 8 2 8 2 7 2
1 0 8 9 9 1 3 4 4 9 1
2 1 2 9 6 1 0 6 2 4 1
2 3 5 5 3 9 9 4 0 9 6
5 0 0 6 1 0 3 5 2 0 3
2 7 0 4 2 2 5 2 8 0 9
9 8 2 2 6 5 3 4 7 6 1
This is a solution (all 100 numbers placed) however using a 12x12 grid (MUCH easier)
9 4 6 8 7 7 4 4 5 5 1 7
8 3 0 5 5 9 2 9 6 7 6 4
4 4 8 3 6 2 6 0 1 7 8 4
4 8 4 2 9 1 4 0 5 6 1 4
9 1 6 9 4 8 1 5 4 2 0 1
9 4 4 7 2 2 5 2 2 5 0 0
4 6 2 2 5 8 4 2 7 4 0 2
0 3 3 3 6 4 0 0 6 3 0 9
9 8 0 1 2 1 7 9 5 5 9 1
6 8 4 2 3 5 2 6 3 2 0 6
9 9 8 2 5 2 9 9 4 2 2 7
1 1 5 6 6 1 9 3 6 1 5 4
It has been found using a truly "brute force" approach, starting from a random matrix and keeping randomly changing digits when that improved the coverage.
This solution has been found by an highly unrolled C++ program automatically generated by a Python script.
Update 2
Using an incremental approach (i.e. keeping a more complex data structure so that when changing a matrix element the number of targets covered can be updated instead than recomputed) I got a much faster search (about 15k matrices/second investigated with a Python implementation running with PyPy).
In a few minutes this version was able to find a 99 quasi-solution (a number is still missing):
7 0 5 6 5 1 1 5 7 1 6
4 6 3 3 9 8 8 6 7 6 1
3 9 0 8 2 6 1 1 4 7 8
1 1 0 8 9 9 0 0 4 4 6
3 4 9 0 4 9 0 4 6 7 1
6 4 4 6 8 6 3 2 5 2 9
9 7 8 4 1 1 4 0 5 4 2
6 2 4 1 5 2 2 1 2 9 7
9 8 2 5 2 2 7 3 6 5 0
3 1 2 5 0 0 6 3 0 5 4
7 5 6 9 2 1 6 5 3 4 6
UPDATE 3
Ok. After a some time (no idea how much) the same Python program actually found a complete solution (several ones indeed)... here is one
6 4 6 9 4 1 2 9 7 3 6
9 2 7 7 4 4 8 1 2 1 7
1 0 6 2 7 0 4 4 8 3 4
2 1 2 2 5 5 9 2 9 6 5
9 2 5 5 2 0 2 6 3 9 1
1 6 3 6 0 0 9 3 7 0 6
6 0 0 4 9 0 1 6 0 0 4
9 8 4 4 8 0 1 4 5 2 3
2 4 8 2 8 1 6 8 6 7 5
1 7 6 9 2 4 5 4 2 7 6
6 6 3 8 8 5 6 1 5 2 1
The searching program can be found here...
You've got 100 numbers and 121 cells to work with, so you'll need to be very efficient. We should try to build up the grid, so that each time we fill a cell, we attain a new number in our list.
For now, let's only worry about 68 4-digit numbers. I think a good chunk of the shorter numbers will be in our grid without any effort.
Start with a 3x3 or 4x4 set of numbers in the top-left of your grid. It can be arbitrary, or fine-tune for slightly better results. Now let's fill in the rest of the grid one square at a time.
Repeat these steps:
Fill an empty cell with a digit
Check which numbers that knocked off the list
If it didn't knock off any 4-digit numbers, try a different digit or cell
Eventually you may need to fill 2 cells or even 3 cells to achieve a new 4-digit number, but this should be uncommon, except at the end (at which point, hopefully there's a lot of empty space). Continue the process for the (few?) remaining 3-digit numbers.
There's a lot room for optimizations and tweaks, but I think this technique is fast and promising and a good starting point. If you get an answer, share it with us! :)
Update
I tried my approach and only got 87 out of the 100:
10894688943
60213136008
56252211674
61444925224
59409675697
02180334817
73260193640
.5476685202
0052034645.
...4.948156
......4671.
My guess is that both algorithms are too slow. Some optimization algorithm might work like best-first search or simulated annealing but my I don't have much experience on programming those.
Have you tried any primary research on Two-Dimensional Bin Packing (2DBP) algorithms? Google Scholars is a good start. I did this a while ago when building an application to generate mosaics.
All rectangular bin packing algorithms can be divided into 4 groups based on their support for the following constraints:
Must the resulting bin be guillotine cuttable? I.e. do you have to later slice the bin(s) in half until all the pieces are unpacked?
Can the pieces be rotated to fit into the bin? Not an issue with square pieces, so this makes more algorithms available to you.
Out of all the algorithms I looked into, the most efficient solution is an Alternate Directions (AD) algorithm with a Tabu Search optimization layer. There are dissertations which prove this. I may be able to dig-up some links if this helps.
Some ideas off the top of my head, without investing much time into thinking about details.
I would start by counting the number of occurrences of each digit in all squares 1..100. The total number of digits will be obviously larger than 121, but by analyzing individual frequencies you can deduce which digits must be grouped on a single line to form as many different squares as possible. For example, if 0 has the highest frequency, you have to try to put as many squares containing a 0 on the same line.
You could maintain a count of digits for each line, and each time you place a digit, you update the count. This lets you easily compute which square numbers have been covered by that particular line.
So, the program will still be brute-force, but it will exploit the problem structure much better.
PS: Counting digit frequencies is the easiest way to decide whether a certain permutation of digits constitutes a square.

Resources