I have a complete binary tree of height 'h'.
How do I find 'h' number of unrelated partitions for this ?
NOTE:
Unrelated partition means no child can be present with its immediate parent.
There is a constraint on the number of nodes in each partition.
The difference of the maximum number nodes in a partition and the minimum number of nodes in the partition can either be 0 or 1.
Also, root is excluded from including in the partitions.
Who devised the problem probably had a more elegant solution in mind, but the following works.
Let's say we have h partitions numbered 1 to h, and that the nodes of partition n have value n. The root node has value 0, and does not participate in the partitions. Let's call a partition even if nis even, and odd if n is odd. Let's also number the levels of the complete binary tree, ignoring the root and starting from level 1 with 2 nodes. Level n has 2n nodes, and the complete tree has 2h+1-1 nodes, but only P=2h+1-2 nodes belong to the partitions (because the root is excluded). Each partition consists of p=⌊P/h⌋ or p=⌈P/h⌉ nodes, such that ∑ᵢpᵢ=P.
If the height h of the tree is even, put all even partitions into the even levels of the left subtree and the odd levels of the right subtee, and put all odd partitions into the odd levels of the left subtree and the even levels of the right subtree.
If h is odd, distribute all partitions up to partition h-1 like in the even case, but distribute partition h evenly into the last level of the left and right subtrees.
This is the result for h up to 7 (I wrote a tiny Python library to print binary trees to the terminal in a compact way for this purpose):
0
1 1
0
1 2
2 2 1 1
0
1 2
2 2 1 1
1 1 3 3 2 2 3 3
0
1 2
2 2 1 1
1 1 1 1 2 2 2 2
2 4 4 4 4 4 4 4 1 3 3 3 3 3 3 3
0
1 2
2 2 1 1
1 1 1 1 2 2 2 2
2 2 2 2 2 2 4 4 1 1 1 1 1 1 3 3
3 3 3 3 3 3 3 3 3 3 5 5 5 5 5 5 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5
0
1 2
2 2 1 1
1 1 1 1 2 2 2 2
2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1
1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 3 3 3 3 3 3 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
0
1 2
2 2 1 1
1 1 1 1 2 2 2 2
2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 4 4 4 4 4 4 4 4 4 4 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7
And this is the code that generates it:
from basicbintree import Node
for h in range(1, 7 + 1):
root = Node(0)
P = 2 ** (h + 1) - 2 # nodes in partitions
p = P // h # partition size (may be p or p + 1)
if h & 1: # odd height
t = (p + 1) // 2 # subtree tail nodes from split partition
n = (h - 1) // 2 # odd or even partitions in subtrees except tail
else: # even height
t = 0 # no subtree tail nodes from split partition
n = h // 2 # odd or even partitions in subtrees
s = P // 2 - t # subtree nodes excluding tail
r = s - n * p # partitions of size p + 1 in subtrees
x = [p + 1] * r + [p] * (n - r) # nodes indexed by subtree partition - 1
odd = [1 + 2 * i for i, c in enumerate(x) for _ in range(c)] + [h] * t
even = [2 + 2 * i for i, c in enumerate(x) for _ in range(c)] + [h] * t
for g in range(1, h + 1):
start = 2 ** (g - 1) - 1
stop = 2 ** g - 1
if g & 1: # odd level
root.set_level(odd[start:stop] + even[start:stop])
else: # even level
root.set_level(even[start:stop] + odd[start:stop])
print('```none')
root.print_tree()
print('```')
All trees produced up to height 27 have been programmatically confirmed to meet the specifications.
Some parts of the algorithm would need a proof, like, e.g., that it's always possible to choose an even size for the split partition in the odd height case, but this and other proofs are left as an exercise to the reader ;-)
I need the efficient algorithm for this problem (time comlexity less than O(n^2)), please help me:
a[i..j] is called a[i..j] < b[i..j] if a[i]<b[i], a[i+1]<b[i+1], ..., a[j]<b[j] after sorting these 2 arrays.
Given array A[1..n], (n<= 10^5, a[i]<= 1000). Find the maximum of k that A[1..k] < A[k+1..2k]
For example, n=10: 2 2 1 4 3 2 5 4 2 3
the answer is 4
Easily to see that k <= n/2. So we can use brute-forces (k from n/2 to 1), but not binary search.
And I don't know what to do with a[i] <= 1000. Maybe using map???
Use a Fenwick tree with range updates. Each index in the tree represents the count of how many numbers in window A are smaller than it. For the windows to be valid, each element in B (the window on the right) must have a partner in A (the window on the left). When we shift a number x into A, we add 1 to the range, [x+1, 1000] in the tree. For the element shifted from B to A, add 1 in its tree index. For each new element in B, add -1 to its index in the tree. If an index drops below zero, the window is invalid.
For the example, we have:
2 2 1 4 3 2 5 4 2 3
2 2
|
Tree:
add 1 to [3, 1000]
add -1 to 2
idx 1 2 3 4 5
val 0 -1 1 1 1 (invalid)
2 2 1 4 3 2 5 4 2 3
2 2 1 4
|
Tree:
add 1 to [3, 1000]
add 1 to 2 (remove 2 from B)
add -1 to 1
add -1 to 4
idx 1 2 3 4 5
val -1 0 2 1 2 (invalid)
2 2 1 4 3 2 5 4 2 3
2 2 1 4 3 2
|
Tree:
add 1 to [2, 1000]
add 1 to 1 (remove 1 from B)
add -1 to 3
add -1 to 2
idx 1 2 3 4 5
val 0 0 2 2 3 (valid)
2 2 1 4 3 2 5 4 2 3
2 2 1 4 3 2 5 4
|
Tree:
add 1 to [5, 1000]
add 1 to 4 (remove 4 from B)
add -1 to 5
add -1 to 4
idx 1 2 3 4 5
val 0 0 2 2 3 (valid)
2 2 1 4 3 2 5 4 2 3
2 2 1 4 3 2 5 4 2 3
|
Tree:
add 1 to [4, 1000]
add 1 to 3 (remove 3 from B)
add -1 to 2
add -1 to 3
idx 1 2 3 4 5
val 0 -1 2 3 4 (invalid)
Say I have matrix A:
A = [1 1 1 2 2 3 3 3;
1 1 1 2 2 3 3 3;
1 1 1 2 2 4 4 5;
2 2 2 2 2 5 5 5]
and matrix B with the same labels, just in different positions and not always with the same elements in each cluster:
B = [3 3 3 3 5 1 1 1:
3 3 3 3 5 1 1 1;
3 3 3 3 5 2 2 4:
5 5 5 5 5 4 4 4]
and I want matrix C to look like this
C = [1 1 1 1 2 3 3 3;
1 1 1 1 2 3 3 3;
1 1 1 1 2 4 4 5;
2 2 2 2 2 5 5 5]
Basically, I want the clusters in B that have a similar position to A to also have the same label as A, even if the clusters in B don't have the same exact amount of elements as the clusters in A. This is just a basic example because what I'm really working on are two images that have different labellings.
example of the image I'm working on
Given a dictionary list and a input word, return true if the input word has a single typo with the same length as the vocabulary in the dictionary.
dictionary = ["apple", "testing", "computer"];
singleType(dictionary, "adple") // true
singleType(dictionary, "addle") // false
singleType(dictionary, "apple") // false
singleType(dictionary, "apples") // false
I proposed a solution that runs in linear time, if we ignore the pre-process time needed for the hashmap.
O(k*26) => O(k), where k = length of the input word
My linear solution goes like, convert the dictionary list into a hash-map, where the key is the word and the value is a boolean, then loop through every character in the input word, and replace every character with 1 of the 26 alphabet and check if it maps to the hash-map.
But they say I could do better than O(k*26), but how?
You could extend the dictionary with all the variants of the word containing a single typo, but instead of the actual typo, you just put some "wildcard" character like ? or * in that place. Then, you can check whether (a) the word is not in the set of correctly spelled words, and (b) replacing any of the letters in the word with the same wildcard symbol, the word can be found in the set of words having one typo.
Example in Python:
>>> dictionary = ["apple", "testing", "computer"]
>>> wildcard = lambda w: [w[:i]+"?"+w[i+1:] for i in range(len(w))]
>>> onetypo = {x for w in dictionary for x in wildcard(w)}
>>> correct = {w for w in dictionary}
>>> word = "apxle"
>>> word not in correct and any(w in onetypo for w in wildcard(word))
True
This reduces the complexity of lookup to just O(k), i.e. still linear in the number of letters, but without the high constant factor. It does, however, greatly blow up the dictionary by a factor equal to the average number of letters in the words.
For a single lookup, I would filter the dictionary by word length, and then iterate the words, counting the errors, and bail out of each word, as soon as the error count is > 1.
val dictionary = List ("affen", "ample", "apple", "appse", "ipple", "appl", "pple", "mapple", "apples")
#annotation.tailrec
def oneError (w1: String, w2:String, err: Int) : Boolean = w1.length match {
case 0 => err == 1
case _ => if (err > 1) false else {
if (w1(0) == w2(0)) oneError (w1.substring (1), w2.substring (1), err) else
oneError (w1.substring (1), w2.substring (1), err + 1)
}
}
scala> dictionary.filter (_.length == 5).filter (s => oneError ("appxe", s, 0))
res5: List[String] = List(apple, appse)
For processing a longer text, I would preprocess the dictionary and split it into Maps (word.length -> List (words)).
For natural language, which is highly redundant, I would build a Set of unique words from the text, to lookup every word just once.
For a single word lookup, the worst case is n calls to the initial function, with n=max (dictionary.groupBy (w.length)).
Each word lookup (of words longer 1) will take at least 2 steps until failure, but most words, supposed no pathological input and dictionary, are only visited for 2 steps. From the remaining ones, most are excluded after 3 steps and so on.
Here is a version, which shows how deep it looks:
def oneError (word: String) : Array[String] = {
#tailrec
def oneError (w1: String, w2:String, steps: Int, err: Int) : Boolean = w1.length match {
case 0 => {print (s"($steps) "); err == 1}
case _ => if (err > 1) {print (s"$steps "); false } else {
if (w1(0) == w2(0)) oneError (w1.substring (1), w2.substring (1), steps +1, err) else
oneError (w1.substring (1), w2.substring (1), steps + 1, err + 1)
}
}
val d = dict (word.length)
println (s"Info: ${d.length} words of same length")
d.filter (entry => oneError (word, entry, 0, 0))
}
Sample output, redacted:
scala> oneError ("fuck")
Info: 3352 words of same length
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 (4) 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 (4) (4) 3 3 3 3
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 3 3 (4) (4) 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 (4) 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
3 (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) 3 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 (4) (4) 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
3 3 3 (4) 3 3 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 (4) 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
res53: Array[String] = Array(Buck, Huck, Puck, buck, duck, funk, luck, muck, puck, suck, tuck, yuck)
It sounds like you are looking for the edit distance of 1 of your pattern with respect to the dictionary entry. For example, an edit distance of 1 would result if the pattern is "adple" and your dictionary entry is "apple". You have an additional constraint that the pattern is the same length as the dictionary entry, but this is easy to implement.
I need to sort a four column matrix in Julia by the third column in ascending order then by the fourth column in descending order.
The easiest way to do chained lexicographic sorting on columns in an arbitrary order is to pass a transformation by function: sortrows(A, by=x->(x[3],x[4]))… but that's just lexicographic with both columns ascending. In order to do fancier behaviors, you can pass a custom comparison function to sortrows:
julia> A = rand(1:3,6,4)
6x4 Array{Int64,2}:
3 1 1 2
1 1 3 1
1 1 2 1
2 1 3 3
1 3 3 1
2 3 2 3
julia> sortrows(A, lt=(x,y)->isless(x[3],y[3]) || (isequal(x[3],y[3]) && isless(y[4],x[4])))
6x4 Array{Int64,2}:
3 1 1 2
2 3 2 3
1 1 2 1
2 1 3 3
1 1 3 1
1 3 3 1