Related
Below is the problem assignment using tree recursion approach:
Maximum Subsequence
A subsequence of a number is a series of (not necessarily contiguous) digits of the number. For example, 12345 has subsequences that include 123, 234, 124, 245, etc. Your task is to get the maximum subsequence below a certain length.
def max_subseq(n, l):
"""
Return the maximum subsequence of length at most l that can be found in the given number n.
For example, for n = 20125 and l = 3, we have that the subsequences are
2
0
1
2
5
20
21
22
25
01
02
05
12
15
25
201
202
205
212
215
225
012
015
025
125
and of these, the maxumum number is 225, so our answer is 225.
>>> max_subseq(20125, 3)
225
>>> max_subseq(20125, 5)
20125
>>> max_subseq(20125, 6) # note that 20125 == 020125
20125
>>> max_subseq(12345, 3)
345
>>> max_subseq(12345, 0) # 0 is of length 0
0
>>> max_subseq(12345, 1)
5
"""
"*** YOUR CODE HERE ***"
There are two key insights for this problem
You need to split into the cases where the ones digit is used and the one where it is not. In the case where it is, we want to reduce l since we used one of the digits, and in the case where it isn't we do not.
In the case where we are using the ones digit, you need to put the digit back onto the end, and the way to attach a digit d to the end of a number n is 10 * n + d.
I could not understand the insights of this problem, mentioned below 2 points:
split into the cases where the ones digit is used and the one where it is not
In the case where we are using the ones digit, you need to put the digit back onto the end
My understanding of this problem:
Solution to this problem looks to generate all subsequences upto l, pseudo code looks like:
digitSequence := strconv.Itoa(n) // "20125"
printSubSequence = func(digitSequence string, currenSubSequenceSize int) { // digitSequence is "20125" and currenSubSequenceSize is say 3
printNthSubSequence(digitSequence, currenSubSequenceSize) + printSubSequence(digitSequence, currenSubSequenceSize-1)
}
where printNthSubSequence prints subsequences for (20125, 3) or (20125, 2) etc...
Finding max_subseq among all these sequences then becomes easy
Can you help me understand the insights given in this problem, with an example(say 20125, 1)? here is the complete question
Something like this? As the instructions suggest, try it with and without the current digit:
function f(s, i, l){
if (i + 1 <= l)
return Number(s.substr(0, l));
if (!l)
return 0;
return Math.max(
// With
Number(s[i]) + 10 * f(s, i - 1, l - 1),
// Without
f(s, i - 1, l)
);
}
var input = [
['20125', 3],
['20125', 5],
['20125', 6],
['12345', 3],
['12345', 0],
['12345', 1]
];
for (let [s, l] of input){
console.log(s + ', l: ' + l);
console.log(f(s, s.length-1, l));
console.log('');
}
Possible Interview Question: How to Find All Overlapping Intervals => provide us a solution to find all the overlapping intervals. On top of this problem, imagine each interval has a weight. I am aiming to find those overlap intervals summed weight, when a new interval is inserted.
Condition: Newly inserted interval's end value is always larger than the previously inserted interval's end point, this will lead us to have already sorted end points.
When a new interval and its weight is inserted, all the overlapped intervals summed weight should be checked that does it exceeds the limit or not. For example when we insert [15, 70] 2, [15, 20] 's summed weight will be 130 and it should give an error since it exceed the limit=128, if not the newly inserted interval will be append to the list.
int limit = 128;
Inserted itervals in order:
order_come | start | end | weight
0 [10, 20] 32
1 [15, 25] 32
2 [5, 30] 32
3 [30, 40] 64
4 [1, 50] 16
5 [1, 60] 16
6 [15, 70] 2 <=should not append to the list.
Final overall summed weight view of the List after `[15, 70] 2` is inserted:
[60, 70, 2]
[50, 60, 18]
[40, 50, 34]
[30, 40, 98]
[25, 30, 66]
[20, 25, 98]
[15, 20, 130] <= exceeds the limit=128, throw an error.
[10, 15, 96]
[5, 10, 64]
[1, 5, 32]
[0, 0, 0]
Thank you for your valuable time and help.
O(log n)-time inserts are doable with an augmented binary search tree. To store
order_come | start | end | weight
0 [10, 20] 32
1 [15, 25] 32
2 [5, 30] 32
3 [30, 40] 64
4 [1, 50] 16
5 [1, 60] 16
we have a tree shaped like
25
/ \
/ \
10 50
/ \ / \
5 20 40 60
/ / /
1 15 30 ,
where each number represents the interval from it to its successor. Associated with each tree node are two numbers. The first we call ∆weight, defined to be the weight of the node's interval minus the weight of the node's parent's interval, if extent (otherwise zero). The second we call ∆max, defined to be the maximum weight of an interval corresponding to a descendant of the node, minus the node's weight.
For the above example,
interval | tree node | total weight | ∆weight | ∆max
[1, 5) 1 32 -32 0
[5, 10) 5 64 -32 0
[10, 15) 10 96 32 32
[15, 20) 15 128 32 0
[20, 25) 20 96 0 32
[25, 30) 25 64 64 64
[30, 40) 30 96 64 0
[40, 50) 40 32 16 64
[50, 60) 50 16 -48 80
[60, ∞) 60 0 -16 0
Binary search tree operations almost invariably require rotations. When we rotate a tree like
p c
/ \ / \
c r => l p
/ \ / \
l g g r
we modify
c.∆weight += p.∆weight
g.∆weight += c.∆weight
g.∆weight -= p.∆weight
p.∆weight -= c.∆weight
p.∆max = max(0, g.∆max + g.∆weight, r.∆max + r.∆weight)
c.∆max = max(0, l.∆max + l.∆weight, p.∆max + p.∆weight).
The point of the augmentations is as follows. To find the max weight in the tree, compute r.∆max + r.∆value where r is the root. To increase every weight in a subtree by a given quantity ∂, increase the subtree root's ∆weight by ∂. By changing O(log n) nodes with inclusion-exclusion, we can increase a whole interval. Together, these operations allow us to evaluate an insertion in time O(log n).
To find the total weight of an interval, search for that interval as normal while also adding up the ∆weight values of that interval's ancestors. For example, to find the weight of [15, 30], we look for 15, traversing 25 (∆weight = 64), 10 (∆weight = 32), 20 (∆weight = 0), and 15 (∆weight = 32), for a total weight of 64 + 32 + 0 + 32 = 128.
To find the maximum total weight along a hypothetical interval, we do a modified search something like this. Using another modified search, compute the greatest tree value less than or equal to start (predstart; let predstart = -∞ if start is all tree values are greater than start) and pass it to this maxtotalweight.
maxtotalweight(root, predstart, end):
if root is nil:
return -∞
if end <= root.value:
return maxtotalweight(root.leftchild, predstart, end) + root.∆weight
if predstart > root.value:
return maxtotalweight(root.rightchild, predstart, end) + root.∆weight
lmtw = maxtotalweight1a(root.leftchild, predstart)
rmtw = maxtotalweight1b(root.rightchild, end)
return max(lmtw, 0, rmtw) + root.∆weight
maxtotalweight1a(root, predstart):
if root is nil:
return -∞
if predstart > root.value:
return maxtotalweight1a(root.rightchild, predstart) + root.∆weight
lmtw = maxtotalweight1a(root.leftchild, predstart)
return max(lmtw, 0, root.rightchild.∆max + root.rightchild.∆weight) + root.∆weight
maxtotalweight1b(root, end):
if root is nil:
return -∞
if end <= root.value:
return maxtotalweight1b(root.leftchild, end) + root.∆weight
rmtw = maxtotalweight1b(root.rightchild, end)
return max(root.leftchild.∆max + root.leftchild.∆weight, 0, rmtw) + root.∆weight
We assume that nil has ∆weight = 0 and ∆max = -∞. Sorry for all of the missing details.
Using the terminology of original answer when you have
'1E 2E 3E ... (n-1)E nE'
end-points already sorted and your (n+1)st end-point is grater than all previous end-points you only need to find intervals with end-point value greater then (n+1)st start-point (greater or equal in case of closed intervals).
In other words - iterate over intervals starting from most-right end-point to the left until you reach the interval with end-point lesser or equal than (n+1)st start-point and keep track of sum of weights. Then check if the sum fits into the limit. Worst case time-complexity is O(n) when all previous intervals have end-point grater then (n+1)st start-point.
I understand how heaps work but there is a problem I have no idea on how to solve.
Let's say you're given a max heap (not a BST),
[149 , 130 , 129 , 107 , 122 , 124 , 103 , 66 , 77 , 91 , 98 , 10 , 55 , 35 , 72]
Find a list of inputs that would give you the same heap structure such that each successive value would be the largest it can possibly which would be:
[66 , 91 , 10 , 107 , 122 , 35 , 55 , 77 , 130 , 98 , 149 , 124 , 129 , 72 , 103]
So in other words, if you were going to insert 66 first then 91 then 10 then 107 and so on into an empty max heap, you would end up with the given heap structure after all of the bubbling up and so forth. How would you even find this input in the first place?
Can anyone suggest any ideas?
Thanks
Consider this max-heap (which I'll draw as a tree, but represents [7, 6, 5, 4, 3, 1, 2].
7
6 5
4 3 1 2
What's the last element that can be inserted? The last slot filled in the heap must be the bottom-right of the tree, and the bubbling-up procedure can only have touched elements along the route from that node to the top. So the previous element inserted must be 7, 5 or 2. Not all of these are possible. If it was 7, then the tree must have looked like this before insertion (with _ representing the slot where we're going to insert before bubbling up):
5
6 2
4 3 1 _
which violates the heap constraint. If 5 were the last element to be inserted, then the heap would have looked like this:
7
6 2
4 3 1 _
This works, so 5 could have been the last thing inserted. Similarly, 2 could also have been the last thing inserted.
In general, an element along the path to the bottom-right-most node could have been the last thing inserted if all the nodes below it along the path are at least as large as the other child of its parent. In our example: 7 can't be the last thing inserted because 5 < 6. 5 can be the last thing inserted because 2 > 1. 2 can be the last thing inserted because it doesn't have any children.
With this observation, one can generate all input sequences (in reverse order) that could have resulted in the heap by recursion.
Here's some code that runs on the example you gave, and verifies that each input sequence it generates actually does generate the given heap. There's 226696 different inputs, but the program only takes a few seconds to run.
# children returns the two children of i. The first
# is along the path to n.
# For example: children(1, 4) == 4, 3
def children(i, n):
i += 1
n += 1
b = 0
while n > i:
b = n & 1
n //= 2
return 2 * i + b - 1, 2 * i - b
# try_remove tries to remove the element i from the heap, on
# the assumption is what the last thing inserted.
# It returns a new heap without that element if possible,
# and otherwise None.
def try_remove(h, i):
h2 = h[:-1]
n = len(h) - 1
while i < n:
c1, c2 = children(i, n)
h2[i] = h[c1]
if c2 < len(h) and h[c1] < h[c2]:
return None
i = c1
return h2
# inputs generates all possible input sequences that could have
# generated the given heap.
def inputs(h):
if len(h) <= 1:
yield h
return
n = len(h) - 1
while True:
h2 = try_remove(h, n)
if h2 is not None:
for ins in inputs(h2):
yield ins + [h[n]]
if n == 0: break
n = (n - 1) // 2
import heapq
# assert_inputs_give_heap builds a max-heap from the
# given inputs, and asserts it's equal to cs.
def assert_inputs_give_heap(ins, cs):
# Build a heap from the inputs.
# Python heaps are min-heaps, so we negate the items to emulate a max heap.
h = []
for i in ins:
heapq.heappush(h, -i)
h = [-x for x in h]
if h != cs:
raise AssertionError('%s != %s' % (h, cs))
cs = [149, 130, 129, 107, 122, 124, 103, 66, 77, 91, 98, 10, 55, 35, 72]
for ins in inputs(cs):
assert_inputs_give_heap(ins, cs)
print ins
Let's say I have a number x which is a power of two that means x = 2^i for some i.
So the binary representation of x has only one '1'. I need to find the index of that one.
For example, x = 16 (decimal)
x = 10000 (in binary)
here index should be 4. Is it possible to find the index in O(1) time by just using bit operation?
The following is an explanation of the logic behind the use of de Bruijn sequences in the O(1) code of the answers provided by #Juan Lopez and #njuffa (great answers by the way, you should consider upvoting them).
The de Bruijn sequence
Given an alphabet K and a length n, a de Bruijn sequence is a sequence of characters from K that contains (in no particular order) all permutations with length n in it [1], for example, given the alphabet {a, b} and n = 3, the following is a list all permutations (with repetitions) of {a, b} with length 3:
[aaa, aab, aba, abb, baa, bab, bba, bbb]
To create the associated de Bruijn sequence we construct a minimum string that contains all these permutations without repetition, one of such strings would be: babbbaaa
"babbbaaa" is a de Bruijn sequence for our alphabet K = {a, b} and n = 3, the notation to represent this is B(2, 3), where 2 is the size of K also denoted as k. The size of the sequence is given by kn, in the previous example kn = 23 = 8
How can we construct such a string? One method consist on building a directed graph where every node represents a permutation and has an outgoing edge for every letter in the alphabet, the transition from one node to another adds the edge letter to the right of the next node and removes its leftmost letter. Once the graph is built grab the edges in a Hamiltonian path over it to construct the sequence.
The graph for the previous example would be:
Then, take a Hamiltonian path (a path that visits each vertex exactly once):
Starting from node aaa and following each edge, we end up having:
(aaa) -> b -> a -> b -> b -> b -> a -> a -> a (aaa) = babbbaaa
We could have started from the node bbb in which case the obtained sequence would have been "aaababbb".
Now that the de Bruijn sequence is covered, let's use it to find the number of leading zeroes in an integer.
The de Bruijn algorihtm [2]
To find out the number of leading zeroes in an integer value, the first step in this algorithm is to isolate the first bit from right to left, for example, given 848 (11010100002):
isolate rightmost bit
1101010000 ---------------------------> 0000010000
One way to do this is using x & (~x + 1), you can find more info on how this expression works on the Hacker's Delight book (chapter 2, section 2-1).
The question states that the input is a power of 2, so the rightmost bit is isolated from the beginning and no effort is required for that.
Once the bit is isolated (thus converting it in a power of two), the second step consist on using a hash table approach along with its hash function to map the filtered input with its corresponding number of leading 0's, p.e., applying the hash function h(x) to 00000100002 should return the the index on the table that contains the value 4.
The algorithm proposes the use of a perfect hash function highlighting these properties:
the hash table should be small
the hash function should be easy to compute
the hash function should not produce collisions, i.e., h(x) ≠ h(y) if x ≠ y
To achieve this, we could use a de Bruijn sequence, with an alphabet of binary elements K = {0, 1}, with n = 6 if we want to solve the problem for 64 bit integers (for 64 bit integers, there are 64 possible power of two values and 6 bits are required to count them all). B(2, 6) = 64, so we need to find a de Bruijn sequence of length 64 that includes all permutations (with repetition) of binary digits with length 6 (0000002, 0000012, ..., 1111112).
Using a program that implements a method like the one described above you can generate a de Bruijn sequence that meets the requirement for 64 bits integers:
00000111111011011101010111100101100110100100111000101000110000102 = 7EDD5E59A4E28C216
The proposed hashing function for the algorithm is:
h(x) = (x * deBruijn) >> (k^n - n)
Where x is a power of two. For every possible power of two within 64 bits, h(x) returns a corresponding binary permutation, and we need to associate every one of these permutations with the number of leading zeroes to fill the table. For example, if x is 16 = 100002, which has 4 leading zeroes, we have:
h(16) = (16 * 0x7EDD5E59A4E28C2) >> 58
= 9141566557743385632 >> 58
= 31 (011111b)
So, at index 31 of our table, we store 4. Another example, let's work with 256 = 1000000002 which has 8 leading zeroes:
h(256) = (256 * 0x7EDD5E59A4E28C2) >> 58
= 17137856407927308800 (due to overflow) >> 58
= 59 (111011b)
At index 59, we store 8. We repeat this process for every power of two until we fill up the table. Generating the table manually is unwieldy, you should use a program like the one found here for this endeavour.
At the end we'd end up with the following table:
int table[] = {
63, 0, 58, 1, 59, 47, 53, 2,
60, 39, 48, 27, 54, 33, 42, 3,
61, 51, 37, 40, 49, 18, 28, 20,
55, 30, 34, 11, 43, 14, 22, 4,
62, 57, 46, 52, 38, 26, 32, 41,
50, 36, 17, 19, 29, 10, 13, 21,
56, 45, 25, 31, 35, 16, 9, 12,
44, 24, 15, 8, 23, 7, 6, 5
};
And the code to calculate the required value:
// Assumes that x is a power of two
int numLeadingZeroes(uint64_t x) {
return table[(x * 0x7EDD5E59A4E28C2ull) >> 58];
}
What warranties that we are not missing an index for a power of two due to collision?
The hash function basically obtains every 6 bits permutation contained in the de Bruijn sequence for every power of two, the multiplication by x is basically just a shift to the left (multiplying a number by a power of two is the same as left shifting the number), then the right shift 58 is applied, isolating the 6 bits group one by one, no collision will appear for two different values of x (the third property of the desired hash function for this problem) thanks to the de Bruijn sequence.
References:
[1] De Bruijn Sequences - http://datagenetics.com/blog/october22013/index.html
[2] Using de Bruijn Sequences to Index a 1 in a Computer Word - http://supertech.csail.mit.edu/papers/debruijn.pdf
[3] The Magic Bitscan - http://elemarjr.net/2011/01/09/the-magic-bitscan/
The specifications of the problem are not entirely clear to me. For example, which operations count as "bit operations" and how many bits make up the input in question? Many processors have a "count leading zeros" or "find first bit" instruction exposed via intrinsic that basically provides the desired result directly.
Below I show how to find the bit position in 32-bit integer based on a De Bruijn sequence.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
/* find position of 1-bit in a = 2^n, 0 <= n <= 31 */
int bit_pos (uint32_t a)
{
static int tab[32] = { 0, 1, 2, 6, 3, 11, 7, 16,
4, 14, 12, 21, 8, 23, 17, 26,
31, 5, 10, 15, 13, 20, 22, 25,
30, 9, 19, 24, 29, 18, 28, 27};
// return tab [0x04653adf * a >> 27];
return tab [(a + (a << 1) + (a << 2) + (a << 3) + (a << 4) + (a << 6) +
(a << 7) + (a << 9) + (a << 11) + (a << 12) + (a << 13) +
(a << 16) + (a << 18) + (a << 21) + (a << 22) + (a << 26))
>> 27];
}
int main (void)
{
uint32_t nbr;
int pos = 0;
while (pos < 32) {
nbr = 1U << pos;
if (bit_pos (nbr) != pos) {
printf ("!!!! error: nbr=%08x bit_pos=%d pos=%d\n",
nbr, bit_pos(nbr), pos);
EXIT_FAILURE;
}
pos++;
}
return EXIT_SUCCESS;
}
You can do it in O(1) if you allow a single memory access:
#include <iostream>
using namespace std;
int indexes[] = {
63, 0, 58, 1, 59, 47, 53, 2,
60, 39, 48, 27, 54, 33, 42, 3,
61, 51, 37, 40, 49, 18, 28, 20,
55, 30, 34, 11, 43, 14, 22, 4,
62, 57, 46, 52, 38, 26, 32, 41,
50, 36, 17, 19, 29, 10, 13, 21,
56, 45, 25, 31, 35, 16, 9, 12,
44, 24, 15, 8, 23, 7, 6, 5
};
int main() {
unsigned long long n;
while(cin >> n) {
cout << indexes[((n & (~n + 1)) * 0x07EDD5E59A4E28C2ull) >> 58] << endl;
}
}
It depends on your definitions. First let's assume there are n bits, because if we assume there is a constant number of bits then everything we could possibly do with them is going to take constant time so we could not compare anything.
First, let's take the widest possible view of "bitwise operations" - they operate on bits but not necessarily pointwise, and furthermore we'll count operations but not include the complexity of the operations themselves.
M. L. Fredman and D. E. Willard showed that there is an algorithm of O(1) operations to compute lambda(x) (the floor of the base-2 logarithm of x, so the index of the highest set bit). It contains quite some multiplications though, so calling it bitwise is a bit funny.
On the other hand, there is an obvious O(log n) operations algorithm using no multiplications, just binary search for it. But can do better, Gerth Brodal showed that it can be done in O(log log n) operations (and none of them are multiplications).
All the algorithms I referenced are in The Art of Computer Programming 4A, bitwise tricks and techniques.
None of these really qualify as finding that 1 in constant time, and it should be obvious that you can't do that. The other answers don't qualify either, despite their claims. They're cool, but they're designed for a specific constant number of bits, any naive algorithm would therefore also be O(1) (trivially, because there is no n to depend on). In a comment OP said something that implied he actually wanted that, but it doesn't technically answer the question.
And the answer is ... ... ... ... ... yes!
Just for fun, since you commented below one of the answers that i up to 20 would suffice.
(multiplications here are by either zero or one)
#include <iostream>
using namespace std;
int f(int n){
return
0 | !(n ^ 1048576) * 20
| !(n ^ 524288) * 19
| !(n ^ 262144) * 18
| !(n ^ 131072) * 17
| !(n ^ 65536) * 16
| !(n ^ 32768) * 15
| !(n ^ 16384) * 14
| !(n ^ 8192) * 13
| !(n ^ 4096) * 12
| !(n ^ 2048) * 11
| !(n ^ 1024) * 10
| !(n ^ 512) * 9
| !(n ^ 256) * 8
| !(n ^ 128) * 7
| !(n ^ 64) * 6
| !(n ^ 32) * 5
| !(n ^ 16) * 4
| !(n ^ 8) * 3
| !(n ^ 4) * 2
| !(n ^ 2);
}
int main() {
for (int i=1; i<1048577; i <<= 1){
cout << f(i) << " "; // 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
}
}
Is the code C = {00, 11, 0101, 111, 1010, 100100, 0110} uniquely decodeable?
My answer is no, because according to Sardinas–Patterson algorithm:
C1 = {1}
C2 = {1, 11, 010, 00100}
So C2 AND C = {11}, so C is not a uniquely decodable code.
I am wondering am I right about this?
You are correct that this code is not uniquely decodable.
Consider the string 111111, this can be parsed as 11 11 11 or as 111 111.