How to convert an array of integer into a single integer - algorithm

Using pseudo-code, if I have an array of integer, how can I make a single big integer that represents the same array in bits?
Example of input (using bits):
[10101, 10001, 00010, 01100]
The integer should be:
10101100010001001100
or
01100000101000110101

number = 0
for each element e
number *= 1 + maximumRepresentableNumber
number += e
For your example, maximumRepresentableNumber will be 11111, as that is the maximum number we can represent using the allowed number of bits (5). Adding 1 to that gives us 100000, and, if we multiply by that, it will be equivalent to a bit-shift by 5 to the left.
This would work for decimal representation as well, i.e. [123, 55, 29] will return 123055029. In this case maximumRepresentableNumber will be 999, so we'd just be multiplying by 1000.

What you are looking for is well known in functional programming land as fold or reduce. The basic idea is, that in a list
a,b,c,d, ..., x
we replace the commas with an operation we want (the operation beig symbolaized by $ here):
a $ (b $ (c $ (d $ ...(x $ Z))) // right fold
and the empty list with some default value Z
A bit different is the left fold, where we start out with Z:
((((Z $ a) $ b) $ c ).... $ x)))
The genral imperative algorithm for left fold would be:
result = Z
for each e in list do result = result $ e
Now, the only problem left is to identify $ and Z, that is the function we want to apply subsequently to all list elements to reach the goal and the starting value. In your case, what you want is either:
append the stringified element to the result string. Z is the empty string.
or: add the element to the result so far multiplied with 2^5. Z would be 0.

In ruby:
answer = 0
[0b10101, 0b10001, 0b00010, 0b01100].each do |x|
answer <<= 5
answer |= x
end
puts answer.to_s(2) # 10101100010001001100

In Python this would be:
a = [0b10101, 0b10001, 0b00010, 0b01100]
b = 0
for elem in a:
b <<= elem.bit_length()
b += elem
print(bin(b))

Related

Given integers X and Y, how do you find the largest permutation of X that is less than or equal to Y?

Given two positive integers X and Y, find the largest permutation of X
that is less than or equal to Y. Return the largest permutation that is
less than or equal to Y as an integer. If there is no permutation of X
that is less than or equal to Y, return -1.
Example 1:
Input: X = 123, Y = 321
Output: 321
Example 2:
Input: X = 1733, Y = 3311
Output: 3173
Example 3:
Input: X = 999, Y = 111
Output: -1
Got this problem for an online assessment earlier yesterday, couldn't find an efficient solution for it and have been thinking about it but still can't think of the right approach. I first tried greedy, in which I would iterate Y from left to right and I create a permutation of X by appending the largest digit in X that is less than or equal to the digit in Y. But for X = 1733 and Y = 3311, my implementation would return -1 because the greedy algorithm rearranged X to 3317. So I turned to recursion, but as you'd expect this very quickly reached stack limit.
I've read this thread that seems to discuss a similar problem, but I believe the top solution fails for example 2. How do you approach this problem?
A recursive solution.
Sort the digits of X decreasingly. Then, as long as you find no solution
take in turn every digit in X that is not larger than the leading digit of Y;
if those digits are equal, recurse on X less this digit and the tail of Y;
if the digit of X is smaller (or X is empty), you are done;
if there is no such digit, you reached a dead-end.
This works because you are trying the permutations of X by decreasing value.
321 vs. 321
3 21 vs. 3 21
21 vs. 21
1 vs. 1
Done
7331 vs. 3311
3 731 vs. 3 311
3 71 vs. 3 11
1 7 vs. 1 1
Dead end
1 73 vs. 3 11
Done
999 vs. 111
Dead end
A non-recursive efficient solution, hinted by #Stef.
The permutations of X can be ordered increasingly by sorting the digits then picking every first digit and recursing on the remaining ones. This established a bijection between the permutations and the integers in [0, d!) for d digits.
For an integer m, you can retrieve the corresponding permutation using a conversion from the factorial basis (take the quotient by (d-1)! and proceed recursively with the remainder). This takes d operations, and you can compare the permutation to Y in O(d) operations.
Now just implement a dichotomic search on the d! permutations, which takes O(d.log(d!)) = O(d².log(d))) operations.
Update: the second solution only works for distinct digits otherwise the permutations do not yield increasing numbers. I hope that there is a workaround.
If X has more digits then there is no solution. If Y has more digits then a descending sort of the digits of X is the solution. Assuming X and Y have the same number of digits:
Put the digits of X in a counting hash.
For each digit of Y going in descending order (left-to-right), take the max digit of X that isn't greater than it and use that in your permutation.
If you ever place a digit lower than its counterpart in Y, place all remaining digits in descending order.
If there ever isn't a non-greater digit available then do the following: repeatedly unwind your prior move until you get to a digit where a lower digit was available. Select the max such lower digit. Then, all remaining digits can be placed in descending order from the map. If there is no such digit (where a lower digit could have been chosen) then there is no solution.
If you get through all the digits then you've produced the max solution.
This is linear in the number of digits if this is limited to base 10. If your base can vary, this is O(num_digits * base)
Here's Ruby code for this.
def get_perm(x, y)
# hist keeps a count of each of the digits of x
hist = Hash.new 0; x.digits.each { |d| hist[d] += 1 }
# output_digits is the answer we're building
output_digits = []
y_digits = y.digits
x_digits = x.digits
# If x has fewer digits then all permutations are good so pick the largest
if x.digits.length < y.digits.length
9.downto(0) do |digit|
output_digits += [digit] * hist[digit]
end
return output_digits
end
# If y has fewer digits then no permutation is good, return -1
if y.digits.length < x.digits.length
return -1
end
# parse the digits of y
(y_digits.length - 1).downto(0) do |i|
cur_y_digit = y_digits[i]
# use the current digit of y if possible
if hist[cur_y_digit] > 0
hist[cur_y_digit] -= 1
output_digits.append(cur_y_digit)
return output_digits if i == 0
# otherwise, use the largest smaller digit available if possible
else
(cur_y_digit - 1).downto(0) do |smaller_digit|
if hist[smaller_digit] > 0
# place the smaller digit, then all remaining digits in descending order
hist[smaller_digit] -= 1
output_digits.append(smaller_digit)
9.downto(0) do |digit|
output_digits += [digit] * hist[digit]
end
return output_digits
end
end
# If we make it here then no digit was available; we need to unwind moves until we
# can replace a digit of our solution with a smaller digit
smallest_digit = hist.keys.min
while i < (y.digits.length - 1) do
i += 1
cur_y_digit = y_digits[i]
cur_unwound_digit = output_digits.pop
hist[cur_unwound_digit] += 1
smallest_digit = [smallest_digit, cur_unwound_digit].min
if cur_y_digit > smallest_digit
(cur_y_digit - 1).downto(smallest_digit) do |d|
if hist[d] >= 1
output_digits.append(d)
hist[d] -= 1
9.downto(0) do |digit|
output_digits += [digit] * hist[digit]
end
return output_digits
end
end
end
end
return -1
end
end
end
Outputs for OP sample cases:
> get_perm(123, 321)
=> [3, 2, 1]
> get_perm(1733, 3311)
=> [3, 1, 7, 3]
> get_perm(999, 111)
=> -1
If Z is the answer, and the numbers have n digits, you can show that there is an index i such that Z[:i] = Y[:i], Z[i]<Y[i], and Z[i+1:] is as large as possible given digits of X \ Z[:i+1] (I use python array slice notation, and the last expression means "the set of digits of X minus those already chosen in Z up to i+1").
Given this, you can easily loop over each candidate i, and efficiently check if it's feasible to chose such i as in above. The solution is with the largest possible i.
The solution should be O(n*log(n)).
I'll leave the proof and implementation details, as I understand it's a homework :)

How can I generate de Bruijn sequences iteratively?

I am looking for a way to generate a de Bruijn sequence iteratively instead of with recursion. My goal is to generate it character by character.
I found some example code in Python for generating de Bruijn sequences and translated it into Rust. I am not yet able to comprehend this technique well enough to create my own method.
Translated into Rust:
fn gen(sequence: &mut Vec<usize>, a: &mut [usize], t: usize, p: usize, k: usize, n: usize) {
if t > n {
if n % p == 0 {
for x in 1..(p + 1) {
sequence.push(a[x])
}
}
} else {
a[t] = a[t - p];
gen(sequence, a, t + 1, p, k, n);
for x in (a[t - p] + 1)..k {
a[t] = x;
gen(sequence, a, t + 1, t, k, n);
}
}
}
fn de_bruijn<T: Clone>(alphabet: &[T], n: usize) -> Vec<T> {
let k = alphabet.len();
let mut a = vec![0; n + 1];
let vecsize = k.checked_pow(n as u32).unwrap();
let mut sequence = Vec::with_capacity(vecsize);
gen(&mut sequence, &mut a, 1, 1, k, n);
sequence.into_iter().map(|x| alphabet[x].clone()).collect()
}
However this is not able to generate iteratively - it goes through a whole mess of recursion and iteration which is impossible to untangle into a single state.
Consider this approach:
Choose the first (lexicographically) representative from every necklace class
Here is Python code for generation of representatives for (binary) necklaces containing d ones (it is possible to repeat for all d values). Sawada article link
Sort representatives in lexicographic order
Make periodic reduction for every representative (if possible): if string is periodic s = p^m like 010101, choose 01
To find the period, it is possible to use string doubling or z-algorithm (I expect it's faster for compiled languages)
Concatenate reductions
Example for n=3,k=2:
Sorted representatives: 000, 001, 011, 111
Reductions: 0, 001, 011, 1
Result: 00010111
The same essential method (with C code) is described in Jörg Arndt's book "Matters Computational", chapter 18
A similar way is mentioned in the wiki
An alternative construction involves concatenating together, in
lexicographic order, all the Lyndon words whose length divides n
You might look for effective way to generate appropriate Lyndon words
I am not familiar with Rust, so I programmed and tested it in Python. Since the poster translated the version in the question from a Python program, I hope it will not be a big issue.
# the following function treats list a as
# k-adic number with n digtis
# and increments this number returning
# the index of the leftmost digit changed
def increment_a7(a, k, n):
digit= n-1
a[digit]+= 1
while a[digit] >= k and digit> 0:
#a[digit]= 0
a[digit]= a[0]+1
a[digit-1]+= 1
digit-= 1
return digit
# the following function adds a to the sequence
# and takes into account, that the beginning of a
# could overlap with the end of sequence
# in that case, it just removes the overlapping digits
# from a before adding the remaining digits to sequence
def append_to_sequence(sequence, a, n):
# here we can assume safely, that a
# does not overlap completely with sequence[-n:]
i= -1
for i in range(n-1, -1, -1):
found= True
# check if the last i digits in sequence
# overlap with the first i digits in a
for j in range(i):
if a[j] != sequence[-i+j]:
# no, they don't overlap
found= False
break
if found:
# yes they overlap, so no need to
# continue the check with a smaller i
break
# now we can just append everything from
# digit i (digit 0 - i-1 are swallowed)
sequence.extend(a[i:])
return n-i
# during the operation we have to keep track of
# the k-adic numbers a, that already occured in
# the sequence. We store them in a set called used
# everytime we add something to the sequence
# we have to update it and add one entry for each
# digit inserted
def update_used(sequence, used, n, num_inserted):
l= len(sequence)
for i in range(num_inserted):
used.add(tuple(sequence[-n-i:l-i]))
# the main work is done in the following function
# it creates and returns the generated sequence
def gen4(k, n):
a= [0]*n
sequence= a[:]
used= set()
# create a fake sequence to add the segments obtained by the cyclic nature
fake= ([k-1] * (n-1))
for i in range(n-1):
fake.append(0)
update_used(fake, used, n, 1)
update_used(sequence, used, n, 1)
valid= True
while valid:
# a is still a valid k-adic number
# this means the generation process
# has not ended
# so construct a new number from the n-1
# last digits of sequence
# followed by a zero
a= sequence[-n+1:]
a.append(0)
while valid and tuple(a) in used:
# the constructed k-adict number a
# was already used, so increment it
# and try again
increment_a(a, k, n)
valid= a[0]<k
if valid:
# great, the number is still valid
# and is not jet part of the sequence
# so add it after removing the overlapping
# digits and update the set with the segments
# we already used
num_inserted= append_to_sequence(sequence, a, n)
update_used(sequence, used, n, num_inserted)
return sequence
I tested the code above by generating some sequences with the original version of gen and this one using the same parameters. For all sets of parameters I tested, the result was the same in both versions.
Please note that this code is less efficient than the original version, especially if the sequence gets long. I guess the costs of the set operations has a non-linear influence on the runtime.
If you like, you can improve it further such as by using a more efficient way to store the used segments. Instead of operating on the k-adic representation (the a-list), you could use a multidimensional array instead.

Remove the inferior digits of a number

Given a number n of x digits. How to remove y digits in a way the remaining digits results in the greater possible number?
Examples:
1)x=7 y=3
n=7816295
-8-6-95
=8695
2)x=4 y=2
n=4213
4--3
=43
3)x=3 y=1
n=888
=88
Just to state: x > y > 0.
For each digit to remove: iterate through the digits left to right; if you find a digit that's less than the one to its right, remove it and stop, otherwise remove the last digit.
If the number of digits x is greater than the actual length of the number, it means there are leading zeros. Since those will be the first to go, you can simply reduce the count y by a corresponding amount.
Here's a working version in Python:
def remove_digits(n, x, y):
s = str(n)
if len(s) > x:
raise ValueError
elif len(s) < x:
y -= x - len(s)
if y <= 0:
return n
for r in range(y):
for i in range(len(s)):
if s[i] < s[i+1:i+2]:
break
s = s[:i] + s[i+1:]
return int(s)
>>> remove_digits(7816295, 7, 3)
8695
>>> remove_digits(4213, 4, 2)
43
>>> remove_digits(888, 3, 1)
88
I hesitated to submit this, because it seems too simple. But I wasn't able to think of a case where it wouldn't work.
if x = y we have to remove all the digits.
Otherwise, you need to find maximum digit in first y + 1 digits. Then remove all the y0 elements before this maximum digit. Then you need to add that maximum to the answer and then repeat that task again, but you need now to remove y - y0 elements now.
Straight forward implementation will work in O(x^2) time in the worst case.
But finding maximum in the given range can be done effectively using Segment Tree data structure. Time complexity will be O(x * log(x)) in the worst case.
P. S. I just realized, that it possible to solve in O(x) also, using the fact, that exists only 10 digits (but the algorithm maybe a little bit complicated). We need to find the minimum in the given range [L, R], but the ranges in this task will "change" from left to the right (L and R always increase). And we just need to store 10 pointers to the digits (1 per digit) to the first position in the number such that position >= L. Then to find the minimum, we need to check only 10 pointers. To update the pointers, we will try to move them right.
So the time complexity will be O(10 * x) = O(x)
Here's an O(x) solution. It builds an index that maps (i, d) to j, the smallest number > i such that the j'th digit of n is d. With this index, one can easily find the largest possible next digit in the solution in O(1) time.
def index(digits):
next = [len(digits)+1] * 10
for i in xrange(len(digits), 0, -1):
next[ord(digits[i-1])-ord('0')] = i-1
yield next[::-1]
def minseq(n, y):
n = str(n)
idx = list(index(n))[::-1]
i, r = 0, []
for ry in xrange(len(n)-y):
i = next(j for j in idx[i] if j <= y+ry) + 1
r.append(n[i - 1])
return ''.join(r)
print minseq(7816295, 3)
print minseq(4213, 2)
Pseudocode:
Number.toDigits().filter (sortedSet (Number.toDigits()). take (y))
Imho you don't need to know x.
For efficiency, Number.toDigits () could be precalculated
digits = Number.toDigits()
digits.filter (sortedSet (digits).take (y))
Depending on language and context, you either output the digits and are done or have to convert the result into a number again.
Working Scala-Code for example:
def toDigits (l: Long) : List [Long] = if (l < 10) l :: Nil else (toDigits (l /10)) :+ (l % 10)
val num = 734529L
val dig = toDigits (num)
dig.filter (_ > ((dig.sorted).take(2).last))
A sorted set is a set which is sorted, which means, every element is only contained once and then the resulting collection is sorted by some criteria, for example numerical ascending. => 234579.
We take two of them (23) and from that subset the last (3) and filter the number by the criteria, that the digits have to be greater than that value (3).
Your question does not explicitly say, that each digit is only contained once in the original number, but since you didn't give a criterion, which one to remove in doubt, I took it as an implicit assumption.
Other languages may of course have other expressions (x.sorted, x.toSortedSet, new SortedSet (num), ...) or lack certain classes, functions, which you would have to build on your own.
You might need to write your own filter method, which takes a pedicate P, and a collection C, and returns a new collection of all elements which satisfy P, P being a Method which takes one T and returns a Boolean. Very useful stuff.

Can we write an algorithm which gives me two whole numbers X and Y when I want to get a desired fraction F such that F= X/Y?

I am working to prepare a test data set in which I have to check rounding. Suppose I want to check round, round_up and round_down is working correctly at 10 th decimal place or not.
Then
if, X=100 and Y = 54 so, X/Y = 1.8518518518518518518518518518519 (test round equidistant)
if, X= 10 and Y = 7 so, 1.4285714285714285714285714285714 (test round_up)
if, X= 10 and Y = 3 so, 3.3333333333333333333333333333333 (test round_down)
Can we write an algorithm in which
input will be rounding mode (round_up, round, round_down) and decimal place I want to round at(in our example 10)
output will be X and Y like above?
If the required location is p (=10 in your example), then y=10^p and then you can choose any x you want.
Depending on the language you are using, p might be too big for you to do 10^p, so in the worst case just divide the result from x/y by 10, 100 or whatever is necessary.
Or you can do like this
# n = number of fraction you want to return
def getFraction(a, b, n):
result = ""
for i in range(n):
f = int((a % b) * 10 / b)
result += str(f)
a = a * 10 - b * f
return result
getFraction(10, 7, 11) # return 42857142857 which 10/7 = 1.42857142857...
What I do is like what you have learnt in elementary school on how to do division by pen and paper.
Actually, if the required digit is d, then if d is not 9, the answer would be x=d,y=9 regardless of p which is the position of the digit. If d is 9, then if p is odd, the answer is x=10,y=11 and if p is even, x=1,y=11. If a trivial answer for d=0 won't do, the mirror answer for d=9 is suitable, that is, if d=0 and p is odd, the answer is x=1,y=11, and if p is even, x=10,y=11. A lot shorter than an answer with y=10^p and certainly fitting in nearly any architecture.

Working with arbitrary inequalities and checking which, if any, are satisfied

Given a non-negative integer n and an arbitrary set of inequalities that are user-defined (in say an external text file), I want to determine whether n satisfies any inequality, and if so, which one(s).
Here is a points list.
n = 0: 1
n < 5: 5
n = 5: 10
If you draw a number n that's equal to 5, you get 10 points.
If n less than 5, you get 5 points.
If n is 0, you get 1 point.
The stuff left of the colon is the "condition", while the stuff on the right is the "value".
All entries will be of the form:
n1 op n2: val
In this system, equality takes precedence over inequality, so the order that they appear in will not matter in the end. The inputs are non-negative integers, though intermediary and results may not be non-negative. The results may not even be numbers (eg: could be strings). I have designed it so that will only accept the most basic inequalities, to make it easier for writing a parser (and to see whether this idea is feasible)
My program has two components:
a parser that will read structured input and build a data structure to store the conditions and their associated results.
a function that will take an argument (a non-negative integer) and return the result (or, as in the example, the number of points I receive)
If the list was hardcoded, that is an easy task: just use a case-when or if-else block and I'm done. But the problem isn't as easy as that.
Recall the list at the top. It can contain an arbitrary number of (in)equalities. Perhaps there's only 3 like above. Maybe there are none, or maybe there are 10, 20, 50, or even 1000000. Essentially, you can have m inequalities, for m >= 0
Given a number n and a data structure containing an arbitrary number of conditions and results, I want to be able to determine whether it satisfies any of the conditions and return the associated value. So as with the example above, if I pass in 5, the function will return 10.
They condition/value pairs are not unique in their raw form. You may have multiple instances of the same (in)equality but with different values. eg:
n = 0: 10
n = 0: 1000
n > 0: n
Notice the last entry: if n is greater than 0, then it is just whatever you got.
If multiple inequalities are satisfied (eg: n > 5, n > 6, n > 7), all of them should be returned. If that is not possible to do efficiently, I can return just the first one that satisfied it and ignore the rest. But I would like to be able to retrieve the entire list.
I've been thinking about this for a while and I'm thinking I should use two hash tables: the first one will store the equalities, while the second will store the inequalities.
Equality is easy enough to handle: Just grab the condition as a key and have a list of values. Then I can quickly check whether n is in the hash and grab the appropriate value.
However, for inequality, I am not sure how it will work. Does anyone have any ideas how I can solve this problem in as little computational steps as possible? It's clear that I can easily accomplish this in O(n) time: just run it through each (in)equality one by one. But what happens if this checking is done in real-time? (eg: updated constantly)
For example, it is pretty clear that if I have 100 inequalities and 99 of them check for values > 100 while the other one checks for value <= 100, I shouldn't have to bother checking those 99 inequalities when I pass in 47.
You may use any data structure to store the data. The parser itself is not included in the calculation because that will be pre-processed and only needs to be done once, but if it may be problematic if it takes too long to parse the data.
Since I am using Ruby, I likely have more flexible options when it comes to "messing around" with the data and how it will be interpreted.
class RuleSet
Rule = Struct.new(:op1,:op,:op2,:result) do
def <=>(r2)
# Op of "=" sorts before others
[op=="=" ? 0 : 1, op2.to_i] <=> [r2.op=="=" ? 0 : 1, r2.op2.to_i]
end
def matches(n)
#op2i ||= op2.to_i
case op
when "=" then n == #op2i
when "<" then n < #op2i
when ">" then n > #op2i
end
end
end
def initialize(text)
#rules = text.each_line.map do |line|
Rule.new *line.split(/[\s:]+/)
end.sort
end
def value_for( n )
if rule = #rules.find{ |r| r.matches(n) }
rule.result=="n" ? n : rule.result.to_i
end
end
end
set = RuleSet.new( DATA.read )
-1.upto(8) do |n|
puts "%2i => %s" % [ n, set.value_for(n).inspect ]
end
#=> -1 => 5
#=> 0 => 1
#=> 1 => 5
#=> 2 => 5
#=> 3 => 5
#=> 4 => 5
#=> 5 => 10
#=> 6 => nil
#=> 7 => 7
#=> 8 => nil
__END__
n = 0: 1
n < 5: 5
n = 5: 10
n = 7: n
I would parse the input lines and separate them into predicate/result pairs and build a hash of callable procedures (using eval - oh noes!). The "check" function can iterate through each predicate and return the associated result when one is true:
class PointChecker
def initialize(input)
#predicates = Hash[input.split(/\r?\n/).map do |line|
parts = line.split(/\s*:\s*/)
[Proc.new {|n| eval(parts[0].sub(/=/,'=='))}, parts[1].to_i]
end]
end
def check(n)
#predicates.map { |p,r| [p.call(n) ? r : nil] }.compact
end
end
Here is sample usage:
p = PointChecker.new <<__HERE__
n = 0: 1
n = 1: 2
n < 5: 5
n = 5: 10
__HERE__
p.check(0) # => [1, 5]
p.check(1) # => [2, 5]
p.check(2) # => [5]
p.check(5) # => [10]
p.check(6) # => []
Of course, there are many issues with this implementation. I'm just offering a proof-of-concept. Depending on the scope of your application you might want to build a proper parser and runtime (instead of using eval), handle input more generally/gracefully, etc.
I'm not spending a lot of time on your problem, but here's my quick thought:
Since the points list is always in the format n1 op n2: val, I'd just model the points as an array of hashes.
So first step is to parse the input point list into the data structure, an array of hashes.
Each hash would have values n1, op, n2, value
Then, for each data input you run through all of the hashes (all of the points) and handle each (determining if it matches to the input data or not).
Some tricks of the trade
Spend time in your parser handling bad input. Eg
n < = 1000 # no colon
n < : 1000 # missing n2
x < 2 : 10 # n1, n2 and val are either number or "n"
n # too short, missing :, n2, val
n < 1 : 10x # val is not a number and is not "n"
etc
Also politely handle non-numeric input data
Added
Re: n1 doesn't matter. Be careful, this could be a trick. Why wouldn't
5 < n : 30
be a valid points list item?
Re: multiple arrays of hashes, one array per operator, one hash per point list item -- sure that's fine. Since each op is handled in a specific way, handling the operators one by one is fine. But....ordering then becomes an issue:
Since you want multiple results returned from multiple matching point list items, you need to maintain the overall order of them. Thus I think one array of all the point lists would be the easiest way to do this.

Resources