This question already has answers here:
Generate all strings under length N in C
(2 answers)
Closed 8 years ago.
I m looking for an algorithm that give all possible combinations of letters
Let me explain better. If i have
base-letters = ["a","b","c"];
depth = 2; //max chars allowed
then the expected result would be these 12 elements (3^1 + 3^2 = 12):
["a", "b", "c", "aa","ab","ac","ba","bb","bc","ca", "cb", "cc"]
If i had a depth value = 3, i would expect (3^1) + (3^2) + (3^3) = 39 elements
["a", "b", ... , "aa", "ab", ... , "aaa", "aab", ..., "aba", ...]
Now, if i understood correctly permutation algorithm is similar, but doesn't consider duplicated letters (like "aa","bb","aab", "aba"), and the variable depth value (it could be different then base-letters length).
You can define a recursive function F(s) which takes a string s of length less than or equal to your maximum length, and start by calling F(s) with s equal to the empty string. The function F computes the length of the string and if it is equal to the maximum length, it prints the string s and returns. If the length of the string is less than the maximum, then F(s) prints out the string s and then iterates over all possible letters in the alphabet and for each letter, it adds the letter to the end of string s to produce a string s' of length one more, and then calls F(s'). This has very low memory usage and is essentially the fastest method possible, at least in asymptotic terms.
In Python, use itertools's permutations function (a code recipe is included if you need to translate the code to your native language)
>>> import itertools
>>> base_elements = ['a', 'b', 'cow']
>>> max_depth = 2
>>> result = [''.join(element) for element in itertools.chain.from_iterable([itertools.permutations(base_elements, depth) for depth in range(1, max_depth+1)])]
>>> print(result)
['a', 'b', 'cow', 'ab', 'acow', 'ba', 'bcow', 'cowa', 'cowb']
If you want only unique values, then rather than concatenating the each output element into a string, create a set. This removes duplicates. Then remove duplicates from the final set.
>>> result = frozenset([frozenset(element)
for element in itertools.chain.from_iterable(
[itertools.permutations(base_elements, depth)
for depth in range(1, max_depth+1)]
)])
Or more cleanly,
def permutations(base_elements, max_depth):
result = set()
for depth in range(1, max_depth+1):
for element in itertools.permutations(base_elements, depth):
result.add(frozenset(element))
return result
It seems this code will give you what you need:
def all_strs(iterable, depth):
results = []
if depth==1:
for item in iterable:
results.append(str(item))
return results
for item in iterable:
for s in all_strs(iterable, depth-1):
results.append(str(item) + s)
return results
if __name__ == "__main__":
print all_strs('abc', 2)
print all_strs([1, 2, 3], 3)
s = 'abc'
results = []
for i in range(len(s)):
results += print all_strs(s, i+1)
print results
the output is:
['aa', 'ab', 'ac', 'ba', 'bb', 'bc', 'ca', 'cb', 'cc']
['111', '112', '113', '121', '122', '123', '131', '132', '133', '211', '212', '213', '221', '222', '223', '231', '232', '233', '311', '312', '313', '321', '322', '323', '331', '332', '333']
['a', 'b', 'c', 'aa', 'ab', 'ac', 'ba', 'bb', 'bc', 'ca', 'cb', 'cc', 'aaa', 'aab', 'aac', 'aba', 'abb', 'abc', 'aca', 'acb', 'acc', 'baa', 'bab', 'bac', 'bba', 'bbb', 'bbc', 'bca', 'bcb', 'bcc', 'caa', 'cab', 'cac', 'cba', 'cbb', 'cbc', 'cca', 'ccb', 'ccc']
Related
I have an array containing repeating characters. Take
['A','B','C','C'] as example. I am writing a C++ program to output all the combinations of size N.
When N = 2, program should output AB, AC, BC and CC.
When N = 3, program should output ABC, ACC, BCC.
My approach was to treat each character of the array as unique. I generated the all combinations and saved it to a vector. I then iterated through the vector to remove duplicate combinations.
For N = 2, assuming all characters are unique gives 4C2 = 6 possibilities : AB, AC, AC, BC, BC, CC.
1 AC and 1 BC are removed because they occur twice instead of once.
Is there a more efficient way to do this?
Note:
The characters in the array are not necessarily in ascending order.
In my initial approach, using find() for vector was insufficient to locate all duplicates. For array ['T','O','M','O','R','R','O','W'], the duplicates are permuted. For N = 4, two of the duplicates are TOMO and TMOO.
You can sort source array and collect similar items in groups. In the code below I added index of the next group beginning to provide proper jump to the next group if next similar item is not used.
Python code and output:
a = ['T','O','M','O','R','R','O','W']
a.sort()
print(a)
n = len(a)
b = [[a[-1], n]]
next = n
for i in range(n-2,-1, -1):
if a[i] != a[i+1]:
next = i + 1
b.insert(0, [a[i], next])
print(b)
def combine(lst, idx, n, k, res):
if len(res) == k:
print(res)
return
if idx >= n:
return
#use current item
combine(lst, idx + 1, n, k, res + lst[idx][0])
#omit current item and go to the next group
combine(lst, lst[idx][1], n, k, res)
combine(b, 0, n, 3, "")
['M', 'O', 'O', 'O', 'R', 'R', 'T', 'W']
[['M', 1], ['O', 4], ['O', 4], ['O', 4], ['R', 6], ['R', 6], ['T', 7], ['W', 8]]
MOO MOR MOT MOW MRR MRT MRW MTW
OOO OOR OOT OOW ORR ORT ORW OTW
RRT RRW RTW
I have an array with characters, numbers and float values.
a[] = {'a',2,2.5}
I have to multiply each integer and float by 2 and no operation should be done on character.
My solution -
def self.double_numbers(input)
input.map do |input_element|
if input_element.is_a? Integer
input_element.to_i * 2
elsif input_element.is_a? Float
input_element.to_Float * 2
end
end
end
It is not working, for input
a[] = {'a',2,2.5}
It is returning
0 4 4
You could use map and over each element in the array check if is Numeric, and if so, then multiply it by 2, then you can compact your result for nil values:
p ['a', 2, 2.5].map{|e| e * 2 if e.is_a? Numeric}.compact
# => [4, 5.0]
If you want to leave the elements which won't be applied the *2 operation, then:
p ['a', 2, 2.5].map{|e| e.is_a?(Numeric) ? e * 2 : e}
Also you could use grep to simplify the checking, and then map your only-numeric values:
p ['a', 2, 2.5].grep(Numeric).map{|e| e*2}
# => [4, 5.0]
I don't know the side effects of doing this, but looks good (of course if the output won't be only Numeric objects):
class Numeric
def duplicate
self * 2
end
end
p ['a', 2, 2.5].grep(Numeric).map(&:duplicate)
Or also:
p ['a', 2, 2.5].grep(Numeric).map(&2.method(:*))
# [4, 5.0]
I have to multiply each integer and float by 2 and no operation should
be done on character.
here you go:
> a = ["a", 2, 2.5]
> a.map{|e| e.is_a?(Numeric) ? e * 2 : e}
#=> ["a", 4, 5.0]
Note: a[] = {'a',2,2.5} not a correct syntax for array.
I would like to find all combinations of a string, maintaining order, but of any length. For example:
string_combinations("wxyz")
# => ['w', 'wx', 'wxy', 'wxyz', 'wxz', 'wy', 'wyz', 'wz', 'x', 'xy', 'xyz', 'xz', 'y', 'yz', 'z']
I would prefer if you could use loops only and avoid using ruby methods like #combination as I am trying to find the cleanest way to implement this if I come across it in another language.
Is there a way to do this in less than O(n^3)? My initial thought is something like:
def string_combinations(str)
result = []
(0...str.length).each do |i|
result << str[i]
((i+1)...str.length).each do |j|
result << str[i] + str[j]
((j+1)...str.length).each do |k|
result << str[i] + str[j..k]
# Still not covering everything.
end
end
end
result
end
Here are two ways it could be done without making use of Array#combination. I've also included code for the case when combination is permitted (#3)1.
1. Map each of the numbers between 1 and 2**n-1 (n being the length of the string) to a unique combination of characters from the string
def string_combinations(str)
arr = str.chars
(1..2**str.length-1).map do |n|
pos = n.bit_length.times.map.with_object([]) { |i,a| a << i if n[i] == 1 }
arr.values_at(*pos).join
end.sort
end
string_combinations("wxyz")
# => ["w", "wx", "wxy", "wxyz", "wxz", "wy", "wyz", "wz",
# "x", "xy", "xyz", "xz", "y", "yz", "z"]
Discrete probability theory provides us with the equation
sum(i = 1 to n) ( |i| C(n,i) ) == 2^n - 1
where C(n,i) is "the number of combinations of n things taken i at a time".
If the given string is "wxyz", n = "wxyz".length #=> 4, so there are 2**4 - 1 #=> 15 combinations of one or more characters from this string. Now consider any of the numbers between 1 and 16, say 11, which is 0b1011 in binary. Converting this to an array of binary digits, we obtain:
bin_arr = [1,0,1,1]
We now pluck out each character of wxyz for which the corresponding index position in bin_arr equals 1, namely
["w", "y", "z"]
and then join those elements to form a string:
["w", "y", "z"].join #=> "wyz"
Since each number 1 to 15 corresponds to a distinct combination of one or more characters from this string, , we can obtain all such combinations by repeating the above calculation for each the numbers between 1 and 15.
No matter which method you use, the resulting array will contain 2**n - 1 elements, so you are looking at O(2**str.length).
2. Use recursion
def string_combinations(str)
(combos(str) - [""]).sort
end
def combos(str)
return [str, ""] if str.length==1
forward = combos str[1..-1]
[*forward, *[str[0]].product(forward).map(&:join)]
end
string_combinations("wxyz")
# => ["w", "wx", "wxy", "wxyz", "wxz", "wy", "wyz", "wz",
# "x", "xy", "xyz", "xz", "y", "yz", "z"]
Notice that
combos("wxyz")
#=> ["z", "", "yz", "y", "xz", "x", "xyz", "xy",
# "wz", "w", "wyz", "wy", "wxz", "wx", "wxyz", "wxy"]
includes an empty string, which must be removed, and the array needs sorting. Hence the need to separate out the recursive method combos.
3. Use Array#combination
Here we invoke arr.combination(n) for all values of n between 1 and arr.size and return a (flattened) array comprised of all n return values.
def string_combinations(str)
a = str.chars
(1..str.size).flat_map { |n| a.combination(n).map(&:join) }.sort
end
string_combinations "wxyz"
# => ["w", "wx", "wxy", "wxyz", "wxz", "wy", "wyz", "wz",
# "x", "xy", "xyz", "xz", "y", "yz", "z"]
1 Since I wrote it before realizing that's not what the OP wanted. ¯\_(ツ)_/¯
A pretty simple solution using a stack (I can't provide ruby-code though):
string inp
list result
//initialize stack
stack s
s.push(0)
while(!s.isEmpty())
int tmp = s.peek()
//the current value is higher than the max-index -> shorten prefix
if tmp >= inp.length()
s.pop()
//increment the last character of the prefix
if !s.isEmpty()
s.push(s.pop() + 1)
continue
//build the result-string from the indices in the stack
//note that the indices in the stack are reverse (highest first)!!!
result.add(buildString(inp , s)
//since we aren't at the end of the string, we can append another character to the stack
s.push(tmp + 1)
The basic idea would be to maintain a stack of positions from which characters will be taken. This stack has the property that each element in the stack is a larger number than the next element in the stack. Thus the ordering of the stack is maintained. If we reach a number that is equal to the string-length, we eliminate that number and increment the next number, thus moving on to the next prefix.
E.g.:
stack string
0 (init) a b c (init)
0 a
0 1 a b
0 1 2 a b c
0 2 a c
1 b
1 2 b c
2 c
The peek of the stack would represent the character of the input-string that is modified, the rest of the stack represents the prefix.
It seems like the algorithm could be expressed in English as:
"w", followed by "w" + all combinations of "xyz", followed by
"x", followed by "x" + all combinations of "yz", followed by
etc.
In other words, there is a notion of a "prefix" and then recursion on the "remaining chars". With that in mind, here is a Ruby solution:
def combine_with_prefix(prefix, chars)
result = []
chars.each_with_index do |ch, i|
result << "#{prefix}#{ch}"
result.concat(combine_with_prefix(result.last, chars[(i + 1)..-1]))
end
result
end
def string_combinations(str)
combine_with_prefix(nil, str.chars)
end
string_combinations("wxyz")
# => ["w", "wx", "wxy", "wxyz", "wxz", "wy", "wyz", "wz", "x", "xy", "xyz", "xz", "y", "yz", "z"]
Here's another way to think about this: You start with an input string s having length n. From that you calculate the set of strings that can be produced by removing one character from s. That set has n members. For each of those members you perform the same operation: Calculate the set of strings that can be produced by removing one character. Each of those sets has n-1 members. For each of those n(n-1) members, perform the operation again, and so on until n is 1. The result is the union of all of the calculated sets.
For example, suppose your starting string is abcd (n = 4) The set of strings that can be produced by removing one character is (bcd, acd, abd, abc). That's 4 operations. Repeating the operation for each of those strings yields 4 sets of 3 members each (4x3 = 12 operations), each of which has length 2. Repeating again yields 12 sets of 2 members each (4x3x2 = 24 operations), each having length 1. That's the magic number, so we round up all of those strings, throw out the duplicates, and we've got our answers. In the end we did 4+4x3+4x3x2 = 40 operations.
That holds true for every length of string. If we have 5 characters we do 5+5x4+5x4x3+5x4x3x2 = 205 operations. For 6 characters it's 1,236 operations. I leave it to you to figure out what that equates to in big-O notation.
This boils down to a really simple recursive algorithm:
def comb(str)
[ str,
*if str.size > 1
str.each_char.with_index.flat_map do |_,i|
next_str = str.dup
next_str.slice!(i)
comb(next_str)
end
end
]
end
p comb("wxyz").uniq.sort
# => [ "w", "wx", "wxy", "wxyz", "wxz", "wy", "wyz", "wz",
# "x", "xy", "xyz", "xz", "y", "yz", "z" ]
We end up throwing out a lot with uniq, though, which tells us we can save a lot of cycles by memoizing:
def comb(str, memo={})
return memo[str] if memo.key?(str)
[ str,
*if str.size > 1
str.each_char.with_index.flat_map do |_,i|
next_str = str.dup
next_str.slice!(i)
memo[str] = comb(next_str, memo)
end
end
]
end
p comb("wxyz").uniq.sort
In case you're curious, with memoization the inner loop is reached 23 times for a 4-character input versus 41 without memoization; 46 vs. 206 times for 5 characters; 87 vs. 1,237 times for 6; and 162 vs. 8,653 for 7. Fairly significant, I think.
Given some array such as the following:
x = ['a', 'b', 'b', 'c', 'a', 'a', 'a']
I want to end up with something that shows how many times each element repeats sequentially. So maybe I end up with the following:
[['a', 1], ['b', 2], ['c', 1], ['a', 3]]
The structure of the results isn't that important... could be some other data types of needed.
1.9 has Enumerable#chunk for just this purpose:
x.chunk{|y| y}.map{|y, ys| [y, ys.length]}
This is not a general solution, but if you only need to match single characters, it can be done like this:
x.join.scan(/(\w)(\1*)/).map{|x| [x[0], x.join.length]}
Here's one line solution. The logic same as Matt suggested, though, works fine with nil's in front of x:
x.each_with_object([]) { |e, r| r[-1] && r[-1][0] == e ? r[-1][-1] +=1 : r << [e, 1] }
Here's my approach:
# Starting array
arr = [nil, nil, "a", "b", "b", "c", "a", "a", "a"]
# Array to hold final values as requested
counts = []
# Array of previous `count` element
previous = nil
arr.each do |letter|
# If this letter matches the last one we checked, increment count
if previous and previous[0] == letter
previous[1] += 1
# Otherwise push a new array for letter/count
else
previous = [letter, 1]
counts.push previous
end
end
I should note that this doesn't suffer from the same problem that Matt Sanders describes, since we're mindful of our first time through the iteration.
Give a polynomial time algorithm that takes three strings, A, B and C, as input, and returns the longest sequence S that is a subsequence of A, B, and C.
Let dp[i, j, k] = longest common subsequence of prefixes A[1..i], B[1..j], C[1..k]
We have:
dp[i, j, k] = dp[i - 1, j - 1, k - 1] + 1 if A[i] = B[j] = C[k]
max(dp[i - 1, j, k], dp[i, j - 1, k], dp[i, j, k - 1]) otherwise
Similar to the 2d case, except you have 3 dimensions. Complexity is O(len A * len B * len C).
Here's a solution in Python for an arbitrary number of sequences. You could use it to test your solution for 2D, 3D cases. It closely follows Wikipedia's algorithm:
#!/usr/bin/env python
import functools
from itertools import starmap
#memoize
def lcs(*seqs):
"""Find longest common subsequence of `seqs` sequences.
Complexity: O(len(seqs)*min(seqs, key=len)*reduce(mul,map(len,seqs)))
"""
if not all(seqs): return () # at least one sequence is empty
heads, tails = zip(*[(seq[0], seq[1:]) for seq in seqs])
if all(heads[0] == h for h in heads): # all seqs start with the same element
return (heads[0],) + lcs(*tails)
return max(starmap(lcs, (seqs[:i]+(tails[i],)+seqs[i+1:]
for i in xrange(len(seqs)))), key=len)
def memoize(func):
cache = {}
#functools.wraps(func)
def wrapper(*args):
try: return cache[args]
except KeyError:
r = cache[args] = func(*args)
return r
return wrapper
Note: without memoization it is an exponential algorithm (wolfram alpha):
$ RSolve[{a[n] == K a[n-1] + K, a[0] = K}, a[n], n]
a(n) = (K^(n + 1) - 1) K/(K - 1)
where K == len(seqs) and n == max(map(len, seqs))
Examples
>>> lcs("agcat", "gac")
('g', 'a')
>>> lcs("banana", "atana")
('a', 'a', 'n', 'a')
>>> lcs("abc", "acb")
('a', 'c')
>>> lcs("XMJYAUZ", "MZJAWXU")
('M', 'J', 'A', 'U')
>>> lcs("XMJYAUZ")
('X', 'M', 'J', 'Y', 'A', 'U', 'Z')
>>> lcs("XMJYAUZ", "MZJAWXU", "AMBCJDEFAGHI")
('M', 'J', 'A')
>>> lcs("XMJYAUZ", "MZJAWXU", "AMBCJDEFAGUHI", "ZYXJAQRU")
('J', 'A', 'U')
>>> lcs() #doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
...
ValueError:
>>> lcs(*"abecd acbed".split())
('a', 'b', 'e', 'd')
>>> lcs("acd", lcs("abecd", "acbed"))
('a', 'd')
>>> lcs(*"abecd acbed acd".split())
('a', 'c', 'd')
All you have to do is Google "longest subsequence".
This is the top link: http://en.wikipedia.org/wiki/Longest_common_subsequence_problem
If you have any particular problem understanding it then please ask here, preferably with a more specific question.