How to use less memory generating Array permutation? - ruby

So I need to get all possible permutations of a string.
What I have now is this:
def uniq_permutations string
string.split(//).permutation.map(&:join).uniq
end
Ok, now what is my problem: This method works fine for small strings but I want to be able to use it with strings with something like size of 15 or maybe even 20. And with this method it uses a lot of memory (>1gb) and my question is what could I change not to use that much memory?
Is there a better way to generate permutation? Should I persist them at the filesystem and retrieve when I need them (I hope not because this might make my method slow)?
What can I do?
Update:
I actually don't need to save the result anywhere I just need to lookup for each in a table to see if it exists.

Just to reiterate what Sawa said. You do understand the scope? The number of permutations for any n elements is n!. It's about the most aggressive mathematical progression operation you can get. The results for n between 1-20 are:
[1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800, 39916800, 479001600,
6227020800, 87178291200, 1307674368000, 20922789888000, 355687428096000,
6402373705728000, 121645100408832000, 2432902008176640000]
Where the last number is approximately 2 quintillion, which is 2 billion billion.
That is 2265820000 gigabytes.
You can save the results to disk all day long - unless you own all the Google datacenters in the world you're going to be pretty much out of luck here :)

Your call to map(&:join) is what is creating the array in memory, as map in effect turns an Enumerator into an array. Depending on what you want to do, you could avoid creating the array with something like this:
def each_permutation(string)
string.split(//).permutation do |permutaion|
yield permutation.join
end
end
Then use this method like this:
each_permutation(my_string) do |s|
lookup_string(s) #or whatever you need to do for each string here
end
This doesn’t check for duplicates (no call to uniq), but avoids creating the array. This will still likely take quite a long time for large strings.
However I suspect in your case there is a better way of solving your problem.
I actually don't need to save the result anywhere I just need to lookup for each in a table to see if it exists.
It looks like you’re looking for possible anagrams of a string in an existing word list. If you take any two anagrams and sort the characters in them, the resulting two strings will be the same. Could you perhaps change your data structures so that you have a hash, with keys being the sorted string and the values being a list of words that are anagrams of that string. Then instead of checking all permutations of a new string against a list, you just need to sort the characters in the string, and use that as the key to look up the list of all strings that are permutations of that string.

Perhaps you don't need to generate all elements of the set, but rather only a random or constrained subset. I have written an algorithm to generate the m-th permutation in O(n) time.
First convert the key to a list representation of itself in the factorial number system. Then iteratively pull out the item at each index specified by the new list and of the old.
module Factorial
def factorial num; (2..num).inject(:*) || 1; end
def factorial_floor num
tmp_1 = 0
1.upto(1.0/0.0) do |counter|
break [tmp_1, counter - 1] if (tmp_2 = factorial counter) > num
tmp_1 = tmp_2 #####
end # #
end # #
end # returns [factorial, integer that generates it]
# for the factorial closest to without going over num
class Array; include Factorial
def generate_swap_list key
swap_list = []
key -= (swap_list << (factorial_floor key)).last[0] while key > 0
swap_list
end
def reduce_swap_list swap_list
swap_list = swap_list.map { |x| x[1] }
((length - 1).downto 0).map { |element| swap_list.count element }
end
def keyed_permute key
apply_swaps reduce_swap_list generate_swap_list key
end
def apply_swaps swap_list
swap_list.map { |index| delete_at index }
end
end
Now, if you want to randomly sample some permutations, ruby comes with Array.shuffle!, but this will let you copy and save permutations or to iterate through the permutohedral space. Or maybe there's a way to constrain the permutation space for your purposes.
constrained_generator_thing do |val|
Array.new(sample_size) {array_to_permute.keyed_permute val}
end

Perhaps I am missing the obvious, but why not do
['a','a','b'].permutation.to_a.uniq!

Related

Ruby calling method in quick_sort gives wrong output

I am doing a quick sort in ruby and using quick_sort I am grabbing the last 3 elements of the array and multiplying it to get the value.
My quick sort works fine but only thing is when I call the my other method max_product_three in quick_search the sorted_array I pass into the method is only showing up as two random numbers like [-3,-2]. If I take out my method from quick_search it gives the correct output. What is happening that when I put my method the sorted array is wrong which is causing my max_product_three not to work.
quick_search.rb
def quick_search(array)
return array if array.length <= 1
len = array.length - 1
left = []
right = []
pivot = array.sample
array.delete_at(array.index(pivot))
array.each do |num|
if num < pivot
left << num
else
right << num
end
end
sorted = []
sorted << quick_search(left)
sorted << pivot
sorted << quick_search(right)
sorted_array = sorted.flatten
p sorted_array
max_product_three(sorted_array)
end
max_product_three.rb
def max_product_three(sorted_array)
len = sorted_array.length - 1
take_3 = len - 3
mulitple = sorted_array.drop(take_3)
p mulitple.inject(:*)
end
I am using this array as reference [-3,1,2,-2,5,6]
I assume you're writing this to learn how quick-sort works? Because ruby's got a #sort method right there... 😁
There are few things here. Is your intention to do two things: first, sort the array; second, multiply the three largest values in the sorted array? Because that's not what your code currently does. You call your max_product_three method from within your sort, which means it'll be called every time quick_sort is called.
Worse, it's the last line in the method. That means the result of calling max_product_three is what's returned each time you iterate, not the sorted array! So, for each sub-sort, what you get back is a single number instead of the sorted array.
Also, your max_product_three method multiplies the last 4 values, not the last 3 values. (You subtract 1 from its length and then subtract 3 from it, so you're dropping length - 4 values, leaving 4 values to multiply.)
You don't need to do p sorted_array at the end of your quick_search method (presumably, should be quick_sort!) but can just have sorted_array to return the array.
And, a smaller thing, your initial guard clause would be a bit better (and more ruby-ish) by not using an explicit operator, for example:
return array unless array.length.positive?
That's quite a lot, and I may have misinterpreted what you're trying to do here so let me know if I have!

No implicit conversion of Integer into Array

I am doing a ruby challenege in leetcode where I have to remove duplicates in a array and return the array with unique values. I keep getting:
Line 55: no implicit conversion of Integer into Array (TypeError) in serializer__.rb (-)
and my answer is:
def remove_duplicates(nums)
nums.sort.each_with_index do |num,i|
if nums[i + 1].eql? num
nums.delete_at(i)
end
end
nums
end
I was not able to find a clear solution to what I need to do to resolve this. Please let me know what I need to do.
Working example
you need to re-read the question ... you have to replace the nums, and return the count of the remaining figures:
def remove_duplicates(nums)
nums.uniq!
nums.count
end
you have been returning the array, not the count of the array.
Adjustments to original code
to do it with your original code (note that the array should already be sorted, according to the question):
def remove_duplicates(nums)
nums.each_with_index do |num,i|
if nums[i + 1].eql? num
nums.delete_at(i)
end
end
nums.count
end
Working example with Unsorted initial array
if you wanted an answer that worked when the code is not sorted (which is not the case with your particular question, but a commenter seems to ask about this), and you need to generate a sorted unique array, overwriting the original array, and returning the size of the new array, you would need to do something like the following:
def remove_duplicates(nums)
nums.replace(nums.uniq.sort)
nums.count
end

Is there a better way?: iterating over an array in ruby

I'm working on a mini project for a summer class. I'd like some feedback on the code I have written, especially part 3.
Here's the question:
Create an array called numbers containing the integers 1 - 10 and assign it to a variable.
Create an empty array called even_numbers.
Create a method that iterates over the array. Place all even numbers in the array even_numbers.
Print the array even_numbers.
Here's my code, so far:
numbers = [1,2,3,4,5,6,7,8,9,10]
print numbers[3]
even_numbers.empty?
def even_numbers
numbers.sort!
end
Rather than doing explicit iteration, the best way is likely Array#select thus:
even_numbers = numbers.select { |n| n.even? }
which will run the block given on each element in the array numbers and produce an array containing all elements for which the block returned true.
or an alternative solution following the convention of your problem:
def get_even_numbers(array)
even_num = []
array.each do |n|
even_num << n if n.even?
end
even_num
end
and of course going for the select method is always preferred.

Cannot understand what the following code does

Can somebody explain to me what the below code is doing. before and after are hashes.
def differences(before, after)
before.diff(after).keys.sort.inject([]) do |diffs, k|
diff = { :attribute => k, :before => before[k], :after => after[k] }
diffs << diff; diffs
end
end
It is from the papertrail differ gem.
It's dense code, no question. So, as you say before and after are hash(-like?) objects that are handed into the method as parameters. Calling before.diff(after) returns another hash, which then immediately has .keys called on it. That returns all the keys in the hash that diff returned. The keys are returned as an array, which is then immediately sorted.
Then we get to the most complex/dense bit. Using inject on that sorted array of keys, the method builds up an array (called diffs inside the inject block) which will be the return value of the inject method.
That array is made up of records of differences. Each record is a hash - built up by taking one key from the sorted array of keys from the before.diff(after) return value. These hashes store the attribute that's being diffed, what it looked like in the before hash and what it looks like in the after hash.
So, in a nutshell, the method gets a bunch of differences between two hashes and collects them up in an array of hashes. That array of hashes is the final return value of the method.
Note: inject can be, and often is, much, much simpler than this. Usually it's used to simply reduce a collection of values to one result, by applying one operation over and over again and storing the results in an accumlator. You may know inject as reduce from other languages; reduce is an alias for inject in Ruby. Here's a much simpler example of inject:
[1,2,3,4].inject(0) do |sum, number|
sum + number
end
# => 10
0 is the accumulator - the initial value. In the pair |sum, number|, sum will be the accumulator and number will be each number in the array, one after the other. What inject does is add 1 to 0, store the result in sum for the next round, add 2 to sum, store the result in sum again and so on. The single final value of the accumulator sum will be the return value. Here 10. The added complexity in your example is that the accumulator is different in kind from the values inside the block. That's less common, but not bad or unidiomatic. (Edit: Andrew Marshall makes the good point that maybe it is bad. See his comment on the original question. And #tokland points out that the inject here is just a very over-complex alternative for map. It is bad.) See the article I linked to in the comments to your question for more examples of inject.
Edit: As #tokland points out in a few comments, the code seems to need just a straightforward map. It would read much easier then.
def differences(before, after)
before.diff(after).keys.sort.map do |k|
{ :attribute => k, :before => before[k], :after => after[k] }
end
end
I was too focused on explaining what the code was doing. I didn't even think of how to simplify it.
It finds the entries in before and after that differ according to the underlying objects, then builds up a list of those differences in a more convenient format.
before.diff(after) finds the entries that differ.
keys.sort gives you the keys (of the map of differences) in sorted order
inject([]) is like map, but starts with diffs initialized to an empty array.
The block creates a diff line (a hash) for each of these differences, and then appends it to diffs.

Ruby find in array with offset

I'm looking for a way to do the following in Ruby in a cleaner way:
class Array
def find_index_with_offset(offset, &block)
[offset..-1].find &block
end
end
offset = array.find_index {|element| element.meets_some_criterion?}
the_object_I_want =
array.find_index_with_offset(offset+1) {|element| element.meets_another_criterion?}
So I'm searching a Ruby array for the index of some object and then I do a follow-up search to find the first object that matches some other criterion and has a higher index in the array. Is there a better way to do this?
What do I mean by cleaner: something that doesn't involve explicitly slicing the array. When you do this a couple of times, calculating the slicing indices gets messy fast. I'd like to keep operating on the original array. It's easier to understand and less error-prone.
NB. In my actual code I haven't monkey-patched Array, but I want to draw attention to the fact that I expect I'm duplicating existing functionality of Array/Enumerable
Edits
Fixed location of offset + 1 as per Mladen Jablanović's comment; rewrite error
Added explanation of 'cleaner' as per Mladen Jablanović's comment
Cleaner is here obviously subjective matter. If you aim for short, I don't think you could do better than that. If you want to be able to chain multiple such finds, or you are bothered by slicing, you can do something like this:
module Enumerable
def find_multi *procs
return nil if procs.empty?
find do |e|
if procs.first.call(e)
procs.shift
next true if procs.empty?
end
false
end
end
end
a = (1..10).to_a
p a.find_multi(lambda{|e| e % 5 == 0}, lambda{|e| e % 3 == 0}, lambda{|e| e % 4 == 0})
#=> 8
Edit: And if you're not concerned with the performance you could do something like:
array.drop_while{|element|
!element.meets_some_criterion?
}.drop(1).find{|element|
element.meets_another_criterion?
}

Resources