Related
I'm trying to implement an algorithm for table verification, but I'm confused with the multiple choices Ruby supplies for working with arrays and hashes and I need help putting it all together. Consider this table as example:
| A | B | C |
-------------
| 1 | 2 | 3 |
| 1 | 2 | 4 |
| 5 | 6 | 7 |
| 1 | 1 | 3 |
My method should count the number of occurrences of a specific cell combination. For example:
match_line([A => 1, C => 3])
The results should be 2, as this combination exists in both the first row and the last one.
What I did so far is to create an hash variable that hold column indexing like so:
[A => 0, B => 1, C=> 2]
And I also have an array list which holds all the above table rows like so:
[[1, 2, 3], [1, 2, 4], [5, 6, 7], [1, 1, 3]]
The logic looks like that - the match_line method above specific the user wants to match a row where column A has the 1 value in it and column C has the 3 value in it. Based on my index hash, the A column index is 0 and C index is 2. Now for each array (row) in the array list, if index 0 equals 1 and index 2 equals 3 like the user requested , I add +1 to a counter and keep going over the other array row until I'm over.
I tried to form it into code, but I ended with a way that seems very not efficient of doing so, I'm interested to see your code example to see perhaps Ruby has inner Enumerable methods that I'm not aware of to make it more elegant.
First, you should use the best available structure to describe your domain :
data = [[1, 2, 3], [1, 2, 4], [5, 6, 7], [1, 1, 3]]
#data_hashes = data.map do |sequence|
{ 'A' => sequence[0], 'B' => sequence[1], 'C' => sequence[2] }
end
Second, I think you should use a real Hash as input for match_line :
# replace match_line([A => 1, C => 3]) with
match_line({'A' => 1, 'C' => 3})
Now you're all set for an easy implementation using Enumerable#select and Array#size (or use Array#count as pointed by Keith Bennet)
def match_line(match)
#data_hashes.count { |row|
match.all? { |match_key, match_value|
row[match_key] == match_value
}
}
end
EDIT: Dynamically create Hash from column names
columns = ['a', 'b', 'c']
data = [[1, 2, 3], [1, 2, 4], [5, 6, 7], [1, 1, 3]]
#data_hashes = data.map do |row|
Hash[columns.zip(row)]
end
I am trying to output an array of 10-element arrays containing permutations of 1 and 2, e.g.:
[[1,1,1,1,1,1,1,1,1,1],
[1,1,1,1,1,1,1,1,1,2],
[1,2,1,2,1,2,1,2,1,2],
...etc...
[2,2,2,2,2,2,2,2,2,2]]
I have done this with smaller arrays, but with (10):
a = [1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2]
a = a.permutation(10).to_a
print a.uniq
...this apparently is too big a calculation- after an hour running, it hadn't completed and the ruby process was sitting on 12GB of memory. Is there another way to come at this?
First, check size of that permutation
a.permutation(10).size
=> 670442572800
It's huge. What you can do instead is to use Array#repeated_permutations on smaller array. Check it here:
b = [1, 2] # Only unique elements
b.repeated_permutation(10) # Returns enumerable
b.repeated_permutation(10).to_a # Creates array with all of permutations
Those permutations are already unique (which u can check by printing it size with and without Array#uniq )
The approach you've gone down does indeed generate a lot of data - 20!/10! or about 600 billion arrays. This is clearly very wasteful when the output you are after only has 1024 arrays
what you are after is closer to the product method on array
[1,2].product(*([[1,2]]*9))
The product method produces all the possible combinations by picking one element from each of the receiver and its arguments. The use of splats and the * method on array is just to avoid writing [1,2] 9 times.
Edit: Don't use this... it's slower by a factor (10 times slower) then running: b= [1,2]; b.repeated_permutation(10).to_a
What you're looking for is actually a binary permutation array...
So taking that approach, we know we have 2**10 (== 1024) permutations, which translated to the numbers between 0 and 1023.
try this:
1024.times.with_object([]) {|i, array| array << ( ("%10b" % (i-1) ).unpack("U*").map {|v| (v == 49) ? 2 : 1} ) }
Or this (slightly faster):
(0..1023).each.with_object([]) {|i, array| array << ( (0..9).each.with_object([]) {|j, p| p << (i[j] +1)} ) }
You take the number of options - 1024. For each option (i) you assign a number (i-1) and extract the binary code that comprises that number.
The binary code is extracted, in my example, by converting it to a string 10 digits long using "%10b" % (i-1) and then unpacking that string to an array. I map that array, replacing the values I get from the string (white space == 32 && zero == 48) with the number 1 or (the number 1 == 49) with the number 2.
Voila.
There should be a better way to extract the binary representation of the numbers, but I couldn't think of one, as I'm running ob very little sleep.
Here's another way that is similar to the approach taken by #Myst, except I've used Fixnum#to_s to convert an integer to a string representation of its binary equivalent.
There are 2**n integers with n digits (including leading zeros), when each digit equals 1 or 2 (or equals 0 or 1). We therefore can 1-1 map each integer i between 0 and 2**n-1 to one of the integers containing only digits 1 and 2 by converting 0-bits to 1 and 1-bits to 2.
In Ruby, one way to convert 123 to binary is:
123.to_s(2).to_i #=> 1111011
As
2**9 #=> 512
(2**9-1).to_s(2) #=> "111111111"
(2**9-1).to_s(2).to_i #=> 111111111
we see that when comparing numbers between 0 and 511 (2**9-1), the binary representation of 123 would have two leading zeros. As we need those leading zeros to make the conversion to 1's and 2's, it is convenient to leave the binary representation of each numbers as a string, and pad the string with zeros:
str = 123.to_s(2).rjust(9,'0') #=> "001111011"
That allows us to write:
str.each_char.map { |c| c.eql?("0") ? 1 : 2 } }
#=> [1, 1, 2, 2, 2, 2, 1, 2, 2]
We can wrap this in a method:
def combos(n)
(2**n).times.map { |i| i.to_s(2)
.rjust(n,'0')
.each_char
.map { |c| c.eql?("0") ? 1 : 2 } }
end
combos(1)
#=> [[1], [2]]
combos(2)
#=> [[1, 1], [1, 2], [2, 1], [2, 2]]
combos(3)
#=> [[1, 1, 1], [1, 1, 2], [1, 2, 1], [1, 2, 2],
# [2, 1, 1], [2, 1, 2], [2, 2, 1], [2, 2, 2]]
combos(4)
#=> [[1, 1, 1, 1], [1, 1, 1, 2], [1, 1, 2, 1], [1, 1, 2, 2],
# [1, 2, 1, 1], [1, 2, 1, 2], [1, 2, 2, 1], [1, 2, 2, 2],
# [2, 1, 1, 1], [2, 1, 1, 2], [2, 1, 2, 1], [2, 1, 2, 2],
# [2, 2, 1, 1], [2, 2, 1, 2], [2, 2, 2, 1], [2, 2, 2, 2]]
combos(10).size
#=> 1024
Say I have an array: [1, 2, 3, 4, 5].
Given another array ([2, 4], for example), I would like to have a new array (or the initial array modified, doesn't matter) that looks like: [1, 3, 5, 2, 4]. So selected elements are moved to the end of the array.
Pushing the elements back is quite straight-forward, but how do I pop specific elements from an array?
a = [1, 2, 3, 4, 5]
b = [2, 4]
(a - b) + (b & a)
#=> [1, 3, 5, 2, 4]
a - b is the elements in a but not in b, while b & a is the elements that are common in both arrays. There goes your expected result.
In case if elements in a are not uniq (as mentioned by eugen) and it's important to remove only one element from b you could do something like:
a = [1, 2, 3, 2, 4, 5, 4, 2]
b = [2, 4, 7]
p (b&a).unshift(a.map{|el|
b.include?(el) ? begin b = b -[el]; nil end : el
}.compact).flatten
#=> [1, 3, 2, 5, 4, 2, 2, 4]
Array#drop removes the first n elements of an array. What is a good way to remove the last m elements of an array? Alternately, what is a good way to keep the middle elements of an array (greater than n, less than m)?
This is exactly what Array#pop is for:
x = [1,2,3]
x.pop(2) # => [2,3]
x # => [1]
You can also use Array#slice method, e.g.:
[1,2,3,4,5,6].slice(1..4) # => [2, 3, 4, 5]
or
a = [1,2,3,4,5,6]
a.take 3 # => [1, 2, 3]
a.first 3 # => [1, 2, 3]
a.first a.size - 1 # to get rid of the last one
The most direct opposite of drop (drop the first n elements) would be take, which keeps the first n elements (there's also take_while which is analogous to drop_while).
Slice allows you to return a subset of the array either by specifying a range or an offset and a length. Array#[] behaves the same when passed a range as an argument or when passed 2 numbers
this will get rid of last n elements:
a = [1,2,3,4,5,6]
n = 4
p a[0, (a.size-n)]
#=> [1, 2]
n = 2
p a[0, (a.size-n)]
#=> [1, 2, 3, 4]
regard "middle" elements:
min, max = 2, 5
p a.select {|v| (min..max).include? v }
#=> [2, 3, 4, 5]
I wanted the return value to be the array without the dropped elements. I found a couple solutions here to be okay:
count = 2
[1, 2, 3, 4, 5].slice 0..-(count + 1) # => [1, 2, 3]
[1, 2, 3, 4, 5].tap { |a| a.pop count } # => [1, 2, 3]
But I found another solution to be more readable if the order of the array isn't important (in my case I was deleting files):
count = 2
[1, 2, 3, 4, 5].reverse.drop count # => [3, 2, 1]
You could tack another .reverse on there if you need to preserve order but I think I prefer the tap solution at that point.
You can achieve the same as Array#pop in a non destructive way, and without needing to know the lenght of the array:
a = [1, 2, 3, 4, 5, 6]
b = a[0..-2]
# => [1, 2, 3, 4, 5]
n = 3 # if we want drop the last n elements
c = a[0..-(n+1)]
# => [1, 2, 3]
Array#delete_at() is the simplest way to delete the last element of an array, as so
arr = [1,2,3,4,5,6]
arr.delete_at(-1)
p arr # => [1,2,3,4,5]
For deleting a segment, or segments, of an array use methods in the other answers.
You can also add some methods
class Array
# Using slice
def cut(n)
slice(0..-n-1)
end
# Using pop
def cut2(n)
dup.tap{|x| x.pop(n)}
end
# Using take
def cut3(n)
length - n >=0 ? take(length - n) : []
end
end
[1,2,3,4,5].cut(2)
=> [1, 2, 3]
Demo (I expect result [3]):
[1,2] - [1,2,3] => [] # Hmm
[1,2,3] - [1,2] => [3] # I see
a = [1,2].to_set => #<Set: {1, 2}>
b = [1,2,3].to_set => #<Set: {1, 2, 3}>
a - b => #<Set: {}> WTF!
And:
[1,2,9] - [1,2,3] => [9] # Hmm. Would like [[9],[3]]
How is one to perform a real set difference regardless of order of the inputs?
Ps. As an aside, I need to do this for two 2000-element arrays. Usually, array #1 will have fewer elements than array #2, but this is not guaranteed.
The - operator applied to two arrays a and b gives the relative complement of b in a (items that are in a but not in b).
What you are looking for is the symmetric difference of two sets (the union of both relative complements between the two). This will do the trick:
a = [1, 2, 9]
b = [1, 2, 3]
a - b | b - a # => [3, 9]
If you are operating on Set objects, you may use the overloaded ^ operator:
c = Set[1, 2, 9]
d = Set[1, 2, 3]
c ^ d # => #<Set: {3, 9}>
For extra fun, you could also find the relative complement of the intersection in the union of the two sets:
( a | b ) - ( a & b ) # => #<Set: {3, 9}>