slice an array by specific way - ruby

There is a good slice method in Python like
my_array[3:]
I'm aware there are slice methods in Ruby as well, but there is no method which does exactly the same as Python's my_array[3:] (in case if don't know the size of the array). Is not it?

class Array
def sub_array(pos, len = -1)
if len == -1
then # the rest of the array starting at pos
len = self.size - pos
end
self.slice(pos, len)
end
end
my_array = %w[a b c d e f]
p my_array.sub_array(3) #=> ["d", "e", "f"]
p my_array.sub_array(5) #=> ["f"]
p my_array.sub_array(9) #=> nil
p my_array.sub_array(3, 2) #=> ["d", "e"]
p my_array.sub_array(3, 9) #=> ["d", "e", "f"]
Actually this was originally a substring method for String.

Please have a look at the ruby slice methods here. and as #Blender suggested you can pass a range like:
my_array[3..-1]
EDIT:
range example
array = ["a", "b", "c", "d", "e"]
array[3..-1]
will result in ["d", "e"] as d's index is 3 and e is the last element.
more examples
a = [ "a", "b", "c", "d", "e" ]
a[2] + a[0] + a[1] #=> "cab"
a[6] #=> nil
a[1, 2] #=> [ "b", "c" ]
a[1..3] #=> [ "b", "c", "d" ]
a[4..7] #=> [ "e" ]
a[6..10] #=> nil
a[-3, 3] #=> [ "c", "d", "e" ]
# special cases
a[5] #=> nil
a[5, 1] #=> []
a[5..10] #=> []

Related

Intersections and Unions in Ruby for sets with repeated elements

How do we get intersections and unions in Ruby for sets that repeat elements.
# given the sets
a = ["A", "B", "B", "C", "D", "D"]
b = ["B", "C", "D", "D", "D", "E"]
# A union function that adds repetitions
union(a, b)
=> ["A", "B", "B", "C", "D", "D", "D", "E"]
# An intersection function that adds repetitions
intersection(a, b)
=> ["B", "C", "D", "D"]
The &, and | operators seem to ignore repetitions and duplicates, as written in the documentation.
# union without duplicates
a | b
=> ["A", "B", "C", "D", "E"]
# intersections without duplicates
a & b
=> ["B", "C", "D"]
def union(a,b)
(a|b).flat_map { |s| [s]*[a.count(s), b.count(s)].max }
end
union(a,b)
# => ["A", "B", "B", "C", "D", "D", "D", "E"]
def intersection(a,b)
(a|b).flat_map { |s| [s]*[a.count(s), b.count(s)].min }
end
intersection(a,b)
#=> ["B", "C", "D", "D"]
Building upon Cary Swoveland's answer, you could create a temporary hash to count the number of occurrences of each member in each array: (I've generalized the number of arguments)
def multiplicities(*arrays)
m = Hash.new { |h, k| h[k] = Array.new(arrays.size, 0) }
arrays.each_with_index { |ary, idx| ary.each { |x| m[x][idx] += 1 } }
m
end
multiplicities(a, b)
#=> {"A"=>[1, 0], "B"=>[2, 1], "C"=>[1, 1], "D"=>[2, 3], "E"=>[0, 1]}
Implementing union and intersection is straight forward:
def union(*arrays)
multiplicities(*arrays).flat_map { |x, m| Array.new(m.max, x) }
end
def intersection(*arrays)
multiplicities(*arrays).flat_map { |x, m| Array.new(m.min, x) }
end
union(a, b) #=> ["A", "B", "B", "C", "D", "D", "D", "E"]
intersection(a, b) #=> ["B", "C", "D", "D"]
With this approach each array has to be traversed only once.

Convert array into a hash

I try to learn map and group_by but it's difficult...
My array of arrays :
a = [ [1, 0, "a", "b"], [1, 1, "c", "d"], [2, 0, "e", "f"], [3, 1, "g", "h"] ]
Expected result :
b= {
1=> {0=>["a", "b"], 1=>["c", "d"]} ,
2=> {0=>["e", "f"]} ,
3=> {1=>["g", "h"]}
}
Group by the first value, the second value can just be 0 or 1.
A starting :
a.group_by{ |e| e.shift}.map { |k, v| {k=>v.group_by{ |e| e.shift}} }
=> [{1=>{0=>[["a", "b"]], 1=>[["c", "d"]]}},
{2=>{0=>[["e", "f"]]}}, {3=>{1=>[["g", "h"]]}}]
I want to get "a" and "b" with the 2 first values, it's the only solution that I've found... (using a hash of hash)
Not sure if group_by is the simplest solution here:
a = [ [1, 0, "a", "b"], [1, 1, "c", "d"], [2, 0, "e", "f"], [3, 1, "g", "h"] ]
result = a.inject({}) do |acc,(a,b,c,d)|
acc[a] ||= {}
acc[a][b] = [c,d]
acc
end
puts result.inspect
Will print:
{1=>{0=>["a", "b"], 1=>["c", "d"]}, 2=>{0=>["e", "f"]}, 3=>{1=>["g", "h"]}}
Also, avoid changing the items you're operating on directly (the shift calls), the collections you could be receiving in your code might not be yours to change.
If you want a somewhat custom group_by I tend do just do it manually. group_by creates an Array of grouped values, so it creates [["a", "b"]] instead of ["a", "b"]. In addition your code is destructive, i.e. it manipulates the value of a. That is only a bad thing if you plan on re using a later on in its original form, but important to note.
As I mentioned though, you might as well just loop through a once and build the desired structure instead of doing multiple group_bys.
b = {}
a.each do |aa|
(b[aa[0]] ||= {})[aa[1]] = aa[2..3]
end
b # => {1=>{0=>["a", "b"], 1=>["c", "d"]}, 2=>{0=>["e", "f"]}, 3=>{1=>["g", "h"]}}
With (b[aa[0]] ||= {}) we check for the existence of the key aa[0] in the Hash b. If it does not exist, we assign an empty Hash ({}) to that key. Following that, we insert the last two elements of aa (= aa[2..3]) into that Hash, with aa[1] as key.
Note that this does not account for duplicate primary + secondary keys. That is, if you have another entry [1, 1, "x", "y"] it will overwrite the entry of [1, 1, "c", "d"] because they both have keys 1 and 1. You can fix that by storing the values in an Array, but then you might as well just do a double group_by. For example, with destructive behavior on a, handling "duplicates":
# Added [1, 1, "x", "y"], removed some others
a = [ [1, 0, "a", "b"], [1, 1, "c", "d"], [1, 1, "x", "y"] ]
b = Hash[a.group_by(&:shift).map { |k, v| [k, v.group_by(&:shift) ] }]
#=> {1=>{0=>[["a", "b"]], 1=>[["c", "d"], ["x", "y"]]}}
[[1, 0, "a", "b"], [1, 1, "c", "d"], [2, 0, "e", "f"], [3, 1, "g", "h"]].
group_by{ |e| e.shift }.
map{ |k, v| [k, v.inject({}) { |h, v| h[v.shift] = v; h }] }.
to_h
#=> {1=>{0=>["a", "b"], 1=>["c", "d"]}, 2=>{0=>["e", "f"]}, 3=>{1=>["g", "h"]}}
Here's how you can do it (nondestructively) with two Enumerable#group_by's and an Object#tap. The elements of a (arrays) could could vary in size and the size of each could be two or greater.
Code
def convert(arr)
h = arr.group_by(&:first)
h.keys.each { |k| h[k] = h[k].group_by { |a| a[1] }
.tap { |g| g.keys.each { |j|
g[j] = g[j].first[2..-1] } } }
h
end
Example
a = [ [1, 0, "a", "b"], [1, 1, "c", "d"], [2, 0, "e", "f"], [3, 1, "g", "h"] ]
convert(a)
#=> {1=>{0=>["a", "b"], 1=>["c", "d"]}, 2=>{0=>["e", "f"]}, 3=>{1=>["g", "h"]}}
Explanation
h = a.group_by(&:first)
#=> {1=>[[1, 0, "a", "b"], [1, 1, "c", "d"]],
# 2=>[[2, 0, "e", "f"]],
# 3=>[[3, 1, "g", "h"]]}
keys = h.keys
#=> [1, 2, 3]
The first value of keys passed into the block assigns the value 1 to the block variable k. We will set h[1] to a hash f, computed as follows.
f = h[k].group_by { |a| a[1] }
#=> [[1, 0, "a", "b"], [1, 1, "c", "d"]].group_by { |a| a[1] }
#=> {0=>[[1, 0, "a", "b"]], 1=>[[1, 1, "c", "d"]]}
We need to do further processing of this hash, so we capture it with tap and assign it to tap's block variable g (i.e., g will initially equal f above). g will be returned by the block after modification.
We have
g.keys #=> [0, 1]
so 0 is the first value passed into each's block and assigned to the block variable j. We then compute:
g[j] = g[j].first[2..-1]
#=> g[0] = [[1, 0, "a", "b"]].first[2..-1]
#=> ["a", "b"]
Similarly, when g's second key (1) is passed into the block,
g[j] = g[j].first[2..-1]
#=> g[1] = [[1, 1, "c", "d"]].first[2..-1]
#=> ["c", "d"]
Ergo,
h[1] = g
#=> {0=>["a", "b"], 1=>["c", "d"]}
h[2] and h[3] are computed similarly, giving us the desired result.

How to match bar, b-a-r, b--a--r etc in a string by Regexp

Given a string, I want to find a word bar, b-a-r, b--a--r etc. where - can be any letter. But interval between letters must be the same.
All letters are lower case and there is no gap betweens.
For example bar, beayr, qbowarprr, wbxxxxxayyyyyrzzz should match this.
I tried /b[a-z]*a[a-z]*r/ but this matches bxar which is wrong.
I am wondering if I achieve this with regexp?
Here's is one way to get all matches.
Code
def all_matches_with_spacers(word, str)
word_size = word.size
word_arr = word.chars
str_arr = str.chars
(0..(str.size - word_size)/(word_size-1)).each_with_object([]) do |n, arr|
regex = Regexp.new(word_arr.join(".{#{n}}"))
str_arr.each_cons(word_size + n * (word_size - 1))
.map(&:join)
.each { |substring| arr << substring if substring =~ regex }
end
end
This requires word.size > 1.
Example
all_matches_with_spacers('bar', 'bar') #=> ["bar"]
all_matches_with_spacers('bar', 'beayr') #=> ["beayr"]
all_matches_with_spacers('bar', 'qbowarprr') #=> ["bowarpr"]
all_matches_with_spacers('bar', 'wbxxxxxayyyyyrzzz') #=> ["bxxxxxayyyyyr"]
all_matches_with_spacers('bobo', 'bobobocbcbocbcobcodbddoddbddobddoddbddob')
#=> ["bobo", "bobo", "bddoddbddo", "bddoddbddo"]
Explanation
Suppose
word = 'bobo'
str = 'bobobocbcbocbcobcodbddoddbddobddoddbddob'
then
word_size = word.size #=> 4
word_arr = word.chars #=> ["b", "o", "b", "o"]
str_arr = str.chars
#=> ["b", "o", "b", "o", "b", "o", "c", "b", "c", "b", "o", "c", "b", "c",
# "o", "b", "c", "o", "d", "b", "d", "d", "o", "d", "d", "b", "d", "d",
# "o", "b", "d", "d", "o", "d", "d", "b", "d", "d", "o", "b"]
If n is the number of spacers between each letter of word, we require
word.size + n * (word.size - 1) <= str.size
Hence (since str.size => 40),
n <= (str.size - word_size)/(word_size-1) #=> (40-4)/(4-1) => 12
We therefore will iterate over zero to 12 spacers:
(0..12).each_with_object([]) do |n, arr| .. end
Enumerable#each_with_object creates an initially-empty array denoted by the block variable arr. The first value passed to block is zero (spacers), assigned to the block variable n.
We then have
regex = Regexp.new(word_arr.join(".{#{0}}")) #=> /b.{0}o.{0}b.{0}o/
which is the same as /bar/. word with n spacers has length
word_size + n * (word_size - 1) #=> 19
To extract all sub-arrays of str_arr with this length, we invoke:
str_arr.each_cons(word_size + n * (word_size - 1))
Here, with n = 0, this is:
enum = str_arr.each_cons(4)
#=> #<Enumerator: ["b", "o", "b", "o", "b", "o",...,"b"]:each_cons(4)>
This enumerator will pass the following into its block:
enum.to_a
#=> [["b", "o", "b", "o"], ["o", "b", "o", "b"], ["b", "o", "b", "o"],
# ["o", "b", "o", "c"], ["b", "o", "c", "b"], ["o", "c", "b", "c"],
# ["c", "b", "c", "b"], ["b", "c", "b", "o"], ["c", "b", "o", "c"],
# ["b", "o", "c", "b"], ["o", "c", "b", "c"], ["c", "b", "c", "o"],
# ["b", "c", "o", "b"], ["c", "o", "b", "c"], ["o", "b", "c", "o"]]
We next convert these to strings:
ar = enum.map(&:join)
#=> ["bobo", "obob", "bobo", "oboc", "bocb", "ocbc", "cbcb", "bcbo",
# "cboc", "bocb", "ocbc", "cbco", "bcob", "cobc", "obco"]
and add each (assigned to the block variable substring) to the array arr for which:
substring =~ regex
ar.each { |substring| arr << substring if substring =~ regex }
arr => ["bobo", "bobo"]
Next we increment the number of spacers to n = 1. This has the following effect:
regex = Regexp.new(word_arr.join(".{#{1}}")) #=> /b.{1}o.{1}b.{1}o/
str_arr.each_cons(4 + 1 * (4 - 1)) #=> str_arr.each_cons(7)
so we now examine the strings
ar = str_arr.each_cons(7).map(&:join)
#=> ["boboboc", "obobocb", "bobocbc", "obocbcb", "bocbcbo", "ocbcboc",
# "cbcbocb", "bcbocbc", "cbocbco", "bocbcob", "ocbcobc", "cbcobco",
# "bcobcod", "cobcodb", "obcodbd", "bcodbdd", "codbddo", "odbddod",
# "dbddodd", "bddoddb", "ddoddbd", "doddbdd", "oddbddo", "ddbddob",
# "dbddobd", "bddobdd", "ddobddo", "dobddod", "obddodd", "bddoddb",
# "ddoddbd", "doddbdd", "oddbddo", "ddbddob"]
ar.each { |substring| arr << substring if substring =~ regex }
There are no matches with one spacer, so arr remains unchanged:
arr #=> ["bobo", "bobo"]
For n = 2 spacers:
regex = Regexp.new(word_arr.join(".{#{2}}")) #=> /b.{2}o.{2}b.{2}o/
str_arr.each_cons(4 + 2 * (4 - 1)) #=> str_arr.each_cons(10)
ar = str_arr.each_cons(10).map(&:join)
#=> ["bobobocbcb", "obobocbcbo", "bobocbcboc", "obocbcbocb", "bocbcbocbc",
# "ocbcbocbco", "cbcbocbcob", "bcbocbcobc", "cbocbcobco", "bocbcobcod",
# ...
# "ddoddbddob"]
ar.each { |substring| arr << substring if substring =~ regex }
arr #=> ["bobo", "bobo", "bddoddbddo", "bddoddbddo"]
No matches are found for more than two spacers, so the method returns
["bobo", "bobo", "bddoddbddo", "bddoddbddo"]
For reference, there is a beautiful solution to the overall problem that is available in regex flavors that allow a capturing group to refer to itself:
^[^b]*bar|b(?:[^a](?=[^a]*a(\1?+.)))+a\1r
Sadly, Ruby doesn't allow this.
The interesting bit is on the right side of the alternation. After matching the initial b, we define a non-capturing group for the characters between b and a. This group will be repeated with the +. Between the a and r, we will inject capture group 1 with \1`. This group was captured one character at a time, overwriting itself with each pass, as each character between b and a was added.
See Quantifier Capture where the solution was demonstrated by #CasimiretHippolyte who refers to the idea behind the technique the "qtax trick".

Trying to understand Ruby arrays [duplicate]

This question already has answers here:
Array slicing in Ruby: explanation for illogical behaviour (taken from Rubykoans.com)
(10 answers)
Closed 9 years ago.
array = [:peanut, :butter, :and, :jelly]
Why does array[4,0] return [] and array[5,0] returns nil?
According to Array#[] documentation:
an empty array is returned when the starting index for an element
range is at the end of the array.
Returns nil if the index (or starting index) are out of range.
a = [ "a", "b", "c", "d", "e" ]
a[2] + a[0] + a[1] #=> "cab"
a[6] #=> nil
a[1, 2] #=> [ "b", "c" ]
a[1..3] #=> [ "b", "c", "d" ]
a[4..7] #=> [ "e" ]
a[6..10] #=> nil
a[-3, 3] #=> [ "c", "d", "e" ]
# special cases
a[5] #=> nil
a[6, 1] #=> nil
a[5, 1] #=> []
a[5..10] #=> []

How does ruby handle array range accessing?

ruby-1.8.7-p174 > [0,1][2..3]
=> []
ruby-1.8.7-p174 > [0,1][3..4]
=> nil
In a 0-index setting where index 2, 3, and 4 are all in fact out of bounds of the 2-item array, why would these return different values?
This is a known ugly odd corner. Take a look at the examples in rdoc for Array#slice.
This specific issue is listed as a "special case"
a = [ "a", "b", "c", "d", "e" ]
a[2] + a[0] + a[1] #=> "cab"
a[6] #=> nil
a[1, 2] #=> [ "b", "c" ]
a[1..3] #=> [ "b", "c", "d" ]
a[4..7] #=> [ "e" ]
a[6..10] #=> nil
a[-3, 3] #=> [ "c", "d", "e" ]
# special cases
a[5] #=> nil
a[5, 1] #=> []
a[5..10] #=> []
If the start is exactly one item beyond the end of the array, then it will return [], an empty array. If the start is beyond that, nil. It's documented, though I'm not sure of the reason for it.

Resources