How to map and remove nil values in Ruby - ruby

I have a map which either changes a value or sets it to nil. I then want to remove the nil entries from the list. The list doesn't need to be kept.
This is what I currently have:
# A simple example function, which returns a value or nil
def transform(n)
rand > 0.5 ? n * 10 : nil }
end
items.map! { |x| transform(x) } # [1, 2, 3, 4, 5] => [10, nil, 30, 40, nil]
items.reject! { |x| x.nil? } # [10, nil, 30, 40, nil] => [10, 30, 40]
I'm aware I could just do a loop and conditionally collect in another array like this:
new_items = []
items.each do |x|
x = transform(x)
new_items.append(x) unless x.nil?
end
items = new_items
But it doesn't seem that idiomatic. Is there a nice way to map a function over a list, removing/excluding the nils as you go?

You could use compact:
[1, nil, 3, nil, nil].compact
=> [1, 3]
I'd like to remind people that if you're getting an array containing nils as the output of a map block, and that block tries to conditionally return values, then you've got code smell and need to rethink your logic.
For instance, if you're doing something that does this:
[1,2,3].map{ |i|
if i % 2 == 0
i
end
}
# => [nil, 2, nil]
Then don't. Instead, prior to the map, reject the stuff you don't want or select what you do want:
[1,2,3].select{ |i| i % 2 == 0 }.map{ |i|
i
}
# => [2]
I consider using compact to clean up a mess as a last-ditch effort to get rid of things we didn't handle correctly, usually because we didn't know what was coming at us. We should always know what sort of data is being thrown around in our program; Unexpected/unknown data is bad. Anytime I see nils in an array I'm working on, I dig into why they exist, and see if I can improve the code generating the array, rather than allow Ruby to waste time and memory generating nils then sifting through the array to remove them later.
'Just my $%0.2f.' % [2.to_f/100]

Try using reduce or inject.
[1, 2, 3].reduce([]) { |memo, i|
if i % 2 == 0
memo << i
end
memo
}
I agree with the accepted answer that we shouldn't map and compact, but not for the same reasons.
I feel deep inside that map then compact is equivalent to select then map. Consider: map is a one-to-one function. If you are mapping from some set of values, and you map, then you want one value in the output set for each value in the input set. If you are having to select before-hand, then you probably don't want a map on the set. If you are having to select afterwards (or compact) then you probably don't want a map on the set. In either case you are iterating twice over the entire set, when a reduce only needs to go once.
Also, in English, you are trying to "reduce a set of integers into a set of even integers".

Ruby 2.7+
There is now!
Ruby 2.7 is introducing filter_map for this exact purpose. It's idiomatic and performant, and I'd expect it to become the norm very soon.
For example:
numbers = [1, 2, 5, 8, 10, 13]
enum.filter_map { |i| i * 2 if i.even? }
# => [4, 16, 20]
In your case, as the block evaluates to falsey, simply:
items.filter_map { |x| process_x url }
"Ruby 2.7 adds Enumerable#filter_map" is a good read on the subject, with some performance benchmarks against some of the earlier approaches to this problem:
N = 100_000
enum = 1.upto(1_000)
Benchmark.bmbm do |x|
x.report("select + map") { N.times { enum.select { |i| i.even? }.map{ |i| i + 1 } } }
x.report("map + compact") { N.times { enum.map { |i| i + 1 if i.even? }.compact } }
x.report("filter_map") { N.times { enum.filter_map { |i| i + 1 if i.even? } } }
end
# Rehearsal -------------------------------------------------
# select + map 8.569651 0.051319 8.620970 ( 8.632449)
# map + compact 7.392666 0.133964 7.526630 ( 7.538013)
# filter_map 6.923772 0.022314 6.946086 ( 6.956135)
# --------------------------------------- total: 23.093686sec
#
# user system total real
# select + map 8.550637 0.033190 8.583827 ( 8.597627)
# map + compact 7.263667 0.131180 7.394847 ( 7.405570)
# filter_map 6.761388 0.018223 6.779611 ( 6.790559)

Definitely compact is the best approach for solving this task. However, we can achieve the same result just with a simple subtraction:
[1, nil, 3, nil, nil] - [nil]
=> [1, 3]

In your example:
items.map! { |x| process_x url } # [1, 2, 3, 4, 5] => [1, nil, 3, nil, nil]
it does not look like the values have changed other than being replaced with nil. If that is the case, then:
items.select{|x| process_x url}
will suffice.

If you wanted a looser criterion for rejection, for example, to reject empty strings as well as nil, you could use:
[1, nil, 3, 0, ''].reject(&:blank?)
=> [1, 3, 0]
If you wanted to go further and reject zero values (or apply more complex logic to the process), you could pass a block to reject:
[1, nil, 3, 0, ''].reject do |value| value.blank? || value==0 end
=> [1, 3]
[1, nil, 3, 0, '', 1000].reject do |value| value.blank? || value==0 || value>10 end
=> [1, 3]

You can use #compact method on the resulting array.
[10, nil, 30, 40, nil].compact => [10, 30, 40]

each_with_object is probably the cleanest way to go here:
new_items = items.each_with_object([]) do |x, memo|
ret = process_x(x)
memo << ret unless ret.nil?
end
In my opinion, each_with_object is better than inject/reduce in conditional cases because you don't have to worry about the return value of the block.

One more way to accomplish it will be as shown below. Here, we use Enumerable#each_with_object to collect values, and make use of Object#tap to get rid of temporary variable that is otherwise needed for nil check on result of process_x method.
items.each_with_object([]) {|x, obj| (process x).tap {|r| obj << r unless r.nil?}}
Complete example for illustration:
items = [1,2,3,4,5]
def process x
rand(10) > 5 ? nil : x
end
items.each_with_object([]) {|x, obj| (process x).tap {|r| obj << r unless r.nil?}}
Alternate approach:
By looking at the method you are calling process_x url, it is not clear what is the purpose of input x in that method. If I assume that you are going to process the value of x by passing it some url and determine which of the xs really get processed into valid non-nil results - then, may be Enumerabble.group_by is a better option than Enumerable#map.
h = items.group_by {|x| (process x).nil? ? "Bad" : "Good"}
#=> {"Bad"=>[1, 2], "Good"=>[3, 4, 5]}
h["Good"]
#=> [3,4,5]

Related

How to convert a three-line Ruby method into one

I have a simple method that iterates through an array and returns a duplicate. (Or duplicates)
def find_dup(array)
duplicate = 0
array.each { |element| duplicate = element if array.count(element) > 1}
duplicate
end
It works, but I'd like to express this more elegantly.
The reason it is three lines is that the variable "duplicate", which the method must return, is not visible to the method if I introduce it inside the block, i.e,
def find_dup(array)
array.each { |element| duplicate = element if array.count(element) > 1}
duplicate
end
I've tried a few ways to define "duplicate" as the result of a block, but to no avail.
Any thoughts?
It's a little too much to do cleanly in a one-liner, but this is a more
efficient solution.
def find_dups(arr)
counts = Hash.new { |hash,key| hash[key] = 0 }
arr.each_with_object(counts) do |x, memo|
memo[x] += 1
end.select { |key,val| val > 1 }.keys
end
The Hash.new call instantiates a hash where the default value is 0.
each_with_object modifies this hash to track the count of each element in arr, then at the
end the filter is used to select only those having a count greater than one.
The benefit of this approach over a solution using Array#includes? or Array#count is that it only scans the array a single time. Thus it is a O(N) time instead of O(N^2).
Your method is only finding the last duplicate in the array. If you want all the duplicates, I would do something like this:
def find_dups(arr)
dups = Hash.new { |h, k| h[k] = 0 }
arr.each { |el| dups[el] += 1 }
dups.select { |k, v| v > 1 }.keys
end
If what you really want is a one-liner that isn't concerned with big-O complexity and only returns the last duplicate in the array, I would do this:
def find_last_dup(arr)
arr.reverse_each { |el| return el if arr.count(el) > 1 }
end
You can do this as one line and it flows a bit nicer. Though this would find the first instance of a duplicate whereas your code is returning the last instance of a duplicate, not sure if that's part of your requirement.
def find_dup(array)
array.group_by { |value| value }.find { |_, groups| groups.count > 1 }.first
end
Also, note that making things one line doesn't strictly mean is better. I'd find the code more readable split over more lines, but that's just my opinion.
def find_dup(array)
array.group_by { |value|
value
}.find { |_, groups|
groups.count > 1
}.first
end
Just want to add one more approach to the mix.
def find_last_dup(arr)
arr.reverse_each.detect { |x| arr.count(x) > 1 }
end
Alternatively, you can get linear time complexity in two lines.
def find_last_dup(arr)
freq = arr.each_with_object(Hash.new(0)) { |x, obj| obj[x] += 1 }
arr.reverse_each.detect { |x| freq[x] > 1 }
end
For the sake of argument, the latter approach can be reduced to one line as well, but this would be unidiomatic and confusing.
def find_last_dup(arr)
arr.each_with_object(Hash.new(0)) { |x, obj| obj[x] += 1 }
.tap do |freq| return arr.reverse_each.detect { |x| freq[x] > 1 } end
end
Given:
> a
=> [8, 5, 6, 6, 5, 8, 6, 1, 9, 7, 2, 10, 7, 7, 3, 4]
You can group the dups together:
> a.uniq.each_with_object(Hash.new(0)) {|e, h| c=a.count(e); h[e]=c if c>1}
=> {8=>2, 5=>2, 6=>3, 7=>3}
Or,
> a.group_by{ |e| e}.select{|k,v| v if v.length>1}
=> {8=>[8, 8], 5=>[5, 5], 6=>[6, 6, 6], 7=>[7, 7, 7]}
In each case, the order of the result is based on the order of the elements in a that have dups. If you just want the first:
> a.group_by{ |e| e}.select{|k,v| v if v.length>1}.first
=> [8, [8, 8]]
Or last:
> a.group_by{ |e| e}.select{|k,v| v if v.length>1}.to_a.last
=> [7, [7, 7, 7]]
If you want to 'fast forward' to the first value that has a dup, you can use drop_while:
> b=[1,2,3,4,5,4,5,6]
> b.drop_while {|e| b.count(e)==1 }[0]
=> 4
Or the last:
> b.reverse.drop_while {|e| b.count(e)==1 }[0]
=> 5
def find_duplicates(array)
array.dup.uniq.each { |element| array.delete_at(array.index(element)) }.uniq
end
The above method find_duplicates duplicated the input array and deletes the first occurrence of all the elements, leaving the array with only remaining occurrences of the duplicate elements.
Example:
array = [1, 2, 3, 4, 3, 4, 3]
=> [1, 2, 3, 4, 3, 4, 3]
find_duplicates(array)
=> [3, 4]

idiomatic way to check if array contains ordered (but possibly non-continuous) set of elements

I was wondering if there is a more idiomatic way to get the functionality represented by the code below. Basically I just want to check if the array contains the elements in pattern in the order specified by pattern. It's okay for there to be gaps between these elements.
class Array
def has_pattern?(pattern)
offset = 0
pattern.each do |p|
offset = self[offset..-1].index(p)
return false if offset.nil?
end
return true
end
end
puts [1, 2, 3, 4, 5, 1].has_pattern?([1, 4, 5]) # true
puts [1, 2, 3, 4, 5, 1].has_pattern?([2, 3, 1]) # true
puts [1, 2, 3, 4, 5, 1].has_pattern?([1, 3, 2]) # false
The code above seems to work, but doesn't feel like idiomatic Ruby to me. Is there a nicer way to write this?
Here's my take on it:
class Array
def has_pattern?(ptn)
i = 0
self.each do |elem|
i += 1 if elem == ptn[i]
end
i >= ptn.size
end
end
It passes through the array only once, so it may make a difference when the array's big.
Here's a different way to approach it:
class Array
def has_pattern?(pattern)
(self - (self - pattern))
.each_cons(pattern.length)
.any? { |p| p === pattern }
end
end
But, as I said in the comments above, I think your solution is superior.

Setting few hash parameters with same value but different keys

I have a construction in my application for which I need a hash like this:
{ 1 => [6,2,2], 2 => [7,4,5], (3..7) => [7,2,1] }
So I would like to have same value for keys 3, 4, 5, 6 and 7.
Sure above example doesn't work cause Ruby is intelligent and sets hash key as given: it sets range as key :) So I can only access my value as my_hash[(3..7)] and my_hash[3], my_hash[4] and so on are nil.
Sure I can have a check or construction outside of hash to do what I need, however I am curious if it is possible to set a hash like this without using any loops outside hash declaration? If not, what is most elegant one? Thanks!
You could subclass Hash to make it easier to construct such hashes:
class RangedHash < Hash
def []=(key, val)
if key.is_a? Range
key.each do |k|
super k, val
end
else
super key, val
end
end
end
It works the same as a normal hash, except when you use a Range key, it sets the given value at every point in the Range.
irb(main):014:0> h = RangedHash.new
=> {}
irb(main):015:0> h[(1..5)] = 42
=> 42
irb(main):016:0> h[1]
=> 42
irb(main):017:0> h[5]
=> 42
irb(main):018:0> h['hello'] = 24
=> 24
irb(main):019:0> h['hello']
=> 24
Is there anything especially wrong with this?
myhash = { 1 => [6,2,2], 2 => [7,4,5] }
(3..7).each { |k| myhash[k] = [7,2,1] }
I don't think there's a way to set multiple keys using literal hash syntax, or without some iteration, but here's a short way to do it with iteration:
irb(main):007:0> h = { 1 => [6,2,2], 2 => [7,4,5] }; (3..7).each {|n| h[n] = [7,2,1]}; h
=> {1=>[6, 2, 2], 2=>[7, 4, 5], 3=>[7, 2, 1], 4=>[7, 2, 1], 5=>[7, 2, 1], 6=>[7, 2, 1], 7=>[7, 2, 1]}
(Note that the trailing ; h is just for displaying purposes above.)
I don't like the idea of creating separate key/value pairs for every possible entry in a range. It's not scalable at all, especially for wide ranges. Consider this small range:
'a' .. 'zz'
which would result in 702 additional keys. Try ('a'..'zz').to_a for fun. Go ahead. I'll wait.
Instead of creating the keys, intercept the lookup. Reusing the RangedHash class name:
class RangedHash < Hash
def [](key)
return self.fetch(key) if self.key? key
self.keys.select{ |k| k.is_a? Range }.each do |r_k|
return self.fetch(r_k) if r_k === key
end
nil
end
end
foo = RangedHash.new
foo[1] = [6,2,2]
foo[2] = [7,4,5]
foo[3..7] = [7,2,1]
At this point foo looks like:
{1=>[6, 2, 2], 2=>[7, 4, 5], 3..7=>[7, 2, 1]}
Testing the method:
require 'pp'
3.upto(7) do |i|
pp foo[i]
end
Which outputs:
[7, 2, 1]
[7, 2, 1]
[7, 2, 1]
[7, 2, 1]
[7, 2, 1]
For any value in a range this outputs the value associated with that range. Values outside the range, but still defined in the hash, work normally, as does returning nil for keys that don't exist in the hash. And, it keeps the hash as small as possible.
The downside to this, or any solution to the question, is the keys that are ranges could overlap, causing collisions. In most of the proposed solutions, the keys would stomp on each other, which would/could end up returning bad values. This method won't do that because it'd take a direct conflict to overwrite a range-key.
To fix this would require deciding whether overlaps are allowed, and, if so, is it OK that the first one found is returned, or should there be logic that determines "best-fit", i.e., the smallest range that fits, or some other criteria entirely. Or, should overlaps be joined to make a larger range if the value is the same? It's a can of worms.
Patching Hash directly, but otherwise the same idea as Luke's...
class Hash
alias_method :orig_assign, '[]='
def []= k, v
if k.is_a? Range
k.each { |i| orig_assign i, v }
v
else
orig_assign k, v
end
end
end
t = {}
t[:what] = :ever
t[3..7] = 123
p t # => {5=>123, 6=>123, 7=>123, 3=>123, 4=>123, :what=>:ever}
Here is some more approach:
h = { 1 => [6,2,2], 2 => [7,4,5], (3..7) => [7,2,1] }
def my_hash(h,y)
h.keys.each do |x|
if (x.instance_of? Range) and (x.include? y) then
return p h[x]
end
end
p h[y]
end
my_hash(h,2)
my_hash(h,3)
my_hash(h,1)
my_hash(h,10)
my_hash(h,5)
my_hash(h,(3..7))
Output:
[7, 4, 5]
[7, 2, 1]
[6, 2, 2]
nil
[7, 2, 1]
[7, 2, 1]

Skip over iteration in Enumerable#collect

(1..4).collect do |x|
next if x == 3
x + 1
end # => [2, 3, nil, 5]
# desired => [2, 3, 5]
If the condition for next is met, collect puts nil in the array, whereas what I'm trying to do is put no element in the returned array if the condition is met. Is this possible without calling delete_if { |x| x == nil } on the returned array?
My code excerpt is heavily abstracted, so looking for a general solution to the problem.
There is method Enumerable#reject which serves just the purpose:
(1..4).reject{|x| x == 3}.collect{|x| x + 1}
The practice of directly using an output of one method as an input of another is called method chaining and is very common in Ruby.
BTW, map (or collect) is used for direct mapping of input enumerable to the output one. If you need to output different number of elements, chances are that you need another method of Enumerable.
Edit: If you are bothered by the fact that some of the elements are iterated twice, you can use less elegant solution based on inject (or its similar method named each_with_object):
(1..4).each_with_object([]){|x,a| a << x + 1 unless x == 3}
I would simply call .compact on the resultant array, which removes any instances of nil in an array. If you'd like it to modify the existing array (no reason not to), use .compact!:
(1..4).collect do |x|
next if x == 3
x
end.compact!
In Ruby 2.7+, it’s possible to use filter_map for this exact purpose. From the docs:
Returns an array containing truthy elements returned by the block.
(0..9).filter_map {|i| i * 2 if i.even? } #=> [0, 4, 8, 12, 16]
{foo: 0, bar: 1, baz: 2}.filter_map {|key, value| key if value.even? } #=> [:foo, :baz]
For the example in the question: (1..4).filter_map { |x| x + 1 unless x == 3 }.
See this post for comparison with alternative methods, including benchmarks.
just a suggestion, why don't you do it this way:
result = []
(1..4).each do |x|
next if x == 3
result << x
end
result # => [1, 2, 4]
in that way you saved another iteration to remove nil elements from the array. hope it helps =)
i would suggest to use:
(1..4).to_a.delete_if {|x| x == 3}
instead of the collect + next statement.
You could pull the decision-making into a helper method, and use it via Enumerable#reduce:
def potentially_keep(list, i)
if i === 3
list
else
list.push i
end
end
# => :potentially_keep
(1..4).reduce([]) { |memo, i| potentially_keep(memo, i) }
# => [1, 2, 4]

Create two-dimensional arrays and access sub-arrays in Ruby

I wonder if there's a possibility to create a two dimensional array and to quickly access any horizontal or vertical sub array in it?
I believe we can access a horizontal sub array in the following case:
x = Array.new(10) { Array.new(20) }
x[6][3..8] = 'something'
But as far as I understand, we cannot access it like this:
x[3..8][6]
How can I avoid or hack this limit?
There are some problems with 2 dimensional Arrays the way you implement them.
a= [[1,2],[3,4]]
a[0][2]= 5 # works
a[2][0]= 6 # error
Hash as Array
I prefer to use Hashes for multi dimensional Arrays
a= Hash.new
a[[1,2]]= 23
a[[5,6]]= 42
This has the advantage, that you don't have to manually create columns or rows. Inserting into hashes is almost O(1), so there is no drawback here, as long as your Hash does not become too big.
You can even set a default value for all not specified elements
a= Hash.new(0)
So now about how to get subarrays
(3..5).to_a.product([2]).collect { |index| a[index] }
[2].product((3..5).to_a).collect { |index| a[index] }
(a..b).to_a runs in O(n). Retrieving an element from an Hash is almost O(1), so the collect runs in almost O(n). There is no way to make it faster than O(n), as copying n elements always is O(n).
Hashes can have problems when they are getting too big. So I would think twice about implementing a multidimensional Array like this, if I knew my amount of data is getting big.
rows, cols = x,y # your values
grid = Array.new(rows) { Array.new(cols) }
As for accessing elements, this article is pretty good for step by step way to encapsulate an array in the way you want:
How to ruby array
You didn't state your actual goal, but maybe this can help:
require 'matrix' # bundled with Ruby
m = Matrix[
[1, 2, 3],
[4, 5, 6]
]
m.column(0) # ==> Vector[1, 4]
(and Vectors acts like arrays)
or, using a similar notation as you desire:
m.minor(0..1, 2..2) # => Matrix[[3], [6]]
Here's a 3D array case
class Array3D
def initialize(d1,d2,d3)
#data = Array.new(d1) { Array.new(d2) { Array.new(d3) } }
end
def [](x, y, z)
#data[x][y][z]
end
def []=(x, y, z, value)
#data[x][y][z] = value
end
end
You can access subsections of each array just like any other Ruby array.
#data[0..2][3..5][8..10] = 0
etc
x.transpose[6][3..8] or x[3..8].map {|r| r [6]} would give what you want.
Example:
a = [ [1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[21, 22, 23, 24, 25]
]
#a[1..2][2] -> [8,13]
puts a.transpose[2][1..2].inspect # [8,13]
puts a[1..2].map {|r| r[2]}.inspect # [8,13]
I'm quite sure this can be very simple
2.0.0p247 :032 > list = Array.new(5)
=> [nil, nil, nil, nil, nil]
2.0.0p247 :033 > list.map!{ |x| x = [0] }
=> [[0], [0], [0], [0], [0]]
2.0.0p247 :034 > list[0][0]
=> 0
a = Array.new(Array.new(4))
0.upto(a.length-1) do |i|
0.upto(a.length-1) do |j|
a[i[j]] = 1
end
end
0.upto(a.length-1) do |i|
0.upto(a.length-1) do |j|
print a[i[j]] = 1 #It's not a[i][j], but a[i[j]]
end
puts "\n"
end
Here is the simple version
#one
a = [[0]*10]*10
#two
row, col = 10, 10
a = [[0]*row]*col
Here is an easy way to create a "2D" array.
2.1.1 :004 > m=Array.new(3,Array.new(3,true))
=> [[true, true, true], [true, true, true], [true, true, true]]

Resources