How to uniq an array case insensitive - ruby

As far as i know, the result of
["a", "A"].uniq
is
["a", "A"]
My question is:
How do I make ["a", "A"].uniq give me either ["a"] or ["A"]

There is another way you can do this. You can actually pass a block to uniq or uniq! that can be used to evaluate each element.
["A", "a"].uniq { |elem| elem.downcase } #=> ["A"]
or
["A", "a"].uniq { |elem| elem.upcase } #=> ["A"]
In this case though, everything will be case insensitive so it will always return the array ["A"]

Just make the case consistent first.
e.g:
["a","A"].map{|i| i.downcase}.uniq
Edit: If as mikej suggests, the elements returned must be exactly the same as in the original array, then this will do that for you:
a.inject([]) { |result,h| result << h unless result.map{|i| i.downcase}.include?(h.downcase); result }
Edit2 Solution which should satisfy mikej :-)
downcased = []
a.inject([]) { |result,h|
unless downcased.include?(h.downcase);
result << h
downcased << h.downcase
end;
result}

you may build a mapping (Hash) between the case-normalized (e.g. downcased) values and the actual value and then take just the values from the hash:
["a", "b", "A", "C"]\
.inject(Hash.new){ |h,element| h[element.downcase] = element ; h }\
.values
selects the last occurrence of a given word (case insensitive):
["A", "b", "C"]
if you want the first occurrence:
["a", "b", "A", "C"]\
.inject(Hash.new){ |h,element| h[element.downcase] = element unless h[element.downcase] ; h }\
.values

["a", "A"].map{|x| x.downcase}.uniq
=> ["a"]
or
["a", "A"].map{|x| x.upcase}.uniq
=> ["A"]

If you are using ActiveSupport, you can use uniq_by.
It doesn't affect the case of the final output.
['A','a'].uniq_by(&:downcase) # => ['A']

A bit more efficient and way is to make use of uniq keys in hashes, so check this:
["a", "A"].inject(Hash.new){ |hash,j| hash[j.upcase] = j; hash}.values
will return the last element, in this case
["A"]
whereas using ||= as assign operator:
["a", "A"].inject(Hash.new){ |hash,j| hash[j.upcase] ||= j; hash}.values
will return first element, in this case
["a"]
especially for big Arrays this should be faster as we don't search the array each time using include?
cheers...

A more general solution (though not the most efficient):
class EqualityWrapper
attr_reader :obj
def initialize(obj, eq, hash)
#obj = obj
#eq = eq
#hash = hash
end
def ==(other)
#eq[#obj, other.obj]
end
alias :eql? :==
def hash
#hash[#obj]
end
end
class Array
def uniq_by(eq, hash = lambda{|x| 0 })
map {|x| EqualityWrapper.new(x, eq, hash) }.
uniq.
map {|x| x.obj }
end
def uniq_ci
eq = lambda{|x, y| x.casecmp(y) == 0 }
hash = lambda{|x| x.downcase.hash }
uniq_by(eq, hash)
end
end
The uniq_by method takes a lambda that checks the equality, and a lambda that returns a hash, and removes duplicate objects as defined by those data.
Implemented on top of that, the uniq_ci method removes string duplicates using case insensitive comparisons.

Related

How to merge array index values and create a hash

I'm trying to convert an array into a hash by using some matching. Before converting the array into a hash, I want to merge the values like this
"Desc,X1XXSC,C,CCCC4524,xxxs,xswd"
and create a hash from it. The rule is that, first value of the array is the key in Hash, in array there are repeating keys, for those keys I need to merge values and place it under one key. "Desc:" are keys. My program looks like this.
p 'test sample application'
str = "Desc:X1:C:CCCC:Desc:XXSC:xxxs:xswd:C:4524"
arr = Array.new
arr = str.split(":")
p arr
test_hash = Hash[*arr]
p test_hash
I could not find a way to figure it out. If any one can guide me, It will be thankful.
Functional approach with Facets:
require 'facets'
str.split(":").each_slice(2).map_by { |k, v| [k, v] }.mash { |k, vs| [k, vs.join] }
#=> {"Desc"=>"X1XXSC", "C"=>"CCCC4524", "xxxs"=>"xswd"}
Not that you cannot do it without Facets, but it's longer because of some basic abstractions missing in the core:
Hash[str.split(":").each_slice(2).group_by(&:first).map { |k, gs| [k, gs.map(&:last).join] }]
#=> {"Desc"=>"X1XXSC", "C"=>"CCCC4524", "xxxs"=>"xswd"}
A small variation on #Sergio Tulentsev's solution:
str = "Desc:X1:C:CCCC:Desc:XXSC:xxxs:xswd:C:4524"
str.split(':').each_slice(2).each_with_object(Hash.new{""}){|(k,v),h| h[k] += v}
# => {"Desc"=>"X1XXSC", "C"=>"CCCC4524", "xxxs"=>"xswd"}
str.split(':') results in an array; there is no need for initializing with arr = Array.new
each_slice(2) feeds the elements of this array two by two to a block or to the method following it, like in this case.
each_with_object takes those two elements (as an array) and passes them on to a block, together with an object, specified by:
(Hash.new{""}) This object is an empty Hash with special behaviour: when a key is not found then it will respond with a value of "" (instead of the usual nil).
{|(k,v),h| h[k] += v} This is the block of code which does all the work. It takes the array with the two elements and deconstructs it into two strings, assigned to k and v; the special hash is assigned to h. h[k] asks the hash for the value of key "Desc". It responds with "", to which "X1" is added. This is repeated until all elements are processed.
I believe you're looking for each_slice and each_with_object here
str = "Desc:X1:C:CCCC:Desc:XXSC:xxxs:xswd:C:4524"
hash = str.split(':').each_slice(2).each_with_object({}) do |(key, value), memo|
memo[key] ||= ''
memo[key] += value
end
hash # => {"Desc"=>"X1XXSC", "C"=>"CCCC4524", "xxxs"=>"xswd"}
Enumerable#slice_before is a good way to go.
str = "Desc:X1:C:CCCC:Desc:XXSC:xxxs:xswd:C:4524"
a = ["Desc","C","xxxs"] # collect the keys in a separate collection.
str.split(":").slice_before(""){|i| a.include? i}
# => [["Desc", "X1"], ["C", "CCCC"], ["Desc", "XXSC"], ["xxxs", "xswd"], ["C", "4524"]]
hsh = str.split(":").slice_before(""){|i| a.include? i}.each_with_object(Hash.new("")) do |i,h|
h[i[0]] += i[1]
end
hsh
# => {"Desc"=>"X1XXSC", "C"=>"CCCC4524", "xxxs"=>"xswd"}

Array in value of hash

How to push inputs into a value of a hash? My problem is that I got multiple keys and all of them reference arrays.
{"A"=>["C"], "B"=>["E"], "C"=>["D"], "D"=>["B"]}
How can I push another String onto one of these? For example I want to add a "Z" to the array of key "A"?
Currently I either overwrite the former array or all data is in one.
Its about converting a Array ["AB3", "DC2", "FG4", "AC1", "AF4"] into a hash with {"A"=>["B", "C", "F"]}.
Any command <<, push, unshift will do a job
if h["A"]
h["A"] << "Z"
else
h["A"] = ["Z"]
end
You said your original problem is converting the array ["AB3", "DC2", "FG4", "AC1", "AF4"] into the hash {"A"=>["B", "C", "F"]}, which can be done like this:
Hash[a.group_by { |s| s[0] }.map { |k, v| [k, v.map { |s| s[1] }] }]
Or like this:
a.inject(Hash.new{|h, k| h[k]=[]}) { |h, s| h[s[0]] << s[1] ; h }
Note that Hash.new{|h, k| h[k]=[]} creates an array with a default value of [] (an empty array), so you'll always be able to use << to add elements to it.
Better approach:
Add a new class method in Hash as below:
class Hash
def add (k,v)
unless self.key?k
self[k] = [v]
else
self[k] = self[k] << v
end
self
end
end
h={}
h.add('A','B') #=> {"A"=>["B"]}
h.add('A','C') #=> {"A"=>["B", "C"]}
h.add('B','X') #=> {"A"=>["B", "C"], "B"=>["X"]}
Done.
This can be even more idiomatic according to your precise problem; say, you want to send multiple values at once, then code can be DRY-ed to handle multiple arguments.
Hope this helps.
All the best.

Delete contents of array based on a set of indexes

delete_at only takes a single index. What's a good way to achieve this using built-in methods?
Doesn't have to be a set, can be an array of indexes as well.
arr = ["a", "b", "c"]
set = Set.new [1, 2]
arr.delete_at set
# => arr = ["a"]
One-liner:
arr.delete_if.with_index { |_, index| set.include? index }
Re-open the Array class and add a new method for this.
class Array
def delete_at_multi(arr)
arr = arr.sort.reverse # delete highest indexes first.
arr.each do |i|
self.delete_at i
end
self
end
end
arr = ["a", "b", "c"]
set = [1, 2]
arr.delete_at_multi(set)
arr # => ["a"]
This could of course be written as a stand-alone method if you don't want to re-open the class. Making sure the indexes are in reverse order is very important, otherwise you change the position of elements later in the array that are supposed to be deleted.
Try this:
arr.reject { |item| set.include? arr.index(item) } # => [a]
It's a bit ugly, I think ;) Maybe someone suggest a better solution?
Functional approach:
class Array
def except_values_at(*indexes)
([-1] + indexes + [self.size]).sort.each_cons(2).flat_map do |idx1, idx2|
self[idx1+1...idx2] || []
end
end
end
>> ["a", "b", "c", "d", "e"].except_values_at(1, 3)
=> ["a", "c", "e"]

Eliminate consecutive duplicates of list elements

What is the best solution to eliminate consecutive duplicates of list elements?
list = compress(['a','a','a','a','b','c','c','a','a','d','e','e','e','e']).
p list # => # ['a','b','c','a','d','e']
I have this one:
def compress(list)
list.map.with_index do |element, index|
element unless element.equal? list[index+1]
end.compact
end
Ruby 1.9.2
Nice opportunity to use Enumerable#chunk, as long as your list doesn't contain nil:
list.chunk(&:itself).map(&:first)
For Ruby older than 2.2.x, you can require "backports/2.2.0/kernel/itself" or use {|x| x} instead of (&:itself).
For Ruby older than 1.9.2, you can require "backports/1.9.2/enumerable/chunk" to get a pure Ruby version of it.
Do this (provided that each element is a single character)
list.join.squeeze.split('')
Ruby 1.9+
list.select.with_index{|e,i| e != list[i+1]}
with respect to #sawa, who told me about with_index :)
As #Marc-André Lafortune noticed if there is nil at the end of your list it won't work for you. We can fix it with this ugly structure
list.select.with_index{|e,i| i < (list.size-1) and e != list[i+1]}
# Requires Ruby 1.8.7+ due to Object#tap
def compress(items)
last = nil
[].tap do |result|
items.each{ |o| result << o unless last==o; last=o }
end
end
list = compress(%w[ a a a a b c c a a d e e e e ])
p list
#=> ["a", "b", "c", "a", "d", "e"]
arr = ['a','a','a','a','b','c','c','a','a','d','e','e','e','e']
enum = arr.each
#=> #<Enumerator: ["a", "a", "a", "a", "b", "c", "c", "a", "a", "d",
# "e", "e", "e", "e"]:each>
a = []
loop do
n = enum.next
a << n unless n == enum.peek
end
a #=> ["a", "b", "c", "a", "d"]
Enumerator#peek raises a StopIteration exception when it has already returned the last element of the enumerator. Kernel#loop handles that exception by breaking out of the loop.
See Array#each and Enumerator#next. Kernel#to_enum1 can be used in place of Array#each.
1 to_enum is an Object instance method that is defined in the Kernel module but documented in the Object class. Got that?

How To keep track of counter variables in ruby, block, for, each, do

I forget how to keep track of the position of the loops in Ruby. Usually I write in JavaScript, AS3, Java, etc.
each:
counter = 0
Word.each do |word,x|
counter += 1
#do stuff
end
for:
same thing
while:
same thing
block
Word.each {|w,x| }
This one I really don't know about.
In addition to Ruby 1.8's Array#each_with_index method, many enumerating methods in Ruby 1.9 return an Enumerator when called without a block; you can then call the with_index method to have the enumerator also pass along the index:
irb(main):001:0> a = *('a'..'g')
#=> ["a", "b", "c", "d", "e", "f", "g"]
irb(main):002:0> a.map
#=> #<Enumerator:0x28bfbc0>
irb(main):003:0> a.select
#=> #<Enumerator:0x28cfbe0>
irb(main):004:0> a.select.with_index{ |c,i| i%2==0 }
#=> ["a", "c", "e", "g"]
irb(main):005:0> Hash[ a.map.with_index{ |c,i| [c,i] } ]
#=> {"a"=>0, "b"=>1, "c"=>2, "d"=>3, "e"=>4, "f"=>5, "g"=>6}
If you want map.with_index or select.with_index (or the like) under Ruby 1.8.x, you can either do this boring-but-fast method:
i = 0
a.select do |c|
result = i%2==0
i += 1
result
end
or you can have more functional fun:
a.zip( (0...a.length).to_a ).select do |c,i|
i%2 == 0
end.map{ |c,i| c }
If you use each_with_index instead of each, you'll get an index along with the element. So you can do:
Word.each_with_index do |(word,x), counter|
#do stuff
end
For while loops you'll still have to keep track of the counter yourself.
A capital W would mean it's a constant which most likely mean it's a class or a module not an instance of a class. I guess you could have a class return an enumerable using each but that seems very bizarre.
To remove the confusing extra junk and the, possibly, incorrectly capitalized example I would make my code look like this.
words = get_some_words()
words.each_with_index do |word, index|
puts "word[#{index}] = #{word}"
end
I'm not sure what Sepp2K was doing with the weird (word,x) thing.

Resources