Hash concatenation/merge without duplicate and select the max key value? - ruby

h1 = {"a"=> "121","b"=> "248","d"=> "192","e"=> "182"}
h2 = {"a"=> "458","b"=> "122","c"=> "562","f"=> "224","g"=> "352"}
This is my input I try to merge it but I had this output only
merge_hash = {"a"=>"121", "b"=>"248", "c"=>"562", "f"=>"224", "g"=>"352", "d"=>"192", "e"=>"182"}
But I want this
merge_hash = {"a"=>"458", "b"=>"248", "c"=>"562", "f"=>"224", "g"=>"352", "d"=>"192", "e"=>"182"}
I used this to merge the Hash
merge_hash = h2.merge(h1)
Anyone please help me with this Issue

It seems that you want to select the maximum value whenever there's an overlap. For this you can use the block form of merge:
h1.merge(h2) do |key, old_val, new_val|
[old_val, new_val].max
end
This block only gets run when there's an overlap and its return value determines which value gets used

I believe that what you are trying to accomplish can be done as follow:
def merge_hashes(h1:, h2:)
keys = h1.merge(h2).keys
hash_result = {}
keys.each do |key|
if h1[key].nil? && !h2[key].nil?
hash_result[key] = h2[key]
elsif !h1[key].nil? && h2[key].nil?
hash_result[key] = h1[key]
else
hash_result[key] = h1[key] > h2[key] ? h1[key] : h2[key]
end
end
hash_result
end
Then you can use it: merge_hashes(h1: h1, h2: h2)

h1.merge(h2) { |_, *value| value.max }

Related

Why does `String#delete` return an error in my ruby code?

The goal is to take a string and return the most common letter along with it's count. For string 'hello', it would return ['l', 2].
I've written the following:
def most_common_letter(string)
list = []
bigcount = 0
while 0 < string.length
count = 0
for i in 0..string.length
if string[0] == string[i]
count += 1
end
end
if count > bigcount
bigcount = count
list = (string[0])
string.delete[string[0]]
end
end
return [list,bigcount]
end
I get the following error:
wrong number of arguments (0 for 1+)
(repl):14:in `delete'
(repl):14:in `most_common_letter'
(repl):5:in `initialize'
Please help me understand what I'm doing wrong with the delete statement, or what else is causing this to return an error.
I have a solution done another way, but I thought this would work just fine.
you are using the delete function wrong
Use string.delete(string[0]) instead of string.delete[string[0]]
EDIT
As for the infinite loop you mentioned.
Your condition for while is 0 < string.length
And you expect the string.delete[string[0]] statement to actually delete a character at a time.
But what exactly it does is, it deletes a character and returns the new string, but it never actually mutates/changes the actual string.
So try changing it to string = string.delete[string[0]]
Apart from using delete() instead of delete[] which has already been answered...
Most of what you need is implemented in Ruby's String class natively. each_char and count.
def most_common_letter(string)
max = [ nil, 0 ]
string.each_char {|char|
char_count = string.count(char)
max = [ char, char_count ] if char_count > max[1]
}
return max
end
You may do this in a much easier way, if you allow me to say.
def most_common_letter(string)
h = Hash.new
string.chars.sort.map { |c|
h[c] = 0 if (h[c].nil?)
h[c] = h[c] + 1
}
maxk = nil
maxv = -1
mk = h.keys
mk.each do |k|
if (h[k] > maxv) then
maxk = k
maxv = h[k]
end
end
[ maxk , maxv ]
end
If you test this with
puts most_common_letter("alcachofra")
the result will be
[ 'a', 3 ]
Finally, remember you don't need a return in the end of a Ruby method. The last value assigned is automatically returned.
Do Ruby in a Ruby way!

Merging Ranges using Sets - Error - Stack level too deep (SystemStackError)

I have a number of ranges that I want merge together if they overlap. The way I’m currently doing this is by using Sets.
This is working. However, when I attempt the same code with a larger ranges as follows, I get a `stack level too deep (SystemStackError).
require 'set'
ranges = [Range.new(73, 856), Range.new(82, 1145), Range.new(116, 2914), Range.new(3203, 3241)]
set = Set.new
ranges.each { |r| set << r.to_set }
set.flatten!
sets_subsets = set.divide { |i, j| (i - j).abs == 1 } # this line causes the error
puts sets_subsets
The line that is failing is taken directly from the Ruby Set Documentation.
I would appreciate it if anyone could suggest a fix or an alternative that works for the above example
EDIT
I have put the full code I’m using here:
Basically it is used to add html tags to an amino acid sequence according to some features.
require 'set'
def calculate_formatting_classes(hsps, signalp)
merged_hsps = merge_ranges(hsps)
sp = format_signalp(merged_hsps, signalp)
hsp_class = (merged_hsps - sp[1]) - sp[0]
rank_format_positions(sp, hsp_class)
end
def merge_ranges(ranges)
set = Set.new
ranges.each { |r| set << r.to_set }
set.flatten
end
def format_signalp(merged_hsps, sp)
sp_class = sp - merged_hsps
sp_hsp_class = sp & merged_hsps # overlap regions between sp & merged_hsp
[sp_class, sp_hsp_class]
end
def rank_format_positions(sp, hsp_class)
results = []
results += sets_to_hash(sp[0], 'sp')
results += sets_to_hash(sp[1], 'sphsp')
results += sets_to_hash(hsp_class, 'hsp')
results.sort_by { |s| s[:pos] }
end
def sets_to_hash(set = nil, cl)
return nil if set.nil?
hashes = []
merged_set = set.divide { |i, j| (i - j).abs == 1 }
merged_set.each do |s|
hashes << { pos: s.min.to_i - 1, insert: "<span class=#{cl}>" }
hashes << { pos: s.max.to_i - 0.1, insert: '</span>' } # for ordering
end
hashes
end
working_hsp = [Range.new(7, 136), Range.new(143, 178)]
not_working_hsp = [Range.new(73, 856), Range.new(82, 1145),
Range.new(116, 2914), Range.new(3203, 3241)]
sp = Range.new(1, 20).to_set
# working
results = calculate_formatting_classes(working_hsp, sp)
# Not Working
# results = calculate_formatting_classes(not_working_hsp, sp)
puts results
Here is one way to do this:
ranges = [Range.new(73, 856), Range.new(82, 1145),
Range.new(116, 2914), Range.new(3203, 3241)]
ranges.size.times do
ranges = ranges.sort_by(&:begin)
t = ranges.each_cons(2).to_a
t.each do |r1, r2|
if (r2.cover? r1.begin) || (r2.cover? r1.end) ||
(r1.cover? r2.begin) || (r1.cover? r2.end)
ranges << Range.new([r1.begin, r2.begin].min, [r1.end, r2.end].max)
ranges.delete(r1)
ranges.delete(r2)
t.delete [r1,r2]
end
end
end
p ranges
#=> [73..2914, 3203..3241]
The other answers aren't bad, but I prefer a simple recursive approach:
def merge_ranges(*ranges)
range, *rest = ranges
return if range.nil?
# Find the index of the first range in `rest` that overlaps this one
other_idx = rest.find_index do |other|
range.cover?(other.begin) || other.cover?(range.begin)
end
if other_idx
# An overlapping range was found; remove it from `rest` and merge
# it with this one
other = rest.slice!(other_idx)
merged = ([range.begin, other.begin].min)..([range.end, other.end].max)
# Try again with the merged range and the remaining `rest`
merge_ranges(merged, *rest)
else
# No overlapping range was found; move on
[ range, *merge_ranges(*rest) ]
end
end
Note: This code assumes each range is ascending (e.g. 10..5 will break it).
Usage:
ranges = [ 73..856, 82..1145, 116..2914, 3203..3241 ]
p merge_ranges(*ranges)
# => [73..2914, 3203..3241]
ranges = [ 0..10, 5..20, 30..50, 45..80, 50..90, 100..101, 101..200 ]
p merge_ranges(*ranges)
# => [0..20, 30..90, 100..200]
I believe your resulting set has too many items (2881) to be used with divide, which if I understood correctly, would require 2881^2881 iterations, which is such a big number (8,7927981983090337174360463368808e+9966) that running it would take nearly forever even if you didn't get stack level too deep error.
Without using sets, you can use this code to merge the ranges:
module RangeMerger
def merge(range_b)
if cover?(range_b.first) && cover?(range_b.last)
self
elsif cover?(range_b.first)
self.class.new(first, range_b.last)
elsif cover?(range_b.last)
self.class.new(range_b.first, last)
else
nil # Unmergable
end
end
end
module ArrayRangePusher
def <<(item)
if item.kind_of?(Range)
item.extend RangeMerger
each_with_index do |own_item, idx|
own_item.extend RangeMerger
if new_range = own_item.merge(item)
self[idx] = new_range
return self
end
end
end
super
end
end
ranges = [Range.new(73, 856), Range.new(82, 1145), Range.new(116, 2914), Range.new(3203, 3241)]
new_ranges = Array.new
new_ranges.extend ArrayRangePusher
ranges.each do |range|
new_ranges << range
end
puts ranges.inspect
puts new_ranges.inspect
This will output:
[73..856, 82..1145, 116..2914, 3203..3241]
[73..2914, 3203..3241]
which I believe is the intended output for your original problem. It's a bit ugly, but I'm a bit rusty at the moment.
Edit: I don't think this has anything to do with your original problem before the edits which was about merging ranges.

Ruby dot notation to nested hash keys

What's the best way of converting a dot notation path (or even an array of strings) into a nested hash key-value? Ex: I need to convert 'foo.bar.baz' equal to 'qux' like this:
{
'foo' => {
'bar' => {
'baz' => 'qux'
}
}
}
I've done this in PHP, but I managed that by creating a key in the array and then setting a tmp variable to that array key's value by reference so any changes would also take place in the array.
Try this
f = "root/sub-1/sub-2/file"
f.split("/").reverse.inject{|a,n| {n=>a}} #=>{"root"=>{"sub-1"=>{"sub-2"=>"file"}}}
I'd probably use recursion. For example:
def hasherizer(arr, value)
if arr.empty?
value
else
{}.tap do |hash|
hash[arr.shift] = hasherizer(arr, value)
end
end
end
This results in:
> hasherizer 'foo.bar.baz'.split('.'), 'qux'
=> {"foo"=>{"bar"=>{"baz"=>"qux"}}}
I like this method below which operates on itself (or your own hash class). It'll create new hash keys or reuse/append to existing keys in a hash to add or update the value.
# set a new or existing nested key's value by a dotted-string key
def dotkey_set(dottedkey, value, deep_hash = self)
keys = dottedkey.to_s.split('.')
first = keys.first
if keys.length == 1
deep_hash[first] = value
else
# in the case that we are creating a hash from a dotted key, we'll assign a default
deep_hash[first] = (deep_hash[first] || {})
dotkey_set(keys.slice(1..-1).join('.'), value, deep_hash[first])
end
end
Usage:
hash = {}
hash.dotkey_set('how.are.you', 'good')
# => "good"
hash
# => {"how"=>{"are"=>{"you"=>"good"}}}
hash.dotkey_set('how.goes.it', 'fine')
# => "fine"
hash
# => {"how"=>{"are"=>{"you"=>"good"}, "goes"=>{"it"=>"fine"}}}
I did something similar when I wrote an HTTP server that had to move all the parameters passed in the request into a multiple value hash which might contain arrays or strings or hashes...
You can look at the code for the Plezi server and framework... although the code over there deals with values surrounded with []...
It could possibly be adjusted like so:
def add_param_to_hash param_name, param_value, target_hash = {}
begin
a = target_hash
p = param_name.split(/[\/\.]/)
val = param_value
# the following, somewhat complex line, runs through the existing (?) tree, making sure to preserve existing values and add values where needed.
p.each_index { |i| p[i].strip! ; n = p[i].match(/^[0-9]+$/) ? p[i].to_i : p[i].to_sym ; p[i+1] ? [ ( a[n] ||= ( p[i+1].empty? ? [] : {} ) ), ( a = a[n]) ] : ( a.is_a?(Hash) ? (a[n] ? (a[n].is_a?(Array) ? (a << val) : a[n] = [a[n], val] ) : (a[n] = val) ) : (a << val) ) }
rescue Exception => e
warn "(Silent): parameters parse error for #{param_name} ... maybe conflicts with a different set?"
target_hash[param_name] = param_value
end
end
This should preserve existing values while adding new values if they exist.
The long line looks something like this when broken down:
def add_param_to_hash param_name, param_value, target_hash = {}
begin
# a will hold the object to which we Add.
# As we walk the tree we change `a`. we start at the root...
a = target_hash
p = param_name.split(/[\/\.]/)
val = param_value
# the following, somewhat complex line, runs through the existing (?) tree, making sure to preserve existing values and add values where needed.
p.each_index do |i|
p[i].strip!
# converts the current key string to either numbers or symbols... you might want to replace this with: n=p[i]
n = p[i].match(/^[0-9]+$/) ? p[i].to_i : p[i].to_sym
if p[i+1]
a[n] ||= ( p[i+1].empty? ? [] : {} ) # is the new object we'll add to
a = a[n] # move to the next branch.
else
if a.is_a?(Hash)
if a[n]
if a[n].is_a?(Array)
a << val
else
a[n] = [a[n], val]
end
else
a[n] = val
end
else
a << val
end
end
end
rescue Exception => e
warn "(Silent): parameters parse error for #{param_name} ... maybe conflicts with a different set?"
target_hash[param_name] = param_value
end
end
Brrr... Looking at the code like this, I wonder what I was thinking...

Ruby: get correct types on values in query string

Say I have
str = "a=2&b=3.05&c=testing"
I run
require 'cgi'
out = {}
CGI::parse(str).each { |k,v| out[k] = v[0] }
When I output a, 2 is a string, when I want it to be an Int
out['a'] // "2" (instead of int 2)
out['b'] // "3.05" (instead of float 3.05)
Is there any way to correct the types from the query string?
Update:
Added this method to test for numbers
def is_a_number?(s)
s.to_s.match(/\A[+-]?\d+?(\.\d+)?\Z/) == nil ? false : true
end
and during the parse
CGI::parse(url).each do |k,v|
val = v[0]
if is_a_number? val
val = val.include?('.') ? val.to_f : val.to_i
end
out[k] = val
end
Seems to work with basic examples. Is there anything unsafe about this?
The short answer is no, there's no way to just get the correct type out. You could write your own parser that tries to guess based on regex matches. The typical way this is handled is that you parse them manually based on the expected type of each parameter. You can call methods like to_i and to_f to convert them to the types you want.
Edited: This works
require 'cgi'
str = "a=2&b=3.05&c=testing"
out = {}
def typecasted(str)
[str.to_i, str.to_f, str].find { |cast| cast.to_s == str }
end
CGI::parse(str).each do |key, val|
out[key] = typecasted val.first
end
p out
# => {"a"=>2, "b"=>3.05, "c"=>"testing"}
If you parse it like this you shouldn't have a problem
out = {}
CGI::parse(str).each do |k, v|
v, v = (v = v.first), (v if v[/[a-zA-Z]/]) || [v.to_i, v.to_f].max
out.merge!(Hash[k, v])
end
Combined with the technique of AJcodez this gives
out = {}
CGI::parse(str).each do |k, v|
v, out[k] = (v = v.first), [v.to_i, v.to_f, v].find { |c| c.to_s == v }
end
Or as a one-liner
Hash[*CGI::parse(str).map {|k, v| v = v.first; [k, [v.to_i, v.to_f, v].find { |c| c.to_s == v }]}.flatten]
gives
{"a"=>2, "b"=>3.05, "c"=>"testing"}
I can't think of a way with that input string you have, but if you can change that to
str = "a=2&b=3.05&c='testing'"
(notice the single quotes), you could use the eval function on each value and let ruby guess the types.

Ruby: sorting 2d array and output similar field value to files

I have array which I read from excel (using ParseExcel) using the following code:
workbook = Spreadsheet::ParseExcel.parse("test.xls")
rows = workbook.worksheet(1).map() { |r| r }.compact
grid = rows.map() { |r| r.map() { |c| c.to_s('latin1') unless c.nil?}.compact rescue nil }
grid.sort_by { |k| k[2]}
test.xls has lots of rows and 6 columns. The code above sort by column 3.
I would like to output rows in array "grid" to many text file like this:
- After sorting, I want to print out all the rows where column 3 have the same value into one file and so on for a different file for other same value in column3.
Hope I explain this right. Thanks for any help/tips.
ps.
I search through most posting on this site but could not find any solution.
instead of using your above code, I made a test 100-row array, each row containing a 6-element array.
You pass in the array, and the column number you want matched, and this method prints into separate files rows that have the same nth element.
Since I used integers, I used the nth element of each row as the filename. You could use a counter, or the md5 of the element, or something like that, if your nth element does not make a good filename.
a = []
100.times do
b = []
6.times do
b.push rand(10)
end
a.push(b)
end
def print_files(a, column)
h = Hash.new
a.each do |element|
h[element[2]] ? (h[element[column]] = h[element[column]].push(element)) : (h[element[column]] = [element])
end
h.each do |k, v|
File.open("output/" + k.to_s, 'w') do |f|
v.each do |line|
f.puts line.join(", ")
end
end
end
end
print_files(a, 2)
Here is the same code using blocks instead of do .. end:
a = Array.new
100.times{b = Array.new;6.times{b.push rand(10)};a.push(b)}
def print_files(a, column)
h = Hash.new
a.each{|element| h[element[2]] ? (h[element[column]] = h[element[column]].push(element)) : (h[element[column]] = [element])}
h.map{|k, v| File.open("output/" + k.to_s, 'w'){|f| v.map{|line| f.puts line.join(", ")}}}
end
print_files(a, 2)

Resources