"natural" sort an array of hashes in Ruby

"natural" sort an array of hashes in Ruby - ruby

There are workable answers for sorting an array of hashes and for natural sorting, but what is the best way to do both at once?
my_array = [ {"id":"some-server-1","foo":"bar"},{"id":"some-server-2","foo":"bat"},{"id":"some-server-10","foo":"baz"} ]
I would like to sort on "id" such that the final ordering is:
some-server-1
some-server-2
some-server-10
I feel like there must be a clever and efficient way to do this, though personally I don't need to break any speed records and will only be sorting a few hundred items. Can I implement a comparison function in sort_by?

First of all, your my_array is JavaScript/JSON so I'll assume that you really have this:
my_array = [
{"id" => "some-server-1", "foo" => "bar"},
{"id" => "some-server-2", "foo" => "bat"},
{"id" => "some-server-10", "foo" => "baz"}
]
Then you just need to sort_by the numeric suffix of the 'id' values:
my_array.sort_by { |e| e['id'].sub(/^some-server-/, '').to_i }
If the "some-server-" prefixes aren't always "some-server-" then you could try something like this:
my_array.sort_by { |e| e['id'].scan(/\D+|\d+/).map { |x| x =~ /\d/ ? x.to_i : x } }
That would split the 'id' values into numeric and non-numeric pieces, convert the numeric pieces to integers, and then compare the mixed string/integers arrays using the Array <=> operator (which compares component-wise); this will work as long as the numeric and non-numeric components always match up. This approach would handle this:
my_array = [
{"id" => "some-server-1", "foo" => "bar"},
{"id" => "xxx-10", "foo" => "baz"}
]
but not this:
my_array = [
{"id" => "11-pancakes-23", "foo" => "baz"},
{"id" => "some-server-1", "foo" => "bar"}
]
If you need to handle this last case then you'd need to compare the arrays entry-by-entry by hand and adjust the comparison based on what you have. You could still get some of the advantages of the sort_by Schwartzian Transform with something like this (not very well tested code):
class NaturalCmp
include Comparable
attr_accessor :chunks
def initialize(s)
#chunks = s.scan(/\D+|\d+/).map { |x| x =~ /\d/ ? x.to_i : x }
end
def <=>(other)
i = 0
#chunks.inject(0) do |cmp, e|
oe = other.chunks[i]
i += 1
if(cmp == 0)
cmp = e.class == oe.class \
? e <=> oe \
: e.to_s <=> oe.to_s
end
cmp
end
end
end
my_array.sort_by { |e| NaturalCmp.new(e['id']) }
The basic idea here is to push the comparison noise off to another class to keep the sort_by from degenerating into an incomprehensible mess. Then we use the same scanning as before to break the strings into pieces and implement the array <=> comparator by hand. If we have two things of the same class then we let that class's <=> deal with it otherwise we force both components to String and compare them as such. And we only care about the first non-0 result.

#mu gives a more than adequate answer for my case, but I also figured out the syntax for introducing arbitrary comparisons:
def compare_ids(a,b)
# Whatever code you want here
# Return -1, 0, or 1
end
sorted_array = my_array.sort { |a,b| compare_ids(a["id"],b["id"] }

I think that if you are sorting on the id field, you could try this:
my_array.sort { |a,b| a["id"].to_i <=> b["id"].to_i }

Related

Is there a better solution to partition a hash into two hashes?

I wrote a method to split a hash into two hashes based on a criteria (a particular hash value). My question is different from another question on Hash. Here is an example of what I expect:
h={
:a => "FOO",
:b => "FOO",
:c => "BAR",
:d => "BAR",
:e => "FOO"
}
h_foo, h_bar = partition(h)
I need h_foo and h_bar to be like:
h_foo={
:a => "FOO",
:b => "FOO",
:e => "FOO"
}
h_bar={
:c => "BAR",
:d => "BAR"
}
My solution is:
def partition h
h.group_by{|k,v| v=="FOO"}.values.collect{|ary| Hash[*ary.flatten]}
end
Is there a clever solution?

There's Enumerable#partition:
h.partition { |k, v| v == "FOO" }.map(&:to_h)
#=> [{:a=>"FOO", :b=>"FOO", :e=>"FOO"}, {:c=>"BAR", :d=>"BAR"}]
Or you could use Enumerable#each_with_object to avoid the intermediate arrays:
h.each_with_object([{}, {}]) { |(k, v), (h_foo, h_bar)|
v == "FOO" ? h_foo[k] = v : h_bar[k] = v
}
#=> [{:a=>"FOO", :b=>"FOO", :e=>"FOO"}, {:c=>"BAR", :d=>"BAR"}]

I don't think there is a clever one liner, but you can make it slightly more generic by doing something like:
def transpose(h,k,v)
h[v] ||= []
h[v] << k
end
def partition(h)
n = {}
h.map{|k,v| transpose(n,k,v)}
result = n.map{|k,v| Hash[v.map{|e| [e, k]}] }
end
which will yield
[{:a=>"FOO", :b=>"FOO", :e=>"FOO"}, {:c=>"BAR", :d=>"BAR"}]
when run against your initial hash h
Edit - TIL about partition. Wicked.

Why not use builtin partition, which is doing almost exactly what you are looking for?
h_foo, h_bar = h.partition { |key, value| value == 'FOO' }
The only downside is that you will get arrays instead of hashes (but you already know how to convert that). In ruby 2.1+ you could simply call .map(&:to_h) at the end of call chain.

Getting an array of hash values given specific keys

Given certain keys, I want to get an array of values from a hash (in the order I gave the keys). I had done this:
class Hash
def values_for_keys(*keys_requested)
result = []
keys_requested.each do |key|
result << self[key]
end
return result
end
end
I modified the Hash class because I do plan to use it almost everywhere in my code.
But I don't really like the idea of modifying a core class. Is there a builtin solution instead? (couldn't find any, so I had to write this).

You should be able to use values_at:
values_at(key, ...) → array
Return an array containing the values associated with the given keys. Also see Hash.select.
h = { "cat" => "feline", "dog" => "canine", "cow" => "bovine" }
h.values_at("cow", "cat") #=> ["bovine", "feline"]
The documentation doesn't specifically say anything about the order of the returned array but:
The example implies that the array will match the key order.
The standard implementation does things in the right order.
There's no other sensible way for the method to behave.
For example:
>> h = { :a => 'a', :b => 'b', :c => 'c' }
=> {:a=>"a", :b=>"b", :c=>"c"}
>> h.values_at(:c, :a)
=> ["c", "a"]

i will suggest you do this:
your_hash.select{|key,value| given_keys.include?(key)}.values

Ruby: cleaner returns from loop iteration methods

I find that I frequently have methods that iterate through an enumerable in order to return a different enumerable or a hash. These methods almost always look like this simplistic example:
def build_hash(array)
hash = {}
array.each do |item|
hash[ item[:id] ]= item
end
hash
end
This approach works works, but I've often wondered if there's a cleaner way to do this, specifically without having to wrap the loop in a temporary object so that the return is correct.
Does anyone know of an improved and/or cleaner and/or faster way to do this, or is this pretty much the best way?

Here are a few ways, considering your specific example
arr = [{:id => 1, :name => :foo}, {:id => 2, :name => :bar}]
Hash[arr.map{ |o| [o[:id], o] }]
arr.each_with_object({}){ |o, h| h[o[:id]] = o }
arr.reduce({}){ |h, o| h[o[:id]] = o; h }
arr.reduce({}){ |h, o| h.merge o[:id] => o }
# each of these return the same Hash
# {1=>{:id=>1, :name=>:foo}, 2=>{:id=>2, :name=>:bar}}

Well in this case, you can use inject and do something like this :
def build_hash(array)
array.inject({}) { |init, item| init[item[:id]] = item; init }
end

{}.tap { |h| array.each { |a| h[a[:id]] = a } }

Here is also a way how to convert Array into Hash.
list_items = ["1", "Foo", "2", "Bar", "3" , "Baz"]
hss = Hash[*list_items]
parameters must be even, otherwise a fatal error is raised, because an odd
number of arguments can’t be mapped to a series of key/value pairs.
{"1"=>"Foo", "2"=>"Bar", "3"=>"Baz"}

You can use ActiveSupport's index_by.
Your example becomes trivial:
def build_hash(array)
array.index_by{|item| item[:id]}
end
There is no really great way to build a hash in Ruby currently, even in Ruby 2.0.
You can use Hash[], although I find that very ugly:
def build_hash(array)
Hash[array.map{|item| [item[:id], item]}]
end
If we can convince Matz, you could at least:
def build_hash(array)
array.map{|item| [item[:id], item]}.to_h
end
There are other requests for new ways to create hashes.

Bidirectional Hash table in Ruby

I need a bidirectional Hash table in Ruby. For example:
h = {:abc => 123, :xyz => 789, :qaz => 789, :wsx => [888, 999]}
h.fetch(:xyz) # => 789
h.rfetch(123) # => abc
h.rfetch(789) # => [:xyz, :qaz]
h.rfetch(888) # => :wsx
Method rfetch means reversed fetch and is only my proposal.
Note three things:
If multiple keys map at the same value then rfetch returns all of them, packed in array.
If value is an array then rfetch looks for its param among elements of the array.
Bidirectional Hash means that both fetch and rfetch should execute in constant time.
Does such structure exists in Ruby (including external libraries)?
I thought about implementing it using two one-directional Hashes synchronized when one of them is modified (and packing it into class to avoid synchronization problems) but maybe I could use an already existing solution?

You could build something yourself pretty easily, just use a simple object that wraps two hashes (one for the forward direction, one for the reverse). For example:
class BiHash
def initialize
#forward = Hash.new { |h, k| h[k] = [ ] }
#reverse = Hash.new { |h, k| h[k] = [ ] }
end
def insert(k, v)
#forward[k].push(v)
#reverse[v].push(k)
v
end
def fetch(k)
fetch_from(#forward, k)
end
def rfetch(v)
fetch_from(#reverse, v)
end
protected
def fetch_from(h, k)
return nil if(!h.has_key?(k))
v = h[k]
v.length == 1 ? v.first : v.dup
end
end
Look ups will behave just like normal hash lookups (because they are normal hash lookups). Add some operators and maybe decent to_s and inspect implementations and you're good.
Such a thing works like this:
b = BiHash.new
b.insert(:a, 'a')
b.insert(:a, 'b')
b.insert(:a, 'c')
b.insert(:b, 'a')
b.insert(:c, 'x')
puts b.fetch(:a).inspect # ["a", "b", "c"]
puts b.fetch(:b).inspect # "a"
puts b.rfetch('a').inspect # [:a, :b]
puts b.rfetch('x').inspect # :c
puts b.fetch(:not_there).inspect # nil
puts b.rfetch('not there').inspect # nil
There's nothing wrong with building your tools when you need them.

There is no such structure built-in in Ruby.
Note that Hash#rassoc does something similar, but it returns only the first match and is linear-time:
h = {:abc => 123, :xyz => 789, :qaz => 789, :wsx => [888, 999]}
h.rassoc(123) # => [:abc, 123]
Also, it isn't possible to fullfill your requirements in Ruby in a perfectly safe manner, as you won't be able to detect changes in values that are arrays. E.g.:
h = MyBidirectionalArray.new(:foo => 42, :bar => [:hello, :world])
h.rfetch(:world) # => :bar
h[:bar].shift
h[:bar] # => [:world]
h.rfetch(:world) # => should be nil, but how to detect this??
Computing a hash everytime to detect a change will make your lookup linear-time. You could duplicate the array-values and freeze them, though (like Ruby does for Hash keys that are strings!)
What you seem to need is a Graph class, which could have a different API than a Hash, no? You can check out rgl or similar, but I don't know how they're implemented.
Good luck.

There is a Hash#invert method (http://www.ruby-doc.org/core-2.1.0/Hash.html#method-i-invert) to achieve this. It won't map multiple values to an array though.

Try this:
class Hash
def rfetch val
select { |k,v| v.is_a?(Array) ? v.include?(val) : v == val }.map { |x| x[0] }
end
end

If you're not doing lots of updates to this hash, you might be able to use inverthash.

Convert array-of-hashes to a hash-of-hashes, indexed by an attribute of the hashes

I've got an array of hashes representing objects as a response to an API call. I need to pull data from some of the hashes, and one particular key serves as an id for the hash object. I would like to convert the array into a hash with the keys as the ids, and the values as the original hash with that id.
Here's what I'm talking about:
api_response = [
{ :id => 1, :foo => 'bar' },
{ :id => 2, :foo => 'another bar' },
# ..
]
ideal_response = {
1 => { :id => 1, :foo => 'bar' },
2 => { :id => 2, :foo => 'another bar' },
# ..
}
There are two ways I could think of doing this.
Map the data to the ideal_response (below)
Use api_response.find { |x| x[:id] == i } for each record I need to access.
A method I'm unaware of, possibly involving a way of using map to build a hash, natively.
My method of mapping:
keys = data.map { |x| x[:id] }
mapped = Hash[*keys.zip(data).flatten]
I can't help but feel like there is a more performant, tidier way of doing this. Option 2 is very performant when there are a very minimal number of records that need to be accessed. Mapping excels here, but it starts to break down when there are a lot of records in the response. Thankfully, I don't expect there to be more than 50-100 records, so mapping is sufficient.
Is there a smarter, tidier, or more performant way of doing this in Ruby?

Ruby <= 2.0
> Hash[api_response.map { |r| [r[:id], r] }]
#=> {1=>{:id=>1, :foo=>"bar"}, 2=>{:id=>2, :foo=>"another bar"}}
However, Hash::[] is pretty ugly and breaks the usual left-to-right OOP flow. That's why Facets proposed Enumerable#mash:
> require 'facets'
> api_response.mash { |r| [r[:id], r] }
#=> {1=>{:id=>1, :foo=>"bar"}, 2=>{:id=>2, :foo=>"another bar"}}
This basic abstraction (convert enumerables to hashes) was asked to be included in Ruby long ago, alas, without luck.
Note that your use case is covered by Active Support: Enumerable#index_by
Ruby >= 2.1
[UPDATE] Still no love for Enumerable#mash, but now we have Array#to_h. It creates an intermediate array, but it's better than nothing:
> object = api_response.map { |r| [r[:id], r] }.to_h

Something like:
ideal_response = api_response.group_by{|i| i[:id]}
#=> {1=>[{:id=>1, :foo=>"bar"}], 2=>[{:id=>2, :foo=>"another bar"}]}
It uses Enumerable's group_by, which works on collections, returning matches for whatever key value you want. Because it expects to find multiple occurrences of matching key-value hits it appends them to arrays, so you end up with a hash of arrays of hashes. You could peel back the internal arrays if you wanted but could run a risk of overwriting content if two of your hash IDs collided. group_by avoids that with the inner array.
Accessing a particular element is easy:
ideal_response[1][0] #=> {:id=>1, :foo=>"bar"}
ideal_response[1][0][:foo] #=> "bar"
The way you show at the end of the question is another valid way of doing it. Both are reasonably fast and elegant.

For this I'd probably just go:
ideal_response = api_response.each_with_object(Hash.new) { |o, h| h[o[:id]] = o }
Not super pretty with the multiple brackets in the block but it does the trick with just a single iteration of the api_response.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

"natural" sort an array of hashes in Ruby - ruby

#mu gives a more than adequate answer for my case, but I also figured out the syntax for introducing arbitrary comparisons: def compare_ids(a,b) # Whatever code you want here # Return -1, 0, or 1 end sorted_array = my_array.sort { |a,b| compare_ids(a["id"],b["id"] }

I think that if you are sorting on the id field, you could try this: my_array.sort { |a,b| a["id"].to_i <=> b["id"].to_i }

Related

Is there a better solution to partition a hash into two hashes?

Getting an array of hash values given specific keys

Ruby: cleaner returns from loop iteration methods

Bidirectional Hash table in Ruby

Convert array-of-hashes to a hash-of-hashes, indexed by an attribute of the hashes

Categories

Resources