Groovy built-in sorting method vs spaceship operator - sorting

Groovy's lists has a sort() method to which you can provide a criterion by which to sort the collection. For example:
def foo = ["abc", "de", "fghi"]
println foo.sort { it.size() }
displays "[de, abc, fghi]".
I thought that this mechanism under the hood calls the Java's method with the spaceship operator, because the following code snippets displays the same result:
def foo = ["abc", "de", "fghi"]
def bar = ["abc", "de", "fghi"]
println foo.sort { it }
println bar.sort { a, b -> a <=> b }
However, this is not the case, because:
def someDate = new Wrapper(new Date())
def someString = new Wrapper("16")
def someNumber = new Wrapper(4)
def foo = [someDate, someString, someNumber]
def bar = [someDate, someString, someNumber]
println foo.sort { it.value } // Produces "valid" output: [4, 16, Wed Jun 01..]
println foo.sort { a, b -> a.value <=> b.value} // Throws ClassCastException
//Simmulates JSON object I am working with
class Wrapper {
Object value
Object value2
Wrapper(value) {
this.value = value
}
#Override
String toString() {
return value
}
}
So, how does it really work? Is there any way I can get such a Groovish comparator used in sort parameter and tune it a bit (for example, so that I can sort by the value of field value2 in the second order)?

when you are using sort{x->...} with one param then OrderBy comparator is used.
as for me the result is unpredictable when you have different types in the same list.
just try to run and predict the output for the following script:
println( ['xx', new Date(), 1199999999].sort{it} )
println( ['xx', new Date(), 11999999999].sort{it} )
however if you want to use it:
println foo.toSorted ( new OrderBy([{it.value},{it.value2}]) )

Related

Accessing nested hashes with accessors in ruby

So I have a hash:
a = {
foo: {
bar: 1
}
}
Now I can access value 1 with a[:foo][:bar].
How would I go on about generating methods from this automatically so I could access the value with a.foo.bar?
Is this even possible? If it is how could I generate this for a predetermined hash?
This is doable with https://ruby-doc.org/stdlib-3.0.0/libdoc/ostruct/rdoc/OpenStruct.html from the standard library.
To make this work recursively we can use:
require 'ostruct'
def to_os(obj)
case obj
when Hash
OpenStruct.new(obj.transform_values { |h| to_os(h) })
when Array
obj.map { |o| to_os(o) }
else
obj
end
end
a = { foo: { bar: 1 } }
b = to_os(a)
puts b[:foo]
puts b.foo
puts b[:foo][:bar]
But as noted in the comments this becomes an OpenStruct, which means that the output of the upper code is:
#<OpenStruct bar=1>
#<OpenStruct bar=1>
1
So what we deduce from this that the nested hash gets lost.

Ruby - Safely navigated hash when a key could be a string or further hash

I'm stuck trying to safely navigate a hash from json.
The json could have a string, eg:
or it could be further nested:
h1 = { location: { formatted: 'Australia', code: 'AU' } }
h2 = { location: 'Australia' }
h2.dig('location', 'formatted')
Then String does not have #dig method
Basically I'm trying to load the JSON then populate the rails model with the data available which may be optional. It seems backwards to check every nested step with an if.
Hash#dig has no magic. It reduces the arguments recursively calling Hash#[] on what was returned from the previous call.
h1 = { location: { formatted: 'Australia', code: 'AU' } }
h1.dig :location, :code
#β‡’ "AU"
It works, because h1[:location] had returned a hash.
h2 = { location: 'Australia' }
h2.dig :location, :code
It raises, because h2[:location] had returned a String.
That said, the solution would be to reimplement Hash#dig, as usually :)
Explicitly taking into account that it’s extremely trivial. Just take a list of keys to dig and (surprise) reduce, returning either the value, or nil.
%i|location code|.reduce(h2) do |acc, e|
acc.is_a?(Hash) ? acc[e] : nil
end
#β‡’ nil
%i|location code|.reduce(h1) do |acc, e|
acc.is_a?(Hash) ? acc[e] : nil
end
#β‡’ "AU"
Shameless plug. You might find the gem iteraptor I had created for this exact purpose useful.
You can use a simple piece of code like that:
def nested_value(hash, key)
return hash if key == ''
keys = key.split('.')
value = hash[keys.first] || hash[keys.first.to_sym]
return value unless value.is_a?(Hash)
nested_value(value, keys[1..-1].join('.'))
end
h1 = { location: { formatted: 'Australia', code: 'AU' } }
h2 = { 'location' => 'Australia' }
p nested_value(h1, 'location.formatted') # => Australia
p nested_value(h2, 'location.formatted') # => Australia
You can also use that method for getting any nested value of a hash by providing key in format foo.bar.baz.qux. Also the method doesn't worry whether a hash has string keys or symbol keys.
I don't know if this lead to the expected behaviour (see examples below) but you can define a patch for the Hash class as follow:
module MyHashPatch
def safe_dig(params) # ok, call as you like..
tmp = self
res = nil
params.each do |param|
if (tmp.is_a? Hash) && (tmp.has_key? param)
tmp = tmp[param]
res = tmp
else
break
end
end
res
end
end
Hash.include MyHashPatch
Then test on your hashes:
h1 = { location: { formatted: 'Australia', code: 'AU' } }
h2 = { location: 'Australia' }
h1.safe_dig([:location, :formatted]) #=> "Australia"
h2.safe_dig([:location, :formatted]) #=> "Australia"
h1.safe_dig([:location, :code]) #=> "AU"
h2.safe_dig([:location, :code]) #=> "Australia"

Ruby 2.5 efficient way to delete ruby key if it contains a hash with only one key/val pair

Assuming a data structure that looks like the following:
foo = {
'first': {
'bar': 'foo'
},
'second': {
'bar': 'foobar',
'foo': 'barfoo'
},
'third': {
'test': 'example'
}
}
I want to remove all keys from the Hash foo that contain an entry that has only one key/val pair. In this particular case, after the operation is done, foo should only have left:
foo = {
'second': {
'bar': 'foobar',
'foo': 'barfoo'
}
}
as foo['first'] and foo['third'] only contain one key/val pair.
Option 1 - delete_if
foo.delete_if { |_, inner| inner.one? }
delete_if is destructive so it mutates the original hash
This will let through empty hashes
Option 2 - reject
This doesn't mutate any more:
foo = foo.reject { |_, inner| inner.one? }
This will let through empty hashes
Option 3 - select
No mutation plus different operator:
foo = foo.select { |_, inner| inner.size > 1 }
Option 4 - many? - Rails only
foo = foo.select { |_, inner| inner.many? }
If you're using Rails it defines #many? for you which is any array with more than 1 item
Other Notes
Used _ for unused variables as that's a way of showing "this is irrelevant"
Named the variable inner - convinced there's a better name but value could be confusing
Just a pair of option more, letting apart the way to check the condition.
Using Hash#keep_if
foo.keep_if{ |_, v| v.size > 1 }
And a more complicated, Enumerable#each_with_object:
foo.each_with_object({}){ |(k,v), h| h[k] = v if v.size > 1 }

How do you check for matching keys in a ruby hash?

I'm learning coding, and one of the assignments is to return keys is return the names of people who like the same TV show.
I have managed to get it working and to pass TDD, but I'm wondering if I've taken the 'long way around' and that maybe there is a simpler solution?
Here is the setup and test:
class TestFriends < MiniTest::Test
def setup
#person1 = {
name: "Rick",
age: 12,
monies: 1,
friends: ["Jay","Keith","Dave", "Val"],
favourites: {
tv_show: "Friends",
things_to_eat: ["charcuterie"]
}
}
#person2 = {
name: "Jay",
age: 15,
monies: 2,
friends: ["Keith"],
favourites: {
tv_show: "Friends",
things_to_eat: ["soup","bread"]
}
}
#person3 = {
name: "Val",
age: 18,
monies: 20,
friends: ["Rick", "Jay"],
favourites: {
tv_show: "Pokemon",
things_to_eat: ["ratatouille", "stew"]
}
}
#people = [#person1, #person2, #person3]
end
def test_shared_tv_shows
expected = ["Rick", "Jay"]
actual = tv_show(#people)
assert_equal(expected, actual)
end
end
And here is the solution that I found:
def tv_show(people_list)
tv_friends = {}
for person in people_list
if tv_friends.key?(person[:favourites][:tv_show]) == false
tv_friends[person[:favourites][:tv_show]] = [person[:name]]
else
tv_friends[person[:favourites][:tv_show]] << person[:name]
end
end
for array in tv_friends.values()
if array.length() > 1
return array
end
end
end
It passes, but is there a better way of doing this?
I think you could replace those for loops with the Array#each. But in your case, as you're creating a hash with the values in people_list, then you could use the Enumerable#each_with_object assigning a new Hash as its object argument, this way you have your own person hash from the people_list and also a new "empty" hash to start filling as you need.
To check if your inner hash has a key with the value person[:favourites][:tv_show] you can check for its value just as a boolean one, the comparison with false can be skipped, the value will be evaluated as false or true by your if statement.
You can create the variables tv_show and name to reduce a little bit the code, and then over your tv_friends hash to select among its values the one that has a length greater than 1. As this will give you an array inside an array you can get from this the first element with first (or [0]).
def tv_show(people_list)
tv_friends = people_list.each_with_object(Hash.new({})) do |person, hash|
tv_show = person[:favourites][:tv_show]
name = person[:name]
hash.key?(tv_show) ? hash[tv_show] << name : hash[tv_show] = [name]
end
tv_friends.values.select { |value| value.length > 1 }.first
end
Also you can omit parentheses when the method call doesn't have arguments.

Hash Enumerable methods: Inconsistent behavior when passing only one parameter

Ruby's enumerable methods for Hash expect 2 parameters, one for the key and one for the value:
hash.each { |key, value| ... }
However, I notice that the behavior is inconsistent among the enumerable methods when you only pass one parameter:
student_ages = {
"Jack" => 10,
"Jill" => 12,
}
student_ages.each { |single_param| puts "param: #{single_param}" }
student_ages.map { |single_param| puts "param: #{single_param}" }
student_ages.select { |single_param| puts "param: #{single_param}" }
student_ages.reject { |single_param| puts "param: #{single_param}" }
# results:
each...
param: ["Jack", 10]
param: ["Jill", 12]
map...
param: ["Jack", 10]
param: ["Jill", 12]
select...
param: Jack
param: Jill
reject...
param: Jack
param: Jill
As you can see, for each and map, the single parameter gets assigned to a [key, value] array, but for select and reject, the parameter is only the key.
Is there a particular reason for this behavior? The docs don't seem to mention this at all; all of the examples given just assume that you are passing in two parameters.
Just checked Rubinius behavior and it is indeed consistent with CRuby. So looking at the Ruby implementation - it is indeed because #select yields two values:
yield(item.key, item.value)
while #each yields an array with two values:
yield [item.key, item.value]
Yielding two values to a block that expects one takes the first argument and ignores the second one:
def foo
yield :bar, :baz
end
foo { |x| p x } # => :bar
Yielding an array will either get completely assigned if the block has one parameter or get unpacked and assigned to each individual value (as if you passed them one by one) if there are two or more parameters.
def foo
yield [:bar, :baz]
end
foo { |x| p x } # => [:bar, :baz]
As for why they made that descision - there probably isn't any good reason behind it, it just wasn't expected people to call them with one argument.
My guess is that internally map is just each with collect. Interesting they don't work quite the same way.
As to each...
The source code is below. It checks how many arguments you've passed into the block. If more than one it calls each_pair_i_fast, otherwise just each_pair_i.
static VALUE
rb_hash_each_pair(VALUE hash)
{
RETURN_SIZED_ENUMERATOR(hash, 0, 0, hash_enum_size);
if (rb_block_arity() > 1)
rb_hash_foreach(hash, each_pair_i_fast, 0);
else
rb_hash_foreach(hash, each_pair_i, 0);
return hash;
}
each_pair_i_fast returns two distinct values:
each_pair_i_fast(VALUE key, VALUE value)
{
rb_yield_values(2, key, value);
return ST_CONTINUE;
}
each_pair_i does not:
each_pair_i(VALUE key, VALUE value)
{
rb_yield(rb_assoc_new(key, value));
return ST_CONTINUE;
}
rb_assoc_new returns a two element array (at least I'm assuming that is what rb_ary_new3 does
rb_assoc_new(VALUE car, VALUE cdr)
{
return rb_ary_new3(2, car, cdr);
}
select looks like this:
rb_hash_select(VALUE hash)
{
VALUE result;
RETURN_SIZED_ENUMERATOR(hash, 0, 0, hash_enum_size);
result = rb_hash_new();
if (!RHASH_EMPTY_P(hash)) {
rb_hash_foreach(hash, select_i, result);
}
return result;
}
and select_i looks like this:
select_i(VALUE key, VALUE value, VALUE result)
{
if (RTEST(rb_yield_values(2, key, value))) {
rb_hash_aset(result, key, value);
}
return ST_CONTINUE;
}
And I'm going to assume that rb_hash_aset returns two distinct arguments similar to each_pair_i.
Most important notice that select/etc doesn't check the argument arity at all.
Sources:
https://github.com/ruby/ruby/blob/d5c5d5c778a0e8d61ab07669132dc18fb1a2e874/hash.c
https://github.com/ruby/ruby/blob/9f44b77a18d4d6099174c6044261eb1611a147ea/array.c

Resources