Is this expected behaviour for a Set of arrays in Ruby? - ruby

We're doing a bit of work in Ruby 1.8.7 that requires traversing and partitioning an undirected graph, that has been failing weirdly in production. When I distil the failing code down to its barest components, I get this strangely failing test:
it 'should be able to clear a ruby set of arrays' do
a = ["2", "b", "d"]
b = ["1", "a", "c", "e", "f"]
set = Set.new([a, b])
a.concat(b)
p "before clear: #{set.inspect}"
set.clear
p "after clear: #{set.inspect}"
set.size.should == 0
end
The test fails with this output:
"before clear: #<Set: {[\"1\", \"a\", \"c\", \"e\", \"f\"], [\"2\", \"b\", \"d\", \"1\", \"a\", \"c\", \"e\", \"f\"]}>"
"after clear: #<Set: {[\"2\", \"b\", \"d\", \"1\", \"a\", \"c\", \"e\", \"f\"]}>"
expected: 0
got: 1 (using ==)
Attempts to delete from the set also behave in strange ways. I'm guessing that Ruby is getting hung up on the hash values of the keys in the array changing under concat(), but surely I should still be able to clear the Set. Right?

There is a workaround for this, if you duplicate the set after you modify the keys, the new set will have the updated keys and clear properly. So setting set = set.dup will fix that problem.

The .dup approach was indeed my first work-around, and did as advertised.
I ended up adding the following monkey-patch to Set:
class Set
def rehash
#hash.rehash
end
end
which allows me to rehash the set's keys after any operation that changes their hash values.
This appears to all be fixed in Ruby 1.9.

Related

Ruby: undefined method 'to_set' for array?

I have a large array of chars:
input = ["p", "f", "p", "t" ... "g"]
I am attempting to take a slice of the array and convert it into a set:
sub = input.slice(0, 4).to_set
But the interpreter bombs:
undefined method `to_set' for ["p", "f", "p", "t"]:Array (NoMethodError)
Why is this happening? In irb this code executes with no issues.
The Enumerable#to_set method is implemented by Ruby's Set. It is not require-d by default hence why you get the error if you try to use it.
But in irb Set is already required. You can verify that by:
require 'set' # => false
This is something that has been raised up as an issue in irb before.

Ruby - Assign key value hash pairs to existing variables

How to assign key value pairs to exiting hash? I have the following code and I want to append some key value pairs to result variable.
def extra_variables
result = ansible_vars_from_objects(#handle.object, {})
result = ansible_vars_from_options(result)
#handle.log(:info, "Extra vars is: #{result}")
ansible_vars_from_ws_values(result)
end
Here is the log output of the result variable:
[----] I, [2022-03-08T21:31:41.701307 #322:2acf0cb72fb8] INFO -- automation: Q-task_id([r345_miq_provision_1235]) <AEMethod launch_ansible_job> Extra vars is: {"ansible_ssh_user"=>"ubuntu"}
Use the Hash#merge! Method
There's a built-in method for merging a Hash object into another Hash in-place: Hash#merge!. The main caveat is that Hash objects must have unique keys, so keep this in mind if you're trying to merge objects with the same top-level keys because the last key in the insertion order wins.
For example:
hash = {a: 1, b:2}
other_hash = {c: 3}
hash.merge! other_hash
hash
#=> {:a=>1, :b=>2, :c=>3}
Watch Out for Parsing Issues When Merging Hash Literals
Also note that if you're trying to merge Hash literals, you'll need to enclose the Hash in parentheses so that the interpreter doesn't think you're trying to pass a block. For example, you'd need to use:
hash.merge!({d: 4})
to avoid Ruby thinkings {d: 4} was a block passed to #merge!, but so far as I know this isn't a problem in any currently-supported Ruby when using a variable as the argument to #merge!. However, it's something to keep in mind if you get an exception like:
syntax error, unexpected ':', expecting '}' (SyntaxError)
which is pretty uninformative, but as of Ruby 3.1.1 that's the exception raised by this particular parsing issue.

Wrong output of Hash#keys method

Under certain conditions Hash#keys does not work correctly in Ruby before version 2.4
Demo code:
h = { a: 1, b: 2, c: 3 }
h.each do |k, v|
h.delete(:a)
p h
p h.keys
break
end
Ruby 2.3.8 output:
{:b=>2, :c=>3}
[:b]
Ruby 2.5.1 output:
{:b=>2, :c=>3}
[:b, :c]
I agree it is not good to modify hash when iterating. But I did not see the relation between the modification the hash and the work keys method.
Why is this happening?
Interesting question. This isn't an answer yet, but it's too long for a comment and it could help others answer the question.
Which Rubies are affected?
I created a GitHub repository with a very simple spec:
describe Hash do
it "should always know which keys are left" do
h = { a: 1, b: 2, c: 3 }
h.each do |k, v|
h.delete :a
expect(h.keys).to eq [:b, :c]
end
end
end
Thanks to Travis, it's easy to see which Ruby versions have this bug:
Ruby 2.1
Ruby 2.2
Ruby 2.3
When did the bug appear?
The bug wasn't in ruby-2.1.0-preview2
The bug was in ruby-2.1.0-rc1
When was the bug fixed?
https://github.com/ruby/ruby/tree/v2_4_0_preview2 was the last tag with this bug.
https://github.com/ruby/ruby/tree/v2_4_0_preview3 is the first tag without this bug.
I just spent an hour using git bisect and make install in order to find that the bug has been fixed in this commit (75775157).
Introduce table improvement by Vladimir Makarov
.
[Feature #12142] See header of st.c for improvment details.
You can see all of code history here:
https://github.com/vnmakarov/ruby/tree/hash_tables_with_open_addressing
This improvement is discussed at
https://bugs.ruby-lang.org/issues/12142 with many people,
especially with Yura Sokolov.
st.c: improve st_table.
include/ruby/st.h: ditto.
internal.h, numeric.c, hash.c (rb_dbl_long_hash): extract a
function.
ext/-test-/st/foreach/foreach.c: catch up this change.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk#56650
b2dd03c8-39d4-4d8f-98ff-823fe69b080e
It has been confirmed by #Vovan, who found this commit 1 minute before I did.

Idiomatic way of detecting duplicate keys in Ruby?

I've just noticed that Ruby doesn't raise an exception or even supply a warning if you supply duplicate keys to a hash:
$VERBOSE = true
key_value_pairs_with_duplicates = [[1,"a"], [1, "b"]]
# No warning produced
Hash[key_value_pairs_with_duplicates] # => {1=>"b"}
# Also no warning
hash_created_by_literal_with_duplicate_keys = {1 => "a", 1=> "b"} # => {1=>"b"}
For key_value_pairs_with_duplicates, I could detect duplicate keys by doing
keys = key_value_pairs_with_duplicates.map(&:first)
raise "Duplicate keys" unless keys.uniq == keys
Or by doing
procedurally_produced_hash = {}
key_value_pairs_with_duplicates.each do |key, value|
raise "Duplicate key" if procedurally_produced_hash.has_key?(key)
procedurally_produced_hash[key] = value
end
Or
hash = Hash[key_value_pairs_with_duplicates]
raise "Duplicate keys" unless hash.length == key_value_pairs_with_duplicates.length
But is there an idiomatic way to do it?
Hash#merge takes an optional block to define how to handle duplicate keys.
http://www.ruby-doc.org/core-1.9.3/Hash.html#method-i-merge
Taking advantage of the fact this block is only called on duplicate keys:
>> a = {a: 1, b: 2}
=> {:a=>1, :b=>2}
>> a.merge(c: 3) { |key, old, new| fail "Duplicate key: #{key}" }
=> {:a=>1, :b=>2, :c=>3}
>> a.merge(b: 10, c: 3) { |key, old, new| fail "Duplicate key: #{key}" }
RuntimeError: Duplicate key: b
I think there are two idiomatic ways to handle this:
Use one of the Hash extensions that allow multiple values per key, or
Extend Hash (or patch w/ flag method) and implement []= to throw a dupe key exception.
You could also just decorate an existing hash with the []= that throws, or alias_method--either way, it's straight-forward, and pretty Ruby-ish.
I would simply build a hash form the array, checking for a value before overwriting a key. This way it avoid creating any unnecessary temporary collections.
def make_hash(key_value_pairs_with_duplicates)
result = {}
key_value_pairs_with_duplicates.each do |pair|
key, value = pair
raise "Duplicate key" if result.has_key?(key)
result[key] = value
end
result
end
But no, I don't think there is an "idiomatic" way to doing this. It just follows the last in rule, and if you don't like that it's up to you to fix it.
In the literal form you are probably out of luck. But in the literal form why would you need to validate this? You are not getting it from a dynamic source if it's literal, so if you choose to dupe keys, it's your own fault. Just, uh... don't do that.
In other answers I've already stated my opinion that Ruby needs a standard method to build a hash from an enumerable. So, as you need your own abstraction for the task anyway, let's just take Facets' mash with the implementation you like the most (Enumerable#inject + Hash#update looks good to me) and add the check:
module Enumerable
def mash
inject({}) do |hash, item|
key, value = block_given? ? yield(item) : item
fail("Repeated key: #{key}") if hash.has_key?(key) # <- new line
hash.update(key => value)
end
end
end
I think most people here overthink the problem. To deal with duplicate keys, I'd simply do this:
arr = [ [:a,1], [:b,2], [:c,3] ]
hsh = {}
arr.each do |k,v|
raise("Whoa! I already have :#{k} key.") if hsh.has_key?(k)
x[k] = v
end
Or make a method out of this, maybe even extend a Hash class with it. Or create a child of Hash class (UniqueHash?) which would have this functionality by default.
But is it worth it? (I don't think so.) How often do we need to deal with duplicate keys in hash like this?
Latest Ruby versions do supply a warning when duplicating a key. However they still go ahead and re-assign the duplicate's value to the key, which is not always desired behaviour. IMO, the best way to deal with this is to override the construction/assignment methods. E.g. to override #[]=
class MyHash < Hash
def []=(key,val)
if self.has_key?(key)
puts("key: #{key} already has a value!")
else
super(key,val)
end
end
end
So when you run:
h = MyHash.new
h[:A] = ['red']
h[:B] = ['green']
h[:A] = ['blue']
it will output
key: A already has a value!
{:A=>["red"], :B=>["green"]}
Of course you can tailor the overridden behaviour any which way you want.
I would avoid using an array to model an hash at all. In other words, don't construct the array of pairs in the first place. I'm not being facetious or dismissive. I'm speaking as someone who has used arrays of pairs and (even worse) balanced arrays many times, and always regretted it.

Ruby: How to loop through an object that may or may not be an array?

I have an each method that is run on some user-submitted data.
Sometimes it will be an array, other times it won't be.
Example submission:
<numbers>
<number>12345</number>
</numbers>
Another example:
<numbers>
<number>12345</number>
<number>09876</number>
</numbers>
I have been trying to do an each do on that, but when there is only one number I get a TypeError (Symbol as array index) error.
I recently asked a question that was tangentally similar. You can easily force any Ruby object into an array using Array.
p Array([1,2,3]) #-> [1,2,3]
p Array(123) #-> [123]
Of course, arrays respond to each. So if you force everying into an array, your problem should be solved.
A simple workaround is to just check if your object responds to :each; and if not, wrap it in an array.
irb(main):002:0> def foo x
irb(main):003:1> if x.respond_to? :each then x else [x] end
irb(main):005:1> end
=> nil
irb(main):007:0> (foo [1,2,3]).each { |x| puts x }
1
2
3
=> [1, 2, 3]
irb(main):008:0> (foo 5).each { |x| puts x }
5
=> [5]
It looks like the problem you want to solve is not the problem you are having.
TypeError (Symbol as array index)
That error tells me that you have an array, but are treating it like a hash and passing in a symbol key when it expects an integer index.
Also, most XML parsers provide child nodes as array, even if there is only one. So this shouldn't be necesary.
In the case of arguments to a method, you can test the object type. This allows you to pass in a single object or an array, and converts to an array only if its not one so you can treat it identically form that point on.
def foo(obj)
obj = [obj] unless obj.is_a?(Array)
do_something_with(obj)
end
Or something a bit cleaner but more cryptic
def foo(obj)
obj = [*obj]
do_something_with(obj)
end
This takes advantage of the splat operator to splat out an array if it is one. So it splats it out (or doesn't change it) and you can then wrap it an array and your good to go.
I was in the same position recently except the object I was working with was either a hash or an array of hashes. If you are using Rails, you can use Array.wrap because Array(hash) converts hashes to an array.
Array({foo: "bar"}) #=> [[:foo, "bar"]]
Array.wrap({foo: "bar"}) #=> [{:foo=>"bar"}]
Array.wrap(123) #=> [123]
Array.wrap([123]) #=> [123]
I sometimes use this cheap little trick:
[might_be_an_array].flatten.each { |x| .... }
Use the splat operator:
[*1] # => [1]
[*[1,2]] # => [1,2]
Like Mark said, you're looking for "respond_to?" Another option would be to use the conditional operator like this:
foo.respond_to? :each ? foo.each{|x| dostuff(x)} : dostuff(foo);
What are you trying to do with each number?
You should try to avoid using respond_to? message as it is not a very object oriented aproach.
Check if is it possible to find in the xml generator code where it is assigning an integer value when there is just one <"number"> tag and modify it to return an array.
Maybe it is a complex task, but I would try to do this in order to get a better OO design.
I don't know much anything about ruby, but I'd assume you could cast (explicitly) the input to an array - especially given that if the input is simply one element longer it's automatically converted to an array.
Have you tried casting it?
If your input is x, use x.to_a to convert your input into an array.
[1,2,3].to_a
=> [1, 2, 3]
1.to_a
=> [1]
"sample string".to_a
=> ["sample string"]
Edit: Newer versions of Ruby seem to not define a default .to_a for some standard objects anymore. You can always use the "explicit cast" syntax Array(x) to achieve the same effect.

Resources