chef 11: any way to turn attributes into a ruby hash? - ruby

I'm generating a config for my service in chef attributes. However, at some point, I need to turn the attribute mash into a simple ruby hash. This used to work fine in Chef 10:
node.myapp.config.to_hash
However, starting with Chef 11, this does not work. Only the top-level of the attribute is converted to a hash, with then nested values remaining immutable mash objects. Modifying them leads to errors like this:
Chef::Exceptions::ImmutableAttributeModification
------------------------------------------------ Node attributes are read-only when you do not specify which precedence level to set. To
set an attribute use code like `node.default["key"] = "value"'
I've tried a bunch of ways to get around this issue which do not work:
node.myapp.config.dup.to_hash
JSON.parse(node.myapp.config.to_json)
The json parsing hack, which seems like it should work great, results in:
JSON::ParserError
unexpected token at '"#<Chef::Node::Attribute:0x000000020eee88>"'
Is there any actual reliable way, short of including a nested parsing function in each cookbook, to convert attributes to a simple, ordinary, good old ruby hash?

after a resounding lack of answers both here and on the opscode chef mailing list, i ended up using the following hack:
class Chef
class Node
class ImmutableMash
def to_hash
h = {}
self.each do |k,v|
if v.respond_to?('to_hash')
h[k] = v.to_hash
else
h[k] = v
end
end
return h
end
end
end
end
i put this into the libraries dir in my cookbook; now i can use attribute.to_hash in both chef 10 (which already worked properly and which is unaffected by this monkey-patch) and chef 11. i've also reported this as a bug to opscode:
if you don't want to have to monkey-patch your chef, speak up on this issue:
http://tickets.opscode.com/browse/CHEF-3857
Update: monkey-patch ticket was marked closed by these PRs

I hope I am not too late to the party but merging the node object with an empty hash did it for me:
chef (12.6.0)> {}.merge(node).class
=> Hash

I had the same problem and after much hacking around came up with this:
json_string = node[:attr_tree].inspect.gsub(/\=\>/,':')
my_hash = JSON.parse(json_string, {:symbolize_names => true})
inspect does the deep parsing that is missing from the other methods proposed and I end up with a hash that I can modify and pass around as needed.

This has been fixed for a long time now:
[1] pry(main)> require 'chef/node'
=> true
[2] pry(main)> node = Chef::Node.new
[....]
[3] pry(main)> node.default["fizz"]["buzz"] = { "foo" => [ { "bar" => "baz" } ] }
=> {"foo"=>[{"bar"=>"baz"}]}
[4] pry(main)> buzz = node["fizz"]["buzz"].to_hash
=> {"foo"=>[{"bar"=>"baz"}]}
[5] pry(main)> buzz.class
=> Hash
[6] pry(main)> buzz["foo"].class
=> Array
[7] pry(main)> buzz["foo"][0].class
=> Hash
[8] pry(main)>
Probably fixed sometime in or around Chef 12.x or Chef 13.x, it is certainly no longer an issue in Chef 15.x/16.x/17.x

The above answer is a little unnecessary. You can just do this:
json = node[:whatever][:whatever].to_hash.to_json
JSON.parse(json)

Related

Ruby - Matching Twitter URL from any html page using Regex

I am trying to fetch the Twitter URL from this page for instance; however, my result is nil. I am pretty sure my regex is not too bad, but my code fails. Here is it :
doc = `(curl --url "http://www.rabbitreel.com/")`
twitter_url = ("/^(?i)[http|https]+:\/\/(?i)[twitter]+\.(?i)(com)\/?\S+").match(doc)
puts twitter_url
# => nil
Maybe, I misused regex syntax. My initial idea was simple: I wanted to match a regular Twitter url structure. I even tried http://rubular.com to test my regex, and it seemed to be fine when I entered a Twitter url.
http://ruby-doc.org/core-2.2.0/String.html#method-i-match
tells you that the object you're calling match on should be the string you're parsing, and the parameter should be the regex pattern. So if anything, you should call :
doc.match("/^(?i)[http|https]+:\/\/(?i)[twitter]+\.(?i)(com)\/?\S+")
I prefer
doc[/your_regex/]
syntax, because it directly delivers a String, and not a MatchData, which needs another step to get the information out of.
For Regexen, I always try to begin as simple as possible
[3] pry(main)> doc[/twitter/]
=> "twitter"
[4] pry(main)> doc[/twitter\.com/]
=> "twitter.com"
[5] pry(main)> doc[/twitter\.com\//]
=> "twitter.com/"
[6] pry(main)> doc[/twitter\.com\/\//] #OOPS. One \/ too many
=> nil
[7] pry(main)> doc[/twitter\.com\//]
=> "twitter.com/"
[8] pry(main)> doc[/twitter\.com\/\S+/]
=> "twitter.com/rabbitreel\""
[9] pry(main)> doc[/twitter\.com\/[^"]+/]
=> "twitter.com/rabbitreel"
[10] pry(main)> doc[/http:\/\/twitter\.com\/[^"]+/]
=> nil
[11] pry(main)> doc[/https?:\/\/twitter\.com\/[^"]+/]
=> "https://twitter.com/rabbitreel"
[12] pry(main)> doc[/https?:\/\/twitter\.com\/[^" ]+/]
=> "https://twitter.com/rabbitreel"
[13] pry(main)> doc[/https?:\/\/twitter\.com\/\w+/] #DONE
=> "https://twitter.com/rabbitreel"
EDIT:
Sure, Regexen cannot parse an entire HTML document.
Here, we only want to find the first occurence of a Twitter URL. So, depending on the requirements, on possible input and the chosen platform, it could make sense to use a Regexp.
Nokogiri is a huge gem, and it might not be possible to install it.
Independently from this fact, it would be a very good idea to check that the returned String really is a correct Twitter URL.
I think this Regexp:
/https?:\/\/twitter\.com\/\w+/
is safe.
[31] pry(main)> malicious_doc = "https://twitter.com/userid#maliciouswebsite.com"
=> "https://twitter.com/userid#maliciouswebsite.com"
[32] pry(main)> malicious_doc[/https?:\/\/twitter\.com\/\w+/]
=> "https://twitter.com/userid"
Using Nokogiri doesn't prevent you from checking for malicious input.
The proposed solution from #mudasobwa is interesting, but isn't safe yet:
[33] pry(main)> Nokogiri::HTML('<html><body>Link</body></html>').css('a').map { |e| e.attributes.values.first.value }.select {|e| e =~ /twitter.com/ }
=> ["http://maliciouswebsitethatisnottwitter.com/"]
NB as of Nov 2021, rabbitreel.com domain is on sale, so please read the comments about the possibility of it’s serving malicious content.
One should never use regexps to parse HTML and here is why.
Below is a robust solution using Nokogiri HTML parsing library:
require 'nokogiri'
doc = Nokogiri::HTML(`(curl --url "http://www.rabbitreel.com/")`)
doc.css('a').map { |e| e.attributes.values.first.value }
.select {|e| e =~ /twitter.com/ }
#⇒ [
# [0] "https://twitter.com/rabbitreel",
# [1] "https://twitter.com/rabbitreel"
# ]
Or, alternatively, with xpath:
require 'nokogiri'
doc = Nokogiri::HTML(`(curl --url "http://www.rabbitreel.com/")`)
doc.xpath('//a[contains(#href, "twitter.com")]')
.map { |e| e.attributes['href'].value }

Ruby access propteries with dot-notation

I'm trying to build a class that will basically be used as a data structure for storing values/nested values. I want there to be two methods, get and set, that accept a dot-notated path to recursively set or get variables.
For example:
bag = ParamBag.new
bag.get('foo.bar') # => nil
bag.set('foo.bar', 'baz')
bag.get('foo.bar') # => 'baz'
The get method could also take a default return value if the value doesn't exist:
bag.get('foo.baz', false) # => false
I could also initialize a new ParamBag with a Hash.
How would I manage this in Ruby? I've done this in other languages, but in order to set a recursive path, I would take the value by reference, but I'm not sure how I'd do it in Ruby.
This was a fun exercise but still falls under the "you probably should not do this" category.
To accomplish what you want, OpenStruct can be used with some slight modifications.
class ParamBag < OpenStruct
def method_missing(name, *args, &block)
if super.nil?
modifiable[new_ostruct_member(name)] = ParamBag.new
end
end
end
This class will let you chain however many method calls together you would like and set any number of parameters.
Tested with Ruby 2.2.1
2.2.1 :023 > p = ParamBag.new
=> #<ParamBag>
2.2.1 :024 > p.foo
=> #<ParamBag>
2.2.1 :025 > p.foo.bar
=> #<ParamBag>
2.2.1 :026 > p.foo.bar = {}
=> {}
2.2.1 :027 > p.foo.bar
=> {}
2.2.1 :028 > p.foo.bar = 'abc'
=> "abc"
Basically, take your get and set methods away and call methods like you would normally.
I do not advise you actually do this, I would instead suggest you use OpenStruct by itself to acheive some flexibility without going too crazy. If you find yourself needing to chain a ton of methods and have them never fail, maybe take a step backwards and ask "is this really the right way to approach this problem?". If the answer to that question is a resounding yes, then ParamBag might just be perfect.

Weird behavior of #upcase! in Ruby

Consider the following code:
#person = { :email => 'hello#example.com' }
temp = #person.clone
temp[:email].upcase!
p temp[:email] # => HELLO#EXAMPLE.COM
p #person[:email] # => HELLO#EXAMPLE.COM, why?!
# But
temp[:email] = 'blah#example.com'
p #person[:email] # => HELLO#EXAMPLE.COM
Ruby version is: "ruby 2.1.0p0 (2013-12-25 revision 44422) [i686-linux]".
I have no idea why is it happening. Can anyone help, please?
In the clone documentation you can read:
Produces a shallow copy of obj—the instance variables of obj are
copied, but not the objects they reference. clone copies the frozen
and tainted state of obj.
Also pay attention to this:
This method may have class-specific behavior. If so, that behavior
will be documented under the #initialize_copy method of the class.
Meaning that in some classes this behaviour can be overrided.
So any object references will be kept, instead of creating new ones. So what you want is a deep copy you can use Marshal:
temp = Marshal.load(Marshal.dump(#person))

how non existing hash key returns empty string?

Here is the exact problem;
$ hash
=> {:createAuthenticationTokenRequest=>{:playerSessionID=>"111"}}
$ hash[:attributes!]
=> "" (here is the crazy result)
$ hash.class
=> Hash
$ hash.keys
=> [:createAuthenticationTokenRequest]
what is going on here? Am i not supposed to get nil for non existent hash keys ?
Detailed problem:
I am using savon to send a webservice request and getting "can't convert Symbol into Integer" error all the time, debugging the error with pry showed me that this line is getting executed as empty string which it shouldn't.
attributes = hash[:attributes!] || {}
Help me out here!
thanks in advance, cheers!
Update:
Answer for how the hash is created;
class Gyoku::Hash
def self.iterate_with_xml(hash)
xml = Builder::XmlMarkup.new
attributes = hash[:attributes!] || {}
Update2:
This is the request i am sending
request(
createAuthenticationTokenRequest: {
playerSessionID: "111"
}
)
As i mentioned before this is savon gem code that gets executed. I tried to write the question as less boring as possible, and don't get why it gets downvoted :/
here is the source code that gets debugged.
https://github.com/savonrb/gyoku/blob/master/lib/gyoku/hash.rb
i guess i deserved to be downvoted......
Don't do this in your code,
def default_request_parameters
#default_request_parameters || Hash.new('')
end
You can create a hash like this
way 1:
new_hash = Hash.new{|h,k| h[k] = ""}
new_hash['unknown_key']
it returns
=> ""
new_hash.keys
=> ['unknown_key']
this adds 'unknown_key' key to the hash.
way 2:
hash2 = Hash.new("")
hash2['unknown_key']
it returns
=> ""
but no keys are added.
hash2.keys
it returns
=> []

Exact opposite to Ruby's CGI.parse method?

I'd like to do some sanitization of query params.
I parse the query with CGI.parse, then I delete some params, but I can't find an opposite method to build the query.
I don't really want to do something like
params.map{|n,v| "#{CGI.escape n}=#{CGI.escape v.to_s}"}.join("&")
There's got to be a simpler way. Is there?
There is a nice method in URI module:
require 'uri'
URI.encode_www_form("q" => "ruby", "lang" => "en") #=> "q=ruby&lang=en"
If you're using Rails (or don't mind pulling in ActiveSupport), then you can use to_param (AKA to_query):
{ :a => '&', :b => 'Where is pancake house?', :c => ['an', 'array'] }.to_param
# a=%26&b=Where+is+pancake+house%3F&c%5B%5D=an&c%5B%5D=array
to_param handles arrays a little differently than your version though, it'll put out c[]=an&c[]=array rather than just c=an&c=array.
While there's no better answer, I'll put up the method which I'm using now.
def build_query(params)
params.map do |name,values|
values.map do |value|
"#{CGI.escape name}=#{CGI.escape value}"
end
end.flatten.join("&")
end
I am not sure if the following is a simplification, but it avoids expanding the (key, value) pairs of a hash.
params.map{|qq| qq.map{|q| CGI.escape(q)}.join('=')}.join('&')

Resources