ruby - need help understanding this inject - ruby

I'd like to understand how the following code works:
def url
#url ||= {
"basename" => self.basename,
"output_ext" => self.output_ext,
}.inject("/:basename/") { |result, token|
result.gsub(/:#{token.first}/, token.last)
}.gsub(/\/\//, "/")
end
I know what it does; somehow it returns the url corresponding to a file located o a dir on a server. So it returns strings similar to this: /path/to/my/file.html
I understand that if #url already has a value, it will be returned and the right ||= will be discarded. I also understand that this begins creating a hash of two elements.
I also think I understand the last gsub; it replaces backslashes by slashes (to cope with windows servers, I guess).
What amazes me is the inject part. I'm not able to understand it. I have used inject before, but this one is too much for me. I don't see how this be done with an each, since I don't understand what it does.
I modified the original function slightly for this question; the original comes from this jekyll file.
Cheers!

foo.inject(bar) {|result, x| f(result,x) }
Can always be written as:
result = bar
foo.each {|x| result = f(result, x)}
result
So for your case, the version with each would look like this:
result = "/:basename/"
{
"basename" => self.basename,
"output_ext" => self.output_ext,
}.each {|token|
result = result.gsub(/:#{token.first}/, token.last)
}
result
Meaning: for all key-value-pairs in the hash, each occurrence of the key in the "/:basename/" is replaced with the value.

Perhaps splitting the code and tweaking a little helps
options = { "basename" => self.basename, "output_ext" => self.output_ext }
options.inject("/:basename") do |result, key_and_kalue|
# Iterating over the hash yields an array of two elements, which I called key_and_value
result.gsub(":#{key_and_value[0]}", key_and_value[1])
end.gsub!(//\/\/, '/')
Basically, the inject code is iterating over all your options and replacing for the actual value wherever it sees a ":key"

Related

why return change variables while inside a class

I cannot understand this ruby behavior, the code explains better what I mean:
class DoNotUnderstand
def initialize
#tiny_array = [3,4]
test
end
def messing(ary)
return [ary[0]+=700, ary[1]+=999]
end
def test
puts #tiny_array.join(",") # before => 3,4
messing(#tiny_array)
puts #tiny_array.join(",") # after => 703,1003
end
end
question = DoNotUnderstand.new
#tiny_array was [3,4] and became [703,1003]
if I don't use a class, that happens:
#tiny = [1,2]
def messing(ary)
return [ary[0]+693,ary[1]+999]
end
puts #tiny.join(",") # before => 1,2
messing(#tiny)
puts #tiny.join(",") # after => 1,2
the array simply remains [1,2]
why?
The class is a red herring, and completely irrelevant to the issue.
In the first case, where the array was modified, you defined messing as:
def messing(ary)
return [ary[0]+=700, ary[1]+=999]
end
Whereas in the second case, where the array was not modified, you defined messing as:
def messing(ary)
return [ary[0]+693,ary[1]+999]
end
In one case you used +=, and in the other, you used merely +.
ary[0] += 700 is exactly equivalent to ary[0] = ary[0] + 700. In other words you are changing the value stored in the 0th index of ary.
In the second case you merely add to the values stored in the array and return the result, but in the first case you not only return the result, you also store it back in the array.
For an explanation of why modifying ary modifies #tiny_array, see this answer to the question Is Ruby pass by reference or by value?.
You're second code example (the one from outside the class) is missing the two characters in the first that make it work the way it does. In the first example, the += operator is used, modifying the array in place:
return [ary[0]+=700, ary[1]+=999]
In your second example, the + operator is used, leaving the array as is:
return [ary[0]+693,ary[1]+999]
If you change it use the += operator, it works the same way as the first code snippet.

Different results from Hash.each and Hash.inject implementations

I'm trying to understand why I'm getting different results from two expressions which I would have thought were functionally identical.
The each method:
matches = {}
#entries.each do|entry, definition|
matches.merge!({entry => definition}) if entry.match(/^#{entry_to_find}/)
end
matches
The inject method:
#entries.inject({}) {|matches, (entry, definition)| matches.merge!
({entry => definition}) if entry.match(/^#{entry_to_find}/)}
The each code block is giving the correct answer when run, but inject keeps returningnil and I don't understand why. I'd hoped to use inject because it's a much shorter piece of code.
It's because the if will return nil in case the condition is not satisfied, and that will be used for the value of matches in the next iteration. Use Enumerable#each_with_object instead:
#entries.each_with_object({}) do |(entry, definition), matches|
matches.merge!({entry => definition}) if entry.match(/^#{entry_to_find}/)
end
I think ndn's analysis of why your inject approach doesn't work is correct. As for shorter alternatives, as you want those unmodified key-value pairs of #entries that fulfill your condition, have you considered Hash#select?
matches = #entries.select { |entry, definition| entry.match(/^#{entry_to_find}/) }

Why does Array.to_s return brackets?

For an array, when I type:
puts array[0]
==> text
Yet when I type
puts array[0].to_s
==> ["text"]
Why the brackets and quotes? What am I missing?
ADDENDUM: my code looks like this
page = open(url) {|f| f.read }
page_array = page.scan(/regex/) #pulls partial urls into an array
partial_url = page_array[0].to_s
full_url = base_url + partial_url #adds each partial url to a consistent base_url
puts full_url
what I'm getting looks like:
http://www.stackoverflow/["questions"]
This print the array as is without brackets
array.join(", ")
to_s is just an alias to inspect for the Array class.
Not that this means a lot other than instead of expecting array.to_s to return a string it's actually returning array.inspect which, based on the name of the method, isn't really what you are looking for.
If you want just the "guts" try:
page_array.join
If there are multiple elements to the array:
page_array.join(" ")
This will make:
page_array = ["test","this","function"]
return:
"test this function"
What "to_s" on an Array returns, depends on the version of Ruby you are using as mentioned above. 1.9.X returns:
"[\"test\"]"
You need to show us the regex to really fix this properly, but this will do it:
Replace this
partial_url = page_array[0].to_s
with this
partial_url = page_array[0][0]
This doesn't necessarily fix why you are getting a doubled-up array, but you can flatten it and then call the first element like this.
page_array = page.scan(/regex/).flatten
Flattening takes out stacked arrays and creates one level, so if you had [1,2,[3,[4,5,6]]] and called flatten on it, you would get [1,2,3,4,5,6]
It is also more robust than doing array[0][0], because, if you had more than two arrays nested in the first element, you would run into the same issue.
Iain is correct though, without seeing the regex, we can't suss out the root cause.

parsing in ruby

I have this Hash:
cookie = {"fbs_138415639544444"=>["\"access_token=138415639544444|5c682220fa7ebccafd97ec58-503523340|9HHx3z7GzOBPdk444wtt&expires=0
&secret=64aa8b3327eafbfd22ba070b&session_key=5c682220fa7dsfdsafas3523340
&sig=4a494b851ff43d3a58dfa8757b702dfe&uid=503523340\""],
"_play_session"=>["fdasdfasdf"]}
I need to get the substring from right after access_token= to right before &expires. The problem is that the number in the key fbs_138415639544444 changes every time, just the part fbs_ remains constant.
Any idea how to only get:
"138415639544444|5c682220fa7ebccafd97ec58-503523340|9HHx3z7GzOBPdk444wtt"
This is a common task when decoding parameters and queries in HTML URLs. Here's a little method to break down the parameters into a hash. From there it's easy to get the value you want:
def get_params_hash(params)
Hash[ *params.split('&').map{ |q| q.split('=') }.flatten ]
end
p get_params_hash(cookie['fbs_138415639544444'].first)['"access_token']
# >> "138415639544444|5c682220fa7ebccafd97ec58-503523340|9HHx3z7GzOBPdk444wtt"
In Ruby 1.9+, hashes retain their insertion order, so if the hash always has the value you want as its first entry, you can use
cookie.keys.first #=> "fbs_138415639544444"
otherwise use:
cookie.keys.select{ |k| k[/^fbs_/] }.first #=> "fbs_138415639544444"
I never code in ruby, but this sounds like a typical task for split function.
you just need to split this
"\"access_token=138415639544444|5c682220fa7ebccafd97ec58-503523340|9HHx3z7GzOBPdk444wtt&expires=0
&secret=64aa8b3327eafbfd22ba070b&session_key=5c682220fa7dsfdsafas3523340
&sig=4a494b851ff43d3a58dfa8757b702dfe&uid=503523340\""
by & symbol. The first element of result array will be:
"\"access_token=138415639544444|5c682220fa7ebccafd97ec58-503523340|9HHx3z7GzOBPdk444wtt"
and after split it by =, and the second element of result array should be:
138415639544444|5c682220fa7ebccafd97ec58-503523340|9HHx3z7GzOBPdk444wtt
If you only need the access_key part, then a regex is probably easiest.
cookie["fbs_138415639544444"][0] =~ /access_token\=([-\w\d\|]*)&/
access_key = $1
Here the access_key is in the first capture group and you can get it with $1.
A better option if you'll need other parts of the string (say the session_key), would probably be to use a couple splits and parse the string into it's own hash.
Edit: Just realized you need the key too.
key = cookie.each_key.find { |k| k.start_with? "fbs_" }
Then you can use key to get the value.
Since the key changes, the first step is to get right key:
key = cookie.keys.select {|k| k =~ /^fbs_/}.first
This matches them if they begin with the text "fbs_". The first match is returned.
Next you can get the other value by a few (ugly) splits:
cookie[key].first.split('=')[1].split('&').first
Using a regex might be a bit cleaner, but it depends on what the valid characters are in that string.
Regexs are brittle so I wouldn't use those when the reality is you are parsing query string params in the end so use the CGI lib:
> require 'cgi'
=> true
> cookie = {"fbs_138415639544444"=>["\"access_token=138415639544444|5c682220fa7ebccafd97ec58-503523340|9HHx3z7GzOBPdk444wtt&expires=0&secret=64aa8b3327eafbfd22ba070b&session_key=5c682220fa7dsfdsafas3523340&sig=4a494b851ff43d3a58dfa8757b702dfe&uid=503523340\""], "_play_session"=>["fdasdfasdf"]}
> CGI.parse(cookie.select {|k,v| k =~ /^fbs_/}.first[1][0])["\"access_token"][0]
=> "138415639544444|5c682220fa7ebccafd97ec58-503523340|9HHx3z7GzOBPdk444wtt"
This is how i solved the problem...
access_token_key = cookies.keys.find{|item| item.starts_with?('fbs_') }
token = cookies[access_token_key].first
access_token = token.split("&").find{|item| item.include?('access_token') }
fb_access_token = access_token.split("=").find{|item| !item.include?('access_token') }

Is there a way to pass a regex capture to a block in Ruby?

I have a hash with a regex for the key and a block for the value. Something like the following:
{ 'test (.+?)' => { puts $1 } }
Not exactly like that, obviously, since the block is being stored as a Proc, but that's the idea.
I then have a regex match later on that looks a lot like this
hash.each do |pattern, action|
if /#{pattern}/i.match(string)
action.call
end
end
The idea was to store the block away in the hash to make it a bit easier for me to expand upon in the future, but now the regex capture doesn't get passed to the block. Is there a way to do this cleanly that would support any number of captures I put in the regex (as in, some regex patterns may have 1 capture, others may have 3)?
What if you pass the match data into your procs?
hash.each do |pattern, action|
if pattern.match(string)
action.call($~)
end
end
Your hash would become:
{ /test (.+?)/i => lambda { |matchdata| puts matchdata[1] } }
I would use Hash.find which walks the hash elements, passing them into a block, one at a time. The one that returns true wins:
Something like this:
hash = {/foo/ => lambda { 'foo' }, /bar/ => lambda { 'bar' } }
str = 'foo'
puts hash.find{ |n,v| str =~ n }.to_a.last.call
Obviously I'm using lambda but it's close enough. And, if there was no match you need to handle nil values. For the example I chained to_a.last.call but in real life you'd want to react to a nil otherwise Ruby will get mad.
If you are searching through a lot of patterns, or processing a lot of text, your search will be slowed down by having to recompile the regex each time. I'd recommend storing the keys as regex objects to avoid that.

Resources