method cascading is possible here? - ruby

I have a three lines of code here like shown below
local = headers.zip(*data_rows).transpose
local = local[1..-1].map {|dataRow| local[0].zip(dataRow).to_h}
p local
Now if you watch the above three lines, I have to store the result of the first line in the variable called local since it would be used in two places in the second line as I have shown,So Can't I cascade the second line with first line anyway? I tried using tap like this
local = headers.zip(*data_rows).transpose.tap{|h|h[1..-1].map {|dataRow| h[0].zip(dataRow).to_h}}
tap is returning the self as explained in the document so can't I get the result final result when I use tab? Anyway other way to achieve this result in one single line so that I don't have to use local variable?

If you're on Ruby 2.5.0 or later, you can use yield_self for this.
local = headers.zip(*data_rows).transpose.yield_self { |h| h[1..-1].map { |dataRow| h[0].zip(dataRow).to_h } }
yield_self is similar to tap in that they both yield self to the block. The difference is in what is returned by each of the two methods.
Object#tap yields self to the block and then returns self. Kernel#yield_self yields self to the block and then returns the result of the block.
Here's an answer to a previous question where I gave a couple of further examples of where each of these method can be useful.

It's often helpful to execute working code with data, to better understand what is to be computed. Seeing transpose and zip, which are often interchangeable, used together, was a clue that a simplification might be possible (a = [1,2,3]; b = [4,5,6]; a.zip(b) => [[1, 4], [2, 5], [3, 6]] <= [a,b].transpose).
Here's my data:
headers=[1,2,3]
data_rows=[[11,12,13],[21,22,23],[31,32,33],[41,42,43]]
and here's what the working code returns:
local = headers.zip(*data_rows).transpose
local[1..-1].map {|dataRow| local[0].zip(dataRow).to_h}
#=> [{1=>11, 2=>12, 3=>13}, {1=>21, 2=>22, 3=>23},
# {1=>31, 2=>32, 3=>33}, {1=>41, 2=>42, 3=>43}]
It would seem that this might be computed more simply:
data_rows.map { |row| headers.zip(row).to_h }
#=> [{1=>11, 2=>12, 3=>13}, {1=>21, 2=>22, 3=>23},
# {1=>31, 2=>32, 3=>33}, {1=>41, 2=>42, 3=>43}]

Related

Correct semantic usage of map

I am trying to understand what is a semantically right way to use map. As map can behave the same way as each, you could modify the array any way you like. But I've been told by my colleague that after map is applied, array should have
the same order and the same size.
For example, that would mean using the map to return an updated array won't be the right way to use map:
array = [1,2,3,4]
array.map{|num| num unless num == 2 || num == 4}.compact
I've been using map and other Enumerator methods for ages and never thought about this too much. Would appreciate advice from experienced Ruby Developers.
In Computer Science, map according to Wikipedia:
In many programming languages, map is the name of a higher-order
function that applies a given function to each element of a list,
returning a list of results in the same order
This statement implies the returned value of map should be of the same length (because we're applying the function to each element). And the returned-elements are to be in the same order. So when you use map, this is what the reader expects.
How not to use map
arr.map {|i| arr.pop } #=> [3, 2]
This clearly betrays the intention of map since we have a different number of elements returned and they are not even in the original order of application. So don't use map like this. See "How to use ruby's value_at to get subhashes in a hash" and subsequent comments for further clarification and thanks to #meager for originally pointing this out to me.
Meditate on this:
array = [1,2,3,4]
array.map{|num| num unless num == 2 || num == 4} # => [1, nil, 3, nil]
.compact # => [1, 3]
The intermediate value is an array of the same size, however it contains undesirable values, forcing the use of compact. The fallout of this is CPU time is wasted generating the nil values, then deleting them. In addition, memory is being wasted generating another array that is the same size when it shouldn't be. Imagine the CPU and memory cost in a loop that is processing thousands of elements in an array.
Instead, using the right tool cleans up the code and avoids wasting CPU or memory:
array.reject { |num| num == 2 || num == 4 } # => [1, 3]
I've been using map and other Enumerator methods for ages and never thought about this too much.
I'd recommend thinking about it. It's the little things like this that can make or break code or a system, and everything we do when programming needs to be done deliberately, avoiding all negative side-effects we can foresee.

Ruby: evaluate string with dynamic binding of variables

I have a database of "formulas" stored as strings. Let's assume for simplicity, that each formula contains 2 variables denoted by a and b, and that the formulas are all wellformed and it is ensured that it consists only of characters from the set ()ab+-*.
At runtime, formulas are fetched from this database, and from another source, numeric values for a and b are fetched, and the formulas are evaluated. The evaluation can be programmed like this:
# This is how it works right now
formula = fetch_formula(....)
a = fetch_left_arg(....)
b = fetch_right_arg(....)
result = eval(formula)
This design works, but I'm not entirely happy with it. It requires that my program names the free variables exactly the same as they are named in the formula, which is ugly.
If my "formula" would not be a string, but a Proc object or Lambda which accepts two parameters, I could do something like
# No explicitly named variables
result = fetch_proc(...).call(fetch_left_arg(....),fetch_right_arg(....))
but unfortunately, the formulas have to be strings.
I tried to experiment in the following way: What if the method, which fetches the formula from the database, would wrap the string into something, which behaves like a block, and where I could pass parameters to it?
# This does not work of course, but maybe you get the idea:
block_string = "|a,b| #{fetch_formula(....)}"
Of course I can't eval such a block_string, but is there something similar which I could use? I know that instance_eval can pass parameters, but what object should I apply it to? So this is perhaps not an option either....
This is very nasty approach, but for simple formulas you’ve mentioned it should work:
▶ formula = 'a + b'
▶ vars = formula.scan(/[a-z]+/).uniq.join(',') # getting vars names
#⇒ "a,b"
▶ pr = eval("proc { |#{vars}| #{formula} }") # preparing proc
▶ pr.call 3, 5
#⇒ 8
Here we rely on the fact, that parameters are passed to the proc in the same order, as they appear in the formula.
If I get your question correctly, it is something that I have done recently, and is fairly easy. Given a string:
s = "{|x, y| x + y}"
You can create a proc by doing:
eval("Proc.new#{s}")
One way to avoid creating the variables in the local scope could be to use a Binding:
bind = binding
formula = fetch_formula(....)
bind.local_variable_set :a, fetch_left_arg(....)
bind.local_variable_set :b, fetch_right_arg(....)
result = bind.eval(formula)
The variables a and b now only exist in the binding, and do not pollute the rest of your code.
You can create a lambda from string, as shown below:
formula = "a + b"
lambda_template = "->(a,b) { %s }"
formula_lambda = eval(lambda_template % formula)
p formula_lambda.call(1,2)
#=> 3

Can I count on partition preserving order?

Say I have a sorted Array, such as this:
myArray = [1, 2, 3, 4, 5, 6]
Suppose I call Enumerable#partition on it:
p myArray.partition(&:odd?)
Must the output always be the following?
[[1, 3, 5], [2, 4, 6]]
The documentation doesn't state this; this is what it says:
partition { |obj| block } → [ true_array, false_array ]
partition → an_enumerator
Returns two arrays, the first containing the elements of enum for which the block evaluates to true, the second containing the rest.
If no block is given, an enumerator is returned instead.
But it seems logical to assume partition works this way.
Through testing Matz's interpreter, it appears to be the case that the output works like this, and it makes full sense for it to be like this. However, can I count on partition working this way regardless of the Ruby version or interpreter?
Note: I made implementation-agnostic because I couldn't find any other tag that describes my concern. Feel free to change the tag to something better if you know about it.
No, you can't rely on the order. The reason is parallelism.
A traditional serial implementation of partition would loop through each element of the array evaluating the block one at a time in order. As each call to odd returns, it's immediately pushed into the appropriate true or false array.
Now imagine an implementation which takes advantage of multiple CPU cores. It still iterates through the array in order, but each call to odd can return out of order. odd(myArray[2]) might return before odd(myArray[0]) resulting in [[3, 1, 5], [2, 4, 6]].
List processing idioms such as partition which run a list through a function (most of Enumerable) benefit greatly from parallel processing, and most computers these days have multiple cores. I wouldn't be surprised if a future Ruby implementation took advantage of this. The writers of the API documentation for Enumerable likely carefully omitted any mention of process ordering to leave this optimization possibility open.
The documentation makes no explicit mention of this, but judging from the official code, it does retain ordering:
static VALUE
partition_i(RB_BLOCK_CALL_FUNC_ARGLIST(i, arys))
{
struct MEMO *memo = MEMO_CAST(arys);
VALUE ary;
ENUM_WANT_SVALUE();
if (RTEST(enum_yield(argc, i))) {
ary = memo->v1;
}
else {
ary = memo->v2;
}
rb_ary_push(ary, i);
return Qnil;
}
This code gets called from the public interface.
Essentially, the ordering in which your enumerable emits objects gets retained with the above logic.

Rails: control ordering of query parameters in url_for?

I'd like to generate a URL where the "p=1" query param appears at the end of the URL, like:
/path?foo=X&bar=Y&p=1
Is it possible to control the ordering of query parameters when generating URLs via:
url_for(params.merge({ p: page_num }))
?
Update:
I tried ChuckE's suggestion below. It turns out that in Ruby 1.9 Hashes are already ordered, so the code in ActiveSupport::OrderedHash is effectively no-op'd. You can verify with Ruby 1.9 that order is preserved:
>> h = {one: 1, two: 2, three: 3 }
{:one=>1, :two=>2, :three=>3}
>> f = h.except(:one)
{:two=>2, :three=>3}
>> f[:one] = 1
1
>> f
{:two=>2, :three=>3, :one=>1}
However, url_for still puts the "p" param first. It seems that any potential solution will need to address how url_for iterates the hash.
After further digging, I see that what's happening is that url_for is actually sorting the parameters by key lexicographically, independent of their insertion order in the hash. Apparently this is being done to aid caching, since URL params are often used for page cache keys.
In short, you can't do it without patching Hash, specifically, you need to override activesupport/core_ext/object/to_param.rb so that Hash#to_param does not call .sort on the return value.
Related question: How to generate custom sorted query string URL in Rails link_to?.
First question is: why would you need something like that? The order which the parameters appear in the url in doesn't influence the way they are fetched by the server, since they are basic key/value associations. So, no matter where the parameter appears, it will always be recognized by the server.
Nonetheless, to answer your question, yes, it is possible. You just have to use ordered hashes. They are available through active support.
opts = OrderedHash.new
opts[:foo] = 'X'
opts[:bar] = 'Y'
opts[:p] = 1
your_helper_url(opts)
Should do the trick for you.

Changing one array in an array of arrays changes them all; why?

a = Array.new(3,[])
a[1][0] = 5
a => [[5], [5], [5]]
I thought this doesn't make sense!
isn't it should a => [[], [5], []]
or this's sort of Ruby's feature ?
Use this instead:
a = Array.new(3){ [] }
With your code the same object is used for the value of each entry; once you mutate one of the references you see all others change. With the above you instead invoke the block each time a new value is needed, which returns a new array each time.
This is similar in nature to the new user question about why the following does not work as expected:
str.gsub /(<([a-z]+)>/, "-->#{$1}<--"
In the above, string interpolation occurs before the gsub method is ever called, so it cannot use the then-current value of $1 in your string. Similarly, in your question you create an object and pass it to Array.new before Ruby starts creating array slots. Yes, the runtime could call dup on the item by default…but that would be potentially disastrous and slow. Hence you get the block form to determine on your own how to create the initial values.

Resources