What do blocks do? - ruby

If someone could shed a light for me, or show an equivalent (if one exists) in PHP-style code, I'd really love it.
This piece of code:
require 'sqlite3'
SQLite3::Database.new("metadata.db").execute("SELECT * from BOOKS") do |row|
puts row
end
uses execute method to issue an SQL query. It's doing a loop, but on what? On the returned value of that query? Can you append a block of code to any expression and it will work on its return value? It seems like the loop isn't connected to anything, and |row| appears from nowhere.
For comparison, in PHP to access a database I would write
$db = new mysqli('details');
$results = $db->query("SELECT * FROM books");
while ($row = $results->fetch()) {
echo $row[0];
}
which says to create a database, store the results of a query as results, then start a loop in which each row of that result is turned into an array to be accessed with array notation. Is that not how Rubyists would do it? What's going on here?

It's doing a loop, but on what? On the returned value of that query?
Right. In your example row is whatever is generated by
SQLite3::Database.new("metadata.db").execute("SELECT * from BOOKS")
In this case row a reasonable thing to call this because any actions you do in the block will be based on that row from the db, but you can choose to call this 'block argument' anything you want.
Is this a thing that happens in Ruby, where you can immediately append a block of code to any expression and it'll work on its return value?
Not really, no. There are some methods that take block arguments - each is a good example.
If you have a collection of, say, cats at the animal hospital, you can loop through them and do operations on each cat because each takes a block argument.
#pretend these cats are objects and not just a string name
cats = [ "mrs. muffin face", "princess cupcake", "deathlord 666", ...]
cats.each do |cat|
#do stuff with each cat here. print their name, find their shoe size, etc
end
Blocks are a very fundamental ruby concept, so it's probably worth reading a bit more about them - this answer links to Why's (poignant) guide to ruby which is generally a good basic reference. I would also suggest Eloquent Ruby for more thorough examples of Ruby.

You seem to be assuming that a block is something that loops on a result, but that is not necessarily correct. A block in Ruby is opposed to arguments. Whereas arguments are evaluated before the execution of the main method, a block is not evaluated in advance. Whether to evaluate that block, under what bindings it is to be evaluated, what timing it is to be evaluated, and how many times it is to be evaluated, all is up to the main method. The block variable in | | is a parameter that the main method passes to the block. Often, elements of the value the main method would have returned without the block are passed one after another, which is the case you had in mind, but that is not always the case.

Here is how execute might be coded in Ruby
class SQLite3::Database
def execute(query_str)
results = db.query(query_str)
while (row = results.fetch)
yield row
end
end
end
(assuming db.query and results.fetch work as you would expect from PHP). As you can see, this is nearly identical to what you're used to from PHP except for that weird yield. As the method loops through the rows of the result, yield passes each row to the block.
This lets you focus on the important stuff (what to do with each row) while hiding away the incidental details (like iterating over a result variable).

Related

Is it possible to write a single-statement for loop in two lines with Ruby?

Often when programming in Ruby, I find myself writing small for loops with a single statement in the body. For example...
for number in 1..10
puts number
end
In other languages like C, Java or Kotlin (for example), I'd be able to write the same code in two lines. For example...
// Kotlin
for (number in 1..10)
println(number)
In the above example, the ending of the loop body is inferred due to the lack of curly braces.
Does Ruby have a way to imitate this "single-statement" style of "for loop"?
Here are some of [my options/your potential replies], along with my thoughts on them.
You could append ; end to the body to end it on the same line.
This is true, and pretty sufficient, but I'd like to know if there's a more idiomatic approach.
This seems unnecessarily picky. Why would you ever want to do this?
You may think I'm being too picky. You may also think what I'm trying to do is un-idiomatic (if that's even a word). I totally understand, but I'd still love to know if it's do-able!
Doing this could let us write code that's even just a tiny bit nicer to read. And for programmers, readability matters.
Sure, you're looking for each, Range#each in this particular case:
(1..10).each { |number| puts number }
For more complex iterations use do - end block syntax. For example
(1..10).each do |number|
puts number
some_method_call(number)
Rails.logger.info("The #{number} is used")
something_else
end
To find more check out Ruby documentation, in particular, see Enumerable.
There is a vanishingly tiny number of cases, where any self-respecting Ruby programmer would even write an explicit loop at all. The number of cases where that loop is a for loop is exactly zero. There is no "more idiomatic" way to write a for loop, because for loops are non-idiomatic, period.
There is an even shorter syntax.
If you are just calling one method on each object you can use & syntax.
(1..3).collect(&:odd?) # => [true, false, true]
This is the same as
(1..3).collect { |each| each.odd? } # => [true, false, true]
This is the preferred way of writing loops in Ruby.
You'll quickly get used to both & and {} block syntax and the enumeration methods defined in Enumerable module. Some useful methods are
each which evaluates the block for each element
collect which create new array with the result from each block
detect which returns the first element for which block results true
select which create new array with elements for which block results true
inject which applies "folding" operation, eg sum = (1..10).inject { |a, b| a + b }
Fun fact, style guides for production code usually ban for loops at all because of a subtle but dangerous scoping issue. See more here, https://stackoverflow.com/a/41308451/24468

When to use callbacks vs just returning a value

I was looking at some Ruby code somewhere, and I saw the following line:
def do_something a, b, c, &callback
xyz = a + b + c
callback.call(xyz)
end
and then when it was called, they did something like this:
do_something a, b, c do |xyz|
puts xyz
end
Is this better practice to use this sort of callback as opposed to just returning the value made by the function? I can understand why it would be done if there are multiple values that need to be transferred, but this one has just one return.
Analysis
There is insufficient information in your original post to determine if this is useful or not. The intent of your first example seems to be that the method will be passed a block, which is then called as a Proc inside the method rather than yielded back to the block. There might be a valid use case for this, but your given example isn't one of them.
If the block is already there, why not just yield to the block? And what happens if no block is given?
Passing Proc or lambda objects around can certainly be a useful technique in certain cases, but unless it simplifies your code or makes it more readable you are creating additional complexity. The examples in your original post don't make a valid case for why it might be needed. Even if you update your post with better examples, "Is a Proc object necessary?" is almost certainly a subjective question based on the needs of the larger program.
Unless you need the features of a Proc or lambda (e.g. you need a closure or access to a specific Binding) then you are generally better off yielding to a block or returning a value. Your mileage may certainly vary.
Yield or Return
In the general case, you can choose to yield to a block or return a value depending on whether or not a block was given. For example:
def do_something(a, b, c)
xyz = a + b + c
block_given? ? yield(xyz) : xyz
end
Unless you need to pass around a closure, this is likely to be a more useful technique. However, as previously stated, your mileage (and code base) may vary.
I would call this bad practice since this method requires a block (you'll get a NoMethodError without one). It can be useful to have a mechanism for immediately passing the return value to a block, but I wouldn't make it mandatory.
A simple improvement would be to make the block optional
def do_something a, b, c
xyz = a + b + c
return yield(xyz) if block_given?
xyz
end

Why were ruby loops designed that way?

As is stated in the title, I was curious to know why Ruby decided to go away from classical for loops and instead use the array.each do ...
I personally find it a little less readable, but that's just my personal opinion. No need to argue about that. On the other hand, I suppose they designed it that way on purpose, there should be a good reason behind.
So, what are the advantages of putting loops that way? What is the "raison d'etre" of this design decision?
This design decision is a perfect example of how Ruby combines the object oriented and functional programming paradigms. It is a very powerful feature that can produce simple readable code.
It helps to understand what is going on. When you run:
array.each do |el|
#some code
end
you are calling the each method of the array object, which, if you believe the variable name, is an instance of the Array class. You are passing in a block of code to this method (a block is equivalent to a function). The method can then evaluate this block and pass in arguments either by using block.call(args) or yield args. each simply iterates through the array and for each element it calls the block you passed in with that element as the argument.
If each was the only method to use blocks, this wouldn't be that useful but many other methods and you can even create your own. Arrays, for example have a few iterator methods including map, which does the same as each but returns a new array containing the return values of the block and select which returns a new array that only contains the elements of the old array for which the block returns a true value. These sorts of things would be tedious to do using traditional looping methods.
Here's an example of how you can create your own method with a block. Let's create an every method that acts a bit like map but only for every n items in the array.
class Array #extending the built in Array class
def every n, &block #&block causes the block that is passed in to be stored in the 'block' variable. If no block is passed in, block is set to nil
i = 0
arr = []
while i < self.length
arr << ( block.nil? ? self[i] : block.call(self[i]) )#use the plain value if no block is given
i += n
end
arr
end
end
This code would allow us to run the following:
[1,2,3,4,5,6,7,8].every(2) #= [1,3,5,7] #called without a block
[1,2,3,4,5,6,7,8,9,10].every(3) {|el| el + 1 } #= [2,5,8,11] #called with a block
Blocks allow for expressive syntax (often called internal DSLs), for example, the Sinatra web microframework.
Sinatra uses methods with blocks to succinctly define http interaction.
eg.
get '/account/:account' do |account|
#code to serve of a page for this account
end
This sort of simplicity would be hard to achieve without Ruby's blocks.
I hope this has allowed you to see how powerful this language feature is.
I think it was mostly because Matz was interested in exploring what a fully object oriented scripting language would look like when he built it; this feature is based heavily on the CLU programming language's iterators.
It has turned out to provide some interesting benefits; a class that provides an each method can 'mix in' the Enumerable module to provide a huge variety of pre-made iteration routines to clients, which reduces the amount of tedious boiler-plate array/list/hash/etc iteration code that must be written. (Ever see java 4 and earlier iterators?)
I think you are kind of biased when you ask that question. Another might ask "why were C for loops designed that way?". Think about it - why would I need to introduce counter variable if I only want to iterate through array's elements? Say, compare these two (both in pseudocode):
for (i = 0; i < len(array); i++) {
elem = array[i];
println(elem);
}
and
for (elem in array) {
println(elem);
}
Why would the first feel more natural than the second, except for historical (almost sociological) reasons?
And Ruby, highly object-oriented as is, takes this even further, making it an array method:
array.each do |elem|
puts elem
end
By making that decision, Matz just made the language lighter for superfluous syntax construct (foreach loop), delegating its use to ordinary methods and blocks (closures). I appreciate Ruby the most just for this very reason - being really rational and economical with language features, but retaining expressiveness.
I know, I know, we have for in Ruby, but most of the people consider it unneccessary.
The do ... end blocks (or { ... }) form a so-called block (almost a closure, IIRC). Think of a block as an anonymous method, that you can pass as argument to another method. Blocks are used a lot in Ruby, and thus this form of iteration is natural for it: the do ... end block is passed as an argument to the method each. Now you can write various variations to each, for example to iterate in reverse or whatnot.
There's also the syntactic sugar form:
for element in array
# Do stuff
end
Blocks are also used for example to filter an array:
array = (1..10).to_a
even = array.select do |element|
element % 2 == 0
end
# "even" now contains [2, 4, 6, 8, 10]
I think it's because it emphasizes the "everything is an object" philosophy behind Ruby: the each method is called on the object.
Then switching to another iterator is much smoother than changing the logic of, for example, a for loop.
Ruby was designed to be expressive, to read as if it was being spoken... Then I think it just evolved from there.
This comes from Smalltalk, that implements control structures as methods, thus reducing the number of keywords and simplifying the parser. Thus allowing controll strucures to serve as proff of concept for the language definition.
In ST, even if conditions are methods, in the fashion:
boolean.ifTrue ->{executeIfBody()}, :else=>-> {executeElseBody()}
In the end, If you ignore your cultural bias, what will be easier to parse for the machine will also be easier to parse by yourself.

Ruby equivalent of C#'s 'yield' keyword, or, creating sequences without preallocating memory

In C#, you could do something like this:
public IEnumerable<T> GetItems<T>()
{
for (int i=0; i<10000000; i++) {
yield return i;
}
}
This returns an enumerable sequence of 10 million integers without ever allocating a collection in memory of that length.
Is there a way of doing an equivalent thing in Ruby? The specific example I am trying to deal with is the flattening of a rectangular array into a sequence of values to be enumerated. The return value does not have to be an Array or Set, but rather some kind of sequence that can only be iterated/enumerated in order, not by index. Consequently, the entire sequence need not be allocated in memory concurrently. In .NET, this is IEnumerable and IEnumerable<T>.
Any clarification on the terminology used here in the Ruby world would be helpful, as I am more familiar with .NET terminology.
EDIT
Perhaps my original question wasn't really clear enough -- I think the fact that yield has very different meanings in C# and Ruby is the cause of confusion here.
I don't want a solution that requires my method to use a block. I want a solution that has an actual return value. A return value allows convenient processing of the sequence (filtering, projection, concatenation, zipping, etc).
Here's a simple example of how I might use get_items:
things = obj.get_items.select { |i| !i.thing.nil? }.map { |i| i.thing }
In C#, any method returning IEnumerable that uses a yield return causes the compiler to generate a finite state machine behind the scenes that caters for this behaviour. I suspect something similar could be achieved using Ruby's continuations, but I haven't seen an example and am not quite clear myself on how this would be done.
It does indeed seem possible that I might use Enumerable to achieve this. A simple solution would be to us an Array (which includes module Enumerable), but I do not want to create an intermediate collection with N items in memory when it's possible to just provide them lazily and avoid any memory spike at all.
If this still doesn't make sense, then consider the above code example. get_items returns an enumeration, upon which select is called. What is passed to select is an instance that knows how to provide the next item in the sequence whenever it is needed. Importantly, the whole collection of items hasn't been calculated yet. Only when select needs an item will it ask for it, and the latent code in get_items will kick into action and provide it. This laziness carries along the chain, such that select only draws the next item from the sequence when map asks for it. As such, a long chain of operations can be performed on one data item at a time. In fact, code structured in this way can even process an infinite sequence of values without any kinds of memory errors.
So, this kind of laziness is easily coded in C#, and I don't know how to do it in Ruby.
I hope that's clearer (I'll try to avoid writing questions at 3AM in future.)
It's supported by Enumerator since Ruby 1.9 (and back-ported to 1.8.7). See Generator: Ruby.
Cliche example:
fib = Enumerator.new do |y|
y.yield i = 0
y.yield j = 1
while true
k = i + j
y.yield k
i = j
j = k
end
end
100.times { puts fib.next() }
Your specific example is equivalent to 10000000.times, but let's assume for a moment that the times method didn't exist and you wanted to implement it yourself, it'd look like this:
class Integer
def my_times
return enum_for(:my_times) unless block_given?
i=0
while i<self
yield i
i += 1
end
end
end
10000.my_times # Returns an Enumerable which will let
# you iterate of the numbers from 0 to 10000 (exclusive)
Edit: To clarify my answer a bit:
In the above example my_times can be (and is) used without a block and it will return an Enumerable object, which will let you iterate over the numbers from 0 to n. So it is exactly equivalent to your example in C#.
This works using the enum_for method. The enum_for method takes as its argument the name of a method, which will yield some items. It then returns an instance of class Enumerator (which includes the module Enumerable), which when iterated over will execute the given method and give you the items which were yielded by the method. Note that if you only iterate over the first x items of the enumerable, the method will only execute until x items have been yielded (i.e. only as much as necessary of the method will be executed) and if you iterate over the enumerable twice, the method will be executed twice.
In 1.8.7+ it has become to define methods, which yield items, so that when called without a block, they will return an Enumerator which will let the user iterate over those items lazily. This is done by adding the line return enum_for(:name_of_this_method) unless block_given? to the beginning of the method like I did in my example.
Without having much ruby experience, what C# does in yield return is usually known as lazy evaluation or lazy execution: providing answers only as they are needed. It's not about allocating memory, it's about deferring computation until actually needed, expressed in a way similar to simple linear execution (rather than the underlying iterator-with-state-saving).
A quick google turned up a ruby library in beta. See if it's what you want.
C# ripped the 'yield' keyword right out of Ruby- see Implementing Iterators here for more.
As for your actual problem, you have presumably an array of arrays and you want to create a one-way iteration over the complete length of the list? Perhaps worth looking at array.flatten as a starting point - if the performance is alright then you probably don't need to go too much further.

Ruby's yield feature in relation to computer science

I recently discovered Ruby's blocks and yielding features, and I was wondering: where does this fit in terms of computer science theory? Is it a functional programming technique, or something more specific?
Ruby's yield is not an iterator like in C# and Python. yield itself is actually a really simple concept once you understand how blocks work in Ruby.
Yes, blocks are a functional programming feature, even though Ruby is not properly a functional language. In fact, Ruby uses the method lambda to create block objects, which is borrowed from Lisp's syntax for creating anonymous functions — which is what blocks are. From a computer science standpoint, Ruby's blocks (and Lisp's lambda functions) are closures. In Ruby, methods usually take only one block. (You can pass more, but it's awkward.)
The yield keyword in Ruby is just a way of calling a block that's been given to a method. These two examples are equivalent:
def with_log
output = yield # We're calling our block here with yield
puts "Returned value is #{output}"
end
def with_log(&stuff_to_do) # the & tells Ruby to convert into
# an object without calling lambda
output = stuff_to_do.call # We're explicitly calling the block here
puts "Returned value is #{output}"
end
In the first case, we're just assuming there's a block and say to call it. In the other, Ruby wraps the block in an object and passes it as an argument. The first is more efficient and readable, but they're effectively the same. You'd call either one like this:
with_log do
a = 5
other_num = gets.to_i
#my_var = a + other_num
end
And it would print the value that wound up getting assigned to #my_var. (OK, so that's a completely stupid function, but I think you get the idea.)
Blocks are used for a lot of things in Ruby. Almost every place you'd use a loop in a language like Java, it's replaced in Ruby with methods that take blocks. For example,
[1,2,3].each {|value| print value} # prints "123"
[1,2,3].map {|value| 2**value} # returns [2, 4, 8]
[1,2,3].reject {|value| value % 2 == 0} # returns [1, 3]
As Andrew noted, it's also commonly used for opening files and many other places. Basically anytime you have a standard function that could use some custom logic (like sorting an array or processing a file), you'll use a block. There are other uses too, but this answer is already so long I'm afraid it will cause heart attacks in readers with weaker constitutions. Hopefully this clears up the confusion on this topic.
There's more to yield and blocks than mere looping.
The series Enumerating enumerable has a series of things you can do with enumerations, such as asking if a statement is true for any member of a group, or if it's true for all the members, or searching for any or all members meeting a certain condition.
Blocks are also useful for variable scope. Rather than merely being convenient, it can help with good design. For example, the code
File.open("filename", "w") do |f|
f.puts "text"
end
ensures that the file stream is closed when you're finished with it, even if an exception occurs, and that the variable is out of scope once you're finished with it.
A casual google didn't come up with a good blog post about blocks and yields in ruby. I don't know why.
Response to comment:
I suspect it gets closed because of the block ending, not because the variable goes out of scope.
My understanding is that nothing special happens when the last variable pointing to an object goes out of scope, apart from that object being eligible for garbage collection. I don't know how to confirm this, though.
I can show that the file object gets closed before it gets garbage collected, which usually doesn't happen immediately. In the following example, you can see that a file object is closed in the second puts statement, but it hasn't been garbage collected.
g = nil
File.open("/dev/null") do |f|
puts f.inspect # #<File:/dev/null>
puts f.object_id # Some number like 70233884832420
g = f
end
puts g.inspect # #<File:/dev/null (closed)>
puts g.object_id # The exact same number as the one printed out above,
# indicating that g points to the exact same object that f pointed to
I think the yield statement originated from the CLU language. I always wonder if the character from Tron was named after CLU too....
I think 'coroutine' is the keyword you're looking for.
E.g. http://en.wikipedia.org/wiki/Yield
Yield in computing and information science:
in computer science, a point of return (and re-entry) of a coroutine

Resources