Command-Query Separation "states that every method should either be a command that performs an action, or a query that returns data to the caller, but not both. In other words, asking a question should not change the answer."
a = [1, 2, 3]
last = a.pop
Here, in Ruby, the pop command returns the item is popped off the Array.
This an example of a command and query within one method, and it seems necessary for it to be so.
If this is the case, is it really a code smell to have a method that is essentially a query as well as a command?
Stack popping is a well-known exception to CQS. Martin Fowler notes it as a good place to break the rule.
I would say in this case it's not a code smell, but in general it is a code smell.
Related
I want to get the last element of a lazy but finite Seq in Raku, e.g.:
my $s = lazy gather for ^10 { take $_ };
The following don't work:
say $s[* - 1];
say $s.tail;
These ones work but don't seem too idiomatic:
say (for $s<> { $_ }).tail;
say (for $s<> { $_ })[* - 1];
What is the most idiomatic way of doing this while keeping the original Seq lazy?
What you're asking about ("get[ing] the last element of a lazy but finite Seq … while keeping the original Seq lazy") isn't possible. I don't mean that it's not possible with Raku – I mean that, in principle, it's not possible for any language that defines "laziness" the way Raku does with, for example, the is-lazy method.
If particular, when a Seq is lazy in Raku, that "means that [the Seq's] values are computed on demand and stored for later use." Additionally, one of the defining features of a lazy iterable is that it cannot know its own length while remaining lazy – that's why calling .elems on a lazy iterable throws an error:
my $s = lazy gather for ^10 { take $_ };
say $s.is-lazy; # OUTPUT: «True»
$s.elems; # THROWS: «Cannot .elems a lazy list onto a Seq»
Now, at this point, you might reasonably be thinking "well, maybe Raku doesn't know how long $s is, but I can tell that it has exactly 10 elements in it." And you're not wrong – with that code, $s is indeed guaranteed to have 10 elements. This means that, if you want to get the tenth (last) element of $s, you can do so with $s[9]. And accessing $s's tenth element like that won't change the fact that $s.is-lazy.
But, importantly, you can only do so because you know something "extra" about $s, and that extra info undoes a good chunk of the reason you might want a list to be lazy in practice.
To see what I mean, consider a very similar Seq
my $s2 = lazy gather for ^10 { last if rand > .95; take $_ };
say $s2.is-lazy; # OUTPUT: «True»
Now, $s2probably has 10 elements, but it might not – the only way to know is to iterate through it and find out. In turn, this means $s2[9] does not jump to the tenth element the way $s[9] did; it iterates through $s2 just like you'd need to. And, as a result, if you run $s2[9], then $s2 will no longer be lazy (i.e., $s2.is-lazy will return False).
And this is, in effect, what you did in the code in your question:
my $s = lazy gather for ^10 { take $_ };
say $s.is-lazy; # OUTPUT: «True»
say (for $s<> { $_ }).tail; # OUTPUT: «9»
say $s.is-lazy; # OUTPUT: «False»
Because Raku cannot ever know that it has reached the tail of a lazy Seq, the only way it could tell you the .tail is to fully iterate $s. And that necessarily means that $s is no longer lazy.
Two complications
It's worth mentioning two adjacent topics that aren't actually related but that are close enough that they trip some people up.
First, nothing I've said about lazy iterables not knowing their length precludes some non-lazy iterables from knowing their length. Indeed, a decent number of Raku types do both the Iterator role and the PredictiveIterator role – and the main point of a PredictiveIterator is that it does know how many elements it can produce without needing to produce/iterate them. But PredictiveIterators cannot be lazy.
The second potentially confusing topic is closely related to the first: while no PredictiveIterator can be lazy (that is, none will ever have an .is-lazy method that returns True), some PredictiveIterators have behavior that is very similar to laziness – and, in fact, may even be colloquially referred to as "lazy".
I can't do a great job explaining this distinction because, quite honestly, I don't fully understand it myself. But I can give you an example: the .lines method on an IO::Handle. It's certainly the case that reading the lines of a huge file behaves a lot like it's dealing with a lazy iterable. most obviously, you can process each line without ever having the whole file in memory. And the docs even say that "lines are read lazily" with the .lines method.
On the other hand:
my $l = 'some-file-with-100_000-lines.txt'.IO.lines;
say $l.is-lazy; # OUTPUT: «False»
say $l.iterator ~~ PredictiveIterator; # OUTPUT: «True»
say $l.elems; # OUTPUT: «100000»
So I'm not quite sure whether it's fair to say that $l "is a lazy iterable", but if it is, it's "lazy" in a different way than $s was.
I realize that was a lot, but I hope it is helpful. If you have a more specific use case in mind for laziness (I bet it wasn't gathering the numbers from zero to nine!), I'd be happy to address that more specifically. And if anyone else can fill in some of the details with .lines and other lazy-not-lazy PredictiveIterators, I'd really appreciate it!
Drop the lazy
Lazy sequences in Raku are designed to work well as is. You don't need to emphasize they're lazy by adding an explicit lazy.
If you add an explicit lazy, Raku interprets that as a request to block operations such as .tail because they will almost certainly immediately render laziness moot, and, if called on an infinite sequence, or even just a sufficiently large one, hang or OOM the program.
So, either drop the lazy, or don't invoke operations like .tail that will be blocked if you do.
Expanded version of my original answer
As noted by #ugexe, the idiomatic solution is to drop the lazy.
Quoting my answer to the SO About Laziness:
if a gather is asked if it's lazy, it returns False.
Aiui, something like the following applies:
Some lazy sequence producers may be actually or effectively infinite. If so, calling .tail etc on them will hang the calling program. Conversely, other lazy sequences perform fine when all their values are consumed in one go. How should Raku distinguish between these two scenarios?
A decision was made in 2015 to let value producing datatypes emphasize or deemphasize their laziness via their response to an .is-lazy call.
Returning True signals that a sequence is not only lazy but wants to be known to be lazy by consuming code that calls .is-lazy. (Not so much end-user code but instead built in consuming features such as # sigilled variables handling an assignment trying to determine whether or not to assign eagerly.) Built in consuming features take a True as a signal they ought block calls like .tail. If a dev knows this is overly conservative, they can add an eager (or remove an unneeded lazy).
Conversely, a datatype, or even a particular object instance, may return False to signal that it does not want to be considered lazy. This may be because the actual behaviour of a particular datatype or instance is eager, but it might instead be that it is lazy technically, but doesn't want a consumer to block operations such as .tail because it knows they will not be harmful, or at least prefers to have that be the default presumption. If a dev knows better (because, say, it hangs the program), or at least does not want to block potentially problematic operations, they can add a lazy (or remove an unneeded eager).
I think this approach works well, but it doc and error messages mentioning "lazy" may not have caught up with the shift made in 2015. So:
If you've been confused by some doc about laziness, please search for doc issues with "lazy" in them, or "laziness", and add comments to existing issues, or file a new doc issue (perhaps linking to this SO answer).
If you've been confused by a Rakudo error message mentioning laziness, please search for Rakudo issues with "lazy" in them, and tagged [LTA] (which means "Less Than Awesome"), and add comments, or file a new Rakudo issue (with an [LTA] tag, and perhaps a link to this SO answer).
Further discussion
the docs ... say “If you want to force lazy evaluation use the lazy subroutine or method. Binding to a scalar or sigilless container will also force laziness.”
Yes. Aiui this is correct.
[which] sounds like it implies “my $x := lazy gather { ... } is the same as my $x := gather { ... }”.
No.
An explicit lazy statement prefix or method adds emphasis to laziness, and Raku interprets that to mean it ought block operations like .tail in case they hang the program.
In contrast, binding to a variable alters neither emphasis nor deemphasis of laziness, merely relaying onward whatever the bound producer datatype/instance has chosen to convey via .is-lazy.
not only in connection with gather but elsewhere as well
Yes. It's about the result of .is-lazy:
my $x = (1, { .say; $_ + 1 } ... 1000);
my $y = lazy (1, { .say; $_ + 1 } ... 1000);
both act lazily ... but $x.tail is possible while $y.tail is not.
Yes.
An explicit lazy statement prefix or method forces the answer to .is-lazy to be True. This signals to a consumer that cares about the dangers of laziness that it should become cautious (eg rejecting .tail etc.).
(Conversely, an eager statement prefix or method can be used to force the answer to .is-lazy to be False, making timid consumers accept .tail etc calls.)
I take from this that there are two kinds of laziness in Raku, and one has to be careful to see which one is being used where.
It's two kinds of what I'll call consumption guidance:
Don't-tail-me If an object returns True from an .is-lazy call then it is treated as if it might be infinite. Thus operations like .tail are blocked.
You-can-tail-me If an object returns False from an .is-lazy call then operations like .tail are accepted.
It's not so much that there's a need to be careful about which of these two kinds is in play, but if one wants to call operations like tail, then one may need to enable that by inserting an eager or removing a lazy, and one must take responsibility for the consequences:
If the program hangs due to use of .tail, well, DIHWIDT.
If you suddenly consume all of a lazy sequence and haven't cached it, well, maybe you should cache it.
Etc.
What I would say is that the error messages and/or doc may well need to be improved.
I came across three ways of writing a loop.
the_count = [1, 2, 3, 4, 5]
for loop 1
for number in the_count
puts "This is count #{number}"
end
for loop 2
the_count.each do |count|
puts "The counter is at: #{count}"
end
for loop 3
the_count.each {|i| puts "I got #{i}"}
Are there situations in which one way is a good practice or better solution than the other two? The first one is the most similar to the ones in other languages, and for me, the third one looks unorderly.
The first option is generally discouraged. It is possible in ruby to be more friendly towards developers coming from other languages (as they recognize the syntax) but it behaves a bit strange regarding variable visibility. Generally, you should avoid this variant everywhere and use only one of the block variants.
The advantage of the two other variants is that it works the same for all methods accepting a block, e.g. map, reduce, take_while and others.
The two bottom variants are mostly equivalent You use the each method and provide it with a block. The each method calls the block once for each element in the array.
Which one you use is mostly up to preference. Most people tend to use the one with braces for simple blocks which don't require a line-break. If you want to use a line-break in your block, e.g. if you have multiple statements there, you should use the do...end variant. This makes your code more readable.
There are other slightly more nuanced opinions on when you should use one or the other (e.g. some always use the braces form when writing functional block, i.e. ones which don't affect the outside of the block even when they are longer), but if you follow this above advice, you will please at least 98% of all ruby developers reading your code.
Thus, in conclusion, avoid the for i in ... variants (the same counts for while, until, ...) and always use the block-form. Use the do...end of block for complex blocks and the braces-form for simple one-line blocks.
When you use the the block form, you should however be aware of the slight differences in priority when chaining methods.
This
foo bar { |i| puts i }
is equivalent to
foo(bar{|i| puts i})
while
foo bar do |i|
puts i
end
is equivalent to
foo(bar) { |i| puts i }
As you can see, in the braces form, the block is passed to the right-most method while in the do...end form, the block is passed to the left-most method. You can always resolve the ambiguity with parenthesis though.
It should be noted that this is trade-off between idiomatic Ruby (solutions 2 and 3) and performant Ruby (using while loops, because for …in uses each under the hood) as pointed out in Yet Another Language Speed Test: Counting Primes:
Notably, it should be mentioned that writing idiomatic Python and Ruby results in much slower code than that used here. Ranges bad. While loops good.
While it's generally encouraged to opt for idiomatic Ruby, there are perfectly valid situations where you want to ignore that advice.
For example, I've checked the documentation on certain methods, like sort. And it looks like the only difference between .sort and .sort! is that one sorts self in place and the other returns an array. I'm a little unclear on what that means - they seem to effectively do the same thing.
Can anyone help me understand this a little better?
When to Use Bang Methods
Technically, the exclamation point (or bang) doesn't intrinsically mean anything. It's simply an allowable character in the method name. In practice, however, so-called bang methods generally:
Changed objects in-place. For example, #sort! sorts self in place, while #sort returns a new array created by sorting self.
Some bang methods return nil if no changes were made, which can cause problems with method chains. For example:
'foo'.sub 'x', 'y'
# => "foo"
'foo'.sub! 'x', 'y'
#=> nil
Use bang methods when you want to mark a method as creating notable side effects, producing destructive operations, or otherwise requiring additional caution or attention. This is largely by convention, though, and you could make all your methods bang methods if you were so inclined.
Methods with a bang(!) are meant to signify a little more caution is required. So, either modification in place vs. not in place (if you are modifying the object in place - you better be sure that you really want to), or in other cases like find_by and find_by! (see here) where one causes an exception if no record is found and one doesn't cause an exception.
Can you guess which one does and which one does not cause an exception?
The methods with the exclamation point alter the actual object they're called on, where as the methods without will just return a new object that has been manipulated.
i.e.
pizza = 'pepperoni'
pizza.capitalize
Now the pizza variable will still equal 'pepperoni'.
If we then call
pizza.capitalize!
The pizza variable will now equal 'Pepperoni'
Enumerable#detect returns the first value of an array where the block evaluates to true. It has an optional argument that needs to respond to call and is invoked in this case, returning its value. So,
(1..10).detect(lambda{ "none" }){|i| i == 11} #=> "none"
Why do we need the lambda? Why don't we just pass the default value itself, since (in my tests) the lambda can't have any parameters anyway? Like this:
(1..10).detect("none"){|i| i == 11} #=> "none"
As with all things in Ruby, the "principle of least surprise" applies. Which is not to say "least surprise for you" of course. Matz is quite candid about what it actually means:
Everyone has an individual background. Someone may come from Python, someone else may come from Perl, and they may be surprised by different aspects of the language. Then they come up to me and say, 'I was surprised by this feature of the language, so Ruby violates the principle of least surprise.' Wait. Wait. The principle of least surprise is not for you only. The principle of least surprise means principle of least my surprise. And it means the principle of least surprise after you learn Ruby very well. For example, I was a C++ programmer before I started designing Ruby. I programmed in C++ exclusively for two or three years. And after two years of C++ programming, it still surprises me.
So, the rational here is really anyone's guess.
One possibility is that it allows for or is consistent with use-cases where you want to conditionally run something expensive:
arr.detect(lambda { do_something_expensive }) { |i| is_i_ok? i }
Or as hinted by #majioa, perhaps to pass a method:
arr.detect(method(:some_method)) { |i| is_i_ok? i }
Accepting a callable object allows allows "lazy" and generic solutions, for example in cases where you'd want to do something expensive, raise an exception, etc...
I can't see a reason why detect couldn't accept non callable arguments, though, especially now in Ruby 2.1 where it's easy to create cheap frozen litterals. I've opened a feature request to that effect.
It is probably so you can generate an appropiate result from the input. You can then do something like
arr = (1..10).to_a
arr.detect(lambda{ arr.length }){|i| i == 11} #=> 10
As you said, returning a constant value is very easy with a lambda anyway.
Really it is interestion question. I can understand why authors have added the feature with method call, you can just pass method variable, containing a Method object or similar, as an argument. I think it is simply voluntaristic solution for the :detect method, because it could be easy to add switch on type of passed argument to select weither it is the Method or not.
I've reverified the examples, and got:
(1..10).detect(proc {'wqw'}) { |i| i % 5 == 0 and i % 7 == 0 } #=> nil
# => "wqw"
(1..10).detect('wqw') { |i| i % 5 == 0 and i % 7 == 0 } #=> nil
# NoMethodError: undefined method `call' for "wqw":String
That is amazing. =)
The best use case for having a lambda is to raise a custom exception
arr = (1..10).to_a
arr.detect(lambda{ raise "not found" }){|i| i == 11} #=> RuntimeError: not found
So, as the K is trivial (just surround with ->{ }) there is not much sense in checking for fallback behavior.
The similar case of passing a &-ed symbol instead of a block is in fact not similar at all, as in that case it indicates something that will be called on the items of the enumerable.
I recently discovered Ruby's blocks and yielding features, and I was wondering: where does this fit in terms of computer science theory? Is it a functional programming technique, or something more specific?
Ruby's yield is not an iterator like in C# and Python. yield itself is actually a really simple concept once you understand how blocks work in Ruby.
Yes, blocks are a functional programming feature, even though Ruby is not properly a functional language. In fact, Ruby uses the method lambda to create block objects, which is borrowed from Lisp's syntax for creating anonymous functions — which is what blocks are. From a computer science standpoint, Ruby's blocks (and Lisp's lambda functions) are closures. In Ruby, methods usually take only one block. (You can pass more, but it's awkward.)
The yield keyword in Ruby is just a way of calling a block that's been given to a method. These two examples are equivalent:
def with_log
output = yield # We're calling our block here with yield
puts "Returned value is #{output}"
end
def with_log(&stuff_to_do) # the & tells Ruby to convert into
# an object without calling lambda
output = stuff_to_do.call # We're explicitly calling the block here
puts "Returned value is #{output}"
end
In the first case, we're just assuming there's a block and say to call it. In the other, Ruby wraps the block in an object and passes it as an argument. The first is more efficient and readable, but they're effectively the same. You'd call either one like this:
with_log do
a = 5
other_num = gets.to_i
#my_var = a + other_num
end
And it would print the value that wound up getting assigned to #my_var. (OK, so that's a completely stupid function, but I think you get the idea.)
Blocks are used for a lot of things in Ruby. Almost every place you'd use a loop in a language like Java, it's replaced in Ruby with methods that take blocks. For example,
[1,2,3].each {|value| print value} # prints "123"
[1,2,3].map {|value| 2**value} # returns [2, 4, 8]
[1,2,3].reject {|value| value % 2 == 0} # returns [1, 3]
As Andrew noted, it's also commonly used for opening files and many other places. Basically anytime you have a standard function that could use some custom logic (like sorting an array or processing a file), you'll use a block. There are other uses too, but this answer is already so long I'm afraid it will cause heart attacks in readers with weaker constitutions. Hopefully this clears up the confusion on this topic.
There's more to yield and blocks than mere looping.
The series Enumerating enumerable has a series of things you can do with enumerations, such as asking if a statement is true for any member of a group, or if it's true for all the members, or searching for any or all members meeting a certain condition.
Blocks are also useful for variable scope. Rather than merely being convenient, it can help with good design. For example, the code
File.open("filename", "w") do |f|
f.puts "text"
end
ensures that the file stream is closed when you're finished with it, even if an exception occurs, and that the variable is out of scope once you're finished with it.
A casual google didn't come up with a good blog post about blocks and yields in ruby. I don't know why.
Response to comment:
I suspect it gets closed because of the block ending, not because the variable goes out of scope.
My understanding is that nothing special happens when the last variable pointing to an object goes out of scope, apart from that object being eligible for garbage collection. I don't know how to confirm this, though.
I can show that the file object gets closed before it gets garbage collected, which usually doesn't happen immediately. In the following example, you can see that a file object is closed in the second puts statement, but it hasn't been garbage collected.
g = nil
File.open("/dev/null") do |f|
puts f.inspect # #<File:/dev/null>
puts f.object_id # Some number like 70233884832420
g = f
end
puts g.inspect # #<File:/dev/null (closed)>
puts g.object_id # The exact same number as the one printed out above,
# indicating that g points to the exact same object that f pointed to
I think the yield statement originated from the CLU language. I always wonder if the character from Tron was named after CLU too....
I think 'coroutine' is the keyword you're looking for.
E.g. http://en.wikipedia.org/wiki/Yield
Yield in computing and information science:
in computer science, a point of return (and re-entry) of a coroutine