Why does delete return the deleted element instead of the new array? - ruby

In ruby, Array#delete(obj) will search and remove the specified object from the array. However, may be I'm missing something here but I found the returning value --- the obj itself --- is quite strange and a even a little bit useless.
My humble opinion is that in consistent with methods like sort/sort! and map/map! there should be two methods, e.g. delete/delete!, where
ary.delete(obj) -> new array, with obj removed
ary.delete!(obj) -> ary (after removing obj from ary)
For several reasons, first being that current delete is non-pure, and it should warn the programmer about that just like many other methods in Array (in fact the entire delete_??? family has this issue, they are quite dangerous methods!), second being that returning the obj is much less chainable than returning the new array, for example, if delete were like the above one I described, then I can do multiple deletions in one statement, or I can do something else after deletion:
ary = [1,2,2,2,3,3,3,4]
ary.delete(2).delete(3) #=> [1,4], equivalent to "ary - [2,3]"
ary.delete(2).map{|x|x**2"} #=> [1,9,9,9,16]
which is elegant and easy to read.
So I guess my question is: is this a deliberate design out of some reason, or is it just a heritage of the language?

If you already know that delete is always dangerous, there is no need to add a bang ! to further notice that it is dangerous. That is why it does not have it. Other methods like map may or may not be dangerous; that is why they have versions with and without the bang.
As for why it returns the extracted element, it provides access to information that is cumbersome to refer to if it were not designed like that. The original array after modification can easily be referred to by accessing the receiver, but the extracted element is not easily accessible.
Perhaps, you might be comparing this to methods that add elements, like push or unshift. These methods add elements irrespective of what elements the receiver array has, so returning the added element would be always the same as the argument passed, and you know it, so it is not helpful to return the added elements. Therefore, the modified array is returned, which is more helpful. For delete, whether the element is extracted depends on whether the receiver array has it, and you don't know that, so it is useful to have it as a return value.

For anyone who might be asking the same question, I think I understand it a little bit more now so I might as well share my approach to this question.
So the short answer is that ruby is not a language originally designed for functional programming, neither does it put purity of methods to its priority.
On the other hand, for my particular applications described in my question, we do have alternatives. The - method can be used as a pure alternative of delete in most situations, for example, the code in my question can be implemented like this:
ary = [1,2,2,2,3,3,3,4]
ary.-([2]).-([3]) #=> [1,4], or simply ary.-([2,3])
ary.-([2]).map{|x|x**2"} #=> [1,9,9,9,16]
and you can happily get all the benefits from the purity of -. For delete_if, I guess in most situations select (with return value negated) could be a not-so-great pure candidate.
As for why delete family was designed like this, I think it's more of a difference in point of view. They are supposed to be more of shorthands for commonly needed non-pure procedures than to be juxtaposed with functional-flavored select, map, etc.

I’ve wondered some of these same things myself. What I’ve largely concluded is that the method simply has a misleading name that carries with it false expectations. Those false expectations are what trigger our curiosity as to why the method works like it does. Bottom line—I think it’s a super useful method that we wouldn’t be questioning if it had a name like “swipe_at” or “steal_at”.
Anyway, another alternative we have is values_at(*args) which is functionally the opposite of delete_at in that you specify what you want to keep and then you get the modified array (as opposed to specifying what you want to remove and then getting the removed item).

Related

Null checks for a complex dereference chain in Java 8

So I have a class generated by some contract (so no modifications allowed) with multiple data layers, I get it through soap request and then in my backend I have something like this:
value = bigRequest.getData().getSamples().get(0).getValuableData().getValue()
and in every dereference in that chain I can have null result. Class itself has no logic, just pure data with accessors, but nonetheless. I'm kinda sick of thought to make an ugly boilerplate of not-null checks for every single dereference, so I'm thinking of the best practice here:
Actually make the ugly boilerplate (either with ifs or with asserts). I assume that its what I've got to do, but I have faint hopes.
Do some Optional magic. But with no access to modify the source it'll probably be even uglier.
Catch the NPE. It's ugly in its heart, but in this particular case I feel it's the best option, just because it's part of the logic, either I have that value or not. But catching NPE makes me shiver.
Something I can't see by myself now.
I'm actually feel myself a little bit uncomfortable with this question cause I feel that NPE theme is explored to the bones, but I had no success in search.
I agree with both of you that Andrew Vershinin’s suggestion is the best we can do here and thus deserves to be posted as an answer.
nullableValue = Optional.ofNullable(bigRequest)
.map(RequestCls::getData)
.map(DataCls::getSamples)
.filter(samples -> ! samples.isEmpty())
.map(samples -> samples.get(0))
.map(SampleCls::getValuableData)
.map(ValDataCls::getValue)
.orElse(null);
You will need to substitute the right class or interface names in the method references (or you may rewrite as lambdas if you prefer). Edit: If bigRequest itself cannot be null, the first method call should be just Optional.of(bigRequest).
It’s not the primarily intended use of Optional, but I find it OK. And better than items 1. and 3. (and 4.) from your question.

Is it good in Rails view to define constant just to be used once?

To a partial in a HAML file, I am passing parameters whose value is long, or methods whose name is long, for example:
"Some quite long string"
quiteLongMethodNameHere(otherConstant)
To make them shorter, I wrapped them in a constant/variable:
- message = "Some quite long string"
- is_important = quiteLongMethodNameHere(otherConstant)
= render :some_component, msg: message, is_important: is_important
Is this a good practice? Or should I just put the value on the param without wrapping it inside variable/constant?
It's a case-by-case decision. You want to balance the sometimes-competing interests of clarity and conciseness. For me, it depends on the expressiveness of both forms. If the long method name is clear, precise, and expressive, then I would be less interested in using an intermediate variable to hold its result than if it were not.
In other cases where the long form is less expressive, I will often use intermediate variables as "living" documentation, even if they are used only on the next line of code. This more explicitly reveals your intention to the reader (who may someone else, or you at some future point in time).
I find intermediate variables are much better than code comments because code comments can more easily become obsolete, and having the clarification in code makes it available for debuggers, etc. The performance hit of creating an extra variable is minimal, and significant in only the most unusual of cases.
Another factor is if you are aggregating things (in arrays, hashes, etc.) that include these function calls and values, then using the intermediate variable makes the code neater, and possibly easier to understand, as you can customize the name to make the most sense in the context of that collection.
Regardless of the length of the string, it makes sense to assign it to a variable/constant, and not directly refer to it in a view file. If it is a text, it makes more sense to put it in a i18n file.
However, it is not good to do that in the main view file. If you are going to do it, do it in the controller file or a helper file.

Why does rubocop or the ruby style guide prefer not to use get_ or set_?

I was running rubocop on my project and fixing the complaints it raised.
One particular complaint bothered me
Do not prefix reader method names with get_
I could not understand much from this complaint so I looked at source code in github.
I found this snippet
def bad_reader_name?(method_name, args)
method_name.start_with?('get_') && args.to_a.empty?
end
def bad_writer_name?(method_name, args)
method_name.start_with?('set_') && args.to_a.one?
end
So the advice or convention is as follows:
1) Actually they advice us not to use get_ when the method does not have arguments . otherwise they allow get_
2) And they advice us not to use set_ when the method has only one argument .otherwise they allow set_
What is the reason behind this convention or rule or advice?
I think the point here is ruby devs prefer to always think of methods as getters since they returns something and use the equals "syntactic sugar" (like in def self.dog=(params) which lets you do Class.dog = something). In essence the point I've always seen made is that the get and set are redundant and verbose.
In opposition to this you have get and set with multiple args which are like finder methods (particularly get; think of ActiveRecord's where).
Keep in mind that 'style guide' = pure opinion. Consistency is the higher goal of style guides in general so unless something is arguably wrong or difficult to read, your goal should be more on having everything the same than of a certain type. Which is why rubocop let's you turn this off.
Another way to see it: the getter/setter paradigm was, as far as I can tell, largely a specific convention in Java/C++ etc.; at least I knew quite a few Java codebases in the very foggy past where Beans were littered with huge amounts of get_-getters and set_-setters. In that time, the private attribute would likely be called "name" with "set_name()" and "get_name()"; as the attribute itself was called "name", the getter could not be "name()" as well.
Hence the very existence of "get_" and "set_" is due to a (trivial) technical shortcoming of languages that do not allow the "=" in method names.
In Ruby, we have quite a different array of possibilities:
First and foremost, we have name() and name=(), which immediately removes the need for the get_ and set_ syntax. So, we do have getters and setters, we just call them differently from Java.
Also, the attribute then is not name but #name, hence solving this conflict as well.
Actually, you don't have attributes with the plain "obj.name" syntax at all! For eaxmple; while Rails/ActiveRecord pretends that "person.name" is a "attribute", it is in fact simply a pair of auto-generated getter name() and setter name=(). Conceptionally, the caller is not supposed to care about what "name" is (attribute or method), it is an implementation detail of the class.
It saves 4 or 3 key presses for each call. This might seem like a joke, but writing concise yet easily comprehensible code is a trademark of Ruby, after all. :-)
The way I understand it is that it's because foo.get_value is imperative and foo.value is declarative.

Naming convention for syntactic sugar methods

I'm build a library for generic reporting, Excel(using Spreadsheet), and most of the time I'll be writing things out on the last created worksheet (or active as I mostly refer to it).
So I'm wondering if there's a naming convention for methods that are mostly sugar to the normal/unsugared method.
For instance I saw a blog post, scroll down to Composite, a while ago where the author used the #method for the sugared, and #method! when unsugared/manual.
Could this be said to be a normal way of doing things, or just an odd implementation?
What I'm thinking of doing now is:
add_row(data)
add_row!(sheet, data)
This feels like a good fit to me, but is there a consensus on how these kinds of methods should be named?
Edit
I'm aware that the ! is used for "dangerous" methods and ? for query/boolean responses. Which is why I got curious whether the usage in Prawn (the blog post) could be said to be normal.
I think it's fair to say that your definitions:
add_row(data)
add_row!(sheet, data)
are going to confuse Ruby users. There is a good number of naming conventions is the Ruby community that are considered like a de-facto standard for naming. For example, the bang methods are meant to modify the receiver, see map and map!. Another convention is add the ? as a suffix to methods that returns a boolean. See all? or any? for a reference.
I used to see bang-methods as more dangerous version of a regular named method:
Array#reverse! that modifies array itself instead of returning new array with reversed order of elements.
ActiveRecord::Base#save! (from Rails) validates model and save it if it's valid. But unlike regular version that return true or false depending on whether the model was saved or not raises an exception if model is invalid.
I don't remember seeing bang-methods as sugared alternatives for regular methods. May be I'd give such methods their own distinct name other then just adding a bang to regular version name.
Why have two separate methods? You could for example make the sheet an optional parameter, for example
def add_row(sheet = active_sheet, data)
...
end
default values don't have to just be static values - in this case it's calling the active_sheet method. If my memory is correct prior to ruby 1.9 you'd have to swap the parameters as optional parameters couldn't be followed by non optional ones.
I'd agree with other answers that ! has rather different connotations to me.

Syntax when dereferencing database-backed tree

I'm using MongoDB, so my clusters of data are in dictionaries. Some of these contain references to other Mongo objects. For example, say I have a Person document which has a separate Employer document. I would like to control element access so I can automatically dereference documents. I also have some data with dates, and since PyMongo can't store timezone info, I'd like to store a string timezone alongside the UTC time and have an accessor to the converted times easily.
Which of these options seems the best to you?
Person = {'employer': ObjectID}
Employer = {'name': str}
Option 1: Augmented operations are methods
Examples
print person.get_employer()['name']
person.get_employer()['name'] = 'Foo'
person.set_employer(new_employer)
Pro: Method syntax makes it clear that getting the employer is not just dictionary access
Con: Differences between the syntaxes between referenced objects and not, making it hard to normalize the schema if necessary. Augmenting an element would require changing the callers
Option 2: Everything is an attribute
Examples
print person.employer.name
person.employer.name = 'Foo'
person.employer = new_employer
Pro: Uniform syntax for augmented and non-augmented
?: Makes it unclear that this is backed by a dictionary, but provides a layer of abstraction?
Con: Requires morphing a dictionary to an object, not pythonic?
Option 3: Everything is a dictionary item
Examples
print person['employer']['name']
person['employer']['name'] = 'Foo'
person['employer'] = new_employer
Pro: Uniform syntax for augmented and non-augmented
?: Makes it unclear that some of these accesses are actually method calls, but provides a layer of abstraction?
Con: Dictionary item syntax is error-prone to type IMHO.
Your first 2 options would require making a "Person" class and an "Employer" class, and using __dict__ to read values and setattr for writing values. This approach will be slower, but will be more flexible (you can add new methods, validation, etc.)
The simplest way would be to use only dictionaries (option 3). It wouldn't require any need for oop. Personally, I also find it to be the most readable of the 3.
So, if I were you, I would use option 3. It is nice and simple, and easy to expand on later if you change your mind. If I had to choose between the first two, I would choose the second (I don't like overusing getters and setters).
P.S. I'd keep away from person.get_employer()['name'] = 'Foo', regardless of what you do.
Do not be afraid to write a custom class when that will make the subsequent code easier to write/read/debug/etc.
Option 1 is good when you're calling something that's slow/intensive/whatever -- and you'll want to save the results so can use option 2 for subsequent access..
Option 2 is your best bet -- less typing, easier to read, create your classes once then instantiate and away you go (no need to morph your dictionary).
Option 3 doesn't really buy you anything over option 2 (besides more typing, plus allowing typos to pass instead of erroring out)

Resources