Loop vs closure, readable vs concise? - coding-style

In my answer to my own question here I posted some code and #Dave Newton was kind enough to provide me with a gist and show me the error in my not-so-Groovy ways. <-- Groovy pun
I took his advice and revamped my code to be Groovier. Since then the link I am making (which Dave represents with the replaceWith variable) has changed. Now the closure representation of what I want to do would look like this:
int i = 1
errorList = errorLinksFile.readLines().grep { it.contains "href" }.collect { line ->
def replaceWith = "<a href=\"${rooturl}${build.url}parsed_console/log_content.html#ERROR${i++}\">"
line.replaceAll(pattern, replaceWith).minus("</font>")
}
And the for loop representation of what I want to do would look like this:
def errorList = []
def i = 1
for(line in errorLinksFile.getText().split("\n")){
if(!line.contains("href")){
continue
}
errorList.add(line.replaceAll(pattern, "<a href=\"${rooturl}${build.url}parsed_console/log_content.html#ERROR${i++}\">").minus("</font>"))
}
The closure version is definitely more concise, but I'm worried if I always go the "Groovier" route the code might be harder for other programmers to understand than a simple for loop. So when is Groovier better and when should I opt for code that is likely to be understood by all programmers?

I believe that a development team should strive to be the best and coding to the least knowledgeable/experienced developer does not support this. It is important that more than one person on the team knows how to read the code that is developed though. So if you're the only one that can read it, teach someone else. If you're worried about someone new to the team being able to read it I feel that they would be equally hard to read since there would be lack of domain knowledge. What I would do though is break it up a little bit:
def originalMethod() {
//Do whatever happens before the given code
errorList = getModifiedErrorsFromFile(errorLinksFile)
}
def getModifiedErrorsFromFile(errorLinksFile) {
int i = 1
getHrefsFromFile(errorLinksFile).collect { line ->
def replaceWith = getReplacementTextForLine(i)
i++
line.replaceAll(pattern, replaceWith).minus("</font>")
}
}
def getHrefsFromFile(errorLinksFile) {
errorLinksFile.readLines().grep { it.contains "href" }
}
def getReplacementTextForLine(i) {
"<a href=\"${rooturl}${build.url}parsed_console/log_content.html#ERROR${i}\">"
}
This way if the next person doesn't immediately understand what is going on they should be able to infer what is going on based on the method names. If that doesn't work adding tests would help the next person understand what is going on.
My 2 cents. Good topic though!!!

Idiomatic groovy is good, people will learn the common idioms quickly. "Clever" groovy, in my opinion, is more likely to be just confusing.

Related

let! vs let and before in Rspec

In Rspec, let uses lazy instantiation so let(:foo) { create(...) } isn't initialised until something calls it. Usually this is good, because it is only used when needed and makes rspec testing times much quicker.
Occasionally however you will have a spec that needs that variable but doesn't explicitly call it. So with lazy instantiation, the spec will fail.
A solution is with a bang! let!(:foo) { create(...) } will force that the variable is initialised.
Some developers seem to be very against this and prefer:
let(:foo) { create(...) }
before do
foo
end
to force the initialisation.
Is there a reason for this? is there any difference between the two methods?
I can think of one difference: before blocks would compound, and you can overwrite let! with let and vice versa. I'll give you an example:
context do
let!(:foo) { create(:foo) }
let(:bar) { create(:bar) }
before { bar }
context do
# if in this context you wish to switch back to "lazy" versions you can
# do that for :foo, just do:
let(:foo) { create(:foo) }
# but you can't "undo" before, even if you define an empty one:
before { }
# it does not cancel the `before` blocks defined in higher contexts
end
end
Edit: I just realized this does not really answer the question why someone would prefer before to let!. Maybe: as mentioned in comments the order is different, but if you depend on such nuance in your specs - it's already too complicated.
Many situations is a matter of style, and the developer is not full aware of main functionalities of RSpec and many times people just don't make sense. Humans are not machines, and specially under time pressure, developers do things that they wouldn't do in ideal conditions :).
But the both cases presented they are not strictly the same.
For example, if you are using subject, it is evaluated in a before hook before the let! initialization and not inside it. I didn't test, but I believe these cases should show the diffs:
let!(:car) { create(:car) }
let(:driver) { create(:driver) }
subject { driver.car() }
it { expect(subject).to eq car } # Fail:
This forces car being created before and being available for subject:
let(:driver) { create(:driver) }
subject { driver.car() }
before { create(:car) }
it { expect(subject).to eq car } # Success

Ruby method for values from all associations

This method works, but I'm sure the performance could be greatly improved. Also, I'm realizing how fun and awesome it is to take smelly code like this, and rubify it. But I need a little more help to get my Ruby skills to the level to refactor something like this.
An objective can have "preassign" objectives. These are pre-requisites that must be completed before the a student can try the objective in question.
ObjectiveStudent is the join model between an objective and a student. It has a method called "points_all_time" that finds the student's best score on that objective.
The check_if_ready method is the one that I'm trying to refactor in this question. It also belong to the ObjectiveStudent model.
It needs to check whether the student has passed ALL of the preassigns for a given objective. If so, return true. Return false if the student has a less-than-passing score on any of the preassigns.
def check_if_ready
self.objective.preassigns.each do |preassign|
obj_stud = self.user.objective_students.find_by(objective_id: preassign.id)
return false if obj_stud.points_all_time < 7
end
return true
end
Right now I suspect this method is making too many calls to the database. What I'm really hoping to find is some way to look at the scores for the pre-reqs with a single db call.
Thank you in advance for any insight.
The following should work for you:
def is_ready?
user.objective_students
.where(objective_id: objective.preassigns.select(:id))
.none? { |obj_stud| obj_stud.points_all_time < 7 }
end
We collect all the objective_students for the user where the objective_id is in the list of objective.preassigns ids. This results in one 1 query being executed.
Then we use Enumerable#none? to make sure that none of the objective_students have points_all_time less than 7.
You could also use the inverse .all? { |obj_stud| obj_stud.points_all_time >= 7 } if you wanted
One way you could "rubify" this method is to rewrite the signature as:
def is_ready?
It is common practice to append ? to functions that return a boolean value in Ruby. (Note: I also don't really see a reason to have the word 'check' in the declaration, but that's just an opinion).
Furthermore, if objective_id is the primary key for the objective_students model, you can simply write objective_students.find(preassign.id) instead of the find_by method.
I would also suggest having a separate method for returning a student's points (especially since I suspect you will need to get a student's points more than just once) :
def getPoints(preAssignId)
return self.user.objective_students.find_by(objective_id: preAssignId).points_all_time
end
Then your main method can be written in a more clear, self-describing manner as:
def is_ready?
self.objective.preassigns.each {|preassign| return false if getPoints(preassign) < 7 }
return true
end

Possible to reference the key in the value of a key-value pair?

For example, say I want to do this:
{
:mytime => times[:mytime]
}
There is a bit of repetition here, is it possible to just do something like { :mytime => times[$_key] } (made up syntax)?
The short answer is: no, there's no syntax that does anything like that. Without more context about what you're actually trying to do, there isn't really a good longer answer. Still, I'll try to be a little more useful. If you're really worried about repetition, you could do something like:
h = {}
[:mytime, :yourtime].each do |k|
h[k] = times[k]
end
For only a few keys, I can't see that being worth it, personally.

How much information hiding is necessary when doing code refactoring?

How much information hiding is necessary? I have boilerplate code before I delete a record, it looks like this:
public override void OrderProcessing_Delete(Dictionary<string, object> pkColumns)
{
var c = Connect();
using (var cmd = new NpgsqlCommand("SELECT COUNT(*) FROM orders WHERE order_id = :_order_id", c)
{ Parameters = { {"_order_id", pkColumns["order_id"]} } } )
{
var count = (long)cmd.ExecuteScalar();
// deletion's boilerplate code...
if (count == 0) throw new RecordNotFoundException();
else if (count > 1) throw new DatabaseStructureChangedException();
// ...boiler plate code
}
// deleting of table(s) goes here...
}
NOTE: boilerplate code is code-generated, including the "using (var cmd = new NpgsqlCommand( ... )"
But I'm seriously thinking to refactor the boiler plate code, I wanted a more succint code. This is how I envision to refactor the code (made nicer with extension method (not the sole reason ;))
using (var cmd = new NpgsqlCommand("SELECT COUNT(*) FROM orders WHERE order_id = :_order_id", c)
{ Parameters = { {"_order_id", pkColumns["order_id"]} } } )
{
cmd.VerifyDeletion(); // [EDIT: was ExecuteWithVerification before]
}
I wanted the executescalar and the boilerplate code to goes inside the extension method.
For my code above, does it warrants code refactoring / information hiding? Is my refactored operation looks too opaque?
I would say that your refactor is extremely good, if your new single line of code replaces a handful of lines of code in many places in your program. Especially since the functionality is going to be the same in all of those places.
The programmer coming after you and looking at your code will simply look at the definition of the extension method to find out what it does, and now he knows that this code is defined in one place, so there is no possibility of it differing from place to place.
Try it if you must, but my feeling is it's not about succinctness but whether or not you want to enforce the behavior every time or most of the time. And by extension, if the verify-condition changes that it would likely change across the board.
Basically, reducing a small chunk of boiler-plate code doesn't necessarily make things more succinct; it's just one more bit of abstractness the developer has to wade through and understand.
As a developer, I'd have no idea what "ExecuteWithVerify" means. What exactly are we verifying? I'd have to look it up and remember it. But with the boiler-plate code, I can look at the code and understand exactly what's going on.
And by NOT reducing it to a separate method I can also tune the boiler-plate code for cases where exceptions need to be thrown for differing conditions.
It's not information-hiding when you extract or refactor your code. It's only information-hiding when you start restricting access to your extension definition after refactoring.
"new" operator within a Class (except for the Constructor) should be Avoided at all costs. This is what you need to refactor here.

Flatten conditional as a refactoring

Consider:
if (something) {
// Code...
}
With CodeRush installed it recommended doing:
if (!something) {
return;
}
// Code...
Could someone explain how this is better? Surely there is no benefit what so ever.
Isolated, as you've presented it - no benefit. But mark4o is right on: it's less nesting, which becomes very clear if you look at even, say a 4-level nesting:
public void foo() {
if (a)
if (b)
if (c)
if (d)
doSomething();
}
versus
public void foo() {
if (!a)
return;
if (!b)
return;
if (!c)
return;
if (!d)
return;
doSomething();
}
early returns like this improve readability.
In some cases, it's cleaner to validate all of your inputs at the beginning of a method and just bail out if anything is not correct. You can have a series of single-level if checks that check successively more and more specific things until you're confident that your inputs are good. The rest of the method will then be much easier to write, and will tend to have fewer nested conditionals.
One less level of nesting.
This is a conventional refactoring meant for maintainability. See:
http://www.refactoring.com/catalog/replaceNestedConditionalWithGuardClauses.html
With one condition, it's not a big improvement. But it follows the "fail fast" principle, and you really start to notice the benefit when you have lots of conditions. If you grew up on "structured programming", which typically recommends functions have single exit points, it may seem unnatural, but if you've ever tried to debug code that has three levels or more of nested conditionals, you'll start to appreciate it.
It can be used to make the code more readable (by way of less nesting). See here for a good example, and here for a good discussion of the merits.
That sort of pattern is commonly used to replace:
void SomeMethod()
{
if (condition_1)
{
if (condition_2)
{
if (condition_3)
{
// code
}
}
}
}
With:
void SomeMethod()
{
if (!condition_1) { return; }
if (!condition_2) { return; }
if (!condition_3) { return; }
// code
}
Which is much easier on the eyes.
I don't think CodeRush is recommending it --- rather just offering it as an option.
IMO, it depends on if something or !something is the exceptional case. If there is a significant amount of code if something happens, then using the !something conditional makes more sense for legibility and potential nesting reduction.
Well, look at it this way (I'll use php as an example):
You fill a form and go to this page: validate.php
example 1:
<?php
if (valid_data($_POST['username'])) {
if (valid_data($_POST['password'])) {
login();
} else {
die();
}
} else {
die();
}
?>
vs
<?php
if (!valid_data($_POST['username'])) {
die();
}
if (!valid_data($_POST['password'])) {
die();
}
login();
?>
Which one is better and easier to maintain? Remember this is just validating two things. Imagine this for a register page or something else.
I remember very clearly losing marks on a piece of college work because I had gone with the
if (!something) {
return;
}
// Code...
format. My lecturer pontificated that it was bad practice to have more than one exit point in a function. I thought that was nuts and 20+ years of computer programming later, I still do.
To be fair, he lived in an era where the lingua franca was C and functions were often pages long and full of nested conditionals making it difficult to track what was going on.
Then and now, however, simplicity is king: Keeping functions small and commenting them well is the best way to make things readable and maintainable.

Resources