How much information hiding is necessary when doing code refactoring? - refactoring

How much information hiding is necessary? I have boilerplate code before I delete a record, it looks like this:
public override void OrderProcessing_Delete(Dictionary<string, object> pkColumns)
{
var c = Connect();
using (var cmd = new NpgsqlCommand("SELECT COUNT(*) FROM orders WHERE order_id = :_order_id", c)
{ Parameters = { {"_order_id", pkColumns["order_id"]} } } )
{
var count = (long)cmd.ExecuteScalar();
// deletion's boilerplate code...
if (count == 0) throw new RecordNotFoundException();
else if (count > 1) throw new DatabaseStructureChangedException();
// ...boiler plate code
}
// deleting of table(s) goes here...
}
NOTE: boilerplate code is code-generated, including the "using (var cmd = new NpgsqlCommand( ... )"
But I'm seriously thinking to refactor the boiler plate code, I wanted a more succint code. This is how I envision to refactor the code (made nicer with extension method (not the sole reason ;))
using (var cmd = new NpgsqlCommand("SELECT COUNT(*) FROM orders WHERE order_id = :_order_id", c)
{ Parameters = { {"_order_id", pkColumns["order_id"]} } } )
{
cmd.VerifyDeletion(); // [EDIT: was ExecuteWithVerification before]
}
I wanted the executescalar and the boilerplate code to goes inside the extension method.
For my code above, does it warrants code refactoring / information hiding? Is my refactored operation looks too opaque?

I would say that your refactor is extremely good, if your new single line of code replaces a handful of lines of code in many places in your program. Especially since the functionality is going to be the same in all of those places.
The programmer coming after you and looking at your code will simply look at the definition of the extension method to find out what it does, and now he knows that this code is defined in one place, so there is no possibility of it differing from place to place.

Try it if you must, but my feeling is it's not about succinctness but whether or not you want to enforce the behavior every time or most of the time. And by extension, if the verify-condition changes that it would likely change across the board.
Basically, reducing a small chunk of boiler-plate code doesn't necessarily make things more succinct; it's just one more bit of abstractness the developer has to wade through and understand.
As a developer, I'd have no idea what "ExecuteWithVerify" means. What exactly are we verifying? I'd have to look it up and remember it. But with the boiler-plate code, I can look at the code and understand exactly what's going on.
And by NOT reducing it to a separate method I can also tune the boiler-plate code for cases where exceptions need to be thrown for differing conditions.

It's not information-hiding when you extract or refactor your code. It's only information-hiding when you start restricting access to your extension definition after refactoring.

"new" operator within a Class (except for the Constructor) should be Avoided at all costs. This is what you need to refactor here.

Related

How to match a enum value with some enum values using linq

I want to know what could be the shortest linq query instead of following if statement.
public enum ErrorMessage { Error1=1, Error2=2, Error3=3, Error4=4 }
ErrorMessage error = ErrorMessage.Error4;
if (error == ErrorMessage.Error1 || error == ErrorMessage.Error2)
{
//do something
}
Linq will make this code complicated,
code you provide is readable, fast and maintainable more than Linq will be
You could use
if (new [] {ErrorMessage.Error1, ErrorMessage.Error2}.Contains(error))
{
//do something
}
or
var bad_errors = new [] {ErrorMessage.Error1, ErrorMessage.Error2};
if (bad_errors.Contains(error))
{
//do something
}
if a single call to an extension method is LINQ enough for you.
I guess to most C# developers such a pattern seems strange (and it totally is), but if you're already working on a dynamically created list of errors you want to check against...
Otherwise, stick with if.
It actually works nicer in languages with less boilerplate, e.g. Python, where this pattern is commonly used and looks a lot nicer:
if error in (Error1, Error2):
# do something

Loop vs closure, readable vs concise?

In my answer to my own question here I posted some code and #Dave Newton was kind enough to provide me with a gist and show me the error in my not-so-Groovy ways. <-- Groovy pun
I took his advice and revamped my code to be Groovier. Since then the link I am making (which Dave represents with the replaceWith variable) has changed. Now the closure representation of what I want to do would look like this:
int i = 1
errorList = errorLinksFile.readLines().grep { it.contains "href" }.collect { line ->
def replaceWith = "<a href=\"${rooturl}${build.url}parsed_console/log_content.html#ERROR${i++}\">"
line.replaceAll(pattern, replaceWith).minus("</font>")
}
And the for loop representation of what I want to do would look like this:
def errorList = []
def i = 1
for(line in errorLinksFile.getText().split("\n")){
if(!line.contains("href")){
continue
}
errorList.add(line.replaceAll(pattern, "<a href=\"${rooturl}${build.url}parsed_console/log_content.html#ERROR${i++}\">").minus("</font>"))
}
The closure version is definitely more concise, but I'm worried if I always go the "Groovier" route the code might be harder for other programmers to understand than a simple for loop. So when is Groovier better and when should I opt for code that is likely to be understood by all programmers?
I believe that a development team should strive to be the best and coding to the least knowledgeable/experienced developer does not support this. It is important that more than one person on the team knows how to read the code that is developed though. So if you're the only one that can read it, teach someone else. If you're worried about someone new to the team being able to read it I feel that they would be equally hard to read since there would be lack of domain knowledge. What I would do though is break it up a little bit:
def originalMethod() {
//Do whatever happens before the given code
errorList = getModifiedErrorsFromFile(errorLinksFile)
}
def getModifiedErrorsFromFile(errorLinksFile) {
int i = 1
getHrefsFromFile(errorLinksFile).collect { line ->
def replaceWith = getReplacementTextForLine(i)
i++
line.replaceAll(pattern, replaceWith).minus("</font>")
}
}
def getHrefsFromFile(errorLinksFile) {
errorLinksFile.readLines().grep { it.contains "href" }
}
def getReplacementTextForLine(i) {
"<a href=\"${rooturl}${build.url}parsed_console/log_content.html#ERROR${i}\">"
}
This way if the next person doesn't immediately understand what is going on they should be able to infer what is going on based on the method names. If that doesn't work adding tests would help the next person understand what is going on.
My 2 cents. Good topic though!!!
Idiomatic groovy is good, people will learn the common idioms quickly. "Clever" groovy, in my opinion, is more likely to be just confusing.

Large application design (WPF/Silverlight)

Aside from the MVVM, as well as MVC patterns for the overall structure of a WPF app, how exactly do you break up the model/controller aspect of an app into subcomponents? The reason I ask is that I have no problem architecting the solution from the perspective of the patterns mentioned above, but when it comes to actually writing the backend; I feel that i'm fudging a lot of it. I end up with high quality apps from the user perspective, but my design asthetics don't allow me accept this.
To clarify; a lot of my business logic cannot be refactored into a class (or class hierarchy, with all associated interfaces) in any easy or meaningful way without having to change the entire app. I've been developing professionally for a year and a half now, so it may be an issue of inexperience; but I feel that it's still no excuse. Any pointers to this admittedly open ended question?
Edit: code request (in Silverlight)- The following is a -snippet- from a mousebuttonup handler in a drag-drop allocation application that's part of a much larger app-
I just really don't like how blunt the logic is, and hate the way that it's all completely unfactorable, since everything is getting stuffed into event handlers.
//determine if there is a previously existing allocated sale corresponding to this purchase's ID
SaleWS allocSaleExisting = colltoaddsale.FirstOrDefault(s => (s.p_TRADEID == allocPurch.TRADEID));
if (allocSaleExisting != null && allocSale.TRADEID == allocSaleExisting.TRADEID)
{
PurchaseWS allocPurchExisting = colltoadd.First(p => p.TRADEID == allocPurch.TRADEID);
//allocPurchExisting.AMOUNT += allocPurch.AMOUNT;
allocSaleExisting.AMOUNT += allocSale.AMOUNT;
allocPurchExisting.AMOUNT += allocSale.AMOUNT;
allocPurch.AMOUNT -= allocSale.AMOUNT;
colltoaddsale.Remove(allocSale);
//colltoadd.Remove(allocPurch);
}
else
{
//Create new "split" item in the data source for the source table
PurchaseWS splitAllocPurch = new PurchaseWS { COMMODITY = allocPurch.COMMODITY, CONTRACTNUMBER = allocPurch.CONTRACTNUMBER, AMOUNT = allocPurch.AMOUNT - allocSale.AMOUNT, FORM = allocPurch.FORM, GRADE = allocPurch.GRADE, LOCATION = allocPurch.LOCATION, SHIP_DATE = allocPurch.SHIP_DATE, TRADEID = allocPurch.TRADEID, UNITS = allocPurch.UNITS };
//update the source table's selecteditem datacontext with the target allocation id
allocPurch.s_TRADEID = allocSale.TRADEID;
allocSale.p_TRADEID = allocPurch.TRADEID;
allocPurch.AMOUNT = allocSale.AMOUNT;
colltoadd.Insert(colltoadd.IndexOf(allocPurch) + 1, splitAllocPurch);
}
}
Take a look at the Composite Application Guidance from the Patterns and Practices group.
It's geared specifically towards this, including using MVVM for WPF/Silverlight in large scale applications, and how to handle business logic concerns, etc.
You should also check Caliburn.

Flatten conditional as a refactoring

Consider:
if (something) {
// Code...
}
With CodeRush installed it recommended doing:
if (!something) {
return;
}
// Code...
Could someone explain how this is better? Surely there is no benefit what so ever.
Isolated, as you've presented it - no benefit. But mark4o is right on: it's less nesting, which becomes very clear if you look at even, say a 4-level nesting:
public void foo() {
if (a)
if (b)
if (c)
if (d)
doSomething();
}
versus
public void foo() {
if (!a)
return;
if (!b)
return;
if (!c)
return;
if (!d)
return;
doSomething();
}
early returns like this improve readability.
In some cases, it's cleaner to validate all of your inputs at the beginning of a method and just bail out if anything is not correct. You can have a series of single-level if checks that check successively more and more specific things until you're confident that your inputs are good. The rest of the method will then be much easier to write, and will tend to have fewer nested conditionals.
One less level of nesting.
This is a conventional refactoring meant for maintainability. See:
http://www.refactoring.com/catalog/replaceNestedConditionalWithGuardClauses.html
With one condition, it's not a big improvement. But it follows the "fail fast" principle, and you really start to notice the benefit when you have lots of conditions. If you grew up on "structured programming", which typically recommends functions have single exit points, it may seem unnatural, but if you've ever tried to debug code that has three levels or more of nested conditionals, you'll start to appreciate it.
It can be used to make the code more readable (by way of less nesting). See here for a good example, and here for a good discussion of the merits.
That sort of pattern is commonly used to replace:
void SomeMethod()
{
if (condition_1)
{
if (condition_2)
{
if (condition_3)
{
// code
}
}
}
}
With:
void SomeMethod()
{
if (!condition_1) { return; }
if (!condition_2) { return; }
if (!condition_3) { return; }
// code
}
Which is much easier on the eyes.
I don't think CodeRush is recommending it --- rather just offering it as an option.
IMO, it depends on if something or !something is the exceptional case. If there is a significant amount of code if something happens, then using the !something conditional makes more sense for legibility and potential nesting reduction.
Well, look at it this way (I'll use php as an example):
You fill a form and go to this page: validate.php
example 1:
<?php
if (valid_data($_POST['username'])) {
if (valid_data($_POST['password'])) {
login();
} else {
die();
}
} else {
die();
}
?>
vs
<?php
if (!valid_data($_POST['username'])) {
die();
}
if (!valid_data($_POST['password'])) {
die();
}
login();
?>
Which one is better and easier to maintain? Remember this is just validating two things. Imagine this for a register page or something else.
I remember very clearly losing marks on a piece of college work because I had gone with the
if (!something) {
return;
}
// Code...
format. My lecturer pontificated that it was bad practice to have more than one exit point in a function. I thought that was nuts and 20+ years of computer programming later, I still do.
To be fair, he lived in an era where the lingua franca was C and functions were often pages long and full of nested conditionals making it difficult to track what was going on.
Then and now, however, simplicity is king: Keeping functions small and commenting them well is the best way to make things readable and maintainable.

What's so great about Func<> delegate?

Sorry if this is basic but I was trying to pick up on .Net 3.5.
Question: Is there anything great about Func<> and it's 5 overloads? From the looks of it, I can still create a similar delgate on my own say, MyFunc<> with the exact 5 overloads and even more.
eg: public delegate TResult MyFunc<TResult>() and a combo of various overloads...
The thought came up as I was trying to understand Func<> delegates and hit upon the following scenario:
Func<int,int> myDelegate = (y) => IsComposite(10);
This implies a delegate with one parameter of type int and a return type of type int. There are five variations (if you look at the overloads through intellisense). So I am guessing that we can have a delegate with no return type?
So am I justified in saying that Func<> is nothing great and just an example in the .Net framework that we can use and if needed, create custom "func<>" delegates to suit our own needs?
Thanks,
The greatness lies in establishing shared language for better communication.
Instead of defining your own delegate types for the same thing (delegate explosion), use the ones provided by the framework. Anyone reading your code instantly grasps what you are trying to accomplish.. minimizes the time to 'what is this piece of code actually doing?'
So as soon as I see a
Action = some method that just does something and returns no output
Comparison = some method that compares two objects of the same type and returns an int to indicate order
Converter = transforms Obj A into equivalent Obj B
EventHandler = response/handler to an event raised by some object given some input in the form of an event argument
Func = some method that takes some parameters, computes something and returns a result
Predicate = evaluate input object against some criteria and return pass/fail status as bool
I don't have to dig deeper than that unless it is my immediate area of concern. So if you feel the delegate you need fits one of these needs, use them before rolling your own.
Disclaimer: Personally I like this move by the language designers.
Counter-argument : Sometimes defining your delegate may help communicate intent better. e.g. System.Threading.ThreadStart over System.Action. So it’s a judgment call in the end.
The Func family of delegates (and their return-type-less cousins, Action) are not any greater than anything else you'd find in the .NET framework. They're just there for re-use so you don't have to redefine them. They have type parameters to keep things generic. E.g., a Func<T0,bool> is the same as a System.Predicate<T> delegate. They were originally designed for LINQ.
You should be able to just use the built-in Func delegate for any value-returning method that accepts up to 4 arguments instead of defining your own delegate for such a purpose unless you want the name to reflect your intention, which is cool.
Cases where you would absolutely need to define your delegate types include methods that accept more than 4 arguments, methods with out, ref, or params parameters, or recursive method signatures (e.g., delegate Foo Foo(Foo f)).
In addition to Marxidad's correct answer:
It's worth being aware of Func's related family, the Action delegates. Again, these are types overloaded by the number of type parameters, but declared to return void.
If you want to use Func/Action in a .NET 2.0 project but with a simple route to upgrading later on, you can cut and paste the declarations from my version comparison page. If you declare them in the System namespace then you'll be able to upgrade just by removing the declarations later - but then you won't be able to (easily) build the same code in .NET 3.5 without removing the declarations.
Decoupling dependencies and unholy tie-ups is one singular thing that makes it great. Everything else one can debate and claim to be doable in some home-grown way.
I've been refactoring slightly more complex system with an old and heavy lib and got blocked on not being able to break compile time dependency - because of the named delegate lurking on "the other side". All assembly loading and reflection didn't help - compiler would refuse to just cast a delegate() {...} to object and whatever you do to pacify it would fail on the other side.
Delegate type comparison which is structural at compile time turns nominal after that (loading, invoking). That may seem OK while you are thinking in terms of "my darling lib is going to be used forever and by everyone" but it doesn't scale to even slightly more complex systems. Fun<> templates bring a degree of structural equivalence back into the world of nominal typing . That's the aspect you can't achieve by rolling out your own.
Example - converting:
class Session (
public delegate string CleanBody(); // tying you up and you don't see it :-)
public static void Execute(string name, string q, CleanBody body) ...
to:
public static void Execute(string name, string q, Func<string> body)
Allows completely independent code to do reflection invocation like:
Type type = Type.GetType("Bla.Session, FooSessionDll", true);
MethodInfo methodInfo = type.GetMethod("Execute");
Func<string> d = delegate() { .....} // see Ma - no tie-ups :-)
Object [] params = { "foo", "bar", d};
methodInfo.Invoke("Trial Execution :-)", params);
Existing code doesn't notice the difference, new code doesn't get dependence - peace on Earth :-)
One thing I like about delegates is that they let me declare methods within methods like so, this is handy when you want to reuse a piece of code but you only need it within that method. Since the purpose here is to limit the scope as much as possible Func<> comes in handy.
For example:
string FormatName(string pFirstName, string pLastName) {
Func<string, string> MakeFirstUpper = (pText) => {
return pText.Substring(0,1).ToUpper() + pText.Substring(1);
};
return MakeFirstUpper(pFirstName) + " " + MakeFirstUpper(pLastName);
}
It's even easier and more handy when you can use inference, which you can if you create a helper function like so:
Func<T, TReturn> Lambda<T, TReturn>(Func<T, TReturn> pFunc) {
return pFunc;
}
Now I can rewrite my function without the Func<>:
string FormatName(string pFirstName, string pLastName) {
var MakeFirstUpper = Lambda((string pText) => {
return pText.Substring(0,1).ToUpper() + pText.Substring(1);
});
return MakeFirstUpper(pFirstName) + " " + MakeFirstUpper(pLastName);
}
Here's the code to test the method:
Console.WriteLine(FormatName("luis", "perez"));
Though it is an old thread I had to add that func<> and action<> also help us use covariance and contra variance.
http://msdn.microsoft.com/en-us/library/dd465122.aspx

Resources