Currently trying to refactor some legacy code and I find myself stumbling across the same situation over and over again. I have a very large function that I want to use extract method on for several blocks. Many times the blocks I aim to extract are wrapped entirely in a conditional. From a style / best practice standpoint, does it make sense to leave the conditional wrapping the new method call, or extract the conditional wrapping code into the new method. Can you explain the pros / cons?
i.e. do you prefer
A)
Big_function() {
<...>
If (conditional) {
Small_function();
}
<...>
}
Small_function() {
Code
}
Or B)
Big_function() {
<...>
Small_function();
<...>
}
Small function() {
If (conditional) {
Code
}
}
Maybe this is trivial, but since I see it a lot, thought I would ask. Let's assume small_function would not be reused in either form, simply used to extract this block.
thanks.
It depends on the context. For each instance, if you have a gut feeling either way, you're probably right. Just make sure the name of the inner function matches the implied if check, or lack of thereof.
With that concern out of the way, the best thing to do is to put the if inside the inner function, because there is one big complicated outer function and many simple inner functions. One extra if check inside several small functions will not add much complexity to them individually, whereas adding lots of extra branches into the outer function will meaningfully affect its complexity and readability.
Lastly, there is a slight perf tradeoff to putting the check inside the inner function vs outside. You can sometimes incur additional costs of calling and returning from the inner function when the if evaluates to false, but on the other hand your outer code can be slightly smaller (fewer instructions). For this to make any meaningful impact either way, the outer function has to be VERY hot, and you have to be writing in a low-level language.
Related
Many times I've come across a situation where I am calling a method many times with one parameter always exactly the same, and I need to add some additional parameter defined in my main method. I'm never sure what to do about the additional parameter. It seems like it might be a better idea to make the parameter static to make the code cleaner, but then, static variables are not very good. Still, adding parameters can in some cases lead to very long lists of parameters that are in a way unnecessary, and I imagine that there will be a speed disadvantage as well when your function is short, called often, and has a lot of parameters.
Here's the most recent code (kotlin) that has had me thinking about this problem, but I've ran into the problem a lot in different languages.
tailrec fun getAncestor(ind: Int, parent: IntArray): Int {
if (parent[ind] == ind) return ind else return getAncestor(parent[ind], parent);
}
Is it better to make parent a static variable and not have it as a parameter to the method? (Assume that every time that this method is called, the second parameter will be the same. I'm usually only writing short 100 line code files for competitive programming so there is no chance the method will be reused with a different second parameter)
In general if you're working on a project, you'll want to stay away from static variables because when the project starts scaling up and getting bigger, it'll be very hard to keep track of the static variables and this will make it very hard to debug.
In competitive programming, I suggest go for the quicker approach.
In my opinion, it depends on your preferred coding paradigm. For instance, if you'd like your code to conform to the functional paradigm, your code is fine how it is now (regarding the parameter), however, if you don't care about the functional programming, having that param extracted out, as a constant is alright. I see no reason why it should be bad. If you worry about the performance, you could declare it as lazy, but otherwise it'd fine, I guess.
How do you do "inline functions" in C#? I don't think I understand the concept. Are they like anonymous methods? Like lambda functions?
Note: The answers almost entirely deal with the ability to inline functions, i.e. "a manual or compiler optimization that replaces a function call site with the body of the callee." If you are interested in anonymous (a.k.a. lambda) functions, see #jalf's answer or What is this 'Lambda' everyone keeps speaking of?.
Finally in .NET 4.5, the CLR allows one to hint/suggest1 method inlining using MethodImplOptions.AggressiveInlining value. It is also available in the Mono's trunk (committed today).
// The full attribute usage is in mscorlib.dll,
// so should not need to include extra references
using System.Runtime.CompilerServices;
...
[MethodImpl(MethodImplOptions.AggressiveInlining)]
void MyMethod(...)
1. Previously "force" was used here. I'll try to clarify the term. As in the comments and the documentation, The method should be inlined if possible. Especially considering Mono (which is open), there are some mono-specific technical limitations considering inlining or more general one (like virtual functions). Overall, yes, this is a hint to compiler, but I guess that is what was asked for.
Inline methods are simply a compiler optimization where the code of a function is rolled into the caller.
There's no mechanism by which to do this in C#, and they're to be used sparingly in languages where they are supported -- if you don't know why they should be used somewhere, they shouldn't be.
Edit: To clarify, there are two major reasons they need to be used sparingly:
It's easy to make massive binaries by using inline in cases where it's not necessary
The compiler tends to know better than you do when something should, from a performance standpoint, be inlined
It's best to leave things alone and let the compiler do its work, then profile and figure out if inline is the best solution for you. Of course, some things just make sense to be inlined (mathematical operators particularly), but letting the compiler handle it is typically the best practice.
Update: Per konrad.kruczynski's answer, the following is true for versions of .NET up to and including 4.0.
You can use the MethodImplAttribute class to prevent a method from being inlined...
[MethodImpl(MethodImplOptions.NoInlining)]
void SomeMethod()
{
// ...
}
...but there is no way to do the opposite and force it to be inlined.
You're mixing up two separate concepts. Function inlining is a compiler optimization which has no impact on the semantics. A function behaves the same whether it's inlined or not.
On the other hand, lambda functions are purely a semantic concept. There is no requirement on how they should be implemented or executed, as long as they follow the behavior set out in the language spec. They can be inlined if the JIT compiler feels like it, or not if it doesn't.
There is no inline keyword in C#, because it's an optimization that can usually be left to the compiler, especially in JIT'ed languages. The JIT compiler has access to runtime statistics which enables it to decide what to inline much more efficiently than you can when writing the code. A function will be inlined if the compiler decides to, and there's nothing you can do about it either way. :)
Cody has it right, but I want to provide an example of what an inline function is.
Let's say you have this code:
private void OutputItem(string x)
{
Console.WriteLine(x);
//maybe encapsulate additional logic to decide
// whether to also write the message to Trace or a log file
}
public IList<string> BuildListAndOutput(IEnumerable<string> x)
{ // let's pretend IEnumerable<T>.ToList() doesn't exist for the moment
IList<string> result = new List<string>();
foreach(string y in x)
{
result.Add(y);
OutputItem(y);
}
return result;
}
The compilerJust-In-Time optimizer could choose to alter the code to avoid repeatedly placing a call to OutputItem() on the stack, so that it would be as if you had written the code like this instead:
public IList<string> BuildListAndOutput(IEnumerable<string> x)
{
IList<string> result = new List<string>();
foreach(string y in x)
{
result.Add(y);
// full OutputItem() implementation is placed here
Console.WriteLine(y);
}
return result;
}
In this case, we would say the OutputItem() function was inlined. Note that it might do this even if the OutputItem() is called from other places as well.
Edited to show a scenario more-likely to be inlined.
Do you mean inline functions in the C++ sense? In which the contents of a normal function are automatically copied inline into the callsite? The end effect being that no function call actually happens when calling a function.
Example:
inline int Add(int left, int right) { return left + right; }
If so then no, there is no C# equivalent to this.
Or Do you mean functions that are declared within another function? If so then yes, C# supports this via anonymous methods or lambda expressions.
Example:
static void Example() {
Func<int,int,int> add = (x,y) => x + y;
var result = add(4,6); // 10
}
Yes Exactly, the only distinction is the fact it returns a value.
Simplification (not using expressions):
List<T>.ForEach Takes an action, it doesn't expect a return result.
So an Action<T> delegate would suffice.. say:
List<T>.ForEach(param => Console.WriteLine(param));
is the same as saying:
List<T>.ForEach(delegate(T param) { Console.WriteLine(param); });
the difference is that the param type and delegate decleration are inferred by usage and the braces aren't required on a simple inline method.
Where as
List<T>.Where Takes a function, expecting a result.
So an Function<T, bool> would be expected:
List<T>.Where(param => param.Value == SomeExpectedComparison);
which is the same as:
List<T>.Where(delegate(T param) { return param.Value == SomeExpectedComparison; });
You can also declare these methods inline and asign them to variables IE:
Action myAction = () => Console.WriteLine("I'm doing something Nifty!");
myAction();
or
Function<object, string> myFunction = theObject => theObject.ToString();
string myString = myFunction(someObject);
I hope this helps.
The statement "its best to leave these things alone and let the compiler do the work.." (Cody Brocious) is complete rubish. I have been programming high performance game code for 20 years, and I have yet to come across a compiler that is 'smart enough' to know which code should be inlined (functions) or not. It would be useful to have a "inline" statement in c#, truth is that the compiler just doesnt have all the information it needs to determine which function should be always inlined or not without the "inline" hint. Sure if the function is small (accessor) then it might be automatically inlined, but what if it is a few lines of code? Nonesense, the compiler has no way of knowing, you cant just leave that up to the compiler for optimized code (beyond algorithims).
There are occasions where I do wish to force code to be in-lined.
For example if I have a complex routine where there are a large number of decisions made within a highly iterative block and those decisions result in similar but slightly differing actions to be carried out. Consider for example, a complex (non DB driven) sort comparer where the sorting algorythm sorts the elements according to a number of different unrelated criteria such as one might do if they were sorting words according to gramatical as well as semantic criteria for a fast language recognition system. I would tend to write helper functions to handle those actions in order to maintain the readability and modularity of the source code.
I know that those helper functions should be in-lined because that is the way that the code would be written if it never had to be understood by a human. I would certainly want to ensure in this case that there were no function calling overhead.
I know this question is about C#. However, you can write inline functions in .NET with F#. see: Use of `inline` in F#
No, there is no such construct in C#, but the .NET JIT compiler could decide to do inline function calls on JIT time. But i actually don't know if it is really doing such optimizations.
(I think it should :-))
In case your assemblies will be ngen-ed, you might want to take a look at TargetedPatchingOptOut. This will help ngen decide whether to inline methods. MSDN reference
It is still only a declarative hint to optimize though, not an imperative command.
Lambda expressions are inline functions! I think, that C# doesn`t have a extra attribute like inline or something like that!
I hope that this is the right place to ask (if not please tell me). I am currently trying to create a game while following Robert C. Martins book "Clean Code" in an effort to improve the readability of my code. I am not totally happy with how many of my functions work however, as more often than not I will need to check various variables before I execute a command. For example:
private void checkScoreAndIncreaseDifficulty() {
if(eater.getScore()%400==0){
world.increaseDifficulty();
}
According to the book I am following functions should only do one thing, but when an "if" statement is involved the functions purpose (to me) seems to naturally increase. The name of the above function is checkScoreAndIncreaseDifficulty which is quite clearly two things but I cannot think of how to reduce it. I feel the solution is very simple but it just is not coming to me. Any advice would be appreciated.
Names of functions are supposed to describe what the function does,not how achieves something. So naming checkScoreAndIncreaseDifficulty is wrong imo. It should be something similar to IncreaseDifficulty. Moreover if you do not feel that names of your code elements do not give enough insight on what the function / class/ variable does, you can always comment your code. :)
Also, the "correct" place to write your conditional statement depends entirely on your application : if difficulty will increase only if the user satisfies some condition, then that condition should be included in the function that increases difficulty. If the difficulty might be increased by other means, then the function should only execute actions that increase difficulty and leave the decision whether to increase difficulty or not to the caller.
I don't see any way to make this method cleaner than it is. It checks the score, and then hands of the increaseDifficulty to another method. What more can you ask?
I think there's also a risk in trying to make methods as minimal as they can possibly be. At some point, you need to dig through a dozen methods and functions spread through your entire codebase in order to understand what a single method or function does. That's no good either.
As far as I'm concerned, your code hits the sweet spot. The cumbersome name is indeed an issue, but not a serious one, and easy enough to fix, if you really want to.
Why don't you use two separate functions(you can do it also with only first function) like this:
private bool isScoreEnough() {
if(eater.getScore()%400==0)
return true;
}
private void increaseDifficulty(){
world.increaseDifficulty();
}
and in your program:
if(isScoreEnough())
increaseDifficulty();
You are setting difficulty level in the function. Depending on score you increase/decrease the difficulty level. That is perfectly fine - as long as the function does the job of determining and setting difficulty level only. Function name has scope for improvement:
private void setDifficultyLevel() {
if (eater.getScore()%400 == 0) {
world.increaseDifficulty();
}
}
Obviously you won't find much code without any if statement, programs have to make decisions based on state, input or whatever. Avoiding if statements at all is also not the key message of "Clean Code". What you should avoid are for example endless if-else constructs if one class does the work of three logical cases - you should have three classes then.
Your method is already pretty short, but if you want to apply one of Martin's principles (a good name spares comments etc.) even further you could use something like:
if (scoreRequiresIncreaseInDifficulty()) {
increaseDifficulaty();
}
and then implement the two methods. However, I don't think this makes much sense in your case unless you will need the same calls multiple times and want to maintain the flexibility to change when the score requires a more difficult game in a single place.
You want to increase difficulty every 400 score points gathered by the eater, right?
So, you would need the eater to allow subscribing for score changes:
interface Eater {
void addScoreChangedListener(ScoreChangedListener listener);
}
interface ScoreChangedListener {
public void onScoreChanged(Score previousScore, Score newScore);
}
The implementation of Eater would hold a list of all score change listeners and invoke the onScoreChanged on each of them when the score changes. See the Observer pattern.
And then when bootstrapping your game:
Eater eater = ....
eater.addScorechangedListener(new DifficultyAdjustScoreChangeListener());
And you are done!
The implementation of DifficultyAdjustScoreChangeListener need to have the if, though, but that's ok:
class DifficultyAdjustScoreChangeListener implements ScoreChangedListener {
public void onScoreChanged(Score previousScore, Score newScore) {
if (newScore.value() % 400 == 0) {
world.increaseDifficulty();
}
}
}
The Problem
I'm writing an application where performance is very important. (Specifically: A raytracer)
I want to have an option for something called "Adaptive Supersampling" in my program. It's fairly simple to implement but I want to have an option to turn it on or off.
Unfortunately I only see two options, which are:
Put an if/else around two separate procedures (which are fairly similar) like so:
void renderLoop() {
if(adaptive) {
doAdaptiveLoop();
} else {
doNormalLoop();
}
}
Put specialized if/else statement scattered around the code.
void renderLoop() {
if(adaptive) something();
else somethingElse();
for(int i = 0; i < LOOP_1; i++) {
if(adaptive) something1();
else somethingElse1();
for(int j = 0; j < LOOP_2, j++) {
if(adaptive) something2();
else somethingElse2();
}
}
//... So on and so forth
}
Both of these methods are fairly terrible. Method 1 repeats a lot of code, and Method 2 is both messy and inefficient. Both of them make it very difficult to add on many features. If I want to add more features, I have to add more if/else statements at a factorial rate with Method 1, and in a very confusing way in Method 2.
The Question
I've run across this problem many times in performance dependent applications, but I've never really been able to solve it.
The questions is: How do I add toggleable features to my program which is centered around many loops, whose performance will decrease with many boolean tests, and whose code will become messy with those tests?
I also run into this problem on a regular basis, and I usually have to choose between options 1 and 2. If the number of options keeps growing, I usually fall back on the swiss-army-knife method: code generation.
In other words, I write program A to write functions B1, B2, etc. Program A takes the options as arguments and generates the appropriate function Bi as a text file, and then the final program includes all the Bi.
It's not pretty, but pretty is overrated.
The advantage is, the common code among all the Bi is only stated in one place, so if you make a change to it, you only have to do it in one place, so you have fewer opportunities to get it wrong.
I think what you are searching for are function pointers or the analogous.
Let me elaborate, you specifically asked for performance, and I agree that chasing pointers might get in your way - BUT since you now seem to have a large number of branches depending on settings somewhere inside you application, you might as well wrap that code up in some function pointers. That does not mean you should call a function through a pointer when just adding two vectors.
On a side note; this is also relevant when parallelizing code; the work-items need not be too large as to mitigate the parallelization efforts but need be large enough to be processed efficiently.
Maybe use a global function pointer variable to hold the loop function your options tell you to use.
Which of theses scenarios would be faster?
Scenario 1:
foreach (var file in directory.GetFiles())
{
if (file.Extension.ToLower() != ".txt" &&
file.Extension.ToLower() != ".bin")
continue;
// Do something cool.
}
Scenario 2:
var files = from file in directory.GetFiles()
where file.Extension.ToLower() == ".txt" ||
file.Extension.ToLower() == ".bin"
select file;
foreach (var file in files)
{
// Do something cool.
}
I know that they are logically the same because of delayed execution, but which would be the faster? And why?
Faster isn't usually the issue per se, especially in a scenario like this where there is not going to be a meaningful performance difference (and in general, if the code is not a bottleneck it just doesn't matter). The issue is which is more readable and more clearly expresses the intent of the code.
I think the second block of code more clearly expresses the intent of the code. It reads as "query a collection of file names for some file names with a certain property" and then "for each of those file names with that property, do something." It declares what is happening, rather than how it is going to happen. Separating the what from the mechanism is what makes the second block of code clearer and where LINQ really shines. Use LINQ to declare the what, and let LINQ implement the mechanism instead of in the past where the what would be muddled with the mechanism.
Is LINQ faster or just more convenient?
So, to answer the question in your title, LINQ usually does not materially hinder performance but it makes code more clear by allowing the coder to declare what they want done instead of having to focus on how they want something done. At the end of the day, we don't care about the how, we care about the what.
I know that they are logically the same because of delayed execution, but which would be the faster?
Probably the imperative version because there is a tiny amount of overhead in using LINQ. But if you really must know which is faster be sure to use a profiler, and be sure to test on real-world data.
And why?
Because LINQ adds a little bit of overhead. But the trade off is significantly clearer and more maintainable code. That is a huge win compared to the usually irrelevant performance loss.
It would be faster to do a GetFiles("*.txt") and GetFile("*.bin") if the directory contains lots of files or is on a network drive.
Compared to that the extra overhead for LINQ is just noise.
Linq isn't faster and it's not really about convenience. Rather, Linq pulls the higher-order functions Fold, Map, and Filter into .NET (with different names). These functions are valuable because they allow us to DRY-up our code. Every time you set up an iteration with a secondary collection or result, you open yourself up to a bug. Linq allows you to focus on what happens inside the iteration and feel fairly confident that the iteration mechanics are bug-free.
This doesn't mean that Linq is strictly slower than manual iteration. As others have mentioned, you'll have to benchmark case-by-case.
I wrote an article on Code Project that benchmarked linq and Stored procedures as well as using compiled linq.
Please take a look.
http://www.codeproject.com/KB/cs/linqsql2.aspx
I understand you are looking at local file parsing, the article will give you some idea of what is involved and what linq is doing behind the scenes.