Early return statements and cyclomatic complexity - cyclomatic-complexity

I prefer this writing style with early returns:
public static Type classify(int a, int b, int c) {
if (!isTriangle(a, b, c)) {
return Type.INVALID;
}
if (a == b && b == c) {
return Type.EQUILATERAL;
}
if (b == c || a == b || c == a) {
return Type.ISOSCELES;
}
return Type.SCALENE;
}
Unfortunately, every return statement increases the cyclomatic complexity metric calculated by Sonar. Consider this alternative:
public static Type classify(int a, int b, int c) {
final Type result;
if (!isTriangle(a, b, c)) {
result = Type.INVALID;
} else if (a == b && b == c) {
result = Type.EQUILATERAL;
} else if (b == c || a == b || c == a) {
result = Type.ISOSCELES;
} else {
result = Type.SCALENE;
}
return result;
}
The cyclomatic complexity of this latter approach reported by Sonar is lower than the first, by 3. I have been told that this might be the result of a wrong implementation of the CC metrics. Or is Sonar correct, and this is really better? These related questions seem to disagree with that:
https://softwareengineering.stackexchange.com/questions/118703/where-did-the-notion-of-one-return-only-come-from
https://softwareengineering.stackexchange.com/questions/18454/should-i-return-from-a-function-early-or-use-an-if-statement
If I add support for a few more triangle types, the return statements will add up to make a significant difference in the metric and cause a Sonar violation. I don't want to stick a // NOSONAR on the method, as that might mask other problems by new features/bugs added to the method in the future. So I use the second version, even though I don't really like it. Is there a better way to handle the situation?

Your question relates to https://jira.codehaus.org/browse/SONAR-4857. For the time being all SonarQube analysers are mixing the cyclomatic complexity and essential complexity. From a theoretical point of view return statement should not increment the cc and this change is going to happen in the SQ ecosystem.

Not really an answer, but way too long for a comment.
This SONAR rule seems to be thoroughly broken. You could rewrite
b == c || a == b || c == a
as
b == c | a == b | c == a
and gain two points in this strange game (and maybe even some speed as branching is expensive; but this is on the discretion of the JITc, anyway).
The old rule claims, that the cyclomatic complexity is related to the number of tests. The new one doesn't, and that's a good thing as obviously the number of meaningfull tests for your both snippets is exactly the same.
Is there a better way to handle the situation?
Actually, I do have an answer: For each early return use | instead of || once. :D
Now seriously: There is a bug requesting annotations allowing to disable a single rule, which is marked as fixed. I din't look any further.

Since the question is also about early return statements as a coding style, it would be helpful to consider the effect of size on the return style. If the method or function is small, less than say 30 lines, early returns are no problem, because anyone reading the code can see the whole method at a glance including all of the returns. In larger methods or functions, an early return can be a trap unintentionally set for the reader. If the early return occurs above the code the reader is looking at, and the reader doesn't know the return is above or forgets that it is above, the reader will misunderstand the code. Production code can be too big to fit on one screen.
So whoever is managing a code base for complexity should be allowing for method size in cases where the complexity appears to be problem. If the code takes more than one screen, a more pedantic return style may be justified. If the method or function is small, don't worry about it.
(I use Sonar and have experienced this same issue.)

Related

Why can't dead code detection be fully solved by a compiler?

The compilers I've been using in C or Java have dead code prevention (warning when a line won't ever be executed). My professor says that this problem can never be fully solved by compilers though. I was wondering why that is. I am not too familiar with the actual coding of compilers as this is a theory-based class. But I was wondering what they check (such as possible input strings vs acceptable inputs, etc.), and why that is insufficient.
The dead code problem is related to the Halting problem.
Alan Turing proved that it is impossible to write a general algorithm that will be given a program and be able to decide whether that program halts for all inputs. You may be able to write such an algorithm for specific types of programs, but not for all programs.
How does this relate to dead code?
The Halting problem is reducible to the problem of finding dead code. That is, if you find an algorithm that can detect dead code in any program, then you can use that algorithm to test whether any program will halt. Since that has been proven to be impossible, it follows that writing an algorithm for dead code is impossible as well.
How do you transfer an algorithm for dead code into an algorithm for the Halting problem?
Simple: you add a line of code after the end of the program you want to check for halt. If your dead-code detector detects that this line is dead, then you know that the program does not halt. If it doesn't, then you know that your program halts (gets to the last line, and then to your added line of code).
Compilers usually check for things that can be proven at compile-time to be dead. For example, blocks that are dependent on conditions that can be determined to be false at compile time. Or any statement after a return (within the same scope).
These are specific cases, and therefore it's possible to write an algorithm for them. It may be possible to write algorithms for more complicated cases (like an algorithm that checks whether a condition is syntactically a contradiction and therefore will always return false), but still, that wouldn't cover all possible cases.
Well, let's take the classical proof of the undecidability of the halting problem and change the halting-detector to a dead-code detector!
C# program
using System;
using YourVendor.Compiler;
class Program
{
static void Main(string[] args)
{
string quine_text = #"using System;
using YourVendor.Compiler;
class Program
{{
static void Main(string[] args)
{{
string quine_text = #{0}{1}{0};
quine_text = string.Format(quine_text, (char)34, quine_text);
if (YourVendor.Compiler.HasDeadCode(quine_text))
{{
System.Console.WriteLine({0}Dead code!{0});
}}
}}
}}";
quine_text = string.Format(quine_text, (char)34, quine_text);
if (YourVendor.Compiler.HasDeadCode(quine_text))
{
System.Console.WriteLine("Dead code!");
}
}
}
If YourVendor.Compiler.HasDeadCode(quine_text) returns false, then the line System.Console.WriteLn("Dead code!"); won't be ever executed, so this program actually does have dead code, and the detector was wrong.
But if it returns true, then the line System.Console.WriteLn("Dead code!"); will be executed, and since there is no more code in the program, there is no dead code at all, so again, the detector was wrong.
So there you have it, a dead-code detector that returns only "There is dead code" or "There is no dead code" must sometimes yield wrong answers.
If the halting problem is too obscure, think of it this way.
Take a mathematical problem that is believed to be true for all positive integer's n, but hasn't been proven to be true for every n. A good example would be Goldbach's conjecture, that any positive even integer greater than two can be represented by the sum of two primes. Then (with an appropriate bigint library) run this program (pseudocode follows):
for (BigInt n = 4; ; n+=2) {
if (!isGoldbachsConjectureTrueFor(n)) {
print("Conjecture is false for at least one value of n\n");
exit(0);
}
}
Implementation of isGoldbachsConjectureTrueFor() is left as an exercise for the reader but for this purpose could be a simple iteration over all primes less than n
Now, logically the above must either be the equivalent of:
for (; ;) {
}
(i.e. an infinite loop) or
print("Conjecture is false for at least one value of n\n");
as Goldbach's conjecture must either be true or not true. If a compiler could always eliminate dead code, there would definitely be dead code to eliminate here in either case. However, in doing so at the very least your compiler would need to solve arbitrarily hard problems. We could provide problems provably hard that it would have to solve (e.g. NP-complete problems) to determine which bit of code to eliminate. For instance if we take this program:
String target = "f3c5ac5a63d50099f3b5147cabbbd81e89211513a92e3dcd2565d8c7d302ba9c";
for (BigInt n = 0; n < 2**2048; n++) {
String s = n.toString();
if (sha256(s).equals(target)) {
print("Found SHA value\n");
exit(0);
}
}
print("Not found SHA value\n");
we know that the program will either print out "Found SHA value" or "Not found SHA value" (bonus points if you can tell me which one is true). However, for a compiler to be able to reasonably optimise that would take of the order of 2^2048 iterations. It would in fact be a great optimisation as I predict the above program would (or might) run until the heat death of the universe rather than printing anything without optimisation.
I don't know if C++ or Java have an Eval type function, but many languages do allow you do call methods by name. Consider the following (contrived) VBA example.
Dim methodName As String
If foo Then
methodName = "Bar"
Else
methodName = "Qux"
End If
Application.Run(methodName)
The name of the method to be called is impossible to know until runtime. Therefore, by definition, the compiler cannot know with absolute certainty that a particular method is never called.
Actually, given the example of calling a method by name, the branching logic isn't even necessary. Simply saying
Application.Run("Bar")
Is more than the compiler can determine. When the code is compiled, all the compiler knows is that a certain string value is being passed to that method. It doesn't check to see if that method exists until runtime. If the method isn't called elsewhere, through more normal methods, an attempt to find dead methods can return false positives. The same issue exists in any language that allows code to be called via reflection.
Unconditional dead code can be detected and removed by advanced compilers.
But there is also conditional dead code. That is code that cannot be known at the time of compilation and can only be detected during runtime. For example, a software may be configurable to include or exclude certain features depending on user preference, making certain sections of code seemingly dead in particular scenarios. That is not be real dead code.
There are specific tools that can do testing, resolve dependencies, remove conditional dead code and recombine the useful code at runtime for efficiency. This is called dynamic dead code elimination. But as you can see it is beyond the scope of compilers.
A simple example:
int readValueFromPort(const unsigned int portNum);
int x = readValueFromPort(0x100); // just an example, nothing meaningful
if (x < 2)
{
std::cout << "Hey! X < 2" << std::endl;
}
else
{
std::cout << "X is too big!" << std::endl;
}
Now assume that the port 0x100 is designed to return only 0 or 1. In that case the compiler cannot figure out that the else block will never be executed.
However in this basic example:
bool boolVal = /*anything boolean*/;
if (boolVal)
{
// Do A
}
else if (!boolVal)
{
// Do B
}
else
{
// Do C
}
Here the compiler can calculate out the the else block is a dead code.
So the compiler can warn about the dead code only if it has enough data to to figure out the dead code and also it should know how to apply that data in order to figure out if the given block is a dead code.
EDIT
Sometimes the data is just not available at the compilation time:
// File a.cpp
bool boolMethod();
bool boolVal = boolMethod();
if (boolVal)
{
// Do A
}
else
{
// Do B
}
//............
// File b.cpp
bool boolMethod()
{
return true;
}
While compiling a.cpp the compiler cannot know that boolMethod always returns true.
The compiler will always lack some context information. E.g. you might know, that a double value never exeeds 2, because that is a feature of the mathematical function, you use from a library. The compiler does not even see the code in the library, and it can never know all features of all mathematical functions, and detect all weired and complicated ways to implement them.
The compiler doesn't necessarily see the whole program. I could have a program that calls a shared library, which calls back into a function in my program which isn't called directly.
So a function which is dead with respect to the library it's compiled against could become alive if that library was changed at runtime.
If a compiler could eliminate all dead code accurately, it would be called an interpreter.
Consider this simple scenario:
if (my_func()) {
am_i_dead();
}
my_func() can contain arbitrary code and in order for the compiler to determine whether it returns true or false, it will either have to run the code or do something that is functionally equivalent to running the code.
The idea of a compiler is that it only performs a partial analysis of the code, thus simplifying the job of a separate running environment. If you perform a full analysis, that isn't a compiler any more.
If you consider the compiler as a function c(), where c(source)=compiled code, and the running environment as r(), where r(compiled code)=program output, then to determine the output for any source code you have to compute the value of r(c(source code)). If calculating c() requires the knowledge of the value of r(c()) for any input, there is no need for a separate r() and c(): you can just derive a function i() from c() such that i(source)=program output.
Others have commented on the halting problem and so forth. These generally apply to portions of functions. However it can be hard/impossible to know whether even an entire type (class/etc) is used or not.
In .NET/Java/JavaScript and other runtime driven environments there's nothing stopping types being loaded via reflection. This is popular with dependency injection frameworks, and is even harder to reason about in the face of deserialisation or dynamic module loading.
The compiler cannot know whether such types would be loaded. Their names could come from external config files at runtime.
You might like to search around for tree shaking which is a common term for tools that attempt to safely remove unused subgraphs of code.
Take a function
void DoSomeAction(int actnumber)
{
switch(actnumber)
{
case 1: Action1(); break;
case 2: Action2(); break;
case 3: Action3(); break;
}
}
Can you prove that actnumber will never be 2 so that Action2() is never called...?
I disagree about the halting problem. I wouldn't call such code dead even though in reality it will never be reached.
Instead, lets consider:
for (int N = 3;;N++)
for (int A = 2; A < int.MaxValue; A++)
for (int B = 2; B < int.MaxValue; B++)
{
int Square = Math.Pow(A, N) + Math.Pow(B, N);
float Test = Math.Sqrt(Square);
if (Test == Math.Trunc(Test))
FermatWasWrong();
}
private void FermatWasWrong()
{
Press.Announce("Fermat was wrong!");
Nobel.Claim();
}
(Ignore the type and overflow errors) Dead code?
Look at this example:
public boolean isEven(int i){
if(i % 2 == 0)
return true;
if(i % 2 == 1)
return false;
return false;
}
The compiler can't know that an int can only be even or odd. Therefore the compiler must be able to understand the semantics of your code. How should this be implemented? The compiler can't ensure that the lowest return will never be executed. Therefore the compiler can't detect the dead code.

What's the difference between if(A) then if(B) and if (A and B)?

if(A) then if(B)
vs
if(A and B)
Which is better to use and why ?
Given:
if (a) {
if (b) {
/// E1
}
/// E2
} else {
// E3
}
One may be tempted to replace it with:
if (a && b) {
/// E1
} else {
// E3
}
but they are not equivalent(a = true and b = false shows the counter-argument for it)
Other than that, there is no reason not to chain them if the language allows short circuits operations like AND, OR. And most of them allows it. Expressions are equivalent and you can use the chained form to improve code readability.
It depends on your specific case but generally:
1) if (A and B) looks better/cleaner. It's immediately clear that the following block will execute if both A and B apply.
2) if(A) then if(B) is better in cases when you want to do something also when A applies, but B doesn't. In other words:
if (A):
if (B):
# something
else:
# something else
generally looks better than
if (A and B):
# something
if (A and not B):
# something else
You've tagged this 'algorithm' and 'logic' and from that perspective I would say little difference.
However in practical programming languages there may or may not be a question of efficiency (or even executability).
Programming languages like C, C++ and Java guarantee that in the expression A && B that A is evaluated first and if false B is not evaluated.
That can make a big difference if B is compuatationally expensive or is invalid if A is false.
Consider the following C snippet:
int*x
//....
if(x!=NULL&&(*x)>10) {
//...
Evaluating (*x) when x==NULL will very likely cause a fatal error.
This 'trick' (called short-circuit evaluation) is useful because it avoids the need to write the slightly more verbose:
if(x!=NULL){
if((*x)>10){
Older versions of VB such as VB6 are infamous for not making the short circuits.
The other one is that B in A || B will not be evaluated if A is true.
Discussion about support:
Do all programming languages have boolean short-circuit evaluation?
In any language that provides short-circuit and has an optimizing compiler you can assume there is unlikely to be any difference in code efficiency and go with the most readable.
That is normally if(A&&B).

Teacher wants only one if for 3 [(a OR b) AND c] questions

This is a weird question, I know, but I need to write a program with 3 questions basically [(a OR b) AND c] without using if. What my teacher wants us to ask the user if an animal is black and answer y or n. If n ask if it is white and answer y or n. If either statement is true, then ask if it is friendly, answering y or n. If it is black or white and friendly then we get a message that it can come home with me or else we get a sorry message My problem is that she says we can use only one if and must use compareToIgnoreCase and a function. I can do this with if, but I can't figure out even how to begin without using if. Please help, I've Googled, read all kinds of answers to anything sounding at all promising, and all I keep finding directs me to use if statements.
This seems like a question to teach you short-circuit evaluation. The idea is to have function answersYesTo(String question) and use that in your boolean expression (a || b) && c. Short-circuit evaluation will start with evaluating a and only evaluate b if a evalutes to false. The reason for this is that if a is true, then the we already know that a||b is true, so there is no need to evalute the last part of the subexpression.
Furthermore, c will NOT be evaluated if a||b evalutes to false, since we at that point know that the expression will evalute to false.
The following code shows one possible implementation:
import java.io.Console;
public class App
{
static public void main(String [] args) {
boolean allowedToBringHome =
(answersYesTo("Is the animal black?")||answersYesTo("Is the animal white?"))
&& answersYesTo("Is it friendly?");
if( allowedToBringHome ) {
print("You can bring the animal home.");
}
else {
print("Sorry, you can't bring the animal home.");
}
}
static boolean answersYesTo(String question) {
String answer = System.console().readLine(question);
return answer.compareToIgnoreCase("y")==0;
}
static void print(String msg) {
System.out.println(msg);
}
}
NOTE: When using short-circuit evaluation always consider readbility of your code. Complex expressions become difficult to read and grasp very quickly, which increases the risk of introducing bugs.
You can try using the ternary operator (http://en.wikipedia.org/wiki/%3F:)
You not using the actual "if" operator, but instead an if/else.
Example:
if (a == b)
return 1; else
return 0;
is the same as
return (a == b) ? 1 : 0
I don't think we're going to do your homework for you, but depending on your teacher's definition of using an 'if', you may be able to use a ternary operator.
i.e. you can write if (A) do x else do y as A ? x : y.
Alternatively, read up on switch/case statements. This isn't a great solution for this sort of thing but does work with your constraints.

Changing from recursion to iteration

I have my function that needs changing to iteration, because of very slow executing of it.
There are two recursive calls, depends which condition is true.
It's something like this:
(I have static array of classes outside function, let's say data)
void func(int a, int b, int c, int d) {
//something
//non-recursive
//here..
for (i=0; i<data.length; i++) {
if (!data[i].x) {
data[i].x = 1;
if (a == data[i].value1)
func(data[i].value2,b,c,d);
else if (a == data[i].value2)
func(data[i].value1,b,c,d);
data[i].x = 0;
}
}
}
!!
EDIT: Here is my algorithm: http://pastebin.com/F7UfzfHv
Function searches for every paths from one point to another in graph, but returns (to an array) only one path in which are only unique vertices.
!!
And I know that good way to manage that is to use stack... But I don't know how. Could someone give me a hint about it?
There is a good possibility that your function might be computing the same thing again and again, which in technical terms is called as overlapping sub-problems(if you are familiar with the dynamic programming then you might know). First try to identify whether that is the case or not. Since you haven't told anything about what function do and I can't tell the exact problem.
Also the thing you are talking about using a stack is not of great use since recursion uses a stack internally. Using external stack instead of internal is more error prone is not of great use.

How can I optimize a multiple (matrix) switch / case algorithm?

Is it possible to optimize this kind of (matrix) algorithm:
// | case 1 | case 2 | case 3 |
// ------|--------|--------|--------|
// | | | |
// case a| a1 | a2 | a3 |
// | | | |
// case b| b1 | b2 | b3 |
// | | | |
// case c| c1 | c2 | c3 |
// | | | |
switch (var)
{
case 1:
switch (subvar)
{
case a:
process a1;
case b:
process b1;
case c:
process c1;
}
case 2:
switch (subvar)
{
case a:
process a2;
case b:
process b2;
case c:
process c2;
}
case 3:
switch (subvar)
{
case a:
process a3;
case b:
process b3;
case c:
process c3;
}
}
The code is fairly simple but you have to imagine more complex with more "switch / case".
I work with 3 variables. According they take the values 1, 2, 3 or a, b, c or alpha, beta, charlie have different processes to achieve. Is it possible to optimize it any other way than through a series of "switch / case?
(Question already asked in french here).
Edit: (from Dran Dane's responses to comments below. These might as well be in this more prominent place!)
"optimize" is to be understood in terms of having to write less code, fewer "switch / case". The idea is to improve readability, maintainability, not performance.
There is maybe a way to write less code via a "Chain of Responsibility" but this solution is not optimal on all points, because it requires the creation of many objects in memory.
It sounds like what you want is a 'Finite State Machine' where using those cases you can activate different processes or 'states'. In C this is usually done with an array (matrix) of function pointers.
So you essentially make an array and put the right function pointers at the right indicies and then you use your 'var' as an index to the right 'process' and then you call it. You can do this in most languages. That way different inputs to the machine activate different processes and bring it to different states. This is very useful for numerous applications; I myself use it all of the time in MCU development.
Edit: Valya pointed out that I probably should show a basic model:
stateMachine[var1][var2](); // calls the right 'process' for input var1, var2
There are no good answers to this question :-(
because so much of the response depends on
The effective goals (what is meant by "optimize", what is unpleasing about the nested switches)
The context in which this construct is going to be applied (what are the ultimate needs implicit to the application)
TokenMacGuy was wise to ask about the goals. I took the time to check the question and its replies on the French site and I'm still puzzled as to the goals... Dran Dane latest response seems to point towards lessening the amount of code / improving readability but let's review for sure:
Processing Speed: not an issue the nested switches are quite efficient, possibly a tat less than 3 multiplications to get an index into a map table, but maybe not even.
Readability: yes possibly an issue, As the number of variables and level increases the combinatorial explosion kicks in, and also the format of the switch statement tends to spread the branching spot and associated values over a long vertical stretch. In this case a 3 dimension (or more) table initialized with fct. pointers puts back together the branching values and the function to be call on on a single line.
Writing less code: Sorry not much help here; at the end of the day we need to account for a relatively high number of combinations and the "map", whatever its form, must be written somewhere. Code generators such as TokenMacGuy's may come handy, it does seem a bit of an overkill in this case. Generators have their place, but I'm not sure it is the case here. One of two case: if the number of variables and level is small enough, the generator is not worth it (takes more time to set it up than to write the actual code in the first place), if the number of variables and levels is significant, the generated code is hard to read, hard to maintain...)
In a nutshell, my recommendation with regards to making the code more readable (and a bit faster to write) is the table/matrix approach described on the French site.
This solution is in two part:
a one time initialization of a 3 dimensional array (for 3 levels); (or a "fancier" container structure if preferred: a tree for example) . This is done with code like:
// This is positively more compact / readable
...
FctMap[1][4][0] = fctAlphaOne;
FctMap[1][4][1] = fctAlphaOne;
..
FctMap[3][0][0] = fctBravoCharlie4;
FctMap[3][0][1] = NULL; // impossible case
FctMap[3][0][2] = fctBravoCharlie4; // note how the same fct may serve in mult. places
And a relatively simple snippet wherever the functions need to be called:
if (FctMap[cond1][cond2][cond3]) {
retVal = FctMap[cond1][cond2][cond3](Arg1, Arg2);
if (retVal < 0)
DoSomething(); // anyway we're leveraging the common api to these fct not the switch alternative ....
}
A case which may prompt one NOT using the solution above are if the combination space is relatively sparsely populated (many "branches" in the switch "tree" are not used) or if some of the functions require a different set of parameters; For both of these cases, I'd like to plug a solution Joel Goodwin proposed first here, and which essentially combines the various keys for the several level into one longer key (with separator character if need be), essentially flattening the problem back to a long, but single level switch statement.
Now...
The real discussion should be about why we need such a mapping/decision-tree in the first place. To answer this unfortunately requires understanding the true nature of the underlying application. To be sure I'm not saying that this is indicative of bad design. A big dispatching section may make sense in some applications. However, even with the C language (which the French Site contributors seemed to disqualify to Object Oriented design), it is possible to adopt Object oriented methodology and patterns. Anyway I'm diverging...) It is possible that the application would overall be better served with alternative design patterns where the "information tree about what to call when" has been distributed in several modules and/or several objects.
Apologies to speak about this in rather abstract terms, it's just the lack of application specifics... The point remains: challenge the idea that we need this big dispatching tree; think of alternative approaches to the application at large.
Alors, bonne chance! ;-)
Depending on the language, some form of hash map with the pair (var, subvar) as the key and first-class functions as the values (or whatever your language offers to best approximate that, e.g. instances of classes extending some proper interface in Java) is likely to provide top performance -- and the utter conciseness of fetching the appropriate function (or whatever;-) from the map based on the key, and executing it, leads to high readability for readers familiar with the language and such functional idioms.
The idea of a function pointer is probably best (as per mjv, Shhnap). But, if the code under each case is fairly small, it may be overkill and result in more obfuscation than intended. In that case, I might implement something snappy and fast-to-read like this:
string decision = var1.ToString() + var2.ToString() + var3.ToString();
switch(decision)
{
case "1aa":
....
case "1ab":
....
}
Unfamiliar with your particular scenario so perhaps the previous suggestions are more appropriate.
I had exactly the same problem once, albeit for an immanent mess of a 5-parameter nested switch. I figured, why type all these O(N5) cases myself, why even invent 'nested' function names if the compiler can do this for me. And all this resulted in a 'nested specialized template switch' referring to a 'specialized template database'.
It's a little complicated to write. But I found it worth it: it results in a 'knowledge' database that is very easy to maintain, to debug, to add to etc... And I must admit: a sense of pride.
// the return type: might be an object actually _doing_ something
struct Result {
const char* value;
Result(): value(NULL){}
Result( const char* p ):value(p){};
};
Some variable types for switching:
// types used:
struct A { enum e { a1, a2, a3 }; };
struct B { enum e { b1, b2 }; };
struct C { enum e { c1, c2 }; };
A 'forward declaration' of the knowledge base: the 'api' of the nested switch.
// template database declaration (and default value - omit if not needed)
// specializations may execute code in stead of returning values...
template< A::e, B::e, C::e > Result valuedb() { return "not defined"; };
The actual switching logic (condensed)
// template layer 1: work away the first parameter, then the next, ...
struct Switch {
static Result value( A::e a, B::e b, C::e c ) {
switch( a ) {
case A::a1: return SwitchA<A::a1>::value( b, c );
case A::a2: return SwitchA<A::a2>::value( b, c );
case A::a3: return SwitchA<A::a3>::value( b, c );
default: return Result();
}
}
template< A::e a > struct SwitchA {
static Result value( B::e b, C::e c ) {
switch( b ) {
case B::b1: return SwitchB<a, B::b1>::value( c );
case B::b2: return SwitchB<a, B::b2>::value( c );
default: return Result();
}
}
template< A::e a, B::e b > struct SwitchB {
static Result value( C::e c ) {
switch( c ) {
case C::c1: return valuedb< a, b, C::c1 >();
case C::c2: return valuedb< a, b, C::c2 >();
default: return Result();
}
};
};
};
};
And the knowledge base itself
// the template database
//
template<> Result valuedb<A::a1, B::b1, C::c1 >() { return "a1b1c1"; }
template<> Result valuedb<A::a1, B::b2, C::c2 >() { return "a1b2c2"; }
This is how it can be used.
int main()
{
// usage:
Result r = Switch::value( A::a1, B::b2, C::c2 );
return 0;
}
Yes, there is definitely easier way to do that, both faster and simpler. The idea is basically the same as proposed by Alex Martelli. Instead of seeing you problem as bi-dimentional, see it as some one dimension lookup table.
It means combining var, subvar, subsubvar, etc to get one unique key and use it as your lookup table entry point.
The way to do it depends on the used language. With python combining var, subvar etc. to build a tuple and use it as key in a dictionnary is enough.
With C or such it's usually simpler to convert each keys to enums, then combine them using logical operators to get just one number that you can use in your switch (that's also an easy way to use switch instead of string comparizons with cascading ifs). You also get another benefit doing it. It's quite usual that several treatments in different branches of the initial switch are the same. With the initial form it's quite difficult to make that obvious. You'll probably have some calls to the same functions but it's at differents points in code. Now you can just group the identical cases when writing the switch.
I used such transformation several times in production code and it's easy to do and to maintain.
Summarily you can get something like this... the mix function obviously depends on your application specifics.
switch (mix(var, subvar))
{
case a1:
process a1;
case b1:
process b1;
case c1:
process c1;
case a2:
process a2;
case b2:
process b2;
case c2:
process c2;
case a3:
process a3;
case b3:
process b3;
case c3:
process c3;
}
Perhaps what you want is code generation?
#! /usr/bin/python
first = [1, 2, 3]
second = ['a', 'b', 'c']
def emit(first, second):
result = "switch (var)\n{\n"
for f in first:
result += " case {0}:\n switch (subvar)\n {{\n".format(f)
for s in second:
result += " case {1}:\n process {1}{0};\n".format(f,s)
result += " }\n"
result += "}\n"
return result
print emit(first,second)
#file("autogen.c","w").write(emit(first,second))
This is pretty hard to read, of course, and you might really want a nicer template language to do your dirty work, but this will ease some parts of your task.
If C++ is an option i would try using virtual function and maybe double dispatch. That could make it much cleaner. But it will only probably pay off only if you have many more cases.
This article on DDJ.com might be a good entry.
If you're just trying to eliminate the two-level switch/case statements (and save some vertical space), you can encode the two variable values into a single value, then switch on it:
// Assumes var is in [1,3] and subvar in [1,3]
// and that var and subvar can be cast to int values
switch (10*var + subvar)
{
case 10+1:
process a1;
case 10+2:
process b1;
case 10+3:
process c1;
//
case 20+1:
process a2;
case 20+2:
process b2;
case 20+3:
process c2;
//
case 30+1:
process a3;
case 30+2:
process b3;
case 30+3:
process c3;
//
default:
process error;
}
If your language is C#, and your choices are short enough and contain no special characters you can use reflection and do it with just a few lines of code. This way, instead of manually creating and maintaining an array of function pointers, use one that the framework provides!
Like this:
using System.Reflection;
...
void DispatchCall(string var, string subvar)
{
string functionName="Func_"+var+"_"+subvar;
MethodInfo m=this.GetType().GetMethod(fName);
if (m == null) throw new ArgumentException("Invalid function name "+ functionName);
m.Invoke(this, new object[] { /* put parameters here if needed */ });
}
void Func_1_a()
{
//executed when var=1 and subvar=a
}
void Func_2_charlie()
{
//executed when var=2 and subvar=charlie
}
Solution from developpez.com
Yes, you can optimize it and make it so much cleaner. You can not use such a "Chain of
Responsibility" with a Factory:
public class ProcessFactory {
private ArrayList<Process> processses = null;
public ProcessFactory(){
super();
processses = new ArrayList<Process>();
processses.add(new ProcessC1());
processses.add(new ProcessC2());
processses.add(new ProcessC3());
processses.add(new ProcessC4());
processses.add(new ProcessC5(6));
processses.add(new ProcessC5(22));
}
public Process getProcess(int var, int subvar){
for(Process process : processses){
if(process.canDo(var, subvar)){
return process;
}
}
return null;
}
}
Then just as your processes implement an interface process with canXXX you can easily use:
new ProcessFactory().getProcess(var,subvar).launch();

Resources