how to simplify/improve an Erlang code? - refactoring

How a good Erlang programmer would write this code ?
loop(expr0) ->
case expr1 of
true ->
A = case expr2 of
true -> ...;
false -> ...
end;
false->
A = case expr3 of
true -> ...;
false -> ...
end
end,
loop(expr4(A)).

Generally speaking, you want to make your code more readable. It's usually a good idea extracting bits of code to functions to avoid long or deeply nested functions, and provide self-explained names that clarify the purpose of a piece of code:
loop(expr0) ->
case expr1 of
true ->
A = do_something(expr2);
false->
A = do_something_else(expr3)
end,
loop(expr4(A)).
do_something(E) ->
case E of
true -> ...;
false -> ...
end
do_something_else(E) ->
case E of
true -> ...;
false -> ...
end
Now, a casual reader knows that your function does something if expr1 is true and something else if expr1 is false. Good naming conventions help a lot here. You can also do that with comments, but code is never outdated, and thus easier to maintain. I also find short functions rather easier to read than really loooong functions. Even if those long functions have coments inlined.
Once you've stated clearly what your function does, you may want to shorten the code. Short code is easier to read and maintain, but don't shorten too much using "clever" constructions, or you'll obscure it, which is the opposite of what you want. You can start by using pattern matching in function heads:
loop(expr0) ->
case expr1 of
true ->
A = do_something(expr2);
false->
A = do_something_else(expr3)
end,
loop(expr4(A)).
do_something(true) -> ...;
do_something(false) -> ....
do_something_else(true) -> ...;
do_something_else(false) -> ....
Then, you can avoid repeating A in the main function (aside, variables scoped out of nested statements is a feature I always disliked)
loop(expr0) ->
A = case expr1 of
true -> do_something(expr2);
false-> do_something_else(expr3)
end,
loop(expr4(A)).
do_something(true) -> ...;
do_something(false) -> ....
do_something_else(true) -> ...;
do_something_else(false) -> ....
And I think that's it for this piece of code. With more context you can also go for some abstractions to reduce duplicity, but take care when abstracting, if you overdo it you'll also obscure the code again, losing the maintenance benefit you'd expected to get by removing similar code.

The code, as it is currently written, is hard to make simpler. The problem are the ExprX entries are unknown, so there is no way to simplify the code without knowing that it is beneficial to do so. If you have a more full example, we will have a much better time at attempting to do an optimization of it.
The concrete problem is that we don't know how Expr2 and Expr3 depends on Expr1 for instance. And we don't know what the purpose of Expr0 is, and neither about Expr4's dependence other than it uses the returned A.

Why need expr0 in loop function?

Related

How to chain multiple IF statements in logic programming?

I want to update my knowledge base after receiving a new state,
not((hasBeenVisited(X-1, Y)); not(wall(X-1, Y)) -> asserta(isDangerous(X-1, Y));
not(((hasBeenVisited(X+1, Y)); not(wall(X+1, Y)) -> asserta(isDangerous(X+1, Y));
not((hasBeenVisited(X, Y-1)); not(wall(X, Y-1)) -> asserta(isDangerous(X, Y-1));
not((hasBeenVisited(X, Y+1)); not(wall(X, Y+1)) -> asserta(isDangerous(X, Y+1));
problem with my code is that if the first line evaluates to true, then, the next lines are not evaluated, because of the logical OR ";".
If I were to change the ";" to logical AND ",", then if one of the conditions fail, the entire predicate returns false instead of true.
How do I chain multiple IFs statements? In procedural programming, we can do something like this:
if condition1 then statements1;
if condition2 then statements2;
if condition3 then statement3;
...
Should I even do that with prolog because I am still thinking in terms of procedural programming...
Your imperative example:
if condition1 then statements1;
if condition2 then statements2;
if condition3 then statements3;
becomes this shape:
(condition1 -> statements1 ; true),
(condition2 -> statements2 ; true),
(condition3 -> statements3 ; true).
with "code blocks" wrapped in () parens and making sure there is always a true whether the condition holds or not, so the block is true AND the next block can run.
You're right that coding in this way with imperative rules and state updates is fighting against the design of Prolog, and not leaning on its strengths. Also X+1 doesn't work the way you are using it, and I wonder if you are thinking that isDangerous(X+1, Y) will return a value like it was a function call and that return goes into asserta(<here>)? If so, that won't happen either.
Probably what you are being pushed towards is building a list of movements to get through a maze, and should be leaning on Prolog's backtracking to find the walls and dangerous places, step back from them and go another way.

What's the difference between if(A) then if(B) and if (A and B)?

if(A) then if(B)
vs
if(A and B)
Which is better to use and why ?
Given:
if (a) {
if (b) {
/// E1
}
/// E2
} else {
// E3
}
One may be tempted to replace it with:
if (a && b) {
/// E1
} else {
// E3
}
but they are not equivalent(a = true and b = false shows the counter-argument for it)
Other than that, there is no reason not to chain them if the language allows short circuits operations like AND, OR. And most of them allows it. Expressions are equivalent and you can use the chained form to improve code readability.
It depends on your specific case but generally:
1) if (A and B) looks better/cleaner. It's immediately clear that the following block will execute if both A and B apply.
2) if(A) then if(B) is better in cases when you want to do something also when A applies, but B doesn't. In other words:
if (A):
if (B):
# something
else:
# something else
generally looks better than
if (A and B):
# something
if (A and not B):
# something else
You've tagged this 'algorithm' and 'logic' and from that perspective I would say little difference.
However in practical programming languages there may or may not be a question of efficiency (or even executability).
Programming languages like C, C++ and Java guarantee that in the expression A && B that A is evaluated first and if false B is not evaluated.
That can make a big difference if B is compuatationally expensive or is invalid if A is false.
Consider the following C snippet:
int*x
//....
if(x!=NULL&&(*x)>10) {
//...
Evaluating (*x) when x==NULL will very likely cause a fatal error.
This 'trick' (called short-circuit evaluation) is useful because it avoids the need to write the slightly more verbose:
if(x!=NULL){
if((*x)>10){
Older versions of VB such as VB6 are infamous for not making the short circuits.
The other one is that B in A || B will not be evaluated if A is true.
Discussion about support:
Do all programming languages have boolean short-circuit evaluation?
In any language that provides short-circuit and has an optimizing compiler you can assume there is unlikely to be any difference in code efficiency and go with the most readable.
That is normally if(A&&B).

When generalizing monad, performance drops nearly 50%

I have code that does some parsing of files according to specified rules. The whole parsing takes place in a monad that is a stack of ReaderT/STTrans/ErrorT.
type RunningRule s a = ReaderT (STRef s LocalVarMap) (STT s (ErrorT String Identity)) a
Because it would be handy to run some IO in the code (e.g. to query external databases), I thought I would generalize the parsing, so that it could run both in Identity or IO base monad, depending on the functionality I would desire. This changed the signature to:
type RunningRule s m a = ReaderT (STRef s LocalVarMap) (STT s (ErrorT String m)) a
After changing the appropriate type signatures (and using some extensions to get around the types) I ran it again in the Identity monad and it was ~50% slower. Although essentially nothing changed, it is much slower. Is this normal behaviour? Is there some simple way how to make this faster? (e.g. combining the ErrorT and ReaderT (and possibly STT) stack into one monad transformer?)
To add a sample of code - it is a thing that based on a parsed input (given in C-like language) constructs a parser. The code looks like this:
compileRule :: forall m. (Monad m, Functor m) =>
-> [Data -> m (Either String Data)] -- For tying the knot
-> ParsedRule -- This is the rule we are compiling
-> Data -> m (Either String Data) -- The real parsing
compileRule compiled (ParsedRule name parsedlines) =
\input -> runRunningRule input $ do
sequence_ compiledlines
where
compiledlines = map compile parsedlines
compile (Expression expr) = compileEx expr >> return ()
compile (Assignment var expr) =
...
compileEx (Function "check" expr) = do
value <- expr
case value of
True -> return ()
False -> fail "Check failed"
where
code = compileEx expr
This is not so unusual, no. You should try using SPECIALIZE pragmas to specialize to Identity, and maybe IO too. Use -ddump-simpl and watch for warnings about rule left hand sides being too complicated. When specialization doesn't happen as it should, GHC ends up passing around typeclass dictionaries at runtime. This is inherently somewhat inefficient, but more importantly it prevents GHC from inlining class methods to enable further simplification.

How to detect if(true) and other refactoring issues?

It is common in java, when using "modern" IDEs, to inline variable values and perform heavy refactoring that can, as an example, transform this source code
boolean test = true;
//...
if(test) {
//...
}
Into this code
if(true) {
//...
}
Obviously, this code can be simplified, but Eclipse won't perform that simplification for me.
So, is there any way (using Eclipse or - even better - maven) that can detect and (possibly) simplify that code ? (it would be obviously way better if such a tool was able to detect other wrong constructs like empty for loops, ...)
What you want is a Program Transformation system (PTS).
These are tools that read source code, build compiler data structures (almost always including at least an AST), carry out customized analysis and modification of the compiler data structures, and then regenerate source text (for the modified program) from those modified data structures.
Many of the PTS will allow you express changes to code directly in source-to-source form as rules, expressed in terms of the language syntax, metavariables, etc. The point of such a rule language is to let you express complex code transformations more easily.
Our DMS Software Reengineering Toolkit is such a PTS. You can easily simplify code with boolean expressions containing boolean constants with the following simple rules:
default domain Java~v7;
simplify_not_true(): primary -> primary
" ! true" -> "false";
simplify_not_false(): primary -> primary
" ! false" -> "true";
simplify_not_not(x: primary): primary -> primary
" ! ! \x " -> "\x";
simplify_and_right_true(x: term): conjunction -> conjunction ;
" \x && true " -> "\x";
simplify_and_left_true(x: term): conjunction -> conjunction ;
" true && \x " -> "\x";
simplify_and_left_false(x: term): conjunction -> conjunction ;
" false && \x " -> "false";
simplify_and_right_false(x: term): conjunction -> conjunction ;
" \x && false " -> "false"
if no_side_effects_or_exceptions(x); -- note additional semantic check here
simplify_or_right_false(x: term): disjunction -> disjunction ;
" \x || false " -> "\x";
simplify_or_left_false(x: term): disjunction -> disjunction ;
" false || \x " -> "\x";
simplify_or_right_true(x: term): disjunction -> disjunction ;
" \x || true " -> "true"
if no_side_effects_or_exceptions(x);
simplify_or_left_true(x: term): disjunction -> disjunction ;
" true || \x " -> "true";
(The grammar names "term", "primary", "conjunction", "disjunction" are directly from the BNF used to drive Java source code parsing.)
These rules together will take boolean expressions involving known boolean constants,
and simplify them down sometimes to simply "true" or "false".
To eliminate if-conditionals whose expressions are boolean constants one would write these:
simplify_if_true(b: block): statement -> statement
" if (true) \b" -> " \b ";
simplify_if_false(b: block): statement -> statement
" if (false) \b" -> ";" -- null statement
Together with boolean simplification, these two rules would get rid of conditionals for obviously true or obviously false conditionals.
To do what you want is bit more complicated, because you wish to propagate information from one place in the program, to another place possibly "far away". For that you need what amounts to a data flow analysis, showing where values can reach from their assignments:
default domain Java~v7;
rule propagate_constant_variables(i:IDENTIFIER): term -> term
" \i " -> construct_reaching_constant()
if constant_reaches(i);
This rule depends on a built-in analysis providing data flow facts and a custom
interface function "constant_reaches" that inspects this data.
(DMS has this for C, C++, Java and COBOL and support for doing it for other languages; to my knowledge, none of the other PTS mentioned in the Wikipedia article have these flow facts available). It also depends on a custom contructor "contruct_reaching_constant" to build a primitive tree node containing a reaching constant. These would be coded in DMS's underlying metaprogramming langauge and require a few tens of lines of code. Similarly the special condition discussed earlier "no_side_effects_or_exceptions"; this can be a lot more complex as the question about side effects may require an analysis of the full program.
There are tools such a Clang that can transform C++ code to some extent, but Clang does not have rewrite rules as PTS do, it is really a compiler with additional hooks.

Mathematica: How is OptionValue implemented?

The implementation of the built-in OptionValue contains some piece of magic so that
OptionValue[name] is equivalent to
OptionValue[f, name], where f is the
head of the left-hand side of the
transformation rule in which
OptionValue[name] appears.
Does anybody have an idea for how to achieve something similar for Options, i.e. implement an autoOptions[] that would resolve to the options defined for the symbol on the left hand side of the transformation rule in which autoOptions[] appears?
For clarity, what I am looking for is a way to make
Options[foo]={bar->1};
foo[OptionsPattern[]]:=autoOptions[]
foo[]
output {bar->1}
The eventual goal is to do something like requested in this question without having to change anything but the RHS of a definition.
Here is a simple, very schematic version:
Module[{tried},
Unprotect[SetDelayed];
SetDelayed[f_[args___, optpt : OptionsPattern[]], rhs_] /;
!FreeQ[Unevaluated[rhs], autoOptions[]] :=
Block[{tried = True},
f[args, optpt] :=
Block[{autoOptions}, autoOptions[] = Options[f]; rhs]] /; ! TrueQ[tried];
Protect[SetDelayed];]
Your usage:
In[8]:= Options[foo] = {bar -> 1};
foo[OptionsPattern[]] := autoOptions[]
foo[]
Out[10]= {bar -> 1}
Note that this won't work when explicit options are also passed - accounting for them is some more work, and this is not generally a good practice since I overloaded SetDelayed - but you asked for it and you get it.

Resources