How to implement simple validation in Scala - validation

Suppose I need to validate request parameters. The validation result is either Success or Failure with NonEmptyList[String]. I can probably use ValidationNel[String, Unit] but it seems a bit overkill. I guess I need a simpler abstraction (see below).
trait ValidationResult
object Success extends ValidationResult
class Failure(errors: NonEmptyList[String]) extends ValidationResult
and a binary operation andAlso to combine two results:
trait ValidationResult {
def andAlso(other: ValidationResult): ValidationResult =
(this, other) match {
case (Success, Success) => Success
case (Success, failure # Failure(_)) => failure
case (failure # Failure(_), Success) => failure
case (Failure(errors1), Failure(errors2)) => Failure(errors1 + errors2)
}
}
Now if I validate three parameters with functions checkA, checkB, and checkC I can easily compose them as follows:
def checkA(a: A): ValidationResult = ...
def checkB(b: B): ValidationResult = ...
def checkC(c: C): ValidationResult = ...
def checkABC(a: A, b: B, c: C) = checkA(a) andAlso checkB(b) andAlso checkC(c)
Does it make sense ?
Does this abstraction have a name ? Maybe a Monoid ?
Is it implemented in scalaz or any other scala library ?

It is indeed a Monoid, and you can be much more precise : it is a List[String] (up to an isomporphism). ValidationResult is indeed isomorphic to a List[String], with Success for Nil, and andAlso is concatenation ::: / ++.
This makes sense, a ValidationResult is a list of errors, and when there are none, that means success.
However, as you note right at the beginning, it all amounts to using ValidationNel[String, Unit], where Unit, "no data of interest" is the interesting part. If means you will handle the actual data separately. You may win a little bit here, and that little bit is avoiding the syntax of Applicative, sprinkling your code with |#| and suchlike; also, a not-often mentioned price of Monads and Co, making it easier to work with a debugger. But there is a downside, as your code grows with places where errors may occur multiplying too, managing the flow by hand will quickly become painful and I would not go that way.
The usual alternative is exceptions.

Related

Comparator.compareBoolean() the same as Comparator.compare()?

How can I write this
Comparator <Item> sort = (i1, i2) -> Boolean.compare(i2.isOpen(), i1.isOpen());
to something like this (code does not work):
Comparator<Item> sort = Comparator.comparing(Item::isOpen).reversed();
Comparing method does not have something like Comparator.comparingBool(). Comparator.comparing returns int and not "Item".
Why can't you write it like this?
Comparator<Item> sort = Comparator.comparing(Item::isOpen);
Underneath Boolean.compareTo is called, which in turn is the same as Boolean.compare
public static int compare(boolean x, boolean y) {
return (x == y) ? 0 : (x ? 1 : -1);
}
And this: Comparator.comparing returns int and not "Item". make little sense, Comparator.comparing must return a Comparator<T>; in your case it correctly returns a Comparator<Item>.
The overloads comparingInt, comparingLong, and comparingDouble exist for performance reasons only. They are semantically identical to the unspecialized comparing method, so using comparing instead of comparingXXX has the same outcome, but might having boxing overhead, but the actual implications depend on the particular execution environment.
In case of boolean values, we can predict that the overhead will be negligible, as the method Boolean.valueOf will always return either Boolean.TRUE or Boolean.FALSE and never create new instances, so even if a particular JVM fails to inline the entire code, it does not depend on the presence of Escape Analysis in the optimizer.
As you already figured out, reversing a comparator is implemented by swapping the argument internally, just like you did manually in your lambda expression.
Note that it is still possible to create a comparator fusing the reversal and an unboxed comparison without having to repeat the isOpen() expression:
Comparator<Item> sort = Comparator.comparingInt(i -> i.isOpen()? 0: 1);
but, as said, it’s unlikely to have a significantly higher performance than the Comparator.comparing(Item::isOpen).reversed() approach.
But note that if you have a boolean sort criteria and care for the maximum performance, you may consider replacing the general-purpose sort algorithm with a bucket sort variant. E.g.
If you have a Stream, replace
List<Item> result = /* stream of Item */
.sorted(Comparator.comparing(Item::isOpen).reversed())
.collect(Collectors.toList());
with
Map<Boolean,List<Item>> map = /* stream of Item */
.collect(Collectors.partitioningBy(Item::isOpen,
Collectors.toCollection(ArrayList::new)));
List<Item> result = map.get(true);
result.addAll(map.get(false));
or, if you have a List, replace
list.sort(Comparator.comparing(Item::isOpen).reversed());
with
ArrayList<Item> temp = new ArrayList<>(list.size());
list.removeIf(item -> !item.isOpen() && temp.add(item));
list.addAll(temp);
etc.
Use comparing using key extractor parameter:
Comparator<Item> comparator =
Comparator.comparing(Item::isOpen, Boolean::compare).reversed();

Early return statements and cyclomatic complexity

I prefer this writing style with early returns:
public static Type classify(int a, int b, int c) {
if (!isTriangle(a, b, c)) {
return Type.INVALID;
}
if (a == b && b == c) {
return Type.EQUILATERAL;
}
if (b == c || a == b || c == a) {
return Type.ISOSCELES;
}
return Type.SCALENE;
}
Unfortunately, every return statement increases the cyclomatic complexity metric calculated by Sonar. Consider this alternative:
public static Type classify(int a, int b, int c) {
final Type result;
if (!isTriangle(a, b, c)) {
result = Type.INVALID;
} else if (a == b && b == c) {
result = Type.EQUILATERAL;
} else if (b == c || a == b || c == a) {
result = Type.ISOSCELES;
} else {
result = Type.SCALENE;
}
return result;
}
The cyclomatic complexity of this latter approach reported by Sonar is lower than the first, by 3. I have been told that this might be the result of a wrong implementation of the CC metrics. Or is Sonar correct, and this is really better? These related questions seem to disagree with that:
https://softwareengineering.stackexchange.com/questions/118703/where-did-the-notion-of-one-return-only-come-from
https://softwareengineering.stackexchange.com/questions/18454/should-i-return-from-a-function-early-or-use-an-if-statement
If I add support for a few more triangle types, the return statements will add up to make a significant difference in the metric and cause a Sonar violation. I don't want to stick a // NOSONAR on the method, as that might mask other problems by new features/bugs added to the method in the future. So I use the second version, even though I don't really like it. Is there a better way to handle the situation?
Your question relates to https://jira.codehaus.org/browse/SONAR-4857. For the time being all SonarQube analysers are mixing the cyclomatic complexity and essential complexity. From a theoretical point of view return statement should not increment the cc and this change is going to happen in the SQ ecosystem.
Not really an answer, but way too long for a comment.
This SONAR rule seems to be thoroughly broken. You could rewrite
b == c || a == b || c == a
as
b == c | a == b | c == a
and gain two points in this strange game (and maybe even some speed as branching is expensive; but this is on the discretion of the JITc, anyway).
The old rule claims, that the cyclomatic complexity is related to the number of tests. The new one doesn't, and that's a good thing as obviously the number of meaningfull tests for your both snippets is exactly the same.
Is there a better way to handle the situation?
Actually, I do have an answer: For each early return use | instead of || once. :D
Now seriously: There is a bug requesting annotations allowing to disable a single rule, which is marked as fixed. I din't look any further.
Since the question is also about early return statements as a coding style, it would be helpful to consider the effect of size on the return style. If the method or function is small, less than say 30 lines, early returns are no problem, because anyone reading the code can see the whole method at a glance including all of the returns. In larger methods or functions, an early return can be a trap unintentionally set for the reader. If the early return occurs above the code the reader is looking at, and the reader doesn't know the return is above or forgets that it is above, the reader will misunderstand the code. Production code can be too big to fit on one screen.
So whoever is managing a code base for complexity should be allowing for method size in cases where the complexity appears to be problem. If the code takes more than one screen, a more pedantic return style may be justified. If the method or function is small, don't worry about it.
(I use Sonar and have experienced this same issue.)

Scala 2.10.1 and specialization (can't get it working right)

Sorry for asking second time about specialization, but I haven't good understanding of what the heck is going on yet...
So, I have one project (Gomoku game with AI), and I decided to use my own simple and dirty #specialized ad-hoc collections in the hot part of it, because I must store primitive types without boxing. The problem is that this doesn't really help, because in jvisualvm's Sampler I clearly see
scala.runtime.BoxesRunTime.boxToShort()
eating up thousands of ms when the optimal move search starts running.
The project: https://github.com/magicgoose/Gomoku
The file with the poor "collections": https://github.com/magicgoose/Gomoku/blob/master/src/magicgoose/gomoku/ai/SpecializedCollections.scala
The method, which causes boxing (one of them, I think):
trait Indexed[#specialized T] extends Enumerable[T] {
#inline def length: Int
#inline def apply(i: Int): T
// ...
#inline final def findIndex(fun: T => Boolean) = {
#tailrec def find(i: Int): Int = {
if (i < length) {
if (fun(this(i))) i
else find(i + 1)
} else -1
}
find(0)
}
}
I have seen another project (debox: https://github.com/non/debox), which tries to accomplish the similar thing (data collections without primitive boxing), but I don't really understand how it is done there.
This has an easy answer: Function1 is not specialized on Short arguments, only Int, Long, Float, and Double. So when you call fun you need to box on the way in.
Either use your own function class--sadly lacking the convenient shorthand!--or make sure you are not using Short => Boolean but rather Int => Boolean (and the types know it). Note that when I said it was easy, I meant only easy to explain the problem: neither solution is all that easy to implement, but at the moment this is what's necessary.

Teacher wants only one if for 3 [(a OR b) AND c] questions

This is a weird question, I know, but I need to write a program with 3 questions basically [(a OR b) AND c] without using if. What my teacher wants us to ask the user if an animal is black and answer y or n. If n ask if it is white and answer y or n. If either statement is true, then ask if it is friendly, answering y or n. If it is black or white and friendly then we get a message that it can come home with me or else we get a sorry message My problem is that she says we can use only one if and must use compareToIgnoreCase and a function. I can do this with if, but I can't figure out even how to begin without using if. Please help, I've Googled, read all kinds of answers to anything sounding at all promising, and all I keep finding directs me to use if statements.
This seems like a question to teach you short-circuit evaluation. The idea is to have function answersYesTo(String question) and use that in your boolean expression (a || b) && c. Short-circuit evaluation will start with evaluating a and only evaluate b if a evalutes to false. The reason for this is that if a is true, then the we already know that a||b is true, so there is no need to evalute the last part of the subexpression.
Furthermore, c will NOT be evaluated if a||b evalutes to false, since we at that point know that the expression will evalute to false.
The following code shows one possible implementation:
import java.io.Console;
public class App
{
static public void main(String [] args) {
boolean allowedToBringHome =
(answersYesTo("Is the animal black?")||answersYesTo("Is the animal white?"))
&& answersYesTo("Is it friendly?");
if( allowedToBringHome ) {
print("You can bring the animal home.");
}
else {
print("Sorry, you can't bring the animal home.");
}
}
static boolean answersYesTo(String question) {
String answer = System.console().readLine(question);
return answer.compareToIgnoreCase("y")==0;
}
static void print(String msg) {
System.out.println(msg);
}
}
NOTE: When using short-circuit evaluation always consider readbility of your code. Complex expressions become difficult to read and grasp very quickly, which increases the risk of introducing bugs.
You can try using the ternary operator (http://en.wikipedia.org/wiki/%3F:)
You not using the actual "if" operator, but instead an if/else.
Example:
if (a == b)
return 1; else
return 0;
is the same as
return (a == b) ? 1 : 0
I don't think we're going to do your homework for you, but depending on your teacher's definition of using an 'if', you may be able to use a ternary operator.
i.e. you can write if (A) do x else do y as A ? x : y.
Alternatively, read up on switch/case statements. This isn't a great solution for this sort of thing but does work with your constraints.

OR between two function call

What is the meaning of a || between two function call
like
{
//some code
return Find(n.left,req)||Find(n.right,req);
}
http://www.careercup.com/question?id=7560692
can some one help me to understand . Many thanks in advance.
It means that it returns true if one of the two functions is true (or both of them).
Depends on the programming language, the method calls Find(n.left,req) -> if it's true - returns true. if it's false, it calls Find(n.right,req) and returns its Boolean value.
In Java (and C and C#) || means "lazy or". The single stroke | also means "or", but operates slightly differently.
To calculate a||b, the computer calculates the truth value (true or false) of a, and if a is true it returns the value true without bothering to calculate b, hence the word "lazy". Only if a is false, will it checks b to see if it is true (and so if a or b is true).
To calculate a|b, the computer works out the value of a and b first, then "ors" the answers together.
The "lazy or" || is more efficient, because it sometimes does not need to calculate b at all. The only reason you might want to use a|b is if b is actually a function (method) call, and because of some side-effect of the method you want to be sure it executes exactly once. I personally consider this poor programming technique, and on the very few occasions that I want b to always be explicitly calculated I do this explicitly and then use a lazy or.
Eg consider a function or method foo() which returns a boolean. Instead of
boolean x = a|foo(something);
I would write
boolean c=foo(something);
boolean x = a||c;
Which explicitly calls foo() exactly once, so you know what is going on.
Much better programming practice, IMHO. Indeed the best practice would be to eliminate the side effect in foo() entirely, but sometimes that's hard to do.
If you are using lazy or || think about the order you evaluate it in. If a is easy to calculate and usually true, a||b will be more efficient than b||a, as most of the time a simple calculation for a is all that is needed. Conversely if b is usually false and is difficult to calculate, b||a will not be much more efficient than a|b. If one of a or b is a constant and the other a method call, you should have the constant as the first term a||foo() rather than foo()||a as a method call will always be slower than using a simple local variable.
Hope this helps.
Peter Webb
return Find(n.left,req)||Find(n.right,req);
means execute first find {Find(n.left,req)} and return true if it returns true or
execute second find return the value true if it return true, otherwise false.

Resources