Postfix Expression Calculator Validation - validation

I need this code to be edited slightly using a loop to tell the user that their input was incorrect and then let them retry typing in a post fix expression?
import java.util.Scanner;
public class Calculator {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in) ;
System.out.println("Enter your two numbers and the operation with spaces between e.g 8 9 -");
String calculation=scan.nextLine();
String [] parts = calculation.split(" ");
try{
double part1 = Double.parseDouble(parts[0]);
double part2 = Double.parseDouble(parts[1]);
double answer = 0;
boolean incorrectOperation = false;
String operation = parts[2];
switch (operation) {
case "+":
answer = part1 + part2;
break;
case "-":
answer = part1 - part2;
break;
case "*":
answer = part1 * part2;
break;
case "/":
answer = part1 / part2;
break;
default:
incorrectOperation = true;
}
String ans;
if(incorrectOperation) {
ans = "Please use +, -, * or / for operation";
} else {
ans = String.valueOf(answer);
}
System.out.println(ans);
} catch (NumberFormatException e) {
System.out.println("Invalid expression " + calculation);
}
}
}
It needs to have some sort of message output using 'System.out.println' to tell the user what they have type is incorrect

Related

Check if two mathematical expressions are equivalent

I came across a question in an interview. I tried solving it but could not come up with a solution. Question is:
[Edited]
First Part: You are given two expressions with only "+" operator, check if given two expressions are mathematically equivalent.
For eg "A+B+C" is equivalent to "A+(B+C)".
Second Part : You are given two expressions with only "+" and "-" operators, check if given two expressions are mathematically equivalent.
For eg "A+B-C" is equivalent to "A-(-B+C)".
My thought process : I was thinking in terms of building an expression tree out of the given expressions and look for some kind of similarity. But I am unable to come up with a good way of checking if two expression trees are some way same or not.
Can some one help me on this :) Thanks in advance !
As long as the operations are commutative, the solution I'd propose is distribute parenthetic operations and then sort terms by 'variable', then run an aggregator across them and you should get a string of factors and symbols. Then just check the set of factors.
Aggregate variable counts until encountering an opening brace, treating subtraction as addition of the negated variable. Handle sub-expressions recursively.
The content of sub-expressions can be directly aggregated into the counts, you just need to take the sign into account properly -- there is no need to create an actual expression tree for this task. The TreeMap used in the code is just a sorted map implementation in the JDK.
The code takes advantage of the fact that the current position is part of the Reader state, so we can easily continue parsing after the closing bracket of the recursive call without needing to hand this information back to the caller explicitly somehow.
Implementation in Java (untested):
class Expression {
// Count for each variable name
Map<String, Integer> counts = new TreeMap<>();
Expression(Srring s) throws IOException {
this(new StringReader(s));
}
Expression(Reader reader) throws IOException {
int sign = 1;
while (true) {
int token = reader.read();
switch (token) {
case -1: // Eof
case ')':
return;
case '(':
add(sign, new Expression(reader));
sign = 1;
break;
case '+':
break;
case '-':
sign = -sign;
break;
default:
add(sign, String.valueOf((char) token));
sign = 1;
break;
}
}
}
void add(int factor, String variable) {
int count = counts.containsKey(variable) ? counts.get(variable) : 0;
counts.put(count + factor, variable);
}
void add(int sign, Expression expr) {
for (Map.Entry<String,Integer> entry : expr.counts.entrySet()) {
add(sign * entry.getVaue(), entry.getKey());
}
}
void equals(Object o) {
return (o instanceof Expression)
&& ((Expression) o).counts.equals(counts);
}
// Not needed for the task, just added for illustration purposes.
String toString() {
StringBuilder sb = new StringBuilder();
for (Map.Entry<String,Integer> entry : expr.counts.entrySet()) {
if (sb.length() > 0) {
sb.append(" + ");
}
sb.append(entry.getValue()); // count
sb.append(entry.getKey()); // variable name
}
return sb.toString();
}
}
Compare with
new Expression("A+B-C").equals(new Expression("A-(-B+C)"))
P.S: Added a toString() method to illustrate the data structure better.
Should print 1A + 1B + -1C for the example.
P.P.P.P.S.: Fixes, simplification, better explanation.
You can parse the expressions from left to right and reduce them to a canonical form for comparison in a straightforward way; the only complication is that when you encounter a closing bracket, you need to know whether its associated opening bracket had a plus or minus in front of it; you can use a stack for that; e.g.:
function Dictionary() {
this.d = [];
}
Dictionary.prototype.add = function(key, value) {
if (!this.d.hasOwnProperty(key)) this.d[key] = value;
else this.d[key] += value;
}
Dictionary.prototype.compare = function(other) {
for (var key in this.d) {
if (!other.d.hasOwnProperty(key) || other.d[key] != this.d[key]) return false;
}
return this.d.length == other.d.length;
}
function canonize(expression) {
var tokens = expression.split('');
var variables = new Dictionary();
var sign_stack = [];
var total_sign = 1;
var current_sign = 1;
for (var i in tokens) {
switch(tokens[i]) {
case '(' : {
sign_stack.push(current_sign);
total_sign *= current_sign;
current_sign = 1;
break;
}
case ')' : {
total_sign *= sign_stack.pop();
break;
}
case '+' : {
current_sign = 1;
break;
}
case '-' : {
current_sign = -1;
break;
}
case ' ' : {
break;
}
default : {
variables.add(tokens[i], current_sign * total_sign);
}
}
}
return variables;
}
var a = canonize("A + B + (A - (A + C - B) - B) - C");
var b = canonize("-C - (-A - (B + (-C)))");
document.write(a.compare(b));

Error catching in stacks

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.StreamTokenizer;
import static java.lang.Math.pow;
public class InfixExpressionEvaluator {
// Tokenizer to break up our input into tokens
StreamTokenizer tokenizer;
// Stacks for operators (for converting to postfix) and operands (for
// evaluating)
StackInterface<Character> operatorStack;
StackInterface<Double> operandStack;
//counts brackets
int Bracket1=0, Bracket2=0, count = 0;
/**
* Initializes the evaluator to read an infix expression from an input
* stream.
* #param input the input stream from which to read the expression
*/
public InfixExpressionEvaluator(InputStream input) {
// Initialize the tokenizer to read from the given InputStream
tokenizer = new StreamTokenizer(new BufferedReader(
new InputStreamReader(input)));
// StreamTokenizer likes to consider - and / to have special
meaning.
// Tell it that these are regular characters, so that they can be
parsed
// as operators
tokenizer.ordinaryChar('-');
tokenizer.ordinaryChar('/');
// Allow the tokenizer to recognize end-of-line, which marks the
end of
// the expression
tokenizer.eolIsSignificant(true);
// Initialize the stacks
operatorStack = new ArrayStack<Character>();
operandStack = new ArrayStack<Double>();
}
/**
* Parses and evaluates the expression read from the provided input
stream,
* then returns the resulting value
* #return the value of the infix expression that was parsed
*/
public Double evaluate() throws ExpressionError {
// Get the first token. If an IO exception occurs, replace it with
a
// runtime exception, causing an immediate crash.
try {
tokenizer.nextToken();
} catch (IOException e) {
throw new RuntimeException(e);
}
boolean preNum = false, preOp = false;
boolean preOpenbracket = false, preClosebracket = false;
int Bracket1 = 0, Bracket2 = 0, count = 0;
// Continue processing tokens until we find end-of-line
while (tokenizer.ttype != StreamTokenizer.TT_EOL) {
// Consider possible token type
switch (tokenizer.ttype) {
case StreamTokenizer.TT_NUMBER:
// Expression error handling
if ((preNum == true) && (count > 0)) {
throw new ExpressionError("Two operands in a row");
}
if ((preClosebracket == true) && (count > 0)) {
throw new ExpressionError("A close bracket cannot
be followed by a number");
}
// If the token is a number, process it as a double-
valued
// operand
processOperand((double) tokenizer.nval);
preOp = false;
preNum = true;
preOpenbracket = false;
preClosebracket = false;
break;
case '+':
case '-':
case '*':
case '/':
case '^':
// check for errror in input
if (count == 0) {
throw new ExpressionError("Leading off with
operator is illegal");
}
if ((preOp == true) && (count > 0)) {
throw new ExpressionError("Two operators in a
row");
}
if ((preOpenbracket == true) && (count > 0)) {
throw new ExpressionError("An open bracket cannot
be followed by an operator");
}
// If the token is any of the above characters, process
it
// is an operator
processOperator((char) tokenizer.ttype);
preOp = true;
preNum = false;
preOpenbracket = false;
preClosebracket = false;
break;
case '(':
case '[':
// Expression error handling
if ((preNum == true) && (count > 0)) {
throw new ExpressionError("An open bracket cannot
be preceded by a number");
}
// If the token is open bracket, process it as such.
Forms
// of bracket are interchangeable but must nest
properly.
processOpenBracket((char) tokenizer.ttype);
preOp = false;
preNum = false;
preOpenbracket = true;
preClosebracket = false;
Bracket1++;
break;
case ')':
case ']':
// Expression error handling
if ((preOp == true) && (count > 0)) {
throw new ExpressionError("A close bracket cannot
be preceded by a operator");
}
// If the token is close bracket, process it as such.
Forms
// of bracket are interchangeable but must nest
properly.
processCloseBracket((char) tokenizer.ttype);
preOp = false;
preNum = false;
preOpenbracket = false;
preClosebracket = true;
Bracket2++;
break;
case StreamTokenizer.TT_WORD:
// If the token is a "word", throw an expression error
throw new ExpressionError("Unrecognized token: "
+ tokenizer.sval);
default:
// If the token is any other type or value, throw an
// expression error
throw new ExpressionError("Unrecognized token: "
+ String.valueOf((char) tokenizer.ttype));
}
count++;
// Read the next token, again converting any potential IO
exception
try {
tokenizer.nextToken();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
// Almost done now, but we may have to process remaining operators
in
// the operators stack
processRemainingOperators();
if (Bracket1 != Bracket2) {
throw new ExpressionError("\nNot the same number of bracket");
}
// Return the result of the evaluation -
return operandStack.pop();
}
void processOperand(double operand) {
//push operand
operandStack.push(operand);
}
void processOperator(char operator) {
// Precedence order: ^*/ level 2, +- level 1, ( and [ has lowest
level 0
while (!operatorStack.isEmpty() && hasPrecedence(operator,
operatorStack.peek())) {
// perform step 1a,1b,1c,1d
operandStack.push(applyOp(operatorStack.pop(),
operandStack.pop(), operandStack.pop()));
}
// Step 2: thisOp has more precedence than one on top of
operatorStack, push it
operatorStack.push(operator);
}
// Returns true if 'op2' has higher or same precedence as 'op1',
// otherwise returns false.
public static boolean hasPrecedence(char op1, char op2)
{
if (op2 == '(' || op2 == ')' || op2 == '[' || op2 == ']')
return false;
if ((op1 == '*' || op1 == '/' || op1 == '^') && (op2 == '+' || op2
== '-'))
return false;
else
return true;
}
void processOpenBracket(char openBracket) {
operatorStack.push(openBracket);
}
void processCloseBracket(char closeBracket) {
//Declare variables
char operator='0';
double a=0, b=0, c=0;
boolean correct = true;
//check for error before loop
if (operatorStack.isEmpty()) {
throw new ExpressionError("Brackets/parenthasis are uneven");
}
if (operatorStack.peek()== '(' || operatorStack.peek()=='[') {
throw new ExpressionError("There was an empty set of brackets
or unneeded use of brackets");
}
if (operandStack.isEmpty()) {
throw new ExpressionError("Too many operators");
}
//loop through stacks and pop off operators and operands
while (correct) {
operator = operatorStack.pop();
b = operandStack.pop();
a = operandStack.pop();
//check to see if next one is bracket
if (operatorStack.peek() == '(' || operatorStack.peek() == '[')
{
correct = false;
}
}
//check for errors
if (closeBracket == ')' && !operatorStack.isEmpty() &&
operatorStack.peek() != '(') {
throw new ExpressionError("Parenthesis do not match");
}
if (closeBracket == ']' && !operatorStack.isEmpty() &&
operatorStack.peek() != '[') {
throw new ExpressionError("Brackets do not match");
}
//calculate result and push it onto the stack, pop off bracket
c = applyOp(operator, b, a);
operandStack.push(c);
operatorStack.pop();
}
public static double applyOp(char op, double b, double a)
{
switch (op) {
case '+':
return a + b;
case '-':
return a - b;
case '*':
return a * b;
case '^':
return pow(a,b);
case '/':
//check for division by zero
if (b == 0)
throw new ExpressionError("No division by zero");
return a / b;
}
return 0;
}
/**
* This method is called when the evaluator encounters the end of an
* expression. It manipulates operatorStack and/or operandStack to
process
* the operators that remain on the stack, according to the Infix-to-
Postfix
* and Postfix-evaluation algorithms.
*/
void processRemainingOperators() {
//error check
double a;
if (!operatorStack.isEmpty()) {
if (operatorStack.peek()=='(' || operatorStack.peek()=='[') {
throw new ExpressionError("Uneven number of parenthesis");
}
//check for expression ending with operator
a=operandStack.pop();
if (operandStack.isEmpty()) {
throw new ExpressionError("You can not end with an
operator");
}
operandStack.push(a);
}
//process the remaining operators and answer is placed in the stack
alone
while (!operatorStack.isEmpty()) {
operandStack.push(applyOp(operatorStack.pop(),
operandStack.pop(), operandStack.pop()));
}
}
/**
* Creates an InfixExpressionEvaluator object to read from System.in,
then
* evaluates its input and prints the result.
* #param args not used
*/
public static void main(String[] args) {
System.out.println("\nInfix expression:");
InfixExpressionEvaluator evaluator =
new InfixExpressionEvaluator(System.in);
Double value = null;
try {
value = evaluator.evaluate();
} catch (ExpressionError e) {
System.out.println("ExpressionError: " + e.getMessage());
}
if (value != null) {
System.out.println(value);
} else {
System.out.println("Evaluator returned null");
}
}
}
So for this program we're supposed to use implement two stacks to do simple arithmetic. If entered correctly, the program works, but when I try to do error catching it doesn't work. I tried using counter variables for counting number of parenthesis, brackets but it didn't work. Here are some cases that didn't work:
2^(2+3*4)
2*14.5+6/5-(5*8-5/9)
10000 * [1+.20/12]^(12*4)
(4+3*2 (error catch here. program should report error because there
is no closed parenthesis)
and many more... any ideas?

How to get a substring in some length for special chars like Chinese

For example, I can get 80 chars with {description?substring(0, 80)} if description is in English, but for Chinese chars, I can get only about 10 chars, and there is a garbage char at the end always.
How can I get 80 chars for any language?
FreeMarker relies on String#substring to do the actual (UTF-16-chars-based?) substring calculation, which doesn't work well with Chinese characters. Instead one should uses Unicode code points. Based on this post and FreeMarker's own substring builtin I hacked together a FreeMarker TemplateMethodModelEx implementation which operates on code points:
public class CodePointSubstring implements TemplateMethodModelEx {
#Override
public Object exec(List args) throws TemplateModelException {
int argCount = args.size(), left = 0, right = 0;
String s = "";
if (argCount != 3) {
throw new TemplateModelException(
"Error: Expecting 1 string and 2 numerical arguments here");
}
try {
TemplateScalarModel tsm = (TemplateScalarModel) args.get(0);
s = tsm.getAsString();
} catch (ClassCastException cce) {
String mess = "Error: Expecting numerical argument here";
throw new TemplateModelException(mess);
}
try {
TemplateNumberModel tnm = (TemplateNumberModel) args.get(1);
left = tnm.getAsNumber().intValue();
tnm = (TemplateNumberModel) args.get(2);
right = tnm.getAsNumber().intValue();
} catch (ClassCastException cce) {
String mess = "Error: Expecting numerical argument here";
throw new TemplateModelException(mess);
}
return new SimpleScalar(getSubstring(s, left, right));
}
private String getSubstring(String s, int start, int end) {
int[] codePoints = new int[end - start];
int length = s.length();
int i = 0;
for (int offset = 0; offset < length && i < codePoints.length;) {
int codepoint = s.codePointAt(offset);
if (offset >= start) {
codePoints[i] = codepoint;
i++;
}
offset += Character.charCount(codepoint);
}
return new String(codePoints, 0, i);
}
}
You can put an instance of it into your data model root, e.g.
SimpleHash root = new SimpleHash();
root.put("substring", new CodePointSubstring());
template.process(root, ...);
and use the custom substring method in FTL:
${substring(description, 0, 80)}
I tested it with non-Chinese characters, which still worked, but so far I haven't tried it with Chinese characters. Maybe you want to give it a try.

Basic Recursion, Check Balanced Parenthesis

I've written software in the past that uses a stack to check for balanced equations, but now I'm asked to write a similar algorithm recursively to check for properly nested brackets and parenthesis.
Good examples: () [] ()
([]()[])
Bad examples: ( (] ([)]
Suppose my function is called: isBalanced.
Should each pass evaluate a smaller substring (until reaching a base case of 2 left)? Or, should I always evaluate the full string and move indices inward?
First, to your original question, just be aware that if you're working with very long strings, you don't want to be making exact copies minus a single letter each time you make a function call. So you should favor using indexes or verify that your language of choice isn't making copies behind the scenes.
Second, I have an issue with all the answers here that are using a stack data structure. I think the point of your assignment is for you to understand that with recursion your function calls create a stack. You don't need to use a stack data structure to hold your parentheses because each recursive call is a new entry on an implicit stack.
I'll demonstrate with a C program that matches ( and ). Adding the other types like [ and ] is an exercise for the reader. All I maintain in the function is my position in the string (passed as a pointer) because the recursion is my stack.
/* Search a string for matching parentheses. If the parentheses match, returns a
* pointer that addresses the nul terminator at the end of the string. If they
* don't match, the pointer addresses the first character that doesn't match.
*/
const char *match(const char *str)
{
if( *str == '\0' || *str == ')' ) { return str; }
if( *str == '(' )
{
const char *closer = match(++str);
if( *closer == ')' )
{
return match(++closer);
}
return str - 1;
}
return match(++str);
}
Tested with this code:
const char *test[] = {
"()", "(", ")", "", "(()))", "(((())))", "()()(()())",
"(() ( hi))) (())()(((( ))))", "abcd"
};
for( index = 0; index < sizeof(test) / sizeof(test[0]); ++index ) {
const char *result = match(test[index]);
printf("%s:\t", test[index]);
*result == '\0' ? printf("Good!\n") :
printf("Bad # char %d\n", result - test[index] + 1);
}
Output:
(): Good!
(: Bad # char 1
): Bad # char 1
: Good!
(())): Bad # char 5
(((()))): Good!
()()(()()): Good!
(() ( hi))) (())()(((( )))): Bad # char 11
abcd: Good!
There are many ways to do this, but the simplest algorithm is to simply process forward left to right, passing the stack as a parameter
FUNCTION isBalanced(String input, String stack) : boolean
IF isEmpty(input)
RETURN isEmpty(stack)
ELSE IF isOpen(firstChar(input))
RETURN isBalanced(allButFirst(input), stack + firstChar(input))
ELSE IF isClose(firstChar(input))
RETURN NOT isEmpty(stack) AND isMatching(firstChar(input), lastChar(stack))
AND isBalanced(allButFirst(input), allButLast(stack))
ELSE
ERROR "Invalid character"
Here it is implemented in Java. Note that I've switched it now so that the stack pushes in front instead of at the back of the string, for convenience. I've also modified it so that it just skips non-parenthesis symbols instead of reporting it as an error.
static String open = "([<{";
static String close = ")]>}";
static boolean isOpen(char ch) {
return open.indexOf(ch) != -1;
}
static boolean isClose(char ch) {
return close.indexOf(ch) != -1;
}
static boolean isMatching(char chOpen, char chClose) {
return open.indexOf(chOpen) == close.indexOf(chClose);
}
static boolean isBalanced(String input, String stack) {
return
input.isEmpty() ?
stack.isEmpty()
: isOpen(input.charAt(0)) ?
isBalanced(input.substring(1), input.charAt(0) + stack)
: isClose(input.charAt(0)) ?
!stack.isEmpty() && isMatching(stack.charAt(0), input.charAt(0))
&& isBalanced(input.substring(1), stack.substring(1))
: isBalanced(input.substring(1), stack);
}
Test harness:
String[] tests = {
"()[]<>{}",
"(<",
"]}",
"()<",
"(][)",
"{(X)[XY]}",
};
for (String s : tests) {
System.out.println(s + " = " + isBalanced(s, ""));
}
Output:
()[]<>{} = true
(< = false
]} = false
()< = false
(][) = false
{(X)[XY]} = true
The idea is to keep a list of the opened brackets, and if you find a closing brackt, check if it closes the last opened:
If those brackets match, then remove the last opened from the list of openedBrackets and continue to check recursively on the rest of the string
Else you have found a brackets that close a nerver opened once, so it is not balanced.
When the string is finally empty, if the list of brackes is empty too (so all the brackes has been closed) return true, else false
ALGORITHM (in Java):
public static boolean isBalanced(final String str1, final LinkedList<Character> openedBrackets, final Map<Character, Character> closeToOpen) {
if ((str1 == null) || str1.isEmpty()) {
return openedBrackets.isEmpty();
} else if (closeToOpen.containsValue(str1.charAt(0))) {
openedBrackets.add(str1.charAt(0));
return isBalanced(str1.substring(1), openedBrackets, closeToOpen);
} else if (closeToOpen.containsKey(str1.charAt(0))) {
if (openedBrackets.getLast() == closeToOpen.get(str1.charAt(0))) {
openedBrackets.removeLast();
return isBalanced(str1.substring(1), openedBrackets, closeToOpen);
} else {
return false;
}
} else {
return isBalanced(str1.substring(1), openedBrackets, closeToOpen);
}
}
TEST:
public static void main(final String[] args) {
final Map<Character, Character> closeToOpen = new HashMap<Character, Character>();
closeToOpen.put('}', '{');
closeToOpen.put(']', '[');
closeToOpen.put(')', '(');
closeToOpen.put('>', '<');
final String[] testSet = new String[] { "abcdefksdhgs", "[{aaa<bb>dd}]<232>", "[ff{<gg}]<ttt>", "{<}>" };
for (final String test : testSet) {
System.out.println(test + " -> " + isBalanced(test, new LinkedList<Character>(), closeToOpen));
}
}
OUTPUT:
abcdefksdhgs -> true
[{aaa<bb>dd}]<232> -> true
[ff{<gg}]<ttt> -> false
{<}> -> false
Note that i have imported the following classes:
import java.util.HashMap;
import java.util.LinkedList;
import java.util.Map;
public static boolean isBalanced(String str) {
if (str.length() == 0) {
return true;
}
if (str.contains("()")) {
return isBalanced(str.replaceFirst("\\(\\)", ""));
}
if (str.contains("[]")) {
return isBalanced(str.replaceFirst("\\[\\]", ""));
}
if (str.contains("{}")) {
return isBalanced(str.replaceFirst("\\{\\}", ""));
} else {
return false;
}
}
Balanced Parenthesis (JS)
The more intuitive solution is to use stack like so:
function isBalanced(str) {
const parentesis = {
'(': ')',
'[': ']',
'{': '}',
};
const closing = Object.values(parentesis);
const stack = [];
for (let char of str) {
if (parentesis[char]) {
stack.push(parentesis[char]);
} else if (closing.includes(char) && char !== stack.pop()) {
return false;
}
}
return !stack.length;
}
console.log(isBalanced('{[()]}')); // true
console.log(isBalanced('{[(]]}')); // false
console.log(isBalanced('([()]')); // false
And using recursive function (without using stack), might look something like so:
function isBalanced(str) {
const parenthesis = {
'(': ')',
'[': ']',
'{': '}',
};
if (!str.length) {
return true;
}
for (let i = 0; i < str.length; i++) {
const char = str[i];
if (parenthesis[char]) {
for (let j = str.length - 1; j >= i; j--) {
const _char = str[j];
if (parenthesis[_char]) {
return false;
} else if (_char === parenthesis[char]) {
return isBalanced(str.substring(i + 1, j));
}
}
} else if (Object.values(parenthesis).includes(char)) {
return false;
}
}
return true;
}
console.log(isBalanced('{[()]}')); // true
console.log(isBalanced('{[(]]}')); // false
console.log(isBalanced('([()]')); // false
* As #Adrian mention, you can also use stack in the recursive function without the need of looking backwards
It doesn't really matter from a logical point of view -- if you keep a stack of all currently un-balanced parens that you pass to each step of the recursion, you'll never need to look backwards, so it doesn't matter if you cut up the string on each recursive call, or just increment an index and only look at the current first character.
In most programming languages, which have non-mutable strings, it's probably more expensive (performance-wise) to shorten the string than it is to pass a slightly larger string on the stack. On the other hand, in a language like C, you could just increment a pointer within the char array. I guess it's pretty language-dependent which of these two approaches is more 'efficient'. They're both equivalent from a conceptual point of view.
In the Scala programming language, I would do it like this:
def balance(chars: List[Char]): Boolean = {
def process(chars: List[Char], myStack: Stack[Char]): Boolean =
if (chars.isEmpty) myStack.isEmpty
else {
chars.head match {
case '(' => process(chars.tail, myStack.push(chars.head))
case ')' => if (myStack.contains('(')) process(chars.tail, myStack.pop)
else false
case '[' => process(chars.tail, myStack.push(chars.head))
case ']' => {
if (myStack.contains('[')) process(chars.tail, myStack.pop) else false
}
case _ => process(chars.tail, myStack)
}
}
val balancingAuxStack = new Stack[Char]
process(chars, balancingAuxStack)
}
Please edit to make it perfect.
I was only suggesting a conversion in Scala.
I would say this depends on your design. You could either use two counters or stack with two different symbols or you can handle it using recursion, the difference is in design approach.
func evalExpression(inStringArray:[String])-> Bool{
var status = false
var inStringArray = inStringArray
if inStringArray.count == 0 {
return true
}
// determine the complimentary bracket.
var complimentaryChar = ""
if (inStringArray.first == "(" || inStringArray.first == "[" || inStringArray.first == "{"){
switch inStringArray.first! {
case "(":
complimentaryChar = ")"
break
case "[":
complimentaryChar = "]"
break
case "{":
complimentaryChar = "}"
break
default:
break
}
}else{
return false
}
// find the complimentary character index in the input array.
var index = 0
var subArray = [String]()
for i in 0..<inStringArray.count{
if inStringArray[i] == complimentaryChar {
index = i
}
}
// if no complimetary bracket is found,so return false.
if index == 0{
return false
}
// create a new sub array for evaluating the brackets.
for i in 0...index{
subArray.append(inStringArray[i])
}
subArray.removeFirst()
subArray.removeLast()
if evalExpression(inStringArray: subArray){
// if part of the expression evaluates to true continue with the rest.
for _ in 0...index{
inStringArray.removeFirst()
}
status = evalExpression(inStringArray: inStringArray)
}
return status
}
PHP Solution to check balanced parentheses
<?php
/**
* #param string $inputString
*/
function isBalanced($inputString)
{
if (0 == strlen($inputString)) {
echo 'String length should be greater than 0';
exit;
}
$stack = array();
for ($i = 0; $i < strlen($inputString); $i++) {
$char = $inputString[$i];
if ($char === '(' || $char === '{' || $char === '[') {
array_push($stack, $char);
}
if ($char === ')' || $char === '}' || $char === ']') {
$matchablePairBraces = array_pop($stack);
$isMatchingPair = isMatchingPair($char, $matchablePairBraces);
if (!$isMatchingPair) {
echo "$inputString is NOT Balanced." . PHP_EOL;
exit;
}
}
}
echo "$inputString is Balanced." . PHP_EOL;
}
/**
* #param string $char1
* #param string $char2
* #return bool
*/
function isMatchingPair($char1, $char2)
{
if ($char1 === ')' && $char2 === '(') {
return true;
}
if ($char1 === '}' && $char2 === '{') {
return true;
}
if ($char1 === ']' && $char2 === '[') {
return true;
}
return false;
}
$inputString = '{ Swatantra (() {} ()) Kumar }';
isBalanced($inputString);
?>
It should be a simple use of stack ..
private string tokens = "{([<})]>";
Stack<char> stack = new Stack<char>();
public bool IsExpressionVaild(string exp)
{
int mid = (tokens.Length / 2) ;
for (int i = 0; i < exp.Length; i++)
{
int index = tokens.IndexOf(exp[i]);
if (-1 == index) { continue; }
if(index<mid ) stack .Push(exp[i]);
else
{
if (stack.Pop() != tokens[index - mid]) { return false; }
}
}
return true;
}
#indiv's answer is nice and enough to solve the parentheses grammar problems. If you want to use stack or do not want to use recursive method you can look at the python script on github. It is simple and fast.
BRACKET_ROUND_OPEN = '('
BRACKET_ROUND__CLOSE = ')'
BRACKET_CURLY_OPEN = '{'
BRACKET_CURLY_CLOSE = '}'
BRACKET_SQUARE_OPEN = '['
BRACKET_SQUARE_CLOSE = ']'
TUPLE_OPEN_CLOSE = [(BRACKET_ROUND_OPEN,BRACKET_ROUND__CLOSE),
(BRACKET_CURLY_OPEN,BRACKET_CURLY_CLOSE),
(BRACKET_SQUARE_OPEN,BRACKET_SQUARE_CLOSE)]
BRACKET_LIST = [BRACKET_ROUND_OPEN,BRACKET_ROUND__CLOSE,BRACKET_CURLY_OPEN,BRACKET_CURLY_CLOSE,BRACKET_SQUARE_OPEN,BRACKET_SQUARE_CLOSE]
def balanced_parentheses(expression):
stack = list()
left = expression[0]
for exp in expression:
if exp not in BRACKET_LIST:
continue
skip = False
for bracket_couple in TUPLE_OPEN_CLOSE:
if exp != bracket_couple[0] and exp != bracket_couple[1]:
continue
if left == bracket_couple[0] and exp == bracket_couple[1]:
if len(stack) == 0:
return False
stack.pop()
skip = True
left = ''
if len(stack) > 0:
left = stack[len(stack) - 1]
if not skip:
left = exp
stack.append(exp)
return len(stack) == 0
if __name__ == '__main__':
print(balanced_parentheses('(()())({})[]'))#True
print(balanced_parentheses('((balanced)(parentheses))({})[]'))#True
print(balanced_parentheses('(()())())'))#False

What is the best Java email address validation method? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
What are the good email address validation libraries for Java? Are there any alternatives to commons validator?
Using the official java email package is the easiest:
public static boolean isValidEmailAddress(String email) {
boolean result = true;
try {
InternetAddress emailAddr = new InternetAddress(email);
emailAddr.validate();
} catch (AddressException ex) {
result = false;
}
return result;
}
Apache Commons is generally known as a solid project. Keep in mind, though, you'll still have to send a verification email to the address if you want to ensure it's a real email, and that the owner wants it used on your site.
EDIT: There was a bug where it was too restrictive on domain, causing it to not accept valid emails from new TLDs.
This bug was resolved on 03/Jan/15 02:48 in commons-validator version 1.4.1
Apache Commons validator can be used as mentioned in the other answers.
pom.xml:
<dependency>
<groupId>commons-validator</groupId>
<artifactId>commons-validator</artifactId>
<version>1.4.1</version>
</dependency>
build.gradle:
compile 'commons-validator:commons-validator:1.4.1'
The import:
import org.apache.commons.validator.routines.EmailValidator;
The code:
String email = "myName#example.com";
boolean valid = EmailValidator.getInstance().isValid(email);
and to allow local addresses
boolean allowLocal = true;
boolean valid = EmailValidator.getInstance(allowLocal).isValid(email);
Late answer, but I think it is simple and worthy:
public boolean isValidEmailAddress(String email) {
String ePattern = "^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+#((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9]+\\.)+[a-zA-Z]{2,}))$";
java.util.regex.Pattern p = java.util.regex.Pattern.compile(ePattern);
java.util.regex.Matcher m = p.matcher(email);
return m.matches();
}
Test Cases:
For production purpose, Domain Name validations should be performed network-wise.
If you are trying to do a form validation received from the client, or just a bean validation - keep it simple.
It's better to do a loose email validation rather than to do a strict one and reject some people, (e.g. when they are trying to register for your web service).
With almost anything allowed in the username part of the email and so many new domains being added literally every month (e.g. .company, .entreprise, .estate), it's safer not to be restrictive:
Pattern pattern = Pattern.compile("^.+#.+\\..+$");
Matcher matcher = pattern.matcher(email);
Late to the question, here, but: I maintain a class at this address: http://lacinato.com/cm/software/emailrelated/emailaddress
It is based on Les Hazlewood's class, but has numerous improvements and fixes a few bugs. Apache license.
I believe it is the most capable email parser in Java, and I have yet to see one more capable in any language, though there may be one out there. It's not a lexer-style parser, but uses some complicated java regex, and thus is not as efficient as it could be, but my company has parsed well over 10 billion real-world addresses with it: it's certainly usable in a high-performance situation. Maybe once a year it'll hit an address that causes a regex stack overflow (appropriately), but these are spam addresses which are hundreds or thousands of characters long with many many quotes and parenthesis and the like.
RFC 2822 and the related specs are really quite permissive in terms of email addresses, so a class like this is overkill for most uses. For example, the following is a legitimate address, according to spec, spaces and all:
"<bob \" (here) " < (hi there) "bob(the man)smith" (hi) # (there) example.com (hello) > (again)
No mail server would allow that, but this class can parse it (and rewrite it to a usable form).
We found the existing Java email parser options to be insufficiently durable (meaning, all of them could not parse some valid addresses), so we created this class.
The code is well-documented and has a lot of easy-to-change options to allow or disallow certain email forms. It also provides a lot of methods to access certain parts of the address (left-hand side, right-hand side, personal names, comments, etc), to parse/validate mailbox-list headers, to parse/validate the return-path (which is unique among the headers), and so forth.
The code as written has a javamail dependency, but it's easy to remove if you don't want the minor functionality it provides.
I'm just wondering why nobody came up with #Email from Hibernate Validator's additional constraints. The validator itself is EmailValidator.
Les Hazlewood has written a very thorough RFC 2822 compliant email validator class using Java regular expressions. You can find it at http://www.leshazlewood.com/?p=23. However, its thoroughness (or the Java RE implementation) leads to inefficiency - read the comments about parsing times for long addresses.
I ported some of the code in Zend_Validator_Email:
#FacesValidator("emailValidator")
public class EmailAddressValidator implements Validator {
private String localPart;
private String hostName;
private boolean domain = true;
Locale locale;
ResourceBundle bundle;
private List<FacesMessage> messages = new ArrayList<FacesMessage>();
private HostnameValidator hostnameValidator;
#Override
public void validate(FacesContext context, UIComponent component, Object value) throws ValidatorException {
setOptions(component);
String email = (String) value;
boolean result = true;
Pattern pattern = Pattern.compile("^(.+)#([^#]+[^.])$");
Matcher matcher = pattern.matcher(email);
locale = context.getViewRoot().getLocale();
bundle = ResourceBundle.getBundle("com.myapp.resources.validationMessages", locale);
boolean length = true;
boolean local = true;
if (matcher.find()) {
localPart = matcher.group(1);
hostName = matcher.group(2);
if (localPart.length() > 64 || hostName.length() > 255) {
length = false;
addMessage("enterValidEmail", "email.AddressLengthExceeded");
}
if (domain == true) {
hostnameValidator = new HostnameValidator();
hostnameValidator.validate(context, component, hostName);
}
local = validateLocalPart();
if (local && length) {
result = true;
} else {
result = false;
}
} else {
result = false;
addMessage("enterValidEmail", "invalidEmailAddress");
}
if (result == false) {
throw new ValidatorException(messages);
}
}
private boolean validateLocalPart() {
// First try to match the local part on the common dot-atom format
boolean result = false;
// Dot-atom characters are: 1*atext *("." 1*atext)
// atext: ALPHA / DIGIT / and "!", "#", "$", "%", "&", "'", "*",
// "+", "-", "/", "=", "?", "^", "_", "`", "{", "|", "}", "~"
String atext = "a-zA-Z0-9\\u0021\\u0023\\u0024\\u0025\\u0026\\u0027\\u002a"
+ "\\u002b\\u002d\\u002f\\u003d\\u003f\\u005e\\u005f\\u0060\\u007b"
+ "\\u007c\\u007d\\u007e";
Pattern regex = Pattern.compile("^["+atext+"]+(\\u002e+["+atext+"]+)*$");
Matcher matcher = regex.matcher(localPart);
if (matcher.find()) {
result = true;
} else {
// Try quoted string format
// Quoted-string characters are: DQUOTE *([FWS] qtext/quoted-pair) [FWS] DQUOTE
// qtext: Non white space controls, and the rest of the US-ASCII characters not
// including "\" or the quote character
String noWsCtl = "\\u0001-\\u0008\\u000b\\u000c\\u000e-\\u001f\\u007f";
String qText = noWsCtl + "\\u0021\\u0023-\\u005b\\u005d-\\u007e";
String ws = "\\u0020\\u0009";
regex = Pattern.compile("^\\u0022(["+ws+qText+"])*["+ws+"]?\\u0022$");
matcher = regex.matcher(localPart);
if (matcher.find()) {
result = true;
} else {
addMessage("enterValidEmail", "email.AddressDotAtom");
addMessage("enterValidEmail", "email.AddressQuotedString");
addMessage("enterValidEmail", "email.AddressInvalidLocalPart");
}
}
return result;
}
private void addMessage(String detail, String summary) {
String detailMsg = bundle.getString(detail);
String summaryMsg = bundle.getString(summary);
messages.add(new FacesMessage(FacesMessage.SEVERITY_ERROR, summaryMsg, detailMsg));
}
private void setOptions(UIComponent component) {
Boolean domainOption = Boolean.valueOf((String) component.getAttributes().get("domain"));
//domain = (domainOption == null) ? true : domainOption.booleanValue();
}
}
With a hostname validator as follows:
#FacesValidator("hostNameValidator")
public class HostnameValidator implements Validator {
private Locale locale;
private ResourceBundle bundle;
private List<FacesMessage> messages;
private boolean checkTld = true;
private boolean allowLocal = false;
private boolean allowDNS = true;
private String tld;
private String[] validTlds = {"ac", "ad", "ae", "aero", "af", "ag", "ai",
"al", "am", "an", "ao", "aq", "ar", "arpa", "as", "asia", "at", "au",
"aw", "ax", "az", "ba", "bb", "bd", "be", "bf", "bg", "bh", "bi", "biz",
"bj", "bm", "bn", "bo", "br", "bs", "bt", "bv", "bw", "by", "bz", "ca",
"cat", "cc", "cd", "cf", "cg", "ch", "ci", "ck", "cl", "cm", "cn", "co",
"com", "coop", "cr", "cu", "cv", "cx", "cy", "cz", "de", "dj", "dk",
"dm", "do", "dz", "ec", "edu", "ee", "eg", "er", "es", "et", "eu", "fi",
"fj", "fk", "fm", "fo", "fr", "ga", "gb", "gd", "ge", "gf", "gg", "gh",
"gi", "gl", "gm", "gn", "gov", "gp", "gq", "gr", "gs", "gt", "gu", "gw",
"gy", "hk", "hm", "hn", "hr", "ht", "hu", "id", "ie", "il", "im", "in",
"info", "int", "io", "iq", "ir", "is", "it", "je", "jm", "jo", "jobs",
"jp", "ke", "kg", "kh", "ki", "km", "kn", "kp", "kr", "kw", "ky", "kz",
"la", "lb", "lc", "li", "lk", "lr", "ls", "lt", "lu", "lv", "ly", "ma",
"mc", "md", "me", "mg", "mh", "mil", "mk", "ml", "mm", "mn", "mo",
"mobi", "mp", "mq", "mr", "ms", "mt", "mu", "museum", "mv", "mw", "mx",
"my", "mz", "na", "name", "nc", "ne", "net", "nf", "ng", "ni", "nl",
"no", "np", "nr", "nu", "nz", "om", "org", "pa", "pe", "pf", "pg", "ph",
"pk", "pl", "pm", "pn", "pr", "pro", "ps", "pt", "pw", "py", "qa", "re",
"ro", "rs", "ru", "rw", "sa", "sb", "sc", "sd", "se", "sg", "sh", "si",
"sj", "sk", "sl", "sm", "sn", "so", "sr", "st", "su", "sv", "sy", "sz",
"tc", "td", "tel", "tf", "tg", "th", "tj", "tk", "tl", "tm", "tn", "to",
"tp", "tr", "travel", "tt", "tv", "tw", "tz", "ua", "ug", "uk", "um",
"us", "uy", "uz", "va", "vc", "ve", "vg", "vi", "vn", "vu", "wf", "ws",
"ye", "yt", "yu", "za", "zm", "zw"};
private Map<String, Map<Integer, Integer>> idnLength;
private void init() {
Map<Integer, Integer> biz = new HashMap<Integer, Integer>();
biz.put(5, 17);
biz.put(11, 15);
biz.put(12, 20);
Map<Integer, Integer> cn = new HashMap<Integer, Integer>();
cn.put(1, 20);
Map<Integer, Integer> com = new HashMap<Integer, Integer>();
com.put(3, 17);
com.put(5, 20);
Map<Integer, Integer> hk = new HashMap<Integer, Integer>();
hk.put(1, 15);
Map<Integer, Integer> info = new HashMap<Integer, Integer>();
info.put(4, 17);
Map<Integer, Integer> kr = new HashMap<Integer, Integer>();
kr.put(1, 17);
Map<Integer, Integer> net = new HashMap<Integer, Integer>();
net.put(3, 17);
net.put(5, 20);
Map<Integer, Integer> org = new HashMap<Integer, Integer>();
org.put(6, 17);
Map<Integer, Integer> tw = new HashMap<Integer, Integer>();
tw.put(1, 20);
Map<Integer, Integer> idn1 = new HashMap<Integer, Integer>();
idn1.put(1, 20);
Map<Integer, Integer> idn2 = new HashMap<Integer, Integer>();
idn2.put(1, 20);
Map<Integer, Integer> idn3 = new HashMap<Integer, Integer>();
idn3.put(1, 20);
Map<Integer, Integer> idn4 = new HashMap<Integer, Integer>();
idn4.put(1, 20);
idnLength = new HashMap<String, Map<Integer, Integer>>();
idnLength.put("BIZ", biz);
idnLength.put("CN", cn);
idnLength.put("COM", com);
idnLength.put("HK", hk);
idnLength.put("INFO", info);
idnLength.put("KR", kr);
idnLength.put("NET", net);
idnLength.put("ORG", org);
idnLength.put("TW", tw);
idnLength.put("ایران", idn1);
idnLength.put("中国", idn2);
idnLength.put("公司", idn3);
idnLength.put("网络", idn4);
messages = new ArrayList<FacesMessage>();
}
public HostnameValidator() {
init();
}
#Override
public void validate(FacesContext context, UIComponent component, Object value) throws ValidatorException {
String hostName = (String) value;
locale = context.getViewRoot().getLocale();
bundle = ResourceBundle.getBundle("com.myapp.resources.validationMessages", locale);
Pattern ipPattern = Pattern.compile("^[0-9a-f:\\.]*$", Pattern.CASE_INSENSITIVE);
Matcher ipMatcher = ipPattern.matcher(hostName);
if (ipMatcher.find()) {
addMessage("hostname.IpAddressNotAllowed");
throw new ValidatorException(messages);
}
boolean result = false;
// removes last dot (.) from hostname
hostName = hostName.replaceAll("(\\.)+$", "");
String[] domainParts = hostName.split("\\.");
boolean status = false;
// Check input against DNS hostname schema
if ((domainParts.length > 1) && (hostName.length() > 4) && (hostName.length() < 255)) {
status = false;
dowhile:
do {
// First check TLD
int lastIndex = domainParts.length - 1;
String domainEnding = domainParts[lastIndex];
Pattern tldRegex = Pattern.compile("([^.]{2,10})", Pattern.CASE_INSENSITIVE);
Matcher tldMatcher = tldRegex.matcher(domainEnding);
if (tldMatcher.find() || domainEnding.equals("ایران")
|| domainEnding.equals("中国")
|| domainEnding.equals("公司")
|| domainEnding.equals("网络")) {
// Hostname characters are: *(label dot)(label dot label); max 254 chars
// label: id-prefix [*ldh{61} id-prefix]; max 63 chars
// id-prefix: alpha / digit
// ldh: alpha / digit / dash
// Match TLD against known list
tld = (String) tldMatcher.group(1).toLowerCase().trim();
if (checkTld == true) {
boolean foundTld = false;
for (int i = 0; i < validTlds.length; i++) {
if (tld.equals(validTlds[i])) {
foundTld = true;
}
}
if (foundTld == false) {
status = false;
addMessage("hostname.UnknownTld");
break dowhile;
}
}
/**
* Match against IDN hostnames
* Note: Keep label regex short to avoid issues with long patterns when matching IDN hostnames
*/
List<String> regexChars = getIdnRegexChars();
// Check each hostname part
int check = 0;
for (String domainPart : domainParts) {
// Decode Punycode domainnames to IDN
if (domainPart.indexOf("xn--") == 0) {
domainPart = decodePunycode(domainPart.substring(4));
}
// Check dash (-) does not start, end or appear in 3rd and 4th positions
if (domainPart.indexOf("-") == 0
|| (domainPart.length() > 2 && domainPart.indexOf("-", 2) == 2 && domainPart.indexOf("-", 3) == 3)
|| (domainPart.indexOf("-") == (domainPart.length() - 1))) {
status = false;
addMessage("hostname.DashCharacter");
break dowhile;
}
// Check each domain part
boolean checked = false;
for (int key = 0; key < regexChars.size(); key++) {
String regexChar = regexChars.get(key);
Pattern regex = Pattern.compile(regexChar);
Matcher regexMatcher = regex.matcher(domainPart);
status = regexMatcher.find();
if (status) {
int length = 63;
if (idnLength.containsKey(tld.toUpperCase())
&& idnLength.get(tld.toUpperCase()).containsKey(key)) {
length = idnLength.get(tld.toUpperCase()).get(key);
}
int utf8Length;
try {
utf8Length = domainPart.getBytes("UTF8").length;
if (utf8Length > length) {
addMessage("hostname.InvalidHostname");
} else {
checked = true;
break;
}
} catch (UnsupportedEncodingException ex) {
Logger.getLogger(HostnameValidator.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
if (checked) {
++check;
}
}
// If one of the labels doesn't match, the hostname is invalid
if (check != domainParts.length) {
status = false;
addMessage("hostname.InvalidHostnameSchema");
}
} else {
// Hostname not long enough
status = false;
addMessage("hostname.UndecipherableTld");
}
} while (false);
if (status == true && allowDNS) {
result = true;
}
} else if (allowDNS == true) {
addMessage("hostname.InvalidHostname");
throw new ValidatorException(messages);
}
// Check input against local network name schema;
Pattern regexLocal = Pattern.compile("^(([a-zA-Z0-9\\x2d]{1,63}\\x2e)*[a-zA-Z0-9\\x2d]{1,63}){1,254}$", Pattern.CASE_INSENSITIVE);
boolean checkLocal = regexLocal.matcher(hostName).find();
if (allowLocal && !status) {
if (checkLocal) {
result = true;
} else {
// If the input does not pass as a local network name, add a message
result = false;
addMessage("hostname.InvalidLocalName");
}
}
// If local network names are not allowed, add a message
if (checkLocal && !allowLocal && !status) {
result = false;
addMessage("hostname.LocalNameNotAllowed");
}
if (result == false) {
throw new ValidatorException(messages);
}
}
private void addMessage(String msg) {
String bundlMsg = bundle.getString(msg);
messages.add(new FacesMessage(FacesMessage.SEVERITY_ERROR, bundlMsg, bundlMsg));
}
/**
* Returns a list of regex patterns for the matched TLD
* #param tld
* #return
*/
private List<String> getIdnRegexChars() {
List<String> regexChars = new ArrayList<String>();
regexChars.add("^[a-z0-9\\x2d]{1,63}$");
Document doc = null;
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
try {
InputStream validIdns = getClass().getClassLoader().getResourceAsStream("com/myapp/resources/validIDNs_1.xml");
DocumentBuilder builder = factory.newDocumentBuilder();
doc = builder.parse(validIdns);
doc.getDocumentElement().normalize();
} catch (SAXException ex) {
Logger.getLogger(HostnameValidator.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(HostnameValidator.class.getName()).log(Level.SEVERE, null, ex);
} catch (ParserConfigurationException ex) {
Logger.getLogger(HostnameValidator.class.getName()).log(Level.SEVERE, null, ex);
}
// prepare XPath
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = null;
String xpathRoute = "//idn[tld=\'" + tld.toUpperCase() + "\']/pattern/text()";
try {
XPathExpression expr;
expr = xpath.compile(xpathRoute);
Object res = expr.evaluate(doc, XPathConstants.NODESET);
nodes = (NodeList) res;
} catch (XPathExpressionException ex) {
Logger.getLogger(HostnameValidator.class.getName()).log(Level.SEVERE, null, ex);
}
for (int i = 0; i < nodes.getLength(); i++) {
regexChars.add(nodes.item(i).getNodeValue());
}
return regexChars;
}
/**
* Decode Punycode string
* #param encoded
* #return
*/
private String decodePunycode(String encoded) {
Pattern regex = Pattern.compile("([^a-z0-9\\x2d]{1,10})", Pattern.CASE_INSENSITIVE);
Matcher matcher = regex.matcher(encoded);
boolean found = matcher.find();
if (encoded.isEmpty() || found) {
// no punycode encoded string, return as is
addMessage("hostname.CannotDecodePunycode");
throw new ValidatorException(messages);
}
int separator = encoded.lastIndexOf("-");
List<Integer> decoded = new ArrayList<Integer>();
if (separator > 0) {
for (int x = 0; x < separator; ++x) {
decoded.add((int) encoded.charAt(x));
}
} else {
addMessage("hostname.CannotDecodePunycode");
throw new ValidatorException(messages);
}
int lengthd = decoded.size();
int lengthe = encoded.length();
// decoding
boolean init = true;
int base = 72;
int index = 0;
int ch = 0x80;
int indexeStart = (separator == 1) ? (separator + 1) : 0;
for (int indexe = indexeStart; indexe < lengthe; ++lengthd) {
int oldIndex = index;
int pos = 1;
for (int key = 36; true; key += 36) {
int hex = (int) encoded.charAt(indexe++);
int digit = (hex - 48 < 10) ? hex - 22
: ((hex - 65 < 26) ? hex - 65
: ((hex - 97 < 26) ? hex - 97
: 36));
index += digit * pos;
int tag = (key <= base) ? 1 : ((key >= base + 26) ? 26 : (key - base));
if (digit < tag) {
break;
}
pos = (int) (pos * (36 - tag));
}
int delta = (int) (init ? ((index - oldIndex) / 700) : ((index - oldIndex) / 2));
delta += (int) (delta / (lengthd + 1));
int key;
for (key = 0; delta > 910; key += 36) {
delta = (int) (delta / 35);
}
base = (int) (key + 36 * delta / (delta + 38));
init = false;
ch += (int) (index / (lengthd + 1));
index %= (lengthd + 1);
if (lengthd > 0) {
for (int i = lengthd; i > index; i--) {
decoded.set(i, decoded.get(i - 1));
}
}
decoded.set(index++, ch);
}
// convert decoded ucs4 to utf8 string
StringBuilder sb = new StringBuilder();
for (int i = 0; i < decoded.size(); i++) {
int value = decoded.get(i);
if (value < 128) {
sb.append((char) value);
} else if (value < (1 << 11)) {
sb.append((char) (192 + (value >> 6)));
sb.append((char) (128 + (value & 63)));
} else if (value < (1 << 16)) {
sb.append((char) (224 + (value >> 12)));
sb.append((char) (128 + ((value >> 6) & 63)));
sb.append((char) (128 + (value & 63)));
} else if (value < (1 << 21)) {
sb.append((char) (240 + (value >> 18)));
sb.append((char) (128 + ((value >> 12) & 63)));
sb.append((char) (128 + ((value >> 6) & 63)));
sb.append((char) (128 + (value & 63)));
} else {
addMessage("hostname.CannotDecodePunycode");
throw new ValidatorException(messages);
}
}
return sb.toString();
}
/**
* Eliminates empty values from input array
* #param data
* #return
*/
private String[] verifyArray(String[] data) {
List<String> result = new ArrayList<String>();
for (String s : data) {
if (!s.equals("")) {
result.add(s);
}
}
return result.toArray(new String[result.size()]);
}
}
And a validIDNs.xml with regex patterns for the different tlds (too big to include:)
<idnlist>
<idn>
<tld>AC</tld>
<pattern>^[\u002d0-9a-zà-öø-ÿāăąćĉċčďđēėęěĝġģĥħīįĵķĺļľŀłńņňŋőœŕŗřśŝşšţťŧūŭůűųŵŷźżž]{1,63}$</pattern>
</idn>
<idn>
<tld>AR</tld>
<pattern>^[\u002d0-9a-zà-ãç-êìíñ-õü]{1,63}$</pattern>
</idn>
<idn>
<tld>AS</tld>
<pattern>/^[\u002d0-9a-zà-öø-ÿāăąćĉċčďđēĕėęěĝğġģĥħĩīĭįıĵķĸĺļľłńņňŋōŏőœŕŗřśŝşšţťŧũūŭůűųŵŷźż]{1,63}$</pattern>
</idn>
<idn>
<tld>AT</tld>
<pattern>/^[\u002d0-9a-zà-öø-ÿœšž]{1,63}$</pattern>
</idn>
<idn>
<tld>BIZ</tld>
<pattern>^[\u002d0-9a-zäåæéöøü]{1,63}$</pattern>
<pattern>^[\u002d0-9a-záéíñóúü]{1,63}$</pattern>
<pattern>^[\u002d0-9a-záéíóöúüőű]{1,63}$</pattern>
</id>
</idlist>
public class Validations {
private Pattern regexPattern;
private Matcher regMatcher;
public String validateEmailAddress(String emailAddress) {
regexPattern = Pattern.compile("^[(a-zA-Z-0-9-\\_\\+\\.)]+#[(a-z-A-z)]+\\.[(a-zA-z)]{2,3}$");
regMatcher = regexPattern.matcher(emailAddress);
if(regMatcher.matches()) {
return "Valid Email Address";
} else {
return "Invalid Email Address";
}
}
public String validateMobileNumber(String mobileNumber) {
regexPattern = Pattern.compile("^\\+[0-9]{2,3}+-[0-9]{10}$");
regMatcher = regexPattern.matcher(mobileNumber);
if(regMatcher.matches()) {
return "Valid Mobile Number";
} else {
return "Invalid Mobile Number";
}
}
public static void main(String[] args) {
String emailAddress = "suryaprakash.pisay#gmail.com";
String mobileNumber = "+91-9986571622";
Validations validations = new Validations();
System.out.println(validations.validateEmailAddress(emailAddress));
System.out.println(validations.validateMobileNumber(mobileNumber));
}
}
If you're looking to verify whether an email address is valid, then VRFY will get you some of the way. I've found it's useful for validating intranet addresses (that is, email addresses for internal sites). However it's less useful for internet mail servers (see the caveats at the top of this page)
Although there are many alternatives to Apache commons, their implementations are rudimentary at best (like Apache commons' implementation itself) and even dead wrong in other cases.
I'd also stay away from so called simple 'non-restrictive' regex; there's no such thing. For example # is allowed multiple times depending on context, how do you know the required one is there? Simple regex won't understand it, even though the email should be valid. Anything more complex becomes error-prone or even contain hidden performance killers. How are you going to maintain something like this?
The only comprehensive RFC compliant regex based validator I'm aware of is email-rfc2822-validator with its 'refined' regex appropriately named Dragons.java. It supports only the older RFC-2822 spec though, although appropriate enough for modern needs (RFC-5322 updates it in areas already out of scope for daily use cases).
But really what you want is a lexer that properly parses a string and breaks it up into the component structure according to the RFC grammar. EmailValidator4J seems promising in that regard, but is still young and limited.
Another option you have is using a webservice such as Mailgun's battle-tested validation webservice or Mailboxlayer API (just took the first Google results). It is not strictly RFC compliant, but works well enough for modern needs.
What do you want to validate? The email address?
The email address can only be checked for its format conformance. See the standard: RFC2822. Best way to do that is a regular expression. You will never know if really exists without sending an email.
I checked the commons validator. It contains an org.apache.commons.validator.EmailValidator class. Seems to be a good starting point.
Current Apache Commons Validator version is 1.3.1.
Class that validates is org.apache.commons.validator.EmailValidator. It has an import for org.apache.oro.text.perl.Perl5Util which is from a retired Jakarta ORO project.
BTW, I found that there is a 1.4 version, here are the API docs. On the site it says: "Last Published: 05 March 2008 | Version: 1.4-SNAPSHOT", but that's not final. Only way to build yourself (but this is a snapshot, not RELEASE) and use, or download from here. This means 1.4 has not been made final for three years (2008-2011). This is not in Apache's style.
I'm looking for a better option, but didn't find one that is very adopted. I want to use something that is well tested, don't want to hit any bugs.
You may also want to check for the length - emails are a maximum of 254 chars long. I use the apache commons validator and it doesn't check for this.
There don't seem to be any perfect libraries or ways to do this yourself, unless you have to time to send an email to the email address and wait for a response (this might not be an option though). I ended up using a suggestion from here http://blog.logichigh.com/2010/09/02/validating-an-e-mail-address/ and adjusting the code so it would work in Java.
public static boolean isValidEmailAddress(String email) {
boolean stricterFilter = true;
String stricterFilterString = "[A-Z0-9a-z._%+-]+#[A-Za-z0-9.-]+\\.[A-Za-z]{2,4}";
String laxString = ".+#.+\\.[A-Za-z]{2}[A-Za-z]*";
String emailRegex = stricterFilter ? stricterFilterString : laxString;
java.util.regex.Pattern p = java.util.regex.Pattern.compile(emailRegex);
java.util.regex.Matcher m = p.matcher(email);
return m.matches();
}
This is the best method:
public static boolean isValidEmail(String enteredEmail){
String EMAIL_REGIX = "^[\\\\w!#$%&’*+/=?`{|}~^-]+(?:\\\\.[\\\\w!#$%&’*+/=?`{|}~^-]+)*#(?:[a-zA-Z0-9-]+\\\\.)+[a-zA-Z]{2,6}$";
Pattern pattern = Pattern.compile(EMAIL_REGIX);
Matcher matcher = pattern.matcher(enteredEmail);
return ((!enteredEmail.isEmpty()) && (enteredEmail!=null) && (matcher.matches()));
}
Sources:-
http://howtodoinjava.com/2014/11/11/java-regex-validate-email-address/
http://www.rfc-editor.org/rfc/rfc5322.txt
Another option is use the Hibernate email validator, using the annotation #Email or using the validator class programatically, like:
import org.hibernate.validator.internal.constraintvalidators.hv.EmailValidator;
class Validator {
// code
private boolean isValidEmail(String email) {
EmailValidator emailValidator = new EmailValidator();
return emailValidator.isValid(email, null);
}
}
Heres my pragmatic approach, where I just want reasonable distinct blah#domain addresses using the allowable characters from the RFC. Addresses must be converted to lowercase beforehand.
public class EmailAddressValidator {
private static final String domainChars = "a-z0-9\\-";
private static final String atomChars = "a-z0-9\\Q!#$%&'*+-/=?^_`{|}~\\E";
private static final String emailRegex = "^" + dot(atomChars) + "#" + dot(domainChars) + "$";
private static final Pattern emailPattern = Pattern.compile(emailRegex);
private static String dot(String chars) {
return "[" + chars + "]+(?:\\.[" + chars + "]+)*";
}
public static boolean isValidEmailAddress(String address) {
return address != null && emailPattern.matcher(address).matches();
}
}

Resources