Retrieving keys with maximum values in a Hashmap in java 8 [duplicate] - java-8

As of Java 1.5, you can pretty much interchange Integer with int in many situations.
However, I found a potential defect in my code that surprised me a bit.
The following code:
Integer cdiCt = ...;
Integer cdsCt = ...;
...
if (cdiCt != null && cdsCt != null && cdiCt != cdsCt)
mismatch = true;
appeared to be incorrectly setting mismatch when the values were equal, although I can't determine under what circumstances. I set a breakpoint in Eclipse and saw that the Integer values were both 137, and I inspected the boolean expression and it said it was false, but when I stepped over it, it was setting mismatch to true.
Changing the conditional to:
if (cdiCt != null && cdsCt != null && !cdiCt.equals(cdsCt))
fixed the problem.
Can anyone shed some light on why this happened? So far, I have only seen the behavior on my localhost on my own PC. In this particular case, the code successfully made it past about 20 comparisons, but failed on 2. The problem was consistently reproducible.
If it is a prevalent problem, it should be causing errors on our other environments (dev and test), but so far, no one has reported the problem after hundreds of tests executing this code snippet.
Is it still not legitimate to use == to compare two Integer values?
In addition to all the fine answers below, the following stackoverflow link has quite a bit of additional information. It actually would have answered my original question, but because I didn't mention autoboxing in my question, it didn't show up in the selected suggestions:
Why can't the compiler/JVM just make autoboxing “just work”?

The JVM is caching Integer values. Hence the comparison with == only works for numbers between -128 and 127.
Refer: #Immutable_Objects_.2F_Wrapper_Class_Caching

You can't compare two Integer with a simple == they're objects so most of the time references won't be the same.
There is a trick, with Integer between -128 and 127, references will be the same as autoboxing uses Integer.valueOf() which caches small integers.
If the value p being boxed is true, false, a byte, a char in the range \u0000 to \u007f, or an int or short number between -128 and 127, then let r1 and r2 be the results of any two boxing conversions of p. It is always the case that r1 == r2.
Resources :
JLS - Boxing
On the same topic :
autoboxing vs manual boxing java

"==" always compare the memory location or object references of the values. equals method always compare the values. But equals also indirectly uses the "==" operator to compare the values.
Integer uses Integer cache to store the values from -128 to +127. If == operator is used to check for any values between -128 to 127 then it returns true. for other than these values it returns false .
Refer the link for some additional info

Integer refers to the reference, that is, when comparing references you're comparing if they point to the same object, not value. Hence, the issue you're seeing. The reason it works so well with plain int types is that it unboxes the value contained by the Integer.
May I add that if you're doing what you're doing, why have the if statement to begin with?
mismatch = ( cdiCt != null && cdsCt != null && !cdiCt.equals( cdsCt ) );

The issue is that your two Integer objects are just that, objects. They do not match because you are comparing your two object references, not the values within. Obviously .equals is overridden to provide a value comparison as opposed to an object reference comparison.

Besides these given great answers, What I have learned is that:
NEVER compare objects with == unless you intend to be comparing them
by their references.

As well for correctness of using == you can just unbox one of compared Integer values before doing == comparison, like:
if ( firstInteger.intValue() == secondInteger ) {..
The second will be auto unboxed (of course you have to check for nulls first).

Related

Microsoft technical interview: Matrix Algorithm

I recently had an interview in which the interviewer gave me some pseudocode and asked questions related to it. Unfortunately, I was not able to answer his questions due to lack of preparation. Due to time constraint, I could not ask him the solution for that problem. I would really appreciate if someone could guide me and help me understand the problem so I can improve for the future. Below is the pseudocode:
A sample state of ‘a’:
[[ 2, NULL, 2, NULL],
[ 2, NULL, 2, NULL],
[NULL, NULL, NULL, NULL],
[NULL, NULL, NULL, NULL]]
FUNCTION foo()
FOR y = 0 to 3
FOR x = 0 to 3
IF a[x+1][y] != NULL
IF a[x+1][y] = a[x][y]:
a[x][y] := a[x][y]*2
a[x+1][y] := NULL
END IF
IF a[x][y] = NULL
a[x][y] := a[x+1][y]
a[x+1][y] := NULL
END IF
END IF
END FOR
END FOR
END FUNCTION
The interviewer asked me:
What is the issue with the above code and how would I fix it?
Once corrected, what does function foo do? Please focus on the result of the function, not the details of the implementation.
How could you make foo more generic? Explain up to three possible generalization directions and describe a strategy for each, no need to write the code!
I mentioned to him:
The state of the matrix looks incorrect because an integer matrix cannot have null values. By default they are assigned 0, false for Boolean and null for the reference type.
Another issue with the above code is at IF a[x+1][y] != NULL, the condition will produce an array index out-of-bounds error when x equals 3.
But I felt the interviewer was looking for something else in my answer and was not satisfied with the explanation.
Have you played the game "2048" (link to game)? If not, this question will likely not make much intuitive sense to you, and because of that, I think it's a poor interview question.
What this attempts to do is simulate one step of the 2048 game where the numbers go upward. Numbers will move upward by one cell unless they hit another number or the matrix border (think of gravity pulling all numbers upward). If the two numbers are equal, they combine and produce a new number (their sum).
Note: this isn't exactly one step of the 2048 game because numbers only move one cell upward, while in the game they move "all they way" until they hit something else. To get a step of the 2048 game, you'd repeat the given function until no more changes occur.
The issue in the code is, as you mentioned, the array index out-of-bounds. It should be fixed by iterating over x = 0 to 2 instead.
To make this more general, you have to be creative:
The main generalization is that it should take a "direction" parameter. (Again you wouldn't know this if you haven't played the 2048 game yourself.) Instead of gravity pulling numbers upward, gravity can pull numbers in any of the 4 cardinal directions.
Maybe the algorithm shouldn't check for NULL but should check against some other sentinel value (which is another input).
It's also pretty easy to generalize this to larger matrices.
Maybe there should be some other rule that dictates when numbers get combined, and how precisely they get combined (not necessarily 2 times the first). These rules can be given in the form of lambdas.
As for this part of your answer:
integer matrix cannot have null values, by default they are assigned 0, false for Boolean and null for the reference type
That is largely dependent on the language being used, so I wouldn't say this is an error in the pseudocode (which isn't supposed to be in any particular language). For instance, in weakly-typed languages you can certainly have a matrix with int and NULL values.
You don't mention what you said about the function's behavior. If I were the interviewer, I would want to see someone "think out loud" and realize at least the following:
The code is trying to compare each element with the one below it.
Nothing happens unless the lower element is NULL.
If the two elements are equal, then the lower one is replaced with NULL and the upper element becomes twice as large.
If the top element is NULL, then the lower non-NULL element "moves" to the top element's place.
These observations about the code are straightforward to obtain just by reading the source code. Whether or not you make sense of these "rules" and notice that it's (similar to) the 2048 game is largely dependent on whether you've played the game before.
Here's the python code for the same program. I have fixed the index out of bound issue in this code. Hope this helps.
null = 0
array = [[2,null,2,null],[2,null,2,null],[null,null,null,null],[null,null,null,null]]
range = [0,1,2]
for y in range:
for x in range:
if array[x+1][y] != null:
if array[x+1][y] == array[x][y]:
array[x][y] = array[x][y]*2
array[x+1][y] = null
if array[x][y] == null:
array[x][y] = array[x+1][y]
array[x+1][y] = null
print(array)
Once corrected, what does function foo do? Please focus on the result of the function, not the details of the implementation
The output will be :
4 null 4 null
null null null null
null null null null
null null null null

How write a hash function to make such expression to be true?

pseudocode:
// deprecated x!=y && hash(x) == hash(y) // how to make this true?
x!=y && hash(x) == hash(y) && (z!=x && z!=y) && (hash(x) != hash(z) && (hash(y) != hash(z)) // how to make this true?
x and y can be any readable value
Whatever the language, the pseudocode is just help to understand what I mean.
I just wonder how to implement such hash function.
PS: For math, i am an idiot. I can not imagine if there is an algorithm that can do this.
UPDATE 1:
The pseudocode has bug, so I updated the code(actually still has bug, never mind, I will explain).
My original requirement is to make a hash function that can return same value for different parameter, and the parameter value should contains some rule. It means, only the parameter value in same category would gets same hash code, others are not.
e.g.
The following expressions are clearly(you can treat '0' as placeholder):
hash("1.1") == hash("1.0") == hash("0.1")
hash("2.2") == hash("2.0") == hash("0.2")
and
hash("2.2") != hash("2.1") != hash("1.2")
I think this question can do such description:
There are two or more different values contains implied same attribute.
Only these values have such same attribute in the world.
The attribute can obtain through some way(maybe a function), hash() will call it inside.
hash() one of the values, you can retrive the attribute, then you can get the unique hashCode.
It's looks like hash collision, but we exactly know what they are. Also looks like many-to-one model.
How to design collision rules? The values could be any character or numeric. And how to implement the designs?
PPS: This is a question full of bugs, maybe the updated parts cannot explain the the problem either. Or maybe this is a false proposition. I want abstract my issue as a general model, but it makes my mind overflowed. If necessary I will post my actual issue that I am facing.
Any constant hash trivially satisfies your condition:
hash(v) = 42
A less constant answer than yuri kilocheck's would be to use the mod operator:
hash(v) = v % 10;
Then you'll have:
hash(1) = 1
hash(2) = 2
hash(3) = 3
...
hash(11) = 1
hash(12) = 2

incomparable types: int and Number in java 8

Suppose I have the following code:
class proba {
boolean fun(Number n) {
return n == null || 0 == n;
}
}
This compiles without problem using openjdk 7 (debian wheezy), but fails to compile when using openjdk 8, with the following error (even when using -source 7):
proba.java:3: error: incomparable types: int and Number
return n == null || 0 == n;
^
1 error
How to go around this:
Is there a compiler option for this construct to continue working in java 8?
Should I make lots of consecutive ifs with instanceof checks of all of Number's subclasses and casting and then comparing one-by one? This seems ugly...
Other suggestions?
This is actually a bugfix (see JDK-8013357): the Java-7 behavior contradicted the JLS §15.21:
The equality operators may be used to compare two operands that are convertible (§5.1.8) to numeric type, or two operands of type boolean or Boolean, or two operands that are each of either reference type or the null type. All other cases result in a compile-time error.
In your case one operand is numeric type, while other is reference type (Number is not convertible to the numeric type), so it should be a compile-time error, according to the specification.
This change is mentioned in Compatibility Guide for Java 8 (search for "primitive").
Note that while your code compiles in Java-7 it works somewhat strangely:
System.out.println(new proba().fun(0)); // compiles, prints true
System.out.println(new proba().fun(0.0)); // compiles, prints false
System.out.println(new proba().fun(new Integer(0))); // compiles, prints false
That's why Java-7 promotes 0 to Integer object (via autoboxing), then compares two objects by reference which is unlikely what you want.
To fix your code, you may convert Number to some predefined primitive type like double:
boolean fun(Number n) {
return n == null || 0 == n.doubleValue();
}
If you want to compare Number and int - call Number.intValue() and then compare.

MongoDB comparison operators with null

In MongoDB I would like to use $gt and $lt comparision operators where the value could be null. When the operators did not work with null, I looked for documentation but found none. In both cases it returned no documents (even though $ne, $gte, and $lte did return documents; meaning there were documents that were both equal to and not equal to null).
I would expect $gt to essentially operate like $ne (as the null type Mongo comarison order is so low) and $lt to return nothing for the same reason.
I was hoping this would work as the value I pass to the query is variable (potentially null), and I don't want to have to write a special case for null.
Example of what I was expeccting, given the following collection:
{
id: 1,
colNum: null
}
{
id: 2,
colNum: 72
}
{
id: 3
}
I would expect the following query:
db.testtable.find( { "colNum" { $gt : null } } )
To return:
{
id: 2,
colNum: 72
}
However, nothing was returned.
Is there a reason that $gt and $lt don't seem to work with null, or is it a MongoDB bug, or is it actually supposed to work and there is likely a user error?
Nitty-Gritty Details
Reading through the latest Mongo source, there's basically 2 cases when doing comparisons involving null:
If the canonical types of the BSON elements being compared are different, only equality comparisons (==, >=, <=) of null & undefined will return true; otherwise any comparison with null will return false.
Note: No other BSON type has the same canonical type as null.
If the canonical types are the same (i.e., both elements are null), then compareElementValues is called. For null, this just returns the difference between the canonical type of both BSON elements and then carries out the requested comparison against 0.
For example, null > null would translate into (5-5) > 0 --> False because the canonical type of null is 5.
Similarly, null < null would translate into (5-5) < 0 --> False.
This means null can only ever be equal to null or undefined. Any other comparison involving null will always return false.
Is this a Bug?
Updated Answer:
The documentation for the comparison operators ($gt, $lt) references the documentation which you originally linked, which implies that the comparison operators should work with null. Furthermore, query sorting (i.e., db.find().sort()) does accurately follow the documented Comparison/Sort behavior.
This is, at the very least, inconsistent. I think it would be worth submitting a bug report to MongoDB's JIRA site.
Original Answer:
I don't think this behavior is a bug.
The general consensus for Javascript is that undefined means unassigned while null means assigned but otherwise undefined. Value comparisons against undefined, aside from equality, don't make sense, at least in a mathematical sense.
Given that BSON draws heavily from JavaScript, this applies to MongoDB too.

Convert Enum to Binary (via Integer or something similar)

I have an Ada enum with 2 values type Polarity is (Normal, Reversed), and I would like to convert them to 0, 1 (or True, False--as Boolean seems to implicitly play nice as binary) respectively, so I can store their values as specific bits in a byte. How can I accomplish this?
An easy way is a lookup table:
Bool_Polarity : constant Array(Polarity) of Boolean
:= (Normal=>False, Reversed => True);
then use it as
B Boolean := Bool_Polarity(P);
Of course there is nothing wrong with using the 'Pos attribute, but the LUT makes the mapping readable and very obvious.
As it is constant, you'd like to hope it optimises away during the constant folding stage, and it seems to: I have used similar tricks compiling for AVR with very acceptable executable sizes (down to 0.6k to independently drive 2 stepper motors)
3.5.5 Operations of Discrete Types include the function S'Pos(Arg : S'Base), which "returns the position number of the value of Arg, as a value of type universal integer." Hence,
Polarity'Pos(Normal) = 0
Polarity'Pos(Reversed) = 1
You can change the numbering using 13.4 Enumeration Representation Clauses.
...and, of course:
Boolean'Val(Polarity'Pos(Normal)) = False
Boolean'Val(Polarity'Pos(Reversed)) = True
I think what you are looking for is a record type with a representation clause:
procedure Main is
type Byte_T is mod 2**8-1;
for Byte_T'Size use 8;
type Filler7_T is mod 2**7-1;
for Filler7_T'Size use 7;
type Polarity_T is (Normal,Reversed);
for Polarity_T use (Normal => 0, Reversed => 1);
for Polarity_T'Size use 1;
type Byte_As_Record_T is record
Filler : Filler7_T;
Polarity : Polarity_T;
end record;
for Byte_As_Record_T use record
Filler at 0 range 0 .. 6;
Polarity at 0 range 7 .. 7;
end record;
for Byte_As_Record_T'Size use 8;
function Convert is new Ada.Unchecked_Conversion
(Source => Byte_As_Record_T,
Target => Byte_T);
function Convert is new Ada.Unchecked_Conversion
(Source => Byte_T,
Target => Byte_As_Record_T);
begin
-- TBC
null;
end Main;
As Byte_As_Record_T & Byte_T are the same size, you can use unchecked conversion to convert between the types safely.
The representation clause for Byte_As_Record_T allows you to specify which bits/bytes to place your polarity_t in. (i chose the 8th bit)
My definition of Byte_T might not be what you want, but as long as it is 8 bits long the principle should still be workable. From Byte_T you can also safely upcast to Integer or Natural or Positive. You can also use the same technique to go directly to/from a 32 bit Integer to/from a 32 bit record type.
Two points here:
1) Enumerations are already stored as binary. Everything is. In particular, your enumeration, as defined above, will be stored as a 0 for Normal and a 1 for Reversed, unless you go out of your way to tell the compiler to use other values.
If you want to get that value out of the enumeration as an Integer rather than an enumeration value, you have two options. The 'pos() attribute will return a 0-based number for that enumeration's position in the enumeration, and Unchecked_Conversion will return the actual value the computer stores for it. (There is no difference in the value, unless an enumeration representation clause was used).
2) Enumerations are nice, but don't reinvent Boolean. If your enumeration can only ever have two values, you don't gain anything useful by making a custom enumeration, and you lose a lot of useful properties that Boolean has. Booleans can be directly selected off of in loops and if checks. Booleans have and, or, xor, etc. defined for them. Booleans can be put into packed arrays, and then those same operators are defined bitwise across the whole array.
A particular pet peeve of mine is when people end up defining themselves a custom boolean with the logic reversed (so its true condition is 0). If you do this, the ghost of Ada Lovelace will come back from the grave and force you to listen to an exhaustive explanation of how to calculate Bernoulli sequences with a Difference Engine. Don't let this happen to you!
So if it would never make sense to have a third enumeration value, you just name objects something appropriate describing the True condition (eg: Reversed_Polarity : Boolean;), and go on your merry way.
It seems all I needed to do was pragma Pack([type name]); (in which 'type name' is the type composed of Polarity) to compress the value down to a single bit.

Resources