In MongoDB I would like to use $gt and $lt comparision operators where the value could be null. When the operators did not work with null, I looked for documentation but found none. In both cases it returned no documents (even though $ne, $gte, and $lte did return documents; meaning there were documents that were both equal to and not equal to null).
I would expect $gt to essentially operate like $ne (as the null type Mongo comarison order is so low) and $lt to return nothing for the same reason.
I was hoping this would work as the value I pass to the query is variable (potentially null), and I don't want to have to write a special case for null.
Example of what I was expeccting, given the following collection:
{
id: 1,
colNum: null
}
{
id: 2,
colNum: 72
}
{
id: 3
}
I would expect the following query:
db.testtable.find( { "colNum" { $gt : null } } )
To return:
{
id: 2,
colNum: 72
}
However, nothing was returned.
Is there a reason that $gt and $lt don't seem to work with null, or is it a MongoDB bug, or is it actually supposed to work and there is likely a user error?
Nitty-Gritty Details
Reading through the latest Mongo source, there's basically 2 cases when doing comparisons involving null:
If the canonical types of the BSON elements being compared are different, only equality comparisons (==, >=, <=) of null & undefined will return true; otherwise any comparison with null will return false.
Note: No other BSON type has the same canonical type as null.
If the canonical types are the same (i.e., both elements are null), then compareElementValues is called. For null, this just returns the difference between the canonical type of both BSON elements and then carries out the requested comparison against 0.
For example, null > null would translate into (5-5) > 0 --> False because the canonical type of null is 5.
Similarly, null < null would translate into (5-5) < 0 --> False.
This means null can only ever be equal to null or undefined. Any other comparison involving null will always return false.
Is this a Bug?
Updated Answer:
The documentation for the comparison operators ($gt, $lt) references the documentation which you originally linked, which implies that the comparison operators should work with null. Furthermore, query sorting (i.e., db.find().sort()) does accurately follow the documented Comparison/Sort behavior.
This is, at the very least, inconsistent. I think it would be worth submitting a bug report to MongoDB's JIRA site.
Original Answer:
I don't think this behavior is a bug.
The general consensus for Javascript is that undefined means unassigned while null means assigned but otherwise undefined. Value comparisons against undefined, aside from equality, don't make sense, at least in a mathematical sense.
Given that BSON draws heavily from JavaScript, this applies to MongoDB too.
Related
I recently had an interview in which the interviewer gave me some pseudocode and asked questions related to it. Unfortunately, I was not able to answer his questions due to lack of preparation. Due to time constraint, I could not ask him the solution for that problem. I would really appreciate if someone could guide me and help me understand the problem so I can improve for the future. Below is the pseudocode:
A sample state of ‘a’:
[[ 2, NULL, 2, NULL],
[ 2, NULL, 2, NULL],
[NULL, NULL, NULL, NULL],
[NULL, NULL, NULL, NULL]]
FUNCTION foo()
FOR y = 0 to 3
FOR x = 0 to 3
IF a[x+1][y] != NULL
IF a[x+1][y] = a[x][y]:
a[x][y] := a[x][y]*2
a[x+1][y] := NULL
END IF
IF a[x][y] = NULL
a[x][y] := a[x+1][y]
a[x+1][y] := NULL
END IF
END IF
END FOR
END FOR
END FUNCTION
The interviewer asked me:
What is the issue with the above code and how would I fix it?
Once corrected, what does function foo do? Please focus on the result of the function, not the details of the implementation.
How could you make foo more generic? Explain up to three possible generalization directions and describe a strategy for each, no need to write the code!
I mentioned to him:
The state of the matrix looks incorrect because an integer matrix cannot have null values. By default they are assigned 0, false for Boolean and null for the reference type.
Another issue with the above code is at IF a[x+1][y] != NULL, the condition will produce an array index out-of-bounds error when x equals 3.
But I felt the interviewer was looking for something else in my answer and was not satisfied with the explanation.
Have you played the game "2048" (link to game)? If not, this question will likely not make much intuitive sense to you, and because of that, I think it's a poor interview question.
What this attempts to do is simulate one step of the 2048 game where the numbers go upward. Numbers will move upward by one cell unless they hit another number or the matrix border (think of gravity pulling all numbers upward). If the two numbers are equal, they combine and produce a new number (their sum).
Note: this isn't exactly one step of the 2048 game because numbers only move one cell upward, while in the game they move "all they way" until they hit something else. To get a step of the 2048 game, you'd repeat the given function until no more changes occur.
The issue in the code is, as you mentioned, the array index out-of-bounds. It should be fixed by iterating over x = 0 to 2 instead.
To make this more general, you have to be creative:
The main generalization is that it should take a "direction" parameter. (Again you wouldn't know this if you haven't played the 2048 game yourself.) Instead of gravity pulling numbers upward, gravity can pull numbers in any of the 4 cardinal directions.
Maybe the algorithm shouldn't check for NULL but should check against some other sentinel value (which is another input).
It's also pretty easy to generalize this to larger matrices.
Maybe there should be some other rule that dictates when numbers get combined, and how precisely they get combined (not necessarily 2 times the first). These rules can be given in the form of lambdas.
As for this part of your answer:
integer matrix cannot have null values, by default they are assigned 0, false for Boolean and null for the reference type
That is largely dependent on the language being used, so I wouldn't say this is an error in the pseudocode (which isn't supposed to be in any particular language). For instance, in weakly-typed languages you can certainly have a matrix with int and NULL values.
You don't mention what you said about the function's behavior. If I were the interviewer, I would want to see someone "think out loud" and realize at least the following:
The code is trying to compare each element with the one below it.
Nothing happens unless the lower element is NULL.
If the two elements are equal, then the lower one is replaced with NULL and the upper element becomes twice as large.
If the top element is NULL, then the lower non-NULL element "moves" to the top element's place.
These observations about the code are straightforward to obtain just by reading the source code. Whether or not you make sense of these "rules" and notice that it's (similar to) the 2048 game is largely dependent on whether you've played the game before.
Here's the python code for the same program. I have fixed the index out of bound issue in this code. Hope this helps.
null = 0
array = [[2,null,2,null],[2,null,2,null],[null,null,null,null],[null,null,null,null]]
range = [0,1,2]
for y in range:
for x in range:
if array[x+1][y] != null:
if array[x+1][y] == array[x][y]:
array[x][y] = array[x][y]*2
array[x+1][y] = null
if array[x][y] == null:
array[x][y] = array[x+1][y]
array[x+1][y] = null
print(array)
Once corrected, what does function foo do? Please focus on the result of the function, not the details of the implementation
The output will be :
4 null 4 null
null null null null
null null null null
null null null null
Laravel's Collection class (v5.5) has a sortBy() method that sorts everything similar to if you had used a SQL ORDER BY statement, except there is one striking difference: a relational database will put NULL values at the end when sorting ascending (or at least PostgreSQL does), but Laravel's sortBy() method puts NULL values first.
So how do I get those NULL values to be sorted last instead of first when using the Collection::sortBy() method? PLEASE NOTE: I cannot change the database query itself! MY ONLY OPTION is to sort the Collection itself within PHP. I absolutely cannot do this at the database level in my situation.
There is a similar question on Stack Overflow here but the solution OP found was kind of a hack and does not work in my situation, because I need it to work for varchars.
NOTE: I realize sortBy() accepts a Closure for the first argument but there is zero explanation in the documentation about the arguments this closure receives (which "key" is $key?), nor does it explicitly say what the closure is supposed to return in order to determine the sort order. (I'm assuming it should return an integer representing the order, but I do not know how to make that work for me with multiple NULL values.)
The sortBy method accepts a field on which to sort, in ascending order. So if you had a collection of App\User objects, you could pass in say first_name and that would sort by each user's first name.
If your logic is more complex, and you wanted to sort on something that isn't strictly a field of each item in your collection you may pass a closure instead. The first parameter passed to the closure is the actual item. You should return a value that you want to be sorted from the closure. Let's say in your collection of App\User objects, you wanted to sort by the last letter of each person's first name:
$users->sortBy(function ($item) {
return substr($item->first_name, -1);
});
The second parameter is the key, which is the key of the collection, or the underlying array it represents. If you've retrieved a collection from the database, this will likely be a numeric index, but if you had a different collection or you decided to re-key the collection by say, the user's email address (Using keyBy), then that is what is passed.
When it comes to sticking all of your null values at the end of the sorted result set, I would suggest using the sort method instead of sortBy. When you pass a closure to sort, it accepts two items, representing two items from your collection to be sorted. In the case of your collection of App\User objects, each item would an instance of App\User. This gives you total control on how two objects are compared and therefore total control over the order.
You should return -1 (or a negative value) if the first item is considered to be less than the second, you should return 0 if the two items are to be considered equal and 1 (or a positive value) if the first item is considered to be greater than the second.
$users->sort(function ($a, $b) {
// Return -1, 0 or 1 here
});
That should allow you to implement your logic. To ensure null values are moved to the end, just return 1 every time whatever you're interested in from $a is null. Similarly, if whatever you're interested in from $b is null, return -1 and if whatever you're interested in from both $a and $b are null, return 0.
As of Java 1.5, you can pretty much interchange Integer with int in many situations.
However, I found a potential defect in my code that surprised me a bit.
The following code:
Integer cdiCt = ...;
Integer cdsCt = ...;
...
if (cdiCt != null && cdsCt != null && cdiCt != cdsCt)
mismatch = true;
appeared to be incorrectly setting mismatch when the values were equal, although I can't determine under what circumstances. I set a breakpoint in Eclipse and saw that the Integer values were both 137, and I inspected the boolean expression and it said it was false, but when I stepped over it, it was setting mismatch to true.
Changing the conditional to:
if (cdiCt != null && cdsCt != null && !cdiCt.equals(cdsCt))
fixed the problem.
Can anyone shed some light on why this happened? So far, I have only seen the behavior on my localhost on my own PC. In this particular case, the code successfully made it past about 20 comparisons, but failed on 2. The problem was consistently reproducible.
If it is a prevalent problem, it should be causing errors on our other environments (dev and test), but so far, no one has reported the problem after hundreds of tests executing this code snippet.
Is it still not legitimate to use == to compare two Integer values?
In addition to all the fine answers below, the following stackoverflow link has quite a bit of additional information. It actually would have answered my original question, but because I didn't mention autoboxing in my question, it didn't show up in the selected suggestions:
Why can't the compiler/JVM just make autoboxing “just work”?
The JVM is caching Integer values. Hence the comparison with == only works for numbers between -128 and 127.
Refer: #Immutable_Objects_.2F_Wrapper_Class_Caching
You can't compare two Integer with a simple == they're objects so most of the time references won't be the same.
There is a trick, with Integer between -128 and 127, references will be the same as autoboxing uses Integer.valueOf() which caches small integers.
If the value p being boxed is true, false, a byte, a char in the range \u0000 to \u007f, or an int or short number between -128 and 127, then let r1 and r2 be the results of any two boxing conversions of p. It is always the case that r1 == r2.
Resources :
JLS - Boxing
On the same topic :
autoboxing vs manual boxing java
"==" always compare the memory location or object references of the values. equals method always compare the values. But equals also indirectly uses the "==" operator to compare the values.
Integer uses Integer cache to store the values from -128 to +127. If == operator is used to check for any values between -128 to 127 then it returns true. for other than these values it returns false .
Refer the link for some additional info
Integer refers to the reference, that is, when comparing references you're comparing if they point to the same object, not value. Hence, the issue you're seeing. The reason it works so well with plain int types is that it unboxes the value contained by the Integer.
May I add that if you're doing what you're doing, why have the if statement to begin with?
mismatch = ( cdiCt != null && cdsCt != null && !cdiCt.equals( cdsCt ) );
The issue is that your two Integer objects are just that, objects. They do not match because you are comparing your two object references, not the values within. Obviously .equals is overridden to provide a value comparison as opposed to an object reference comparison.
Besides these given great answers, What I have learned is that:
NEVER compare objects with == unless you intend to be comparing them
by their references.
As well for correctness of using == you can just unbox one of compared Integer values before doing == comparison, like:
if ( firstInteger.intValue() == secondInteger ) {..
The second will be auto unboxed (of course you have to check for nulls first).
I have the code like this:
var query = repository.Where(item => item.UserId == userId && item.LoanNumber != loanNumber)
which is transformed to SQL (repository is IQueryable).
loanNumber is a string parameter in the method. The problem is that checking against inequality fails (ignored). If instead of variable I use constant with its value, it works properly.
What the... ?
A number should be a NUMBER DATA TYPE, and not a string. It violates normalization rules. So please tell what are the data type of the values being compared on both sides of the expression in predicate.
If you compare similar data types, you would get correct results as you don't and should not rely on implicit conversion.
So make sure you have the correct data type.
I have a problem not understanding how apache pig (version r0.9.2) is handling negation of null values.
I have an expression like this:
nonEmpty = FILTER dataFields BY NOT IsEmpty(children);
If children is null, IsEmpty function will return null - so what confuses me how NOT operator will behave since I would have expression like this:
nonEmpty = FILTER dataFields BY NOT NULL;
Documentation for pig latin r0.9.2 says next:
"Pig does not support a boolean data type. However, the result of a boolean expression (an expression that includes boolean and comparison operators) is always of type boolean (true or false)."
which doesn't do anything more than confuse me totally.
Thanks for the help in advance.
Testing a NULL for emptiness is probably not a good idea regardless. In fact, I tried it on 0.10.0, and it threw an error saying exactly that. Instead, filter by not null and not empty:
nonEmpty = FILTER dataFields BY (children IS NOT NULL) AND (NOT IsEmpty(children));