Use hashCode() for sorting Objects in java, not in HashTable and ect - sorting

I need your help.
If i want to sort a PriorityQeueu in java, with out connection to it's attributes - could i use the hashCode's Objects to compare?
This how i did it:
comp = new Comparator<Person>() {
#Override
public int compare(Person p1, Person p2) {
if(p1.hashCode() < p2.hashCode()) return 1;
if(p1.hashCode() == p2.hashCode()) return 0;
return -1;
}
};
collector = new PriorityQueue<Person>(comp);

It doesn't sound like a good approach.
Default hashCode() is typically implemented by converting the internal address of the object into an integer. So the order of objects will differ between application executions.
Also, 2 objects with the same set of attribute values will not return the same hashCode value unless you override the implementation. This actually breaks the expected contract of Comparable.

Related

LINQ Distinct does not invoke IEquatable<T>.Equals

I have a set of domain object, deriving from a base, where I've overridden Equals, IEquatable<T>.Equals and equality operators. I've successfully used Contains, but now I am trying to use Distinct differently. Here's look at a sample code:
var a = new Test { Id = 1 };
var a2 = new Test { Id = 1 };
var list = new List<Test> { a, a2 };
var distinct = list.Distinct().ToList(); // both objects, Equal implementations not called
var containsA = list.Contains(a); // true, Equal implementations called
var containsA2 = list.Contains(a); // true
var containsNewObjectWithSameId = list.Contains(new Test { Id = 1 }); // true
public class Test : IEquatable<Test>
{
public int Id { get; init; }
public bool Equals(Test other)
{
if (ReferenceEquals(null, other))
return false;
if (ReferenceEquals(this, other))
return true;
if (this.GetType() != other.GetType())
return false;
return this.Id == other.Id;
}
public override int GetHashCode() => base.GetHashCode + this.Id;
}
Contains finds matches, but Distinct is feeling very inclusive and keeps them both. From MS docs:
The first search does not specify any equality comparer, which means FindFirst uses
EqualityComparer.Default to determine equality of boxes. That in turn uses the implementation
of the IEquatable.Equals method in the Box class.
What am I missing?
Thanks #JonSkeet for your insight in the comments.
The problem in this case is the way I wrote my GetHashCode method. It has nothing to do with LINQ, as I originally thought.
Explanation
GetHashCode has to be identical for objects that compare equally. In my case - since the base implementation of object.Equals only checks for reference equality and I am comparing two separate objects - a and b, their base.GetHashCode would result in different values, which in turn would render those two objects as not equal.
Solution
In this case, simply returning the Id value is enough as is shown in MS docs:
One of the simplest ways to compute a hash code for a numeric value that has the same or a smaller range than the Int32 type is to simply return that value.
So changing the above code sample like this:
public override int GetHashCode() => this.Id;
would solve the issue. Please keep in mind that if the value of Id is not unique, this will cause ill behavior. In such cases you'll need another property to check and you will have to compose GetHashCode from ALL those properties. For further info refer to MS docs

A algorithm to track the status of a number

To design a API,
get(), it will return the random number, also the number should not duplicate, means it always be unique.
put(randomvalue), it will put back the generated random number from get(), if put back, get() function can reuse this number as output.
It has to be efficient, no too much resource is highly used.
Is there any way to implement this algorithm? It is not recommended to use hashmap, because if this API generate for billions of requests, saving the generated the random number still use too much space.
I could no work out this algorithm, please help give a clue, thanks in advance!
I cannot think of any solution without extra space. With space, one option could be to use TreeMap, firstly add all the elements in treeMap with as false. When element is accessed, mark as true. Similarly for put, change the value to false.
Code snippet below...
public class RandomNumber {
public static final int SIZE = 100000;
public static Random rand;
public static TreeMap<Integer, Boolean> treeMap;
public RandomNumber() {
rand = new Random();
treeMap = new TreeMap<>();
}
static public int getRandom() {
while (true) {
int random = rand.nextInt(SIZE);
if (!treeMap.get(random)) {
treeMap.put(random, true);
return random;
}
}
}
static public void putRandom(int number) {
treeMap.put(number, false);
}
}

Custom comparator for a priority queue without defining a nested class

I have a class called getout (no constructor). Within that class I have some private variables that are priority queues. The priority queues are initialized with a custom comparator function that I am supposed to create:
priority_queue<tile, vector<tile>, ****insert comparator here***> primary;
I understand that custom comparators can be written using a class or struct. However I cannot do it this way (Im sure there's a way). The reason why is within this comparator I use functions pertaining to my class getout. I decided to write my comparator as a regular bool function as follows:
class escape{
public:
//grabs row of the tile
int get_row(int index){
return floor(index/size);
}
//grabs column of the tile
int get_col(int index){
return index - get_row(index)*size;
}
//stores information about each tile of the grid
struct tile{
int index;
};
//returns the index provided a row and column
int get_index(int row, int col){
return row*size + col;
}
//comparator less_than
bool less_than(const tile &t1, const tile &t2)
{
if(t1.rubble_amount == t2.rubble_amount){
//return object with lower column value
if(get_col(t1.index) == get_col(t2.index)){
return get_row(t1.index) > get_row(t2.index);
}
//if column values are same, return object with lower row
else if(get_col(t1.index) > get_col(t2.index)){
return true;
}
}//if
return t1.rubble_amount > t2.rubble_amount;
}//comparator less_than
};
The functions pertaining to my class I am using are get_row(), get_col(). I do not want to resolve this by making them member variables of my tile structs.
How do I define the comparator of my priority queue that is of the form of a bool function?
Everything is in my class getout.
I have tried:
priority_queue<tile, vector<tile>, function<bool(tile, tile)>> primary(less_than);
But I am getting an error "Unknown type name less_than". Am I implementing the above code correctly ? Is there another way I can do this?
(all necessary libraries are included)
Thanks!!

Spring SpEL chooses wrong method to invoke

I'm trying to evaluate the following SpEL expression (Spring-expression version 3.1.1):
T(com.google.common.collect.Lists).newArrayList(#iterable)
where #iterable is of type java.lang.Iterable.
Google Guava com.google.common.collect.Lists (version 14.0) does have a method newArrayList(Iterable) but for some reason SpEL chooses to invoke a different method: newArrayList(Object[])
I dived into the code and found the issue to be with org.springframework.expression.spel.support.ReflectiveMethodResolver implementation: it seems to be sensitive to the manner in which methods are sorted by the java.lang.Class::getMethods.
If 2 methods match the invocation (in the case one of the methods is varargs), the later method (in the order) will be invoked, instead of choosing the method that isn't varargs (which is more specific).
It seems like JDK doesn't guarantee the order the methods are sorted: different runs show different order.
Is there a way to overcome this issue?
You can use the collection projections of Spring EL to select all from iterable and convert it to list:
"#iterable.?[true]"
A simple example to test:
Iterable<Integer> it = () -> new Iterator<Integer>() {
private int[] a = new int[]{1, 2, 3};
private int index = 0;
#Override
public boolean hasNext() {
return index < a.length;
}
#Override
public Integer next() {
return a[index++];
}
};
Tmp tmp = new Tmp();
tmp.setO(it);
StandardEvaluationContext context = new StandardEvaluationContext(tmp);
ArrayList<Integer> list = parser.parseExpression("o.?[true]").getValue(context,
ArrayList.class);

String set implementation

I have to implement a set ADT for a pair of strings. The interface I want is (in Java):
public interface StringSet {
void add(String a, String b);
boolean contains(String a, String b);
void remove(String a, String b);
}
The data access pattern has the following properties:
The contains operation is far more frequent that the add and remove ones.
More often that not, contains returns true i.e. the search is successful
A simple implementation I can think of is to use a two-level hashtable, i.e. HashMap<String, HashMap<String, Boolean>>. But this datastructure makes no use of the two peculiarities of the access pattern. I am wondering if there is something more efficient than the hashtable, maybe by leveraging the access pattern peculiarities.
Personally, I would design this in terms of a standard Set<> interface:
public class StringPair {
public StringPair(String a, String b) {
a_ = a;
b_ = b;
hash_ = (a_ + b_).hashCode();
}
public boolean equals(StringPair pair) {
return (a_.equals(pair.a_) && b_.equals(pair.b_));
}
#Override
public boolean equals(Object obj) {
if (obj instanceof StringPair) {
return equals((StringPair) obj);
}
return false;
}
#Override
public int hashCode() {
return hash_;
}
private String a_;
private String b_;
private int hash_;
}
public class StringSetImpl implements StringSet {
public StringSetImpl(SetFactory factory) {
pair_set_ = factory.createSet<StringPair>();
}
// ...
private Set<StringPair> pair_set_ = null;
}
Then you could leave it up to the user of StringSetImpl to use the preferred Set type. If you are attempting to optimize access, though, it's hard to do better than a HashSet<> (at least with respect to runtime complexity), given that access is O(1), whereas tree-based sets have O(log N) access times.
That contains() usually returns true may make it worth considering a Bloom filter, although this would require that some number of false positives for contains() are allowed (don't know if that is the case).
Edit
To avoid the extra allocation, you can do something like this, which is similar to your two-level approach, except using a set rather than a map for the second level:
public class StringSetImpl implements StringSet {
public StringSetImpl() {
elements_ = new HashMap<String, Set<String>>();
}
public boolean contains(String a, String b) {
if (!elements_.containsKey(a)) {
return false;
}
Set<String> set = elements_.get(a);
if (set == null) {
return false;
}
return set.contains(b);
}
public void add(String a, String b) {
if (!elements_.containsKey(a) || elements_.get(a) == null) {
elements_.put(a, new HashSet<String>());
}
elements_.get(a).add(b);
}
public void remove(String a, String b) {
if (!elements_.containsKey(a)) {
return;
}
HashSet<String> set = elements_.get(a);
if (set == null) {
elements_.remove(a);
return a;
}
set.remove(b);
if (set.empty()) {
elements_.remove(a);
}
}
private Map<String, Set<String>> elements_ = null;
}
Since it's 4:20 AM where I'm located, the above is definitely not my best work (too tired to refresh myself on the treatment of null by these different collections types), but it sketches the approach.
Do not use normal trees (most standard library data structures) for this. There is one simple assumption, which will hurt you in this case:
The normal O(log(n)) calculation of operations on trees assume that comparisons are in O(1). This is true for integers and most other keys, but not for strings. In case of strings each comparison is on O(k) where k is the length of the string. This makes all operations dependent on the length, which will most likely hurt you if you need to be fast and is easily overlooked.
Especially if you most often return true there will be k comparisons for each string at each level, so with this access pattern you will experience the full drawback of strings in trees.
Your access pattern is easily handled by a Trie. Testing if a string is contained is in O(k) worst case (not average case as in a hash map). Adding a string is is also in O(k). Since you are storing two strings I would suggest, you don't index your trie by characters, but rather by some larger type, so you can add two special index values. One value for the end of the first string, and one value for the end of both strings.
In your case using these two extra symbols would also allow for simple removal: Just delete the final node containing the end symbol and your string will not be found anymore. You will waste some memory, because you still have the strings in your structure that have been deleted. In case this is a problem you could keep track of the number of deleted strings and rebuild your trie in case this get's to bad.
P.s. A trie can be thought of as a combination of a tree and several hashtables, so this gives you the best of both data structures.
I'd second the approach of Michael Aaron Safyan to use a StringPair type. Perhaps with a more specific name, or as a generic tuple type: Tuple<A,B> instantiated to Tuple<String,String>. But I would strongly suggest to use one of the provided set implementations, either a HashSet or a TreeSet.
Red-Black Tree implementation of the set would be a good option. C++ STL is implemented in Red-Black Tree

Resources