Need help overriding compareTo - sorting

I'm a Java novice that is having some trouble overriding the compareTo method in the Comparable interface. My code creates a HashMap that associates strings to an int. I would like to override compareTo so that the strings in the ArrayList keys are sorted based on their HashMap values, not alphabetically. Under this implementation, however, the strings are still sorted alphabetically.
Oh, and to clarify, nameWeight is the HashMap of String,Integer pairs.
Any ideas?
List<String> keys = new ArrayList<String>(nameWeight.keySet());
System.out.println(keys);
Collections.sort(keys);
public int compareTo(String that){
int gtr = 1;
int less = -1;
int eql = 0;
System.out.print(this);
System.out.print(that);
if(that=="JOHN")
return less;
int valThis = nameWeight.get(this);
int valThat = nameWeight.get(that);
if(valThis==valThat)
return eql;
if(valThis>valThat)
return gtr;
if(valThis<valThat)
return less;
return gtr;
}

You're sorting a list of strings so the compareTo method called is the one defined in class String (or its superclass). Since you can't modify String you would have to create a subclass of String, override compareTo in that class and use List<StringSubClass>. But since String is final you're not allowed to subclass from it (thanks #pst).
Alternatively you don't use String objects in the list but objects of the type you created and in which you override compareTo (don't forget to add implements Comparable to the class definition).
Or (shameless plug from #pst again), and that is probably the best solution, you pass a comparator to the sort function which will be used to sort the strings instead of the default implementation.

Related

How to work inline with custom IEqualityComparer<T> parameters

Several time I needed to call linq distincts from different IEnumerables.
These distincts often need criteria that I use just once through the software.
I found really annoying the constraint to create a class that implements the IEqualityComparer with the codebase to perform the distinct, so I thought to cover the gap creating a generic class that allows to point to a lambda expression passed as a parameter of the distinct.
In order to pass a custom IEqualityComparer parameters I developed the following class:
public class InlineComparer<T>
{
private class LambdaBasedComparer : IEqualityComparer<T>
{
public LambdaBasedComparer(Func<T, int> getHashCode)
{
fGetHashCode = getHashCode;
}
public bool Equals(T x, T y)
{
return x?.GetHashCode() == y?.GetHashCode();
}
private Func<T, int> fGetHashCode { get; set; }
public int GetHashCode(T obj)
{
return fGetHashCode(obj);
}
}
public static IEqualityComparer<T> GetComparer(Func<T, int> getHashCode)
{
return new LambdaBasedComparer(getHashCode);
}
}
What do you think about it? I hope it may be helpful!
Of course, a complete implementation of this helper makes use of this class into an extension method similar to "IEnumerable.Distinct(Func getHashCode)", but I wanted to highlight the possibility to work with lambdas to pass the distinct code.
Your equality comparer will declare two objects to be equal if they return the same value for GetHashCode(). Of course, when defining your own Equality comparer you are free to define the concept of equality any way you want, as long as your equality is reflexive, symmetric and transitive (x==x; if x==y then y==x; if x==y and y==x, then x==z).
Your equality comparer fits these rules, so you can use it.
However! Will it be a useful comparer?
You want to use a special equality comparer instead of the default equality comparer because you want some special definition of (un)equality of two objects.
Normally, during your design process you should first define equality of your objects. If you've done that and you want to use your LambdaBasedComparer you'll have to create a Hash Function that will return different values for different objects.
Normally hash functions have one requirement: two equal objects should return the same hash value. There is no requirement upon two different objects.
There are only Int32.MaxValue different Hash values, so if you've designed a class with more than this value possible unequal instances you can't use your comparer. An easy example: try to create a LambdaBasedComparer<long> that uses normal equality.
But even if your class can only create half of the Int32.MaxValue instances, it will be very difficult to create a proper hash function that will generate unique hash values for different objects.
Finally your equality will not be very intuitive if you use it to compare derived classes. Consider class Person and derived classes Employee and Customer.
IEqualityComparer<Person> personComparer = new LambdBasedComparer<Person>(...);
Person p = new Person(...);
Person e = new Employee(...);
Person c = new Customer(...);
Now I can say that a certain Person who isn't an Employee can equal one of your Employees. But would you ever say that Employees will equal Customers?
Summarized: you think that you have a simple solution for your comparers, but it will be very difficult to create a proper hash function for your definition of equality, and it will probably be even more difficult to test this hash function

Creating a stream of booleans from a boolean array? [duplicate]

There is no nice way to convert given boolean[] foo array into stream in Java-8 in one statement, or I am missing something?
(I will not ask why?, but it is really incomprehensible: why not add stream support for all primitive types?)
Hint: Arrays.stream(foo) will not work, there is no such method for boolean[] type.
Given boolean[] foo use
Stream<Boolean> stream = IntStream.range(0, foo.length)
.mapToObj(idx -> foo[idx]);
Note that every boolean value will be boxed, but it's usually not a big problem as boxing for boolean does not allocate additional memory (just uses one of predefined values - Boolean.TRUE or Boolean.FALSE).
You can use Guava's Booleans class:
Stream<Boolean> stream = Booleans.asList(foo).stream();
This is a pretty efficient way because Booleans.asList returns a wrapper for the array and does not make any copies.
of course you could create a stream directly
Stream.Builder<Boolean> builder = Stream.builder();
for (int i = 0; i < foo.length; i++)
builder.add(foo[i]);
Stream<Boolean> stream = builder.build();
…or by wrapping an AbstractList around foo
Stream<Boolean> stream = new AbstractList<Boolean>() {
public Boolean get(int index) {return (foo[index]);}
public int size() {return foo.length;}
}.stream();
Skimming through the early access JavaDoc (ie. java.base module) of the newest java-15, there is still no neat way to make the primitive boolean array work with Stream API together well. There is no new feature in the API with treating a primitive boolean array since java-8.
Note that there exist IntStream, DoubleStream and LongStream, but nothing like BooleanStream that would represent of a variation of a sequence of primitive booleans. Also the overloaded methods of Stream are Stream::mapToInt, Stream::mapToDouble and Stream::mapToLong, but not Stream::mapToBoolean returning such hypothetical BooleanStream.
Oracle seems to keep following this pattern, which could be found also in Collectors. There is also no such support for float primitives (there is for double primitives instead). In my opinion, unlike of float, the boolean support would make sense to implement.
Back to the code... if you have a boxed boolean array (ie. Boolean[] array), the things get easier:
Boolean[] array = ...
Stream<Boolean> streamOfBoxedBoolean1 = Arrays.stream(array);
Stream<Boolean> streamOfBoxedBoolean2 = Stream.of(array);
Otherwise you have to use more than one statement as said in this or this answer.
However, you asked (emphasizes mine):
way to convert given boolean[] foo array into stream in Java-8 in one statement.
... there is actually a way to achieve this through one statement using a Spliterator made from an Iterator. It is definetly not nice but :
boolean[] array = ...
Stream<Boolean> stream = StreamSupport.stream(
Spliterators.spliteratorUnknownSize(
new Iterator<>() {
int index = 0;
#Override public boolean hasNext() { return index < array.length; }
#Override public Boolean next() { return array[index++]; }
}, 0), false);

Creating composite key class for Secondary Sort

I am trying to create a composite key class of a String uniqueCarrier and int month for Secondary Sort. Can anyone tell me, what are the steps for the same.
Looks like you have an equality problem since you're not using uniqueCarrier in your compareTo method. You need to use uniqueCarrier in your compareTo and equals methods (also define an equals method). From the java lang reference
The natural ordering for a class C is said to be consistent with equals if and only if e1.compareTo(e2) == 0 has the same boolean value as e1.equals(e2) for every e1 and e2 of class C. Note that null is not an instance of any class, and e.compareTo(null) should throw a NullPointerException even though e.equals(null) returns false.
You can also implement a RawComparator so that you can compare them without deserializing for some faster performance.
However, I recommend (as I always do) to not write things like Secondary Sort yourself. These have been implemented (as well as dozens of other optimizations) in projects like Pig and Hive. E.g. if you were using Hive, all you need to write is:
SELECT ...
FROM my_table
ORDER BY month, carrier;
The above is a lot simpler to write than trying to figure out how to write Secondary Sorts (and eventually when you need to use it again, how to do it in a generic fashion). MapReduce should be considered a low level programming paradigm and should only be used (IMHO) when you need high performance optimizations that you don't get from higher level projects like Pig or Hive.
EDIT: Forgot to mention about Grouping comparators, see Matt's answer
Your compareTo() implementation is incorrect. You need to sort first on uniqueCarrier, then on month to break equality:
#Override
public int compareTo(CompositeKey other) {
if (this.getUniqueCarrier().equals(other.getUniqueCarrier())) {
return this.getMonth().compareTo(other.getMonth());
} else {
return this.getUniqueCarrier().compareTo(other.getUniqueCarrier());
}
}
One suggestion though: I typically choose to implement my attributes directly as Writable types if possible (for example, IntWriteable month and Text uniqueCarrier). This allows me to call write and readFields directly on them, and also use their compareTo. Less code to write is always good...
Speaking of less code, you don't have to call the parent constructor for your composite key.
Now for what is left to be done:
My guess is you are still missing a hashCode() method, which should only return the hash of the attribute you want to group on, in this case uniqueCarrier. This method is called by the default Hadoop partitionner to distribute work across reducers.
I would also write custom GroupingComparator and SortingComparator to make sure grouping happens only on uniqueCarrier, and that sorting behaves according to CompositeKey compareTo():
public class CompositeGroupingComparator extends WritableComparator {
public CompositeGroupingComparator() {
super(CompositeKey.class, true);
}
#Override
public int compare(WritableComparable a, WritableComparable b) {
CompositeKey first = (CompositeKey) a;
CompositeKey second = (CompositeKey) b;
return first.getUniqueCarrier().compareTo(second.getUniqueCarrier());
}
}
public class CompositeSortingComparator extends WritableComparator {
public CompositeSortingComparator()
{
super (CompositeKey.class, true);
}
#Override
public int compare (WritableComparable a, WritableComparable b){
CompositeKey first = (CompositeKey) a;
CompositeKey second = (CompositeKey) b;
return first.compareTo(second);
}
}
Then, tell your Driver to use those two:
job.setSortComparatorClass(CompositeSortingComparator.class);
job.setGroupingComparatorClass(CompositeGroupingComparator.class);
Edit: Also see Pradeep's suggestion of implementing RawComparator to prevent having to unmarshall to an Object each time, if you want to optimize further.

compareTo in WritableComparable is intended for?

I am trying to emit 2 matrices as my key and value. One matrix as key and the other as value.
I wrote my class which implements WritableComparable.
But I am confused what to write with in:
#Override
public int compareTo(MW o) {
// TODO Auto-generated method stub
return 0;
}
What is this CompareTo() intended for?
To cite the Java documentation:
This interface imposes a total ordering on the objects of each class
that implements it. This ordering is referred to as the class's
natural ordering, and the class's compareTo method is referred to as
its natural comparison method.
Usually you return 0 if both objects are the same, and with a number higher or lower you determine the order between your objects.

Can we sort an IList partially?

IList<A_Desc,A_premium,B_Desc,B_Premium>
Can I sort two columns A_Desc,A_premium...based on A_Desc ?
And let B_Desc,B_Premium be remain in same order before sorting
First off, a list can only be of one type, and only has one "column" of data, so you actually want two lists and a data type that holds "desc" and "premium". "desc" sounds like a String to me; I don't know what Premium is, but I'll pretend it's a double for lack of better ideas. I don't know what this data is supposed to represent, so to me, it's just some thingie.
public class Thingie{
public String desc;
public double premium;
}
That is, of course, a terrible way to define the class- I should instead have desc and premium be private, and Desc and Premium as public Properties with Get and Set methods. But this is the fastest way for me to get the point across.
It's more canonical to make Thingie implement IComparable, and compare itself to other Thingie objects. But I'm editing an answer I wrote before I knew you needed to write a custom type, and had the freedom to just make it implement IComparable. So here's the IComparer approach, which lets you sort objects that don't sort themselves by telling C# how to sort them.
Implement an IComparer that operates over your custom type.
public class ThingieSorter: IComparer<Thingie>{
public int Compare(Thingie t1, Thingie t2){
int r = t1.desc.CompareTo(t2);
if(r != 0){return r;}
return t1.premium.CompareTo(t2);
}
}
C# doesn't require IList to implement Sort- it might be inefficient if it's a LinkedList. So let's make a new list, based on arrays, which does sort efficiently, and sort it:
public List<Thingie> sortedOf(IList<Thingie> list){
List<Thingie> ret = new List<Thingie>(list);
ret.sort(new ThingieSorter());
return ret;
}
List<Thingie> implements the interface IList<Thingie>, so replacing your original list with this one shouldn't break anything, as long as you have nothing holding onto the original list and magically expecting it to be sorted. If that's happening, refactor your code so it doesn't grab the reference until after your list has been sorted, since it can't be sorted in place.

Resources