Collections.sort() issue in java7 - sorting

Is there a sort issue with java7? I am using Collections.sort(list, comparator)
When I switched over to java7, I noticed that the sorting resulted in a different list compared to the result when I was using java6.
Example: List = [d, e, b, a, c, f, g, h]
In java6 Collections.sort(List, comparator) resulted in [a, b, c, d, e, f, g, h]
In java7 Collections.sort(List, comparator) resulted in [b, a, c, d, e, f, g, h]
The first two values in the list have been swapped.

Java 7 switched from Merge sort to Tim sort. It might result in slight changes in order with "broken comparators" (quoting comment in source code of Arrays class):
/**
* Old merge sort implementation can be selected (for
* compatibility with broken comparators) using a system property.
* Cannot be a static boolean in the enclosing class due to
* circular dependencies. To be removed in a future release.
*/
Try running your JVM with:
java -Djava.util.Arrays.useLegacyMergeSort=true
It's not clear what "broken comparator" means, but apparently it can result in different order of elements in sorted arrays.

One thing to note, that might be causing confusion. Collections.sort is a stable sort. This means for equal elements, it maintains their original ordering, so:
if a == b, then
Collections.sort([d, e, b, a, c, f, g, h]) = [b, a, c, d, e, f, g, h]
and
Collections.sort([d, e, a, b, c, f, g, h]) = [a, b, c, d, e, f, g, h]
Seems likely to me that either that is what your seeing, or the Comparator in question (or the objects being sorteds' natural ordering) isn't working the way you expect it to.

Related

Prolog: Debugging Recursive Source Removal Algorithm

I am currently working on implementing a source-removal topological sorting algorithm for a directed graph. Basically the algorithm goes like this:
Find a node in a graph with no incoming edges
Remove that node and all edges coming out from it and write its value down
Repeat 1 and 2 until you eliminate all nodes
So, for example, the graph
would have a topological sort of a,e,b,f,c,g,d,h. (Note: topological sorts aren't unique and thus there can be a different topological sort as well)
I am currently working on a Prolog implementation of this with the graph being represented in list form as follows:
[ [a,[b,e,f]], [b,[f,g]], [c,[g,d]], [d,[h]], [e,[f]], [f,[]],
[g,[h]], [h,[]] ]
Where the [a, [b,e,f] ] term for example represents the edges going from a to b, e, and f respectively, and the [b, [f,g] ] term represents the edges going from b to f and g. In other words, the first item in the array "tuple" is the "from" node and the following array contains the destinations of edges coming from the "from" node.
I am also operating under assumption that there is one unique name for each vertex and thus when I find it, I can delete it without worrying about any potential duplicates.
I wrote the following code
% depends_on shows that D is adjacent to A, i.e. I travel from A to D on the graph
% returns true if A ----> D
depends_on(G,A,D) :- member([A,Ns],G), member(D,Ns).
% doesnt_depend_on shows that node D doesnt have paths leading to it
doesnt_depend_on(G, D) :- \+ depends_on(G, _, D).
% removes node from a graph with the given value
remove_graph_node([ [D,_] | T], D, T). % base case -- FOUND IT return the tail only since we already popped it
remove_graph_node([ [H,Ns] | T], D, R) :- \+ H=D,remove_graph_node( T, D, TailReturn), append([[H,Ns]], TailReturn, R).
%----------------------------------------------------
source_removal([], []]). % Second parameter is empty list due to risk of a cycle
source_removal(G,Toposort):-member([D,_], G),
doesnt_depend_on(G,D),
remove_graph_node(G,D,SubG),
source_removal(SubG, SubTopoSort),
append([D], SubTopoSort, AppendResult),
Toposort is AppendResult.
And I tested the depends_on, doesnt_depend_on, and remove_graph_node by hand using the graph [ [a,[b,e,f]], [b,[f,g]], [c,[g,d]], [d,[h]], [e,[f]], [f,[]], [g,[h]], [h,[]] ] and manually changing the parameter variables (especially when it comes to node names like a, b, c and etc). I can vouch after extensive testing that they work.
However, my issue is debugging the source_removal command. In it, I repeatedly remove a node with no directed edge pointing towards it along with its outgoing edges and then try to add the node's name to the Toposort list I am building.
At the end of the function's running, I expect to get an array of output like [a,e,b,f,c,g,d,h] for its Toposort parameter. Instead, I got
?- source_removal([ [a,[b,e,f]], [b,[f,g]], [c,[g,d]], [d,[h]], [e,[f]], [f,[]], [g,[h]], [h,[]] ], Result).
false.
I got false as an output instead of the list I am trying to build.
I have spent hours trying to debug the source_removal function but failed to come up with anything. I would greatly appreciate it if anyone would be willing to take a look at this with a different pair of eyes and help me figure out what the issue in the source_removal function is. I would greatly appreciate it.
Thanks for the time spent reading this post and in advance.
The first clause for source_removal/2 contained a typo (one superfluous closing square bracket).
The last line for the second clause in your code says Toposort is AppendResult. Note that is is used in Prolog to denote the evaluation of an arithmetic expression, e.g., X is 3+4 yields X = 7 (instead of just unifying variable X with the term 3+4). When I change that line to use = (assignment, more precisely unification) instead of is (arithmetic evaluation) like so
source_removal([], []). % Second parameter is empty list due to risk of a cycle
source_removal(G,Toposort):-member([D,_], G),
doesnt_depend_on(G,D),
remove_graph_node(G,D,SubG),
source_removal(SubG, SubTopoSort),
append([D], SubTopoSort, AppendResult),
Toposort = AppendResult.
I get the following result:
?- source_removal([ [a,[b,e,f]], [b,[f,g]], [c,[g,d]], [d,[h]], [e,[f]], [f,[]], [g,[h]], [h,[]] ], Result).
Result = [a, b, c, d, e, f, g, h] ;
Result = [a, b, c, d, e, g, f, h] ;
Result = [a, b, c, d, e, g, h, f] ;
Result = [a, b, c, d, g, e, f, h] ;
Result = [a, b, c, d, g, e, h, f] ;
Result = [a, b, c, d, g, h, e, f] ;
Result = [a, b, c, e, d, f, g, h] ;
Result = [a, b, c, e, d, g, f, h] ;
Result = [a, b, c, e, d, g, h, f] ;
Result = [a, b, c, e, f, d, g, h] ;
Result = [a, b, c, e, f, g, d, h] ;
...
Result = [c, d, a, e, b, g, h, f] ;
false.
(Shortened, it shows 140 solutions in total.)
Edit: I didn't check all the solutions, but among the ones it finds is the one you gave in your example ([a,e,b,f,c,g,d,h]), and they look plausible in the sense that each either starts with a or with c.

Create an order from several incomplete orders

Here is the motivation and how it should behave, but I need a help how to implement it.
I have several (typically) incomplete orders given as ordered values, for ex.:
1. A, C, D
2. D, E
3. X, B
4. B, C
5. C, F
6. C, A
and the resulting order should be:
A, X, B, C, D, E, F or A, X, B, C, F, D, E or A, X, B, C, D, F, E
The idea behind it is sort the result based on first seen order. I will try explain it on the example in steps:
order A, C, D
D, E - D seen, so add E after D, so order A, C, D, E
X, B - no value seen yet, so we can not determine the order now, so create 2nd temporary order X, B
B, C - C already seen, so order A, B, C, D, E
and 2nd order can be merged via B, so A, X, B, C, D, E
C, F - C see, so order A, X, B, C, D, E, F
C, A - ignore, both values are part of already defined order (by first incomplete order A, C, D)
But what if an additional incomplete order F, D or F, E will be part (added to the end) of the input? The step-by-step mental algorithm will fail - F was already placed.
How can the idea be implemented, any idea?

How to get structure name?

If I have struct_name(a, b, c, d, e)., how can I get the name of the struct? In this case, it would be struct_name.
Is there any specific command to do this or should I transform it in some way in a list (I tried and atom_chars doesn't work) and find save the characters until meeting ( ?
One solution is to use functor/3.
Example:
?- Term = struct_name(a, b, c, d, e),
functor(Term, F, Arity).
Term = struct_name(a, b, c, d, e),
F = struct_name,
Arity = 5.
Related term inspection predicates are arg/3 and (=..)/2.
The use of such predicates often indicates a problem with your data structure design, and typically severly limits the generality of your relations.
Note in particular that you can use them only if their arguments are sufficiently instantiated.
For example:
?- functor(Term, F, A).
ERROR: Arguments are not sufficiently instantiated
This means that you can no longer use such predicates for generating answers.
You can use the (=..)/2 predicate (this is an ISO-predicate, so it should work on (almost) all Prolog interpreters) that has on the left side a functor, and on the right side the name of the functor followed by its operands.
So:
?- struct_name(a, b, c, d, e) =.. L.
L = [struct_name, a, b, c, d, e].
You can thus obtain the name of the struct with:
get_name(A,N) :-
A =.. [N|_].
When you then call it with struct_name(a, b, c, d, e), it will give you:
?- get_name(struct_name(a, b, c, d, e),N).
N = struct_name.

Prolog ensure a rule's return parameters are unique and in canonical order

I have some data declared in a Prolog file that looks like the following:
gen1(grass).
gen1(poison).
gen1(psychic).
gen1(bug).
gen1(rock).
...
gen1((poison, flying)).
gen1((ghost, poison)).
gen1((water, ice)).
...
weak1(grass, poison).
weak1(grass, bug).
weak1(poison, rock).
strong1(grass, rock).
strong1(poison, grass).
strong1(bug, grass).
strong1(poison, bug).
strong1(psychic, poison).
strong1(bug, poison).
strong1(bug, psychic).
strong1(rock, bug).
Note that the data does not define strong1 or weak1 for compound gen1(...). Those are determined by rules which do not contribute to the minimal working example. I mention them because it might be useful to know they exist.
I am trying to find relations between these terms that form a cycle. Here's one sample function:
triangle1(A, B, C) :-
setof(A-B-C, (
gen1(A), gen1(B), gen1(C), A \= B, A \= C, B \= C,
strong1(A, B), strong1(B, C), strong1(C, A)
), Tris),
member(A-B-C, Tris).
This setup does remove duplicates where A, B, and C are in the same order. However, it doesn't remove duplicates in different orders. For example:
?- triangle1(A, B, C),
member(A, [bug, grass, rock]),
member(B, [bug, rock, grass]),
member(C, [bug, rock, grass]).
A = bug,
B = grass,
C = rock ;
A = grass,
B = rock,
C = bug ;
A = rock,
B = bug,
C = grass ;
false.
That query should only return one set of [A, B, C].
I've thought about using sort/2, but there are cases where simply sorting changes the meaning of the answer:
?- triangle1(A, B, C),
sort([A, B, C], [D, E, F]),
\+member([D, E, F], [[A, B, C], [B, C, A], [C, A, B]]).
A = D, D = bug,
B = F, F = psychic,
C = E, E = poison .
I also tried < and >, but those don't work on atoms, apparently.
Any thoughts?
(I looked at the similar questions, but have no idea how what I'm doing here compares to what other people are doing)
EDIT: As per comment about minimal working example.
You can try sorting inside the setof/3 call. So you should avoid the generation of triples in wrong order.
I mean: calling setof/3, instead of
A \= B, A \= C, B \= C,
try with
A #< B, A #< C, B \= C,
This way you impose that A is lower than B and lower than C, you avoid duplicates and maintain correct solutions.
The full triangle1/3
triangle1(A, B, C) :-
setof(A-B-C, (
gen1(A), gen1(B), gen1(C), A #< B, A #< C, B \= C,
strong1(A, B), strong1(B, C), strong1(C, A)
), Tris),
member(A-B-C, Tris).

Random distribution between evenly sized buckets without repetition

Problem
I have N items of various types evenly distributed into their own buckets determined by type. I want to create a new list that:
randomly picks from each bucket
does not pick from the same bucket twice in a row
each bucket must have (if possible) an equal amount of representation in the final list
not using language specific libraries (not easily implemented in another language)
Example
I have 12 items of 4 distinct types which means I have 4 buckets:
Bucket A - [a, a, a]
Bucket B - [b, b, b]
Bucket C - [c, c, c]
Bucket D - [d, d, d]
What I want
A list of the above items in a random distribution without any characters repeating with a size between 1 and N.
12 Items: a, d, c, a, b, a, c, d, c, b, d, b
8 Items: c, a, d, a, b, d, c, b
4 Items: c, b, d, a
3 Items: b, c, a (Skipping D)
I was trying to do this with a while loop that generates random integers until the next bucket isn't equal to the previously used bucket, but that seems inefficient, and was hoping someone else might have a better algorithm to solve this problem.
You could generate a random list of the buckets, and then randomly pick from then in order, removing the bucket from the list when you pick from it. When the list is empty, regenerate a random list of buckets, repeating until you pick the desired number of items.
Can you repeat items from the buckets? So if you pick the 1st "a" from bucket A the first time around, can you pick it a 2nd time? That'll change the solution.
Edited in response to the constraint that no draw must be consecutive from each bucket. It's simple to throw away permutations that don't meet your criteria. Now that this will fail (as is) if two buckets have identical "labels".
A little hack with itertools and random.sample for a permutation:
import random
import itertools as itr
from math import ceil
def buckets_choice(N,labels):
    items = int(ceil(float(N)/len(labels)))
    b = list(itr.chain(*(labels for _ in xrange(items))))
    while True:
        p = random.sample(b,len(b))
        cond = map( (lambda x,y: x==y), p[1:], p[:1])
        if not sum(cond):  return p[:N]
L = ['a','b','c','d']
for _ in xrange(5):
    print buckets_choice(3,L), buckets_choice(8,L), buckets_choice(12,L)
A sample run gives (quote marks removed for clarity):
(a, b, d) (b, d, c, a, d, a, b, c) (b, c, d, c, d, a, d, b, a, c, b, a)
(b, a, d) (d, a, c, c, a, b, b, d) (c, b, a, b, a, c, b, d, d, a, d, c)
(b, d, a) (b, c, c, a, b, a, d, d) (a, d, a, d, c, b, d, c, a, b, c, b)
(d, c, b) (c, d, a, b, c, b, a, d) (c, b, a, a, b, c, d, c, b, a, d, d)
(b, d, a) (c, b, b, d, c, a, d, a) (c, b, d, a, d, b, b, d, c, a, a, c)

Resources