Comparing pairs of objects

Comparing pairs of objects - algorithm

This is a fairly simple question but I can't remember all my coding and data structures and feel a little blank.
Lets say I have a list/array of things (e.g. structures or objects). There is a certain property (true or false) that needs to hold between all pairs of these objects. What would be the fastest method to check if the property is violated between any pair of objects?

Unless you have additional information about the property (for example, that it is transitive) your only solution is to check that property for every pair from the list, with two nested loops:
for (int i = 0 ; i != N ; i++)
for (int j = 0 ; j != N ; j++)
if (i != j) // This assumes that the property might not be reflexive
// This will check the property both ways, i.e.
// there is no implication that the property is commutative.
checkProperty(list[i], list[j]);
For commutative properties (i.e. when A ? B implies B ? A) you can do it in half the comparisons by starting the second loop at j = i+1
If property is transitive (i.e. when A ? B and B ? C imply that A ? C, where ? denotes the property check) you can build a faster check.

You'll need a double loop to compare every item. Assuming you just need to check every 2-item combination (i.e. order doesn't matter), you can just loop through the remaining items in the inner loop.
for (int i = 0 ; i < N ; i++)
for (int j = i+1 ; j < N ; j++)
checkProperty(list[i], list[j]);

Related

List intersection with O(n·m)

Suppose we have two lists of lengths n and m respectively:
val l1 = Seq(1,2,3,4,5,6,7,8,9,0)
val l2 = Seq(2,4,6,8,10,12)
Is there a way to calculate their intersection wiht less than O(n·m)?
That is
val result = Seq(2,4,6,8)
EDIT: we can assume our lists are sorted.

For sorted lists the following Algorithm should work:
You can have 2 pointers say (i and j) one at l1 another at l2.
Now you can iterate on l1 and l2 such that
while (i< l1.size && j < l2.size ) {
if l1[i] < l2[j]
i++
else if (l1[i] == l2[j] )
i++; j++; output = output U {l1[i]}
else
j++
}
This should be in O(max(m,n))

Put one of the items into a hash set. O(min(n,m))
var set2 = new HashSet<int>(){2,4,6,8,10,12};
Take the other set and check if it exists in hash set. Each access is O(1) since we need the other set and we created the hash set with the shorter set that means the time is O(max(m,n)) if it return true in the other set
add it to your results.
Result is O(n+m) in time and O(min(n,m)) in memory.

How to find all pairs of sets (in a collection) that share at least M elements?

Given a large set of sets, all the same size (call it S={s1,...,sn}), I want to find all pairs (si,sj) that have an overlap of at least M.
So if M=2 and S consists of
s1 = (3,4,8,9)
s2 = (1,3,7,8)
s3 = (1,2,5,6)
s4 = (1,6,7,8)
I want to identify the pairs (s1,s2), (s2,s4), and (s3,s4).
The straightforward approach compares every pair and checks for the size of the intersection, but this is prohibitively slow given the number of sets and size of sets I am using (something like O(log(m) n2) where m is the size of the sets?).
I've searched around and haven't found a similar question (though this answer is probably relevant). Any help would be greatly appreciated!

Go by value
pseudo code:
typedef V => type of values;
list<Set> S;
Map<V, list<int>> val2sets;
for(int setIdx=0; setIdx < S.length; ++setIdx){
foreach(V v in S[setIdx]) {
val2sets[v].push(setIdx);
}
}
int[][] setIntersectCount;
foreach(V as v ,list<int> as l in val2sets) {
for(int i=0; i < l.length; ++i){
for(int j=i+1; j < l.length; ++j){
setIntersectCount[i][j]++;
if(setIntersectCount[i][j] == 2){
printf("set {0} and {1} are a pair\n", i, j);
}
}
}
}
as for Complexity:
let S = # of sets.
let M = # of members in a set.
let V = # of unique values in the sets.
O(SM) generating the val2sets
as for reading all values and matching each 2...,
it is maxed at O(V x S^2) but probability it will be closer to O(SM) because uniformity will say not all sets will have all elements in common.
P.S. hope i didn't have any mistake in my calculations.
Worth to note: I did not calculate the Naieve approach complexity either :)

Aranging integers in a specific order

Given a set of distinct unsorted integers s1, s2, .., sn how do you arrange integers such that s1 < s2 > s3 < s4...
I know this can be solved by looking at the array from left to right and if the condition is not satisfied swapping those two elements gives the right answer. Can someone explain me why this algorithm works.

Given any three successive numbers in the array, there are four possible relationships:
a < b < c
a < b > c
a > b < c
a > b > c
In the first case we know that a < c. Since the first condition is met, we can swap b and c to meet the second condition, and the first condition is still met.
In the second case, both conditions are already met.
In the third case, we have to swap a and b to give b < a ? c. But we already know that b < c, so if a < c then swapping to meet that second condition doesn't invalidate the first condition.
In the last case we know that a > c, so swapping a and b to meet the first condition maintains the validity of the second condition.
Now, you add a fourth number to the sequence. You have:
a < b > c ? d
If c < d then there's no need to change anything. But if we have to swap c and d, the prior condition is still met. Because if b > c and c > d, then we know that b > d. So swapping c and d gives us b > d < c.
You can use similar reasoning when you add the fifth number. You have a < b > c < d ? e. If d > e, then there's no need to change anything. If d < e, then by definition c < e as well, so swapping maintains the prior condition.
Pseudo code that implements the algorithm:
for i = 0 to n-2
if i is even
if (a[i] > a[i+1])
swap(a[i], a[i+1])
end if
else
if (a[i] < a[i+1])
swap(a[i], a[i+1])
end

Here is the code to the suggested solution in java.
public static int [] alternatingList(int [] list) {
int first, second,third;
for (int i = 0;i < list.length-2;i+=2) {
first = list[i];
second = list[i+1];
third = list[i+2];
if (first > second && first > third) {
list[i+1] = first;
list[i] = second;
}
else if (third> first && third > second) {
list[i+1] = third;
list[i+2] = second;
}
}
return list;
}
In this code since all the numbers are distinct there will always be a bigger number to put into the "peaks". Swapping the numbers will not change the consistency of the last part you did because the number you swap out will always be smaller than the one you put into the new peak.
Keep in mind this code doesn't handle some edge cases like even length lists and lists smaller than three, I wrote it pretty fast :), I only wrote the code to illustrate the concept of the solution
In addition this solution is better than the one in the proposed dupe because it makes one pass. The solution in the dupe uses the hoare's selection algorithm which is n but requires multiple decreasing in size passes on the list, also it needs to make another n pass on the list after using Hoare's (or the median of medians).
More mathematical proof:
For every three consecutive numbers a,b,c there are three options
a > b && a > c
b > c && b > a
c > a && c > b
In the first case you switch a into the middle because it's the largest, second case do nothing (largest is already in the middle) and 3rd case 'c` goes to the middle.
now you have a < b > c d e where for now d and e are unknown. Now the new a,b,c are c,d,e and you do the same operation this is guaranteed not to mess up the order since c will only be changed if it is larger than d and e thus the number moved into c's spot will be smaller than b and not break the ordering, this can continue infinitely clearly with the order never breaking.

How can I efficiently determine if two lists contain elements ordered in the same way?

I have two ordered lists of the same element type, each list having at most one element of each value (say ints and unique numbers), but otherwise with no restrictions (one may be a subset of the other, they may be completely disjunct, or share some elements but not others).
How do I efficiently determine if A is ordering any two items in a different way than B is? For example, if A has the items 1, 2, 10 and B the items 2, 10, 1, the property would not hold as A lists 1 before 10 but B lists it after 10. 1, 2, 10 vs 2, 10, 5 would be perfectly valid however as A never mentions 5 at all, I cannot rely on any given sorting rule shared by both lists.

You can get O(n) as follows. First, find the intersection of the two sets using hashing. Second, test whether A and B are identical if you only consider elements from the intersection.

My approach would be to first make sorted copies of A and B which also record the positions of elements in the original lists:
for i in 1 .. length(A):
Apos[i] = (A, i)
sortedApos = sort(Apos[] by first element of each pair)
for i in 1 .. length(B):
Bpos[i] = (B, i)
sortedBpos = sort(Bpos[] by first element of each pair)
Now find those elements in common using a standard list merge that records the positions in both A and B of the shared elements:
i = 1
j = 1
shared = []
while i <= length(A) && j <= length(B)
if sortedApos[i][1] < sortedBpos[j][1]
++i
else if sortedApos[i][1] > sortedBpos[j][1]
++j
else // They're equal
append(shared, (sortedApos[i][2], sortedBpos[j][2]))
++i
++j
Finally, sort shared by its first element (position in A) and check that all its second elements (positions in B) are increasing. This will be the case iff the elements common to A and B appear in the same order:
sortedShared = sort(shared[] by first element of each pair)
for i = 2 .. length(sortedShared)
if sortedShared[i][2] < sortedShared[i-1][2]
return DIFFERENT
return SAME
Time complexity: 2*(O(n) + O(nlog n)) + O(n) + O(nlog n) + O(n) = O(nlog n).

General approach: store all the values and their positions in B as keys and values in a HashMap. Iterate over the values in A and look them up in B's HashMap to get their position in B (or null). If this position is before the largest position value you've seen previously, then you know that something in B is in a different order than A. Runs in O(n) time.
Rough, totally untested code:
boolean valuesInSameOrder(int[] A, int[] B)
{
Map<Integer, Integer> bMap = new HashMap<Integer, Integer>();
for (int i = 0; i < B.length; i++)
{
bMap.put(B[i], i);
}
int maxPosInB = 0;
for (int i = 0; i < A.length; i++)
{
if(bMap.containsKey(A[i]))
{
int currPosInB = bMap.get(A[i]);
if (currPosInB < maxPosInB)
{
// B has something in a different order than A
return false;
}
else
{
maxPosInB = currPosInB;
}
}
}
// All of B's values are in the same order as A
return true;
}

Algorithm get a new list containing no duplicated item by adding any 2 elements in a big array

I can only think of this naive algorithm. Any better way? C/C++, Ruby ,Haskell is OK.
arry = [1,5,.....4569895] //1000000 elements ,sorted , no duplicated
newArray = Hash.new
for (i = 0 ; i < arry.length ;i++ )
{
for (j = 0 ; j < arry.length ;j ++ )
{
elem = arry[i] + arry[j]
if (! newArray.key?(elem))
{
newArray [elem] = arry[i] + arry[j]
}
}
}
EDIT : sorry. I have discrete value in the array , instead of [1..1000000]

It would be more efficient to separate the algorithm into two distinct steps. (Warning: pseudocode ahead)
First create n-1 lists by adding the rest of the elements to the ith element. This can be done in parallel for each list. Note that the resulting lists will be sorted.
newArray = array(array.length);
for (i = 0 ; i < array.length ;i++ ) {
newArray[i] = array(array.length - i - 1);
for (j = 0; j < array.length - i; j++) {
newArray[i][j] = array[i] + array[j + i];
}
}
Second use merge sort in to merge the resulted lists. You can do this in parallel, e.g. merge newArray[0] - newArray[i], newArray[2] - newArray[1-i], ... and then again until you only have one list.

If the condition says that you should be able to add any item in the range, then the only way i can think of is to check if the sum is not yet in the result list. Since for any number x, there are x different additions that lead to x. (Or x/2 if you think that 1 + 2 and 2 + 1 is the same addition).

There is one obvious optimization: make the second loop start at the indice i, that way you will avoid having x+y and y+x.
Then if you don't want to use a set, you could use the fact that the items are sorted, so you could build N lists, and merge them while removing the duplicates.

I'm afraid the best worst-case time complexity is O(n2). For input {20, 21, 22, ...}, you won't get any duplicate adding these numbers. Assuming hash insertions are O(1), you already have the best algorithm...

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Comparing pairs of objects - algorithm

You'll need a double loop to compare every item. Assuming you just need to check every 2-item combination (i.e. order doesn't matter), you can just loop through the remaining items in the inner loop. for (int i = 0 ; i < N ; i++) for (int j = i+1 ; j < N ; j++) checkProperty(list[i], list[j]);

Related

List intersection with O(n·m)

How to find all pairs of sets (in a collection) that share at least M elements?

Aranging integers in a specific order

How can I efficiently determine if two lists contain elements ordered in the same way?

Algorithm get a new list containing no duplicated item by adding any 2 elements in a big array

Categories

Resources