product of two number in a array - algorithm

Given a array of n distinct integer. Find all pairs of x,y in the array such that z(given) = x * y...do it without sorting and in a most efficient manner..
[edit] Integer are within range of int i.e 0-65536 and numbers are non negative if that helps.
Dont want to sort coz it will take a lot of time. Storage space is not a issue.

Here is linear time hash based solution:
Let hash be an array of size 65537 initilized to 0.
foreach element ele in Array
if ele != 0
hash[product/ele] = ele
end-if
if hash[ele] != 0 AND ele * hash[ele] == product
print ele, product/ele
end-if
end-foreach

There aren't any super efficient ways of doing this. The best I can think of is O(n^2):
Have an auxiliary function that takes a number (a) and a list, and goes through every element (b) checking a*b = z and saving the pair if it is.
Go through every element of your original list, and if a particular element (x) divides z (ie z % x = 0) then send x and the remainder of the list after x to the auxiliary function.
UPDATE:
I'm giving an O(n^2) solution because the question did not specify unique pairs. If only unique pairs are desired, this should be added to the question. Also, my solution assumes the order of pairs doesn't matter, which is another detail that should be clarified.

Iterate through the array...if an element x can divide z (ie z % x == 0), check if it's other factor y=(z/x) exists in the HashTable....
If it does, then you found a pair...else just add it to the hashTable and continue...

Related

4 values in an array with sum = x

I don't need the code for this. I was wondering if someone could show how to find 4 integers in an array of size n that could equal a value x in time complexity n^2logn. I tried looking for videos but found them hard to follow. From my understanding you create a helper array of all the possible pairs? Sort them and then find a sum of x from the smaller arrays?
Eg. Array = { 1,3,4,5,6,9,4}
Can someone very explain in steps how to check for a sum of say 8. Just need help visualizing this process.
# your input array 'df' and the sought sum 'x'
df = [1,3,4,5,6,9,4]
x = 12
def findSumOfFour(df, x):
# create the dictionary of all pairs key (as sum of a pair) value (pair of indices of 'df')
myDict = { (df[i]+df[j]):[i,j] for i in range(0,len(df)) for j in range(i,len(df)) if not i == j}
# verify if 'x' - a key 'k' is again a key in the dictionary 'myDict',
# if so we only need to verify that the indices differ
for k in myDict.keys():
if x-k in myDict.keys() and not set(myDict[k]) & set(myDict[x-k]):
return [df[myDict[k][0]],df[myDict[k][1]],df[myDict[x-k][0]],df[myDict[x-k][1]]]
return []
solution = findSumOfFour(df,x)
print(solution)
Note, this approach follows the description of גלעד-ברקן (comments) explained with more detail and on your example.
Note 2, the runtime is in O(n^2 log n) since the insert and lookup operations in a map/dictionary in general is O(log n).

Divide an odd size array into into two equal sets of same size and same sum after deleting any one element from the array

Given an array of odd size. You have to delete any one element from the array and then find whether it is possible to divide the remaining even size array into two sets of equal size and having same sum of their elements. It is mandatory to remove any one element from the array.
So Here I am assuming that it is necessary to remove 1 element from the array.
Please look at the code snippet below.
int solve(int idx, int s, int cntr, int val) {
if(idx == n)
if(cntr != 1)
return INT_MAX;
else
return abs((sum-val)-2*s);
int ans = INT_MAX;
if(cntr == 0)
ans = min(ans, solve(idx+1, s, cntr+1, arr[idx]));
else
ans = min(ans, min(solve(idx+1,s+arr[idx], cntr, val), solve(idx+1, s, cntr, val)));
return ans;
}
Here sum is the total sum of original array,
val is the
value of the element at any position which u want to delete, and cntr to keep track whether any value is removed from the array or not.
So the algo goes like this.
Forget that you need to delete any value, Then the problem becomes whether is it possible to divide the array into 2 equi-sum halves. Now we can think of this problem such as divide the array into 2 parts such that abs(sum-2*sum_of_any_half_part) is minimized. So With this idea Lets say I initially have a bucket s which can be the part of array which we are concerned about. So at each step we can either put any element into this part or leave it for the other part.
Now if we introduce the deletion part in to this problem, its just one small changes which is required. Now at each step instead of 2 you have 3 options.
To delete this particular element and then increase the cntr to 1 and the val to the value of the element at that index in the array.
don't do any thing with this element. This is equal to putting this element into other bucket/half
put this element into bucket s, i.e. increase value of s by arr[idx];
Now recursively check which gives the best result.
P.S. Look at the base case in the code snippet to have better idea.
In the end if the above solve function gives ans = 0 then that means yes we can divide the array into 2 equi-sum parts after deleting any element.
Hope this helps.

Generate a number is range (1,n) but not in a list (i,j)

How can I generate a random number that is in the range (1,n) but not in a certain list (i,j)?
Example: range is (1,500), list is [1,3,4,45,199,212,344].
Note: The list may not be sorted
Rejection Sampling
One method is rejection sampling:
Generate a number x in the range (1, 500)
Is x in your list of disallowed values? (Can use a hash-set for this check.)
If yes, return to step 1
If no, x is your random value, done
This will work fine if your set of allowed values is significantly larger than your set of disallowed values:if there are G possible good values and B possible bad values, then the expected number of times you'll have to sample x from the G + B values until you get a good value is (G + B) / G (the expectation of the associated geometric distribution). (You can sense check this. As G goes to infinity, the expectation goes to 1. As B goes to infinity, the expectation goes to infinity.)
Sampling a List
Another method is to make a list L of all of your allowed values, then sample L[rand(L.count)].
The technique I usually use when the list is length 1 is to generate a random
integer r in [1,n-1], and if r is greater or equal to that single illegal
value then increment r.
This can be generalised for a list of length k for small k but requires
sorting that list (you can't do your compare-and-increment in random order). If the list is moderately long, then after the sort you can start with a bsearch, and add the number of values skipped to r, and then recurse into the remainder of the list.
For a list of length k, containing no value greater or equal to n-k, you
can do a more direct substitution: generate random r in [1,n-k], and
then iterate through the list testing if r is equal to list[i]. If it is
then set r to n-k+i (this assumes list is zero-based) and quit.
That second approach fails if some of the list elements are in [n-k,n].
I could try to invest something clever at this point, but what I have so far
seems sufficient for uniform distributions with values of k much less than
n...
Create two lists -- one of illegal values below n-k, and the other the rest (this can be done in place).
Generate random r in [1,n-k]
Apply the direct substitution approach for the first list (if r is list[i] then set r to n-k+i and go to step 5).
If r was not altered in step 3 then we're finished.
Sort the list of larger values and use the compare-and-increment method.
Observations:
If all values are in the lower list, there will be no sort because there is nothing to sort.
If all values are in the upper list, there will be no sort because there is no occasion on which r is moved into the hazardous area.
As k approaches n, the maximum size of the upper (sorted) list grows.
For a given k, if more value appear in the upper list (the bigger the sort), the chance of getting a hit in the lower list shrinks, reducing the likelihood of needing to do the sort.
Refinement:
Obviously things get very sorty for large k, but in such cases the list has comparatively few holes into which r is allowed to settle. This could surely be exploited.
I might suggest something different if many random values with the same
list and limits were needed. I hope that the list of illegal values is not the
list of results of previous calls to this function, because if it is then you
wouldn't want any of this -- instead you would want a Fisher-Yates shuffle.
Rejection sampling would be the simplest if possible as described already. However, if you didn't want use that, you could convert the range and disallowed values to sets and find the difference. Then, you could choose a random value out of there.
Assuming you wanted the range to be in [1,n] but not in [i,j] and that you wanted them uniformly distributed.
In Python
total = range(1,n+1)
disallowed = range(i,j+1)
allowed = list( set(total) - set(disallowed) )
return allowed[random.randrange(len(allowed))]
(Note that this is not EXACTLY uniform since in all likeliness, max_rand%len(allowed) != 0 but this will in most practical applications be very close)
I assume that you know how to generate a random number in [1, n) and also your list is ordered like in the example above.
Let's say that you have a list with k elements. Make a map(O(logn)) structure, which will ensure speed if k goes higher. Put all elements from list in map, where element value will be the key and "good" value will be the value. Later on I'll explain about "good" value. So when we have the map then just find a random number in [1, n - k - p)(Later on I'll explain what is p) and if this number is in map then replace it with "good" value.
"GOOD" value -> Let's start from k-th element. It's good value is its own value + 1, because the very next element is "good" for us. Now let's look at (k-1)th element. We assume that its good value is again its own value + 1. If this value is equal to k-th element then the "good" value for (k-1)th element is k-th "good" value + 1. Also you will have to store the largest "good" value. If the largest value exceed n then p(from above) will be p = largest - n.
Of course I recommend you this only if k is big number otherwise #Timothy Shields' method is perfect.

How can I randomly iterate through a large Range?

I would like to randomly iterate through a range. Each value will be visited only once and all values will eventually be visited. For example:
class Array
def shuffle
ret = dup
j = length
i = 0
while j > 1
r = i + rand(j)
ret[i], ret[r] = ret[r], ret[i]
i += 1
j -= 1
end
ret
end
end
(0..9).to_a.shuffle.each{|x| f(x)}
where f(x) is some function that operates on each value. A Fisher-Yates shuffle is used to efficiently provide random ordering.
My problem is that shuffle needs to operate on an array, which is not cool because I am working with astronomically large numbers. Ruby will quickly consume a large amount of RAM trying to create a monstrous array. Imagine replacing (0..9) with (0..99**99). This is also why the following code will not work:
tried = {} # store previous attempts
bigint = 99**99
bigint.times {
x = rand(bigint)
redo if tried[x]
tried[x] = true
f(x) # some function
}
This code is very naive and quickly runs out of memory as tried obtains more entries.
What sort of algorithm can accomplish what I am trying to do?
[Edit1]: Why do I want to do this? I'm trying to exhaust the search space of a hash algorithm for a N-length input string looking for partial collisions. Each number I generate is equivalent to a unique input string, entropy and all. Basically, I'm "counting" using a custom alphabet.
[Edit2]: This means that f(x) in the above examples is a method that generates a hash and compares it to a constant, target hash for partial collisions. I do not need to store the value of x after I call f(x) so memory should remain constant over time.
[Edit3/4/5/6]: Further clarification/fixes.
[Solution]: The following code is based on #bta's solution. For the sake of conciseness, next_prime is not shown. It produces acceptable randomness and only visits each number once. See the actual post for more details.
N = size_of_range
Q = ( 2 * N / (1 + Math.sqrt(5)) ).to_i.next_prime
START = rand(N)
x = START
nil until f( x = (x + Q) % N ) == START # assuming f(x) returns x
I just remembered a similar problem from a class I took years ago; that is, iterating (relatively) randomly through a set (completely exhausting it) given extremely tight memory constraints. If I'm remembering this correctly, our solution algorithm was something like this:
Define the range to be from 0 to
some number N
Generate a random starting point x[0] inside N
Generate an iterator Q less than N
Generate successive points x[n] by adding Q to
the previous point and wrapping around if needed. That
is, x[n+1] = (x[n] + Q) % N
Repeat until you generate a new point equal to the starting point.
The trick is to find an iterator that will let you traverse the entire range without generating the same value twice. If I'm remembering correctly, any relatively prime N and Q will work (the closer the number to the bounds of the range the less 'random' the input). In that case, a prime number that is not a factor of N should work. You can also swap bytes/nibbles in the resulting number to change the pattern with which the generated points "jump around" in N.
This algorithm only requires the starting point (x[0]), the current point (x[n]), the iterator value (Q), and the range limit (N) to be stored.
Perhaps someone else remembers this algorithm and can verify if I'm remembering it correctly?
As #Turtle answered, you problem doesn't have a solution. #KandadaBoggu and #bta solution gives you random numbers is some ranges which are or are not random. You get clusters of numbers.
But I don't know why you care about double occurence of the same number. If (0..99**99) is your range, then if you could generate 10^10 random numbers per second (if you have a 3 GHz processor and about 4 cores on which you generate one random number per CPU cycle - which is imposible, and ruby will even slow it down a lot), then it would take about 10^180 years to exhaust all the numbers. You have also probability about 10^-180 that two identical numbers will be generated during a whole year. Our universe has probably about 10^9 years, so if your computer could start calculation when the time began, then you would have probability about 10^-170 that two identical numbers were generated. In the other words - practicaly it is imposible and you don't have to care about it.
Even if you would use Jaguar (top 1 from www.top500.org supercomputers) with only this one task, you still need 10^174 years to get all numbers.
If you don't belive me, try
tried = {} # store previous attempts
bigint = 99**99
bigint.times {
x = rand(bigint)
puts "Oh, no!" if tried[x]
tried[x] = true
}
I'll buy you a beer if you will even once see "Oh, no!" on your screen during your life time :)
I could be wrong, but I don't think this is doable without storing some state. At the very least, you're going to need some state.
Even if you only use one bit per value (has this value been tried yes or no) then you will need X/8 bytes of memory to store the result (where X is the largest number). Assuming that you have 2GB of free memory, this would leave you with more than 16 million numbers.
Break the range in to manageable batches as shown below:
def range_walker range, batch_size = 100
size = (range.end - range.begin) + 1
n = size/batch_size
n.times do |i|
x = i * batch_size + range.begin
y = x + batch_size
(x...y).sort_by{rand}.each{|z| p z}
end
d = (range.end - size%batch_size + 1)
(d..range.end).sort_by{rand}.each{|z| p z }
end
You can further randomize solution by randomly choosing the batch for processing.
PS: This is a good problem for map-reduce. Each batch can be worked by independent nodes.
Reference:
Map-reduce in Ruby
you can randomly iterate an array with shuffle method
a = [1,2,3,4,5,6,7,8,9]
a.shuffle!
=> [5, 2, 8, 7, 3, 1, 6, 4, 9]
You want what's called a "full cycle iterator"...
Here is psudocode for the simplest version which is perfect for most uses...
function fullCycleStep(sample_size, last_value, random_seed = 31337, prime_number = 32452843) {
if last_value = null then last_value = random_seed % sample_size
return (last_value + prime_number) % sample_size
}
If you call this like so:
sample = 10
For i = 1 to sample
last_value = fullCycleStep(sample, last_value)
print last_value
next
It would generate random numbers, looping through all 10, never repeating If you change random_seed, which can be anything, or prime_number, which must be greater than, and not be evenly divisible by sample_size, you will get a new random order, but you will still never get a duplicate.
Database systems and other large-scale systems do this by writing the intermediate results of recursive sorts to a temp database file. That way, they can sort massive numbers of records while only keeping limited numbers of records in memory at any one time. This tends to be complicated in practice.
How "random" does your order have to be? If you don't need a specific input distribution, you could try a recursive scheme like this to minimize memory usage:
def gen_random_indices
# Assume your input range is (0..(10**3))
(0..3).sort_by{rand}.each do |a|
(0..3).sort_by{rand}.each do |b|
(0..3).sort_by{rand}.each do |c|
yield "#{a}#{b}#{c}".to_i
end
end
end
end
gen_random_indices do |idx|
run_test_with_index(idx)
end
Essentially, you are constructing the index by randomly generating one digit at a time. In the worst-case scenario, this will require enough memory to store 10 * (number of digits). You will encounter every number in the range (0..(10**3)) exactly once, but the order is only pseudo-random. That is, if the first loop sets a=1, then you will encounter all three-digit numbers of the form 1xx before you see the hundreds digit change.
The other downside is the need to manually construct the function to a specified depth. In your (0..(99**99)) case, this would likely be a problem (although I suppose you could write a script to generate the code for you). I'm sure there's probably a way to re-write this in a state-ful, recursive manner, but I can't think of it off the top of my head (ideas, anyone?).
[Edit]: Taking into account #klew and #Turtle's answers, the best I can hope for is batches of random (or close to random) numbers.
This is a recursive implementation of something similar to KandadaBoggu's solution. Basically, the search space (as a range) is partitioned into an array containing N equal-sized ranges. Each range is fed back in a random order as a new search space. This continues until the size of the range hits a lower bound. At this point the range is small enough to be converted into an array, shuffled, and checked.
Even though it is recursive, I haven't blown the stack yet. Instead, it errors out when attempting to partition a search space larger than about 10^19 keys. I has to do with the numbers being too large to convert to a long. It can probably be fixed:
# partition a range into an array of N equal-sized ranges
def partition(range, n)
ranges = []
first = range.first
last = range.last
length = last - first + 1
step = length / n # integer division
((first + step - 1)..last).step(step) { |i|
ranges << (first..i)
first = i + 1
}
# append any extra onto the last element
ranges[-1] = (ranges[-1].first)..last if last > step * ranges.length
ranges
end
I hope the code comments help shed some light on my original question.
pastebin: full source
Note: PW_LEN under # options can be changed to a lower number in order to get quicker results.
For a prohibitively large space, like
space = -10..1000000000000000000000
You can add this method to Range.
class Range
M127 = 170_141_183_460_469_231_731_687_303_715_884_105_727
def each_random(seed = 0)
return to_enum(__method__) { size } unless block_given?
unless first.kind_of? Integer
raise TypeError, "can't randomly iterate from #{first.class}"
end
sample_size = self.end - first + 1
sample_size -= 1 if exclude_end?
j = coprime sample_size
v = seed % sample_size
each do
v = (v + j) % sample_size
yield first + v
end
end
protected
def gcd(a,b)
b == 0 ? a : gcd(b, a % b)
end
def coprime(a, z = M127)
gcd(a, z) == 1 ? z : coprime(a, z + 1)
end
end
You could then
space.each_random { |i| puts i }
729815750697818944176
459631501395637888351
189447252093456832526
919263002791275776712
649078753489094720887
378894504186913665062
108710254884732609237
838526005582551553423
568341756280370497598
298157506978189441773
27973257676008385948
757789008373827330134
487604759071646274309
217420509769465218484
947236260467284162670
677052011165103106845
406867761862922051020
136683512560740995195
866499263258559939381
596315013956378883556
326130764654197827731
55946515352016771906
785762266049835716092
515578016747654660267
...
With a good amount of randomness so long as your space is a few orders smaller than M127.
Credit to #nick-steele and #bta for the approach.
This isn't really a Ruby-specific answer but I hope it's permitted. Andrew Kensler gives a C++ "permute()" function that does exactly this in his "Correlated Multi-Jittered Sampling" report.
As I understand it, the exact function he provides really only works if your "array" is up to size 2^27, but the general idea could be used for arrays of any size.
I'll do my best to sort of explain it. The first part is you need a hash that is reversible "for any power-of-two sized domain". Consider x = i + 1. No matter what x is, even if your integer overflows, you can determine what i was. More specifically, you can always determine the bottom n-bits of i from the bottom n-bits of x. Addition is a reversible hash operation, as is multiplication by an odd number, as is doing a bitwise xor by a constant. If you know a specific power-of-two domain, you can scramble bits in that domain. E.g. x ^= (x & 0xFF) >> 5) is valid for the 16-bit domain. You can specify that domain with a mask, e.g. mask = 0xFF, and your hash function becomes x = hash(i, mask). Of course you can add a "seed" value into that hash function to get different randomizations. Kensler lays out more valid operations in the paper.
So you have a reversible function x = hash(i, mask, seed). The problem is that if you hash your index, you might end up with a value that is larger than your array size, i.e. your "domain". You can't just modulo this or you'll get collisions.
The reversible hash is the key to using a technique called "cycle walking", introduced in "Ciphers with Arbitrary Finite Domains". Because the hash is reversible (i.e. 1-to-1), you can just repeatedly apply the same hash until your hashed value is smaller than your array! Because you're applying the same hash, and the mapping is one-to-one, whatever value you end up on will map back to exactly one index, so you don't have collisions. So your function could look something like this for 32-bit integers (pseudocode):
fun permute(i, length, seed) {
i = hash(i, 0xFFFF, seed)
while(i >= length): i = hash(i, 0xFFFF, seed)
return i
}
It could take a lot of hashes to get to your domain, so Kensler does a simple trick: he keeps the hash within the domain of the next power of two, which makes it require very few iterations (~2 on average), by masking out the unnecessary bits. The final algorithm looks like this:
fun next_pow_2(length) {
# This implementation is for clarity.
# See Kensler's paper for one way to do it fast.
p = 1
while (p < length): p *= 2
return p
}
permute(i, length, seed) {
mask = next_pow_2(length)-1
i = hash(i, mask, seed) & mask
while(i >= length): i = hash(i, mask, seed) & mask
return i
}
And that's it! Obviously the important thing here is choosing a good hash function, which Kensler provides in the paper but I wanted to break down the explanation. If you want to have different random permutations each time, you can add a "seed" value to the permute function which then gets passed to the hash function.

Algorithm to find matching pairs in a list

I will phrase the problem in the precise form that I want below:
Given:
Two floating point lists N and D of the same length k (k is multiple of 2).
It is known that for all i=0,...,k-1, there exists j != i such that D[j]*D[i] == N[i]*N[j]. (I'm using zero-based indexing)
Return:
A (length k/2) list of pairs (i,j) such that D[j]*D[i] == N[i]*N[j].
The pairs returned may not be unique (any valid list of pairs is okay)
The application for this algorithm is to find reciprocal pairs of eigenvalues of a generalized palindromic eigenvalue problem.
The equality condition is equivalent to N[i]/D[i] == D[j]/N[j], but also works when denominators are zero (which is a definite possibility). Degeneracies in the eigenvalue problem cause the pairs to be non-unique.
More generally, the algorithm is equivalent to:
Given:
A list X of length k (k is multiple of 2).
It is known that for all i=0,...,k-1, there exists j != i such that IsMatch(X[i],X[j]) returns true, where IsMatch is a boolean matching function which is guaranteed to return true for at least one j != i for all i.
Return:
A (length k/2) list of pairs (i,j) such that IsMatch(i,j) == true for all pairs in the list.
The pairs returned may not be unique (any valid list of pairs is okay)
Obviously, my first problem can be formulated in terms of the second with IsMatch(u,v) := { (u - 1/v) == 0 }. Now, due to limitations of floating point precision, there will never be exact equality, so I want the solution which minimizes the match error. In other words, assume that IsMatch(u,v) returns the value u - 1/v and I want the algorithm to return a list for which IsMatch returns the minimal set of errors. This is a combinatorial optimization problem. I was thinking I can first naively compute the match error between all possible pairs of indexes i and j, but then I would need to select the set of minimum errors, and I don't know how I would do that.
Clarification
The IsMatch function is reflexive (IsMatch(a,b) implies IsMatch(b,a)), but not transitive. It is, however, 3-transitive: IsMatch(a,b) && IsMatch(b,c) && IsMatch(c,d) implies IsMatch(a,d).
Addendum
This problem is apparently identically the minimum weight perfect matching problem in graph theory. However, in my case I know that there should be a "good" perfect matching, so the distribution of edge weights is not totally random. I feel that this information should be used somehow. The question now is if there is a good implementation to the min-weight-perfect-matching problem that uses my prior knowledge to arrive at a solution early in the search. I'm also open to pointers towards a simple implementation of any such algorithm.
I hope I got your problem.
Well, if IsMatch(i, j) and IsMatch(j, l) then IsMatch(i, l). More generally, the IsMatch relation is transitive, commutative and reflexive, ie. its an equivalence relation. The algorithm translates to which element appears the most times in the list (use IsMatch instead of =).
(If I understand the problem...)
Here is one way to match each pair of products in the two lists.
Multiply each pair N and save it to a structure with the product, and the subscripts of the elements making up the product.
Multiply each pair D and save it to a second instance of the structure with the product, and the subscripts of the elements making up the product.
Sort both structions on the product.
Make a merge-type pass through both sorted structure arrays. Each time you find a product from one array that is close enough to the other, you can record the two subscripts from each sorted list for a match.
You can also use one sorted list for an ismatch function, doing a binary search on the product.
well。。Multiply each pair D and save it to a second instance of the structure with the product, and the subscripts of the elements making up the product.
I just asked my CS friend, and he came up with the algorithm below. He doesn't have an account here (and apparently unwilling to create one), but I think his answer is worth sharing.
// We will find the best match in the minimax sense; we will minimize
// the maximum matching error among all pairs. Alpha maintains a
// lower bound on the maximum matching error. We will raise Alpha until
// we find a solution. We assume MatchError returns an L_1 error.
// This first part finds the set of all possible alphas (which are
// the pairwise errors between all elements larger than maxi-min
// error.
Alpha = 0
For all i:
min = Infinity
For all j > i:
AlphaSet.Insert(MatchError(i,j))
if MatchError(i,j) < min
min = MatchError(i,j)
If min > Alpha
Alpha = min
Remove all elements of AlphaSet smaller than Alpha
// This next part increases Alpha until we find a solution
While !AlphaSet.Empty()
Alpha = AlphaSet.RemoveSmallest()
sol = GetBoundedErrorSolution(Alpha)
If sol != nil
Return sol
// This is the definition of the helper function. It returns
// a solution with maximum matching error <= Alpha or nil if
// no such solution exists.
GetBoundedErrorSolution(Alpha) :=
MaxAssignments = 0
For all i:
ValidAssignments[i] = empty set;
For all j > i:
if MatchError <= Alpha
ValidAssignments[i].Insert(j)
ValidAssignments[j].Insert(i)
// ValidAssignments[i].Size() > 0 due to our choice of Alpha
// in the outer loop
If ValidAssignments[i].Size() > MaxAssignments
MaxAssignments = ValidAssignments[i].Size()
If MaxAssignments = 1
return ValidAssignments
Else
G = graph(ValidAssignments)
// G is an undirected graph whose vertices are all values of i
// and edges between vertices if they have match error less
// than or equal to Alpha
If G has a perfect matching
// Note that this part is NP-complete.
Return the matching
Else
Return nil
It relies on being able to compute a perfect matching of a graph, which is NP-complete, but at least it is reduced to a known problem. It is expected that the solution be NP-complete, but this is OK since in practice the size of the given lists are quite small. I'll wait around for a better answer for a few days, or for someone to expand on how to find the perfect matching in a reasonable way.
You want to find j such that D(i)*D(j) = N(i)*N(j) {I assumed * is ordinary real multiplication}
assuming all N(i) are nonzero, let
Z(i) = D(i)/N(i).
Problem: find j, such that Z(i) = 1/Z(j).
Split set into positives and negatives and process separately.
take logs for clarity. z(i) = log Z(i).
Sort indirectly. Then in the sorted view you should have something like -5 -3 -1 +1 +3 +5, for example. Read off +/- pairs and that should give you the original indices.
Am I missing something, or is the problem easy?
Okay, I ended up using this ported Fortran code, where I simply specify the dense upper triangular distance matrix using:
complex_t num = N[i]*N[j] - D[i]*D[j];
complex_t den1 = N[j]*D[i];
complex_t den2 = N[i]*D[j];
if(std::abs(den1) < std::abs(den2)){
costs[j*(j-1)/2+i] = std::abs(-num/den2);
}else if(std::abs(den1) == 0){
costs[j*(j-1)/2+i] = std::sqrt(std::numeric_limits<double>::max());
}else{
costs[j*(j-1)/2+i] = std::abs(num/den1);
}
This works great and is fast enough for my purposes.
You should be able to sort the (D[i],N[i]) pairs. You don't need to divide by zero -- you can just multiply out, as follows:
bool order(i,j) {
float ni= N[i]; float di= D[i];
if(di<0) { di*=-1; ni*=-1; }
float nj= N[j]; float dj= D[j];
if(dj<0) { dj*=-1; nj*=-1; }
return ni*dj < nj*di;
}
Then, scan the sorted list to find two separation points: (N == D) and (N == -D); you can start matching reciprocal pairs from there, using:
abs(D[i]*D[j]-N[i]*N[j])<epsilon
as a validity check. Leave the (N == 0) and (D == 0) points for last; it doesn't matter whether you consider them negative or positive, as they will all match with each other.
edit: alternately, you could just handle (N==0) and (D==0) cases separately, removing them from the list. Then, you can use (N[i]/D[i]) to sort the rest of the indices. You still might want to start at 1.0 and -1.0, to make sure you can match near-zero cases with exactly-zero cases.

Resources