How would you use dynamic programming to find the list of positive integers in an array whose sum is closest to but not equal to some positive integer K?
I'm a little stuck thinking about this.
The usual phrasing for this is that you're looking for the value closest to, but not exceeding K. If you mean "less than K", it just means that your value of K is one greater than the usual. If you truly mean just "not equal to K", then you'd basically run through the algorithm twice, once finding the largest sum less than K, then again finding the smallest sum greater than K, then picking the one of those whose absolute difference from K is the smallest.
For the moment I'm going to assume you really mean the largest sum less than or equal to K, since that's the most common formulation, and the other possibilities don't really have much affect on the algorithm.
The basic idea is fairly simple, though it (at least potentially) uses a lot of storage. We build a table with K+1 columns and N+1 rows (where N = number of inputs). We initialize the first row in the table to 0's.
Then we start walking through the table, and building the best value we can for each possible maximum value up to the real maximum, going row by row so we start with only a single input, then two possible inputs, then three, and so on. At each spot in the table, there are only two possibilities for the best value: the previous best value that doesn't use the current input, or else the current input plus the previous best value for the maximum minus the current input (and since we compute the table values in order, we'll always already have that value).
We also usually want to keep track of which items were actually used to produce the result. To do that, we set a Boolean for a given spot in the table to true if and only if we compute a value for that spot in the table using the new input for that row (rather than just copying the previous row's best value). The best result is in the bottom, right-hand corner, so we start there, and walk backward through the table one row at a time. When we get to a row where the Boolean for that column was set to true, we know we found an input that was used. We print out that item, and then subtract that from the total to get the next column to the left where we'll find the next input that was used to produce this output.
Here's an implementation that's technically in C++, but written primarily in a C-like style to make each step as explicit as possible.
#include <iostream>
#include <functional>
#define elements(array) (sizeof(array)/sizeof(array[0]))
int main() {
// Since we're assuming subscripts from 1..N, I've inserted a dummy value
// for v[0].
int v[] = {0, 7, 15, 2, 1};
// For the moment I'm assuming a maximum <= MAX.
const int MAX = 17;
// ... but if you want to specify K as the question implies, where sum<K,
// you can get rid of MAX and just specify K directly:
const int K = MAX + 1;
const int rows = elements(v);
int table[rows][K] = {0};
bool used[rows][K] = {false};
for (int i=1; i<rows; i++)
for (int c = 0; c<K; c++) {
int prev_val = table[i-1][c];
int new_val;
// we compute new_val inside the if statement so we won't
// accidentally try to use a negative column from the table if v[i]>c
if (v[i] <= c && (new_val=v[i]+table[i-1][c-v[i]]) > prev_val) {
table[i][c] = new_val;
used[i][c] = true;
}
else
table[i][c] = prev_val;
}
std::cout << "Result: " << table[rows-1][MAX] << "\n";
std::cout << "Used items where:\n";
int column = MAX;
for (int i=rows; i>-1; i--)
if (used[i][column]) {
std::cout << "\tv[" << i << "] = " << v[i] << "\n";
column -= v[i];
}
return 0;
}
There are a couple of things you'd normally optimize in this (that I haven't for the sake of readability). First, if you reach an optimum sum, you can stop searching, so in this case we could actually break out of the loop before considering the final input of 1 at all (since 15 and 2 give the desired result of 17).
Second, in the table itself we only really use two rows at any given time: one current row and one previous row. The rows before that (in the main table) are never used again (i.e., to compute row[n] we need the values from row[n-1], but not row[n-2], row[n-3], ... row[0]. To reduce storage, we can make the main table be only two rows, and we swap between the first and second rows. A very C-like trick to do that would be to use only the least significant bit of the row number, so you'd replace table[i] and table[i-1] with table[i&1] and table[(i-1)&1] respectively (but only for the main table -- not when addressing the used table.
Here is an example in python:
def closestSum(a,k):
s={0:[]}
for x in a:
ns=dict(s)
for j in s:
ns[j+x]=s[j]+[x]
s=ns
if k in s:
del s[k]
return s[min(s,key=lambda i:abs(i-k))]
Example:
>>> print closestSum([1,2,5,6],10)
[1, 2, 6]
The idea is simply to keep track of what sums can be made from all previous elements as you go through the array, as well as one way to make that sum. At the end, you just pick the closest to what you want. It is a dynamic programming solution because it breaks the overall problem down into sub-problems, and uses a table to remember the results of the sub-problems instead of recalculating them.
Cato's idea in Racket:
#lang racket
(define (closest-sum xs k)
(define h (make-hash '([0 . ()])))
(for* ([x xs] [(s ys) (hash-copy h)])
(hash-set! h (+ x s) (cons x ys))
(hash-set! h x (list x)))
(when (hash-ref h k #f) (hash-remove! h k))
(cdr (argmin (λ (a) (abs (- k (car a)))) (hash->list h))))
To get an even terser program, one can grab terse-hash.rkt from GitHub and write:
(define (closest-sum xs k)
(define h {make '([0 . ()])})
(for* ([x xs] [(s ys) {copy h}])
{! h (+ x s) (cons x ys)}
{! h x (list x)})
(when {h k #f} {remove! h k})
(cdr (argmin (λ (a) (abs (- k (car a)))) {->list h})))
Related
p.s. I have referred to this as Random, but this is a Seed Based Random Shuffle, where the Seed will be generated by a PRNG, but with the same Seed, the same "random" distribution will be observed.
I am currently trying to find a method to assist in doing 2 things:
1) Generate Non-Repeating Sequence
This will take 2 arguments: Seed; and N. It will generate a sequence, of size N, populated with numbers between 1 and N, with no repetitions.
I have found a few good methods to do this, but most of them get stumped by feasibility with the second thing.
2) Extract an entry from the Sequence
This will take 3 arguments: Seed; N; and I. This is for determining what value would appear at position I in a Sequence that would be generated with Seed and N. However, in order to work with what I have in mind, it absolutely cannot use a generated sequence, and pick out an element.
I initially worked with pre-calculating the sequence, then querying it, but this only really works in test cases, as the number of Seeds, and the value of N that will be used would create a database into the Petabytes.
From what I can tell, having a method that implements requirement 1 by using requirement 2 would be the most ideal method.
i.e. a sequence is generated by:
function Generate_Sequence(int S, int N) {
int[] sequence = new int[N];
for (int i = 0; i < N; i++) {
sequence[i] = Extract_From_Sequence(S, N, i);
}
return sequence;
}
For Example
GS = Generate Sequence
ES = Extract from Sequence
for:
S = 1
N = 5
I = 4
GS(S, N) = { 4, 2, 5, 1, 3 }
ES(S, N, I) = 1
let S = 2
GS(S, N) = { 3, 5, 2, 4, 1 }
ES(S, N, I) = 4
One way to do this is to make a permutation over the bit positions of the number. Assume that N is a power of two (I will discuss the general case later!).
Use the seed S to generate a permutation \sigma over the set of {1,2,...,log(n)}. Then permute the bits of I according to the \sigma to obtain I'. In other words, the bit of I' at the position \sigma(x) is obtained from the bit of I at the position x.
One problem with this method is its linearity (It is closed under the XOR operation). To overcome this, you can find a number p with gcd(p,N)=1 (this can be done easily even for very large Ns) and generate a random number (q < N) using the seed S. The output of the Extract_From_Sequence(S, N, I) would be (p*I'+q mod N).
Now the case where N is not a complete power of two. The problem arises when the I' falls outside the range of [1,N]. In that case, we return the most significant bits of I to their initial position until the resulting value falls into the desired range. This is done by changing the \sigma(log(n)) bit of I' with the log(n) bit, and so on ....
I am trying to find different sequences of fixed length which can be generated using the numbers from a given set (distinct elements) such that each element from set should appear in the sequence. Below is my logic:
eg. Let the set consists of S elements, and we have to generate sequences of length K (K >= S)
1) First we have to choose S places out of K and place each element from the set in random order. So, C(K,S)*S!
2) After that, remaining places can be filled from any values from the set. So, the factor
(K-S)^S should be multiplied.
So, overall result is
C(K,S)S!((K-S)^S)
But, I am getting wrong answer. Please help.
PS: C(K,S) : No. of ways selecting S elements out of K elements (K>=S) irrespective of order. Also, ^ : power symbol i.e 2^3 = 8.
Here is my code in python:
# m is the no. of element to select from a set of n elements
# fact is a list containing factorial values i.e. fact[0] = 1, fact[3] = 6& so on.
def ways(m,n):
res = fact[n]/fact[n-m+1]*((n-m)**m)
return res
What you are looking for is the number of surjective functions whose domain is a set of K elements (the K positions that we are filling out in the output sequence) and the image is a set of S elements (your input set). I think this should work:
static int Count(int K, int S)
{
int sum = 0;
for (int i = 1; i <= S; i++)
{
sum += Pow(-1, (S-i)) * Fact(S) / (Fact(i) * Fact(S - i)) * Pow(i, K);
}
return sum;
}
...where Pow and Fact are what you would expect.
Check out this this math.se question.
Here's why your approach won't work. I didn't check the code, just your explanation of the logic behind it, but I'm pretty sure I understand what you're trying to do. Let's take for example K = 4, S = {7,8,9}. Let's examine the sequence 7,8,9,7. It is a unique sequence, but you can get to it by:
Randomly choosing positions 1,2,3, filling them randomly with 7,8,9 (your step 1), then randomly choosing 7 for the remaining position 4 (your step 2).
Randomly choosing positions 2,3,4, filling them randomly with 8,9,7 (your step 1), then randomly choosing 7 for the remaining position 1 (your step 2).
By your logic, you will count it both ways, even though it should be counted only once as the end result is the same. And so on...
I have a few million datapoints, each with a time and a value. I'm interested in knowing all of the sliding windows, (ie, chunks of 4000 datapoints) where the range from high to low of the window exceeds a constant threshold.
For example:, assume a window of length 3, and a threshold where high - low > 3. Then the series: [10 12 14 13 10 11 16 14 17] would result in [0, 2, 4, 5] because those are the indexes where the 3 period window's high - low range exceeded the threshold.
I have a window size of 4000 and a dataset size of millions.
The naive approach is to just calculate every possible window range, ie 1-4000, 2-4001, 3-4002, etc, and accumulate those sets that breached the threshold. This takes forever as you might imagine for large datasets.
So, the algorithm I think would be better is the following:
Calculate the range of the first window (1-4000), and store the index of the high/low of the window range. Then, iterate to (2-4001, 3-4002) etc. Only update the high/low index if the NEW value on the far right of the window is higher/lower than the old cached value.
Now, let's say the high/low indexes of the 1-4000 window is 333 and 666 respectively. I iterate and continue updating new highs/lows as I see them on the right, but as soon as the window is at 334-4333 (as soon as the cached high/low is outside of the current window) I recalculate the high/low for the current window (334-4333), cache, and continue iterating.
My question is:
1.) Is there a mathematical formula for this that eliminates the need for an algorithm at all? I know there are formulas for weighted and exponential moving averages over a window period that don't require recalculation of the window.
2.) Is my algorithm sensible? Accurate? Is there a way it could be greatly simplified or improved?
Thanks a lot.
If the data length is n and window size m, then here's an O(n log m) solution using sorted-maps.
(defn freqs
"Like frequencies but uses a sorted map"
[coll]
(reduce (fn [counts x]
(assoc counts x (inc (get counts x 0))))
(sorted-map) coll))
(defn rng
"Return max - min value of a sorted-map (log time)"
[smap]
(- (ffirst (rseq smap)) (ffirst smap)))
(defn slide-threshold [v w t]
(loop [q (freqs (subvec v 0 w)), i 0, j (+ i w), a []]
(if (= (count v) j)
a
(let [q* (merge-with + q {(v i) -1} {(v j) 1})
q* (if (zero? (q* (v i))) (dissoc q* (v i)) q*)
a* (if (> (rng q) t) (conj a i) a)]
(recur q* (inc i) (inc j) a*)))))
(slide-threshold [10 12 14 13 10 11 16 14 17] 3 3)
;=> [0 2 4 5]
The naive version is not linear. Linear would be O(n). The naive algorithm is O(n*k), where k is the window size. Your improvement also is O(n * k) in the worst case (imagine a sorted array), but in the general case you should see a big improvement in running time because you'll avoid a large number of recalculations.
You can solve this in O(n log k) by using a Min-max heap (or two heaps), but you have to use a type of heap that can remove an arbitrary node in O(log k). You can't use a standard binary heap because although removing an arbitrary node is O(log k), finding the node is O(k).
Assuming you have a Min-max heap, the algorithm looks like this:
heap = create empty heap
add first k items to the heap
for (i = k; i < n-k; ++i)
{
if (heap.MaxItem - heap.MinItem) > threshold
output range
remove item i-k from the heap
add item i to the heap
}
The problem, of course, is removing item i-k from the heap. Actually, the problem is finding it efficiently. The way I've done this in the past is to modify my binary heap so that it stores nodes that contain an index and a value. The heap comparisons use the value, of course. The index is the node's position in the backing array, and is updated by the heap whenever the node is moved. When an item is added to the heap, the Add method returns a reference to the node, which I maintain in an array. Or in your case you can maintain it in a queue.
So the algorithm looks like this:
queue = create empty queue of heap nodes
heap = create empty heap
for (i = 0; i < k; ++i)
{
node = heap.Add(array[i]);
queue.Add(node);
}
for (i = k; i < n-k; ++i)
{
if (heap.MaxItem - heap.MinItem) > threshold
output range
node = queue.Dequeue()
remove item at position node.Index from the heap
node = heap.Add(array[i])
queue.Add(node)
}
This is provably O(n log k). Every item is read and added to the heap. Actually, it's also removed from the heap. In addition, every item is added to the queue and removed from the queue, but those two operations are O(1).
For those of you who doubt me, it is possible to remove an arbitrary element from a heap in O(log k) time, provided that you know where it is. I explained the technique here: https://stackoverflow.com/a/8706363/56778.
So, if you have a window of size 4,000, running time will be roughly proportional to: 3n * 2(log k). Given a million items and a window size of 5,000, that works out to 3,000,000 * (12.3 * 2), or about 75 million. That's roughly equivalent to having to recompute the full window in your optimized naive method 200 times.
As I said, your optimized method can end up taking a long time if the array is sorted, or nearly so. The heap algorithm I outlined above doesn't suffer from that.
You should give your "better" algorithm a try and see if it's fast enough. If it is, and you don't expect pathological data, then great. Otherwise take a look at this technique.
There are some algoritms to keep minimum (or maximum) value in sliding window with amortized complexity O(1) per element (O(N) for all data set). This is one of them using Deque data structure, which contains value/index pairs. For both Min and Max you have to keep two deques (with max length 4000).
at every step:
if (!Deque.Empty) and (Deque.Head.Index <= CurrentIndex - T) then
Deque.ExtractHead;
//Head is too old, it is leaving the window
while (!Deque.Empty) and (Deque.Tail.Value > CurrentValue) do
Deque.ExtractTail;
//remove elements that have no chance to become minimum in the window
Deque.AddTail(CurrentValue, CurrentIndex);
CurrentMin = Deque.Head.Value
//Head value is minimum in the current window
Another approach uses stacks
Here is the python code for this:
import heapq
l = [10,12, 14, 13, 10, 11, 16, 14, 17]
w = 3
threshold = 3
breached_indexes = []
#set up the heap for the initial window size
min_values = [(l[i], i) for i in range(0,w)]
max_values = [(-l[i], i) for i in range(0,w)]
heapq.heapify(min_values)
heapq.heapify(max_values)
#check if first window violates the add the index
if (threshold <= -max_values[0][0] - min_values[0][0]):
breached_indexes.append(0)
for i in range(1, len(l)-w+1):
#remove all elements before the current index
while min_values[0][1] < i:
heapq.heappop(min_values)
while max_values[0][1] < i:
heapq.heappop(max_values)
#check the breach
if (threshold <= -max_values[0][0] - min_values[0][0]):
breached_indexes.append(i)
if (i+w >= len(l)):
break
#push the next element entering the window
heapq.heappush(min_values, (l[i+w], i+w))
heapq.heappush(max_values, (-l[i+w], i+w))
print breached_indexes
Explanation:
Maintain 2 heaps, min-heap and max-heap
At every step when we move the window, do the following
a. Remove items from the heap till the index of the items don't fall
outside the window
b. Check if threshold is violated comparing
the top elements of the heap and record the index, if needed.
c. push the element that newly entered the window into both the heaps.
*I use a negative value for max_heap, since python's implementation is a min-heap
The worst-case complexity of this algorithm would be O(n log n).
Just wanted to play with an idea inspired by the Simple Moving Average concept.
Let's consider 9 points with a sliding window of size 4. At any point, we'll keep track of the maximum values for all windows of size 4, 3, 2, and 1 respectively that end at that point. Suppose we store them in arrays...
At position 1 (p1), we have one value (v1) and one window {p1}, the array A1 contains max(v1)
At position 2 (p2), we have two values (v1, v2) and two windows {p1, p2} and {p2}, the array A2 contains max(v1, v2) and max(v2)
At position 3 (p3), following the same pattern, the array A3 contains max(v1, v2, v3) = max(max(v1, v2), v3), max(v2, v3), and max(v3). Observe that we already know max(v1, v2) from A2
Let's jump a bit and look at position 6 (p6), the array A6 contains max(v3, v4, v5, v6), max(v4, v5, v6), max(v5, v6), and max(v6). Again, we already know max(v3, v4, v5), max(v4, v5), and max(v5) from A5.
Roughly, it looks something like this:
1 2 3 4 5 6 7 8 9
1 1 1 1
x 2 2 2 2
x x 3 3 3 3
x x x 4 4 4 4
5 5 5 5
6 6 6 6
7 7 7
8 8
9
This can be generalized as follows:
Let
n number of datapoints
s window size, 1 <= s <= n
i current position / datapoint, 1 <= s <= n
Vi value at position i
Ai array at position i (note: the array starts at 1 in this definition)
then
Ai (i <= s) has elements
aj = max(Vi, Ai-1[j]) for j in (1..i-1)
aj = Vi for j = i
aj = undefined/unimportant for j in (i+1..s)
Ai (i > s) has elements
aj = max(Vi, Ai-1[j+1]) for j in (1..s-1)
aj = Vi for j = s
The max value for the window of size s at position i is given by Ai[1]. Further, one gets as a bonus the max value for a window of any size x (0 < x <= s ) given by Ai[s - x + 1].
In my opinion the following is true:
Computational/time complexity is minimal. There is no sorting, insertion, deletion, or searching; however, the max function is called n*s times.
Space complexity is bigger (we are storing at least s arrays of size s) but only if we want to persist the result for future queries which run in O(1). Otherwise, only two arrays are necessary, Ai-1 and Ai; all we need in order to fill in the array at position i is the array at position i-1
We still cannot easily make this algorithm run in parallel processes
Using this algorithm to calculate min and max values, we can efficiently accumulate sliding window percentage changes of large dataset
I added a sample implementation / test bed in Javascript for it on github - SlidingWindowAlgorithm. Here is a copy of the algorithm itself (Please note that in this implementation the array is indexed at 0):
var evalMaxInSlidingWindow = function(datapoints, windowsize){
var Aprev = [];
var Acurr = [];
var Aresult = [];
for (var i = 0, len = datapoints.length; i < len; i++)
{
if (i < windowsize)
{
for(var j = 0; j < windowsize; j++)
{
if (j < i)
{
Acurr[j] = Math.max(datapoints[i], Aprev[j]);
}
if (j == i)
{
Acurr[j] = datapoints[i];
}
}
}
else
{
for(var j = 0; j < windowsize; j++)
{
if (j < windowsize - 1)
{
Acurr[j] = Math.max(datapoints[i], Aprev[j + 1]);
}
if (j == windowsize - 1)
{
Acurr[j] = datapoints[i];
}
}
}
Aresult.push(Acurr[0]);
Aprev = [].concat(Acurr);
}
return Aresult;
};
After a discussion with Scott, it seems that this algorithm does nothing special. Well, it was fun playing with it. : )
Let me be clear at start that this is a contrived example and not a real world problem.
If I have a problem of creating a random number between 0 to 10. I do this 11 times making sure that a previously occurred number is not drawn again, if I get a repeated number,
I create another random number again to make sure it has not be seen earlier. So essentially I get a a sequence of unique numbers from 0 - 10 in a random order
e.g. 3 1 2 0 5 9 4 8 10 6 7 and so on
Now to come up with logic to make sure that the random numbers are unique and not one which we have drawn before, we could use many approaches
Use C++ std::bitset and set the bit corresponding to the index equal to value of each random no. and check it next time when a new random number is drawn.
Or
Use a std::map<int,int> to count the number of times or even simple C array with some sentinel values stored in that array to indicate if that number has occurred or not.
If I have to avoid these methods above and use some mathematical/logical/bitwise operation to find whether a random number has been draw before or not, is there a way?
You don't want to do it the way you suggest. Consider what happens when you have already selected 10 of the 11 items; your random number generator will cycle until it finds the missing number, which might be never, depending on your random number generator.
A better solution is to create a list of numbers 0 to 10 in order, then shuffle the list into a random order. The normal algorithm for doing this is due to Knuth, Fisher and Yates: starting at the first element, swap each element with an element at a location greater than the current element in the array.
function shuffle(a, n)
for i from n-1 to 1 step -1
j = randint(i)
swap(a[i], a[j])
We assume an array with indices 0 to n-1, and a randint function that sets j to the range 0 <= j <= i.
Use an array and add all possible values to it. Then pick one out of the array and remove it. Next time, pick again until the array is empty.
Yes, there is a mathematical way to do it, but it is a bit expansive.
have an array: primes[] where primes[i] = the i'th prime number. So its beginning will be [2,3,5,7,11,...].
Also store a number mult Now, once you draw a number (let it be i) you check if mult % primes[i] == 0, if it is - the number was drawn before, if it wasn't - then the number was not. chose it and do mult = mult * primes[i].
However, it is expansive because it might require a lot of space for large ranges (the possible values of mult increases exponentially
(This is a nice mathematical approach, because we actually look at a set of primes p_i, the array of primes is only the implementation to the abstract set of primes).
A bit manipulation alternative for small values is using an int or long as a bitset.
With this approach, to check a candidate i is not in the set you only need to check:
if (pow(2,i) & set == 0) // not in the set
else //already in the set
To enter an element i to the set:
set = set | pow(2,i)
A better approach will be to populate a list with all the numbers, shuffle it with fisher-yates shuffle, and iterate it for generating new random numbers.
If I have to avoid these methods above and use some
mathematical/logical/bitwise operation to find whether a random number
has been draw before or not, is there a way?
Subject to your contrived constraints yes, you can imitate a small bitset using bitwise operations:
You can choose different integer types on the right according to what size you need.
bitset code bitwise code
std::bitset<32> x; unsigned long x = 0;
if (x[i]) { ... } if (x & (1UL << i)) { ... }
// assuming v is 0 or 1
x[i] = v; x = (x & ~(1UL << i)) | ((unsigned long)v << i);
x[i] = true; x |= (1UL << i);
x[i] = false; x &= ~(1UL << i);
For a larger set (beyond the size in bits of unsigned long long), you will need an array of your chosen integer type. Divide the index by the width of each value to know what index to look up in the array, and use the modulus for the bit shifts. This is basically what bitset does.
I'm assuming that the various answers that tell you how best to shuffle 10 numbers are missing the point entirely: that your contrived constraints are there because you do not in fact want or need to know how best to shuffle 10 numbers :-)
Keep a variable too map the drawn numbers. The i'th bit of that variable will be 1 if the number was drawn before:
int mapNumbers = 0;
int generateRand() {
if (mapNumbers & ((1 << 11) - 1) == ((1 << 11) - 1)) return; // return if all numbers have been generated
int x;
do {
x = newVal();
} while (!x & mapNumbers);
mapNumbers |= (1 << x);
return x;
}
Given a array of n distinct integer. Find all pairs of x,y in the array such that z(given) = x * y...do it without sorting and in a most efficient manner..
[edit] Integer are within range of int i.e 0-65536 and numbers are non negative if that helps.
Dont want to sort coz it will take a lot of time. Storage space is not a issue.
Here is linear time hash based solution:
Let hash be an array of size 65537 initilized to 0.
foreach element ele in Array
if ele != 0
hash[product/ele] = ele
end-if
if hash[ele] != 0 AND ele * hash[ele] == product
print ele, product/ele
end-if
end-foreach
There aren't any super efficient ways of doing this. The best I can think of is O(n^2):
Have an auxiliary function that takes a number (a) and a list, and goes through every element (b) checking a*b = z and saving the pair if it is.
Go through every element of your original list, and if a particular element (x) divides z (ie z % x = 0) then send x and the remainder of the list after x to the auxiliary function.
UPDATE:
I'm giving an O(n^2) solution because the question did not specify unique pairs. If only unique pairs are desired, this should be added to the question. Also, my solution assumes the order of pairs doesn't matter, which is another detail that should be clarified.
Iterate through the array...if an element x can divide z (ie z % x == 0), check if it's other factor y=(z/x) exists in the HashTable....
If it does, then you found a pair...else just add it to the hashTable and continue...