I have six arrays that are each given a (not necessarily unique) value from one to fifty. I am also given a number of items to split between them. The value of each item is defined by the array it is in. Arrays can hold infinite or zero items, but the sum of items in all arrays must equal the original number of items given.
I want to find the best configuration of items in arrays where the sum of item values in each individual array are as close as possible to each other.
For instance, let's say that I have three arrays with a value of 10 and three arrays with a value of 20. For nine items, one would go in each of the '20' arrays and two would go into each of the '10' arrays so that the sum of each array is 20 and the total number of items is nine.
I can't add a fractional number of items to an array, and the numbers are hardly ever perfectly divisible like that example, but there always exists a solution where the difference between the sums is minimal.
I'm currently using brute force to solve this problem, but performance suffers with larger numbers of items. I feel like there is a mathematical answer to this problem, but I wouldn't even know where to begin.
It is easy to write a greedy algorithm that comes up with an approximate solution. Just always add the next item to the array with the lowest sum of values.
The array with the highest value should be within 1 item of being correct.
For each count of items in the array with the highest value, you can repeat the exercise. Getting the array with the second highest value to within 1.
Continue through all of them, and with 6 arrays you'll wind up with 3^5 = 243 possible arrangements of items (note that the number of items in the last array is entirely determined by the first 5). Pick the best of these and your combinatorial explosion is contained.
(This approach should work if you're trying to minimize the value difference between the largest and smallest array, and have a fixed number of arrays. )
I am having trouble figuring out an algorithim to best assign values to different points on a diagram based on the distance between the points.
Essentially, I am given a diagram with a block and a dynamic amount of points. It should look something like this:
I am then given a list of values to assign to each point. Here are the rules and info:
I know the Lat,Long values for each point and the central block. In other words, I can get the direct distance from every object to another.
The list of values may be shorter that the total number of points. In this case, values can be repeated multiple times.
In the case where values must be repeated, the duplicate values should be as far away as possible from one another.
Here is an example using a value list of {1,2}:
In reality, this is a very simple example. In truth, there may be thousands of points.
Find out how many values you need to repeat, in your example you have 2 values and 5 points so, you need to have 2 repetition for 2 values, then you will have 2x2=4 positions [call this pNum] (you have to use different pairs as much as possible so that they are far apart from each other).
Calculate a distance array then find the max pNum values in the array, in other words find the greates 4 values in the array in your example.
assigne the repeated values for the the points found most far apart, and assign the rest of the points based on the array distance values.
I have an array with, for example, 1000000000000 of elements (integers). What is the best approach to pick, for example, only 3 random and unique elements from this array? Elements must be unique in whole array, not in list of N (3 in my example) elements.
I read about Reservoir sampling, but it provides only method to pick random numbers, which can be non-unique.
If the odds of hitting a non-unique value are low, your best bet will be to select 3 random numbers from the array, then check each against the entire array to ensure it is unique - if not, choose another random sample to replace it and repeat the test.
If the odds of hitting a non-unique value are high, this increases the number of times you'll need to scan the array looking for uniqueness and makes the simple solution non-optimal. In that case you'll want to split the task of ensuring unique numbers from the task of making a random selection.
Sorting the array is the easiest way to find duplicates. Most sorting algorithms are O(n log n), but since your keys are integers Radix sort can potentially be faster.
Another possibility is to use a hash table to find duplicates, but that will require significant space. You can use a smaller hash table or Bloom filter to identify potential duplicates, then use another method to go through that smaller list.
counts = [0] * (MAXINT-MININT+1)
for value in Elements:
counts[value] += 1
uniques = [c for c in counts where c==1]
result = random.pick_3_from(uniques)
I assume that you have a reasonable idea what fraction of the array values are likely to be unique. So you would know, for instance, that if you picked 1000 random array values, the odds are good that one is unique.
Step 1. Pick 3 random hash algorithms. They can all be the same algorithm, except that you add different integers to each as a first step.
Step 2. Scan the array. Hash each integer all three ways, and for each hash algorithm, keep track of the X lowest hash codes you get (you can use a priority queue for this), and keep a hash table of how many times each of those integers occurs.
Step 3. For each hash algorithm, look for a unique element in that bucket. If it is already picked in another bucket, find another. (Should be a rare boundary case.)
That is your set of three random unique elements. Every unique triple should have even odds of being picked.
(Note: For many purposes it would be fine to just use one hash algorithm and find 3 things from its list...)
This algorithm will succeed with high likelihood in one pass through the array. What is better yet is that the intermediate data structure that it uses is fairly small and is amenable to merging. Therefore this can be parallelized across machines for a very large data set.
I am using the rand() function in my iPhone project to generate a random array index. I generate several random indexes and then get the objects from those indexes. However I don't want to get one object more than once so is there a way to say generate a random number within the range of the array count (which I am already doing) excluding previously picked numbers.
i.e. something like this:
int one = rand() % arrayCount
int two = rand() % arrayCount != one
Thanks
Three possibilities:
Shuffling
Shuffle your array and extract the elements in their order.
Remember
Extract a random element and store it into a NSSet. If you extract one the next time check if it's already in the set. (This is linear time.)
Delete
Use a NSMutableArray and remove already extracted elements from the array. If you don't want to modify the original one make a mutable copy.
Which one's the best depends on your needs.
Background:
I'm working with permutations of the sequence of integers {0, 1, 2 ... , n}.
I have a local search algorithm that transforms a permutation in some systematic way into another permutation. The point of the algorithm is to produce a permutation that minimises a cost function. I'd like to work with a wide range of problems, from n=5 to n=400.
The problem:
To reduce search effort I need to be able to check if I've processed a particular permutation of integers before. I'm using a hash table for this and I need to be able to generate an id for each permutation which I can use as a key into the table. However, I can't think of any nice hash function that maps a set of integers into a key such that collisions do not occur too frequently.
Stuff I've tried:
I started out by generating a sequence of n prime numbers and multiplying the ith number in my permutation with the ith prime then summing the results. The resulting key however produces collisions even for n=5.
I also thought to concatenate the values of all numbers together and take the integer value of the resulting string as a key but the id quickly becomes too big even for small values of n. Ideally, I'd like to be able to store each key as an integer.
Does stackoverflow have any suggestions for me?
Zobrist hashing might work for you. You need to create an NxN matrix of random integers, each cell representing that element i is in the jth position in the current permutation.
For a given permutation you pick the N cell values, and xor them one by one to get the permutation's key (note that key uniqueness is not guaranteed).
The point in this algorithm is, that if you swap to elements in your permutations, you can easily generate the new key from the current permutation by simply xor-ing out the old and xor-ing in the new positions.
Judging by your question, and the comments you've left, I'd say your problem is not possible to solve.
Let me explain.
You say that you need a unique hash from your combination, so let's make that rule #1:
1: Need a unique number to represent a combination of an arbitrary number of digits/numbers
Ok, then in a comment you've said that since you're using quite a few numbers, storing them as a string or whatnot as a key to the hashtable is not feasible, due to memory constraints. So let's rewrite that into another rule:
2: Cannot use the actual data that were used to produce the hash as they are no longer in memory
Basically, you're trying to take a large number, and store that into a much smaller number range, and still have uniqueness.
Sorry, but you can't do that.
Typical hashing algorithms produce relatively unique hash values, so unless you're willing to accept collisions, in the sense that a new combination might be flagged as "already seen" even though it hasn't, then you're out of luck.
If you were to try a bit-field, where each combination has a bit, which is 0 if it hasn't been seen, you still need large amounts of memory.
For the permutation in n=20 that you left in a comment, you have 20! (2,432,902,008,176,640,000) combinations, which if you tried to simply store each combination as a 1-bit in a bit-field, would require 276,589TB of storage.
You're going to have to limit your scope of what you're trying to do.
As others have suggested, you can use hashing to generate an integer that will be unique with high probability. However, if you need the integer to always be unique, you should rank the permutations, i.e. assign an order to them. For example, a common order of permutations for set {1,2,3} is the lexicographical order:
1,2,3
1,3,2
2,1,3
2,3,1
3,1,2
3,2,1
In this case, the id of a permutation is its index in the lexicographical order. There are other methods of ranking permutations, of course.
Making ids a range of continuous integers makes it possible to implement the storage of processed permutations as a bit field or a boolean array.
How fast does it need to be?
You could always gather the integers as a string, then take the hash of that, and then just grab the first 4 bytes.
For a hash you could use any function really, like MD5 or SHA-256.
You could MD5 hash a comma separated string containg your ints.
In C# it would look something like this (Disclaimer: I have no compiler on the machine I'm using today):
using System;
using System.Security.Cryptography;
using System.Text;
public class SomeClass {
static Guid GetHash(int[] numbers) {
string csv = string.Join(',', numbers);
return new Guid(new MD5CryptoServiceProvider().ComputeHash(Encoding.ASCII.GetBytes(csv.Trim())));
}
}
Edit: What was I thinking? As stated by others, you don't need a hash. The CSV should be sufficient as a string Id (unless your numbers array is big).
Convert each number to String, concatenate Strings (via StringBuffer) and take contents of StringBuffer as a key.
Not relates directly to the question, but as an alternative solution you may use Trie tree as a look up structure. Trie trees are very good for strings operations, its implementation relatively easy and it should be more faster (max of n(k) where k is length of a key) than hashset for a big amount of long strings. And you aren't limited in key size( such in a regular hashset in must int, not bigger). Key in your case will be a string of all numbers separated by some char.
Prime powers would work: if p_i is the ith prime and a_i is the ith element of your tuple, then
p_0**a_0 * p_1**a_1 * ... * p_n**a_n
should be unique by the Fundamental Theorem of Arithmetic. Those numbers will get pretty big, though :-)
(e.g. for n=5, (1,2,3,4,5) will map to 870,037,764,750 which is already more than 32 bits)
Similar to Bojan's post it seems like the best way to go is to have a deterministic order to the permutations. If you process them in that order then there is no need to do a lookup to see if you have already done any particular permutation.
get two permutations of same series of numbers {1,.., n}, construct a mapping tupple, (id, permutation1[id], permutation2[id]), or (id, f1(id), f2(id)); you will get an unique map by {f3(id)| for tuple (id, f1(id), f2(id)) , from id, we get a f2(id), and find a id' from tuple (id',f1(id'),f2(id')) where f1(id') == f2(id)}