Divide a group of people into two disjoint subgroups (of arbitrary size) and find some values

Divide a group of people into two disjoint subgroups (of arbitrary size) and find some values - algorithm

As we know from programming, sometimes a slight change in a problem can
significantly alter the form of its solution.
Firstly, I want to create a simple algorithm for solving
the following problem and classify it using bigtheta
notation:
Divide a group of people into two disjoint subgroups
(of arbitrary size) such that the
difference in the total ages of the members of
the two subgroups is as large as possible.
Now I need to change the problem so that the desired
difference is as small as possible and classify
my approach to the problem.
Well,first of all I need to create the initial algorithm.
For that, should I make some kind of sorting in order to separate the teams, and how am I suppose to continue?
EDIT: for the first problem,we have ruled out the possibility of a set being an empty set. So all we have to do is just a linear search to find the min age and then put it in a set B. SetA now has all the other ages except the age of setB, which is the min age. So here is the max difference of the total ages of the two sets, as high as possible

The way you described the first problem, it is trivial in the way that it requires you to find only the minimum element (in case the subgroups should contain at least 1 member), otherwise it is already solved.
The second problem can be solved recursively the pseudo code would be:
// compute sum of all elem of array and store them in sum
min = sum;
globalVec = baseVec;
fun generate(baseVec, generatedVec, position, total)
if (abs(sum - 2*total) < min){ // check if the distribution is better
min = abs(sum - 2*total);
globalVec = generatedVec;
}
if (position >= baseVec.length()) return;
else{
// either consider elem at position in first group:
generate(baseVec,generatedVec.pushback(baseVec[position]), position + 1, total+baseVec[position]);
// or consider elem at position is second group:
generate(baseVec,generatedVec, position + 1, total);
}
And now just start the function with generate(baseVec,"",0,0) where "" stand for an empty vector.
The algo can be drastically improved by applying it to a sorted array, hence adding a test condition to stop branching, but the idea stays the same.

Related

Is there an elegant and efficient way to implement weighted random choices in golang? Details on current implementation and issues inside

tl;dr: I'm looking for methods to implement a weighted random choice based on the relative magnitude of values (or functions of values) in an array in golang. Are there standard algorithms or recommendable packages for this? Is so how do they scale?
Goals
I'm trying to write 2D and 3D markov process programs in golang. A simple 2D example of such is the following: Imagine one has a lattice, and on each site labeled by index (i,j) there are n(i,j) particles. At each time step, the program chooses a site and moves one particle from this site to a random adjacent site. The probability that a site is chosen is proportional to its population n(i,j) at that time.
Current Implementation
My current algorithm, e.g. for the 2D case on an L x L lattice, is the following:
Convert the starting array into a slice of length L^2 by concatenating rows in order, e.g. cdfpop[i L +j]=initialpopulation[i][j].
Convert the 1D slice into a cdf by running a for loop over cdfpop[i]+=cdfpop[i-1].
Generate a two random numbers, Rsite whose range is from 1 to the largest value in the cdf (this is just the last value, cdfpop[L^2-1]), and Rhop whose range is between 1 and 4. The first random number chooses a weighted random site, and the second number a random direction to hop in
Use a binary search to find the leftmost index indexhop of cdfpop that is greater than Rsite. The index being hopped to is either indexhop +-1 for x direction hops or indexhop +- L for y direction hops.
Finally, directly change the values of cdfpop to reflect the hop process. This means subtracting one from (adding one to) all values in cdfpop between the index being hopped from (to) and the index being hopped to (from) depending on order.
Rinse and repeat in for loop. At the end reverse the cdf to determine the final population.
Edit: Requested Pseudocode looks like:
main(){
//import population LxL array
population:= import(population array)
//turn array into slice
for i number of rows{
cdf[ith slice of length L] = population[ith row]
}
//compute cumulant array
for i number of total sites{
cdf[i] = cdf[i-1]+cdf[i]
}
for i timesteps{
site = Randomhopsite(cdf)
cdf = Dohop(cdf, site)
}
Convertcdftoarrayandsave(cdf)
}
Randomhopsite(cdf) site{
//Choose random number in range of the cummulant
randomnumber=RandomNumber(Range 1 to Max(cdf))
site = binarysearch(cdf) // finds leftmost index such that
// cdf[i] > random number
return site
}
Dohop(cdf,site) cdf{
//choose random hop direction and calculate coordinate
randomnumber=RandomNumber(Range 1 to 4)
case{
randomnumber=1 { finalsite= site +1}
randomnumber=2 { finalsite= site -1}
randomnumber=3 { finalsite= site + L}
randomnumber=4 { finalsite= site - L}
}
//change the value of the cumulant distribution to reflect change
if finalsite > site{
for i between site and finalsite{
cdf[i]--
}
elseif finalsite < site{
for i between finalsite and site{
cdf[i]++
}
else {error: something failed}
return cdf
}
This process works really well for simple problems. For this particular problem, I can run about 1 trillion steps on a 1000x 1000 lattice in about 2 minutes on average with my current set up, and I can compile population data to gifs every 10000 or so steps by spinning a go routine without a huge slowdown.
Where efficiency breaks down
The trouble comes when I want to add different processes, with real-valued coefficients, whose rates are not proportional to site population. So say I now have a hopping rate at k_hop *n(i,j) and a death rate (where I simply remove a particle) at k_death *(n(i,j))^2. There are two slow-downs in this case:
My cdf will be double the size (not that big of a deal). It will be real valued and created by cdfpop[i*L+j]= 4 *k_hop * pop[i][j] for i*L+j<L*L and cdfpop[i*L+j]= k_death*math. Power(pop[i][j],2) for L*L<=i*L+j<2*L*L, followed by cdfpop[i]+=cdfpop[i-1]. I would then select a random real in the range of the cdf.
Because of the squared n, I will have to dynamically recalculate the part of the cdf associated with the death process weights at each step. This is a MAJOR slow down, as expected. Timing for this is about 3 microseconds compared with the original algorithm which took less than a nanosecond.
This problem only gets worse if I have rates calculated as a function of populations on neighboring sites -- e.g. spontaneous particle creation depends on the product of populations on neighboring sites. While I hope to work out a way to just modify the cdf without recalculation by thinking really hard, as I try to simulate problems of increasing complexity, I can't help but wonder if there is a universal solution with reasonable efficiency I'm missing that doesn't require specialized code for each random process.
Thanks for reading!

Incorrect Recursive approach to finding combinations of coins to produce given change

I was recently doing a project euler problem (namely #31) which was basically finding out how many ways we can sum to 200 using elements of the set {1,2,5,10,20,50,100,200}.
The idea that I used was this: the number of ways to sum to N is equal to
(the number of ways to sum N-k) * (number of ways to sum k), summed over all possible values of k.
I realized that this approach is WRONG, namely due to the fact that it creates several several duplicate counts. I have tried to adjust the formula to avoid duplicates, but to no avail. I am seeking the wisdom of stack overflowers regarding:
whether my recursive approach is concerned with the correct subproblem to solve
If there exists one, what would be an effective way to eliminate duplicates
how should we approach recursive problems such that we are concerned with the correct subproblem? what are some indicators that we've chosen a correct (or incorrect) subproblem?

When trying to avoid duplicate permutations, a straightforward strategy that works in most cases is to only create rising or falling sequences.
In your example, if you pick a value and then recurse with the whole set, you will get duplicate sequences like 50,50,100 and 50,100,50 and 100,50,50. However, if you recurse with the rule that the next value should be equal to or smaller than the currently selected value, out of those three you will only get the sequence 100,50,50.
So an algorithm that counts only unique combinations would be e.g.:
function uniqueCombinations(set, target, previous) {
for all values in set not greater than previous {
if value equals target {
increment count
}
if value is smaller than target {
uniqueCombinations(set, target - value, value)
}
}
}
uniqueCombinations([1,2,5,10,20,50,100,200], 200, 200)
Alternatively, you can create a copy of the set before every recursion, and remove the elements from it that you don't want repeated.
The rising/falling sequence method also works with iterations. Let's say you want to find all unique combinations of three letters. This algorithm will print results like a,c,e, but not a,e,c or e,a,c:
for letter1 is 'a' to 'x' {
for letter2 is first letter after letter1 to 'y' {
for letter3 is first letter after letter2 to 'z' {
print [letter1,letter2,letter3]
}
}
}

m69 gives a nice strategy that often works, but I think it's worthwhile to better understand why it works. When trying to count items (of any kind), the general principle is:
Think of a rule that classifies any given item into exactly one of several non-overlapping categories. That is, come up with a list of concrete categories A, B, ..., Z that will make the following sentence true: An item is either in category A, or in category B, or ..., or in category Z.
Once you have done this, you can safely count the number of items in each category and add these counts together, comfortable in the knowledge that (a) any item that is counted in one category is not counted again in any other category, and (b) any item that you want to count is in some category (i.e., none are missed).
How could we form categories for your specific problem here? One way to do it is to notice that every item (i.e., every multiset of coin values that sums to the desired total N) either contains the 50-coin exactly zero times, or it contains it exactly once, or it contains it exactly twice, or ..., or it contains it exactly RoundDown(N / 50) times. These categories don't overlap: if a solution uses exactly 5 50-coins, it pretty clearly can't also use exactly 7 50-coins, for example. Also, every solution is clearly in some category (notice that we include a category for the case in which no 50-coins are used). So if we had a way to count, for any given k, the number of solutions that use coins from the set {1,2,5,10,20,50,100,200} to produce a sum of N and use exactly k 50-coins, then we could sum over all k from 0 to N/50 and get an accurate count.
How to do this efficiently? This is where the recursion comes in. The number of solutions that use coins from the set {1,2,5,10,20,50,100,200} to produce a sum of N and use exactly k 50-coins is equal to the number of solutions that sum to N-50k and do not use any 50-coins, i.e. use coins only from the set {1,2,5,10,20,100,200}. This of course works for any particular coin denomination that we could have chosen, so these subproblems have the same shape as the original problem: we can solve each one by simply choosing another coin arbitrarily (e.g. the 10-coin), forming a new set of categories based on this new coin, counting the number of items in each category and summing them up. The subproblems become smaller until we reach some simple base case that we process directly (e.g. no allowed coins left: then there is 1 item if N=0, and 0 items otherwise).
I started with the 50-coin (instead of, say, the largest or the smallest coin) to emphasise that the particular choice used to form the set of non-overlapping categories doesn't matter for the correctness of the algorithm. But in practice, passing explicit representations of sets of coins around is unnecessarily expensive. Since we don't actually care about the particular sequence of coins to use for forming categories, we're free to choose a more efficient representation. Here (and in many problems), it's convenient to represent the set of allowed coins implicitly as simply a single integer, maxCoin, which we interpret to mean that the first maxCoin coins in the original ordered list of coins are the allowed ones. This limits the possible sets we can represent, but here that's OK: If we always choose the last allowed coin to form categories on, we can communicate the new, more-restricted "set" of allowed coins to subproblems very succinctly by simply passing the argument maxCoin-1 to it. This is the essence of m69's answer.

There's some good guidance here. Another way to think about this is as a dynamic program. For this, we must pose the problem as a simple decision among options that leaves us with a smaller version of the same problem. It boils out to a certain kind of recursive expression.
Put the coin values c0, c1, ... c_(n-1) in any order you like. Then define W(i,v) as the number of ways you can make change for value v using coins ci, c_(i+1), ... c_(n-1). The answer we want is W(0,200). All that's left is to define W:
W(i,v) = sum_[k = 0..floor(200/ci)] W(i+1, v-ci*k)
In words: the number of ways we can make change with coins ci onward is to sum up all the ways we can make change after a decision to use some feasible number k of coins ci, removing that much value from the problem.
Of course we need base cases for the recursion. This happens when i=n-1: the last coin value. At this point there's a way to make change if and only if the value we need is an exact multiple of c_(n-1).
W(n-1,v) = 1 if v % c_(n-1) == 0 and 0 otherwise.
We generally don't want to implement this as a simple recursive function. The same argument values occur repeatedly, which leads to an exponential (in n and v) amount of wasted computation. There are simple ways to avoid this. Tabular evaluation and memoization are two.
Another point is that it is more efficient to have the values in descending order. By taking big chunks of value early, the total number of recursive evaluations is minimized. Additionally, since c_(n-1) is now 1, the base case is just W(n-1)=1. Now it becomes fairly obvious that we can add a second base case as an optimization: W(n-2,v) = floor(v/c_(n-2)). That's how many times the for loop will sum W(n-1,1) = 1!
But this is gilding a lilly. The problem is so small that exponential behavior doesn't signify. Here is a little implementation to show that order really doesn't matter:
#include <stdio.h>
#define n 8
int cv[][n] = {
{200,100,50,20,10,5,2,1},
{1,2,5,10,20,50,100,200},
{1,10,100,2,20,200,5,50},
};
int *c;
int w(int i, int v) {
if (i == n - 1) return v % c[n - 1] == 0;
int sum = 0;
for (int k = 0; k <= v / c[i]; ++k)
sum += w(i + 1, v - c[i] * k);
return sum;
}
int main(int argc, char *argv[]) {
unsigned p;
if (argc != 2 || sscanf(argv[1], "%d", &p) != 1 || p > 2) p = 0;
c = cv[p];
printf("Ways(%u) = %d\n", p, w(0, 200));
return 0;
}
Drumroll, please...
$ ./foo 0
Ways(0) = 73682
$ ./foo 1
Ways(1) = 73682
$ ./foo 2
Ways(2) = 73682

Algorithm / Data structure for largest set intersection in a collection of sets with a given set

I have a large collection of several million sets, C. The elements of my sets come from a universe of about 2000 possible elements. I need to know, for a given set, s, which set in C has the largest intersection with s? (Or the k sets in C with the k-largest intersections). I will be making many of these queries, sequentially, for different s.
I know that the obvious way to do this is to just to loop over every set in C and compute the intersection and take the max. Are there any smart data structures / programming tricks that can speed up my search? It would be great if I could do this faster than O(C).
EDIT: approximate answers would be alright too

I don't think there's a clever data structure that will help with asymptotic performance. But this is a perfect map reduce problem. A GPGPU would do nicely. For a universe of 2048 elements, a set as a bitmap is only 256 bytes. 4 million is only a gigabyte. Even a modestly spec'ed Nvidia has that. E.g. programming in CUDA, you'd copy C to graphics card RAM, map a chunk of the gigabyte to each GPU core for searching and then reduce across cores to find the final answer. This ought to take on the order of a very few milliseconds. Not fast enough? Just buy hotter hardware.
If you re-phrase your question along these lines, you'll probably get answers from experts in this kind of programming, which I'm not.

One simple trick is to sort the list of sets C in decreasing order by size, then proceed with brute force intersection tests as usual. As you go along, keep track of the set b with the biggest intersection so far. If you find a set whose intersection with the query set s has size |s| (or equivalently, has intersection equal to s -- use whichever of these tests is faster), you can immediately stop and return it as this is the best possible answer. Otherwise, if the next set from C has fewer than |b| elements, you can immediately stop and return b. This can easily be generalised to finding the top k matches.

I don't see any way to do this in less than O(C) per query, but I have some ideas on how to maximize efficiency. The idea is basically to build a lookup table for each element. If some elements are rare and some are common, you can have positive and negative lookup tables:
s[i] // your query, an array of size 2 thousand, true/false
sign[i] // whether the ith element is positive/negative lookup. +/- 1
sets[i] // a list of all the sets that the ith element belongs/(doesn't) to
query(s):
overlaps[i] // an array of size C, initialized to 0's
for i in len(s):
if s[i]:
for j in sets[i]:
overlaps[j] += sign[i]
return max_index(overlaps)
Especially if many of your elements are of widely differing probabilities (as you said), this approach should save you some time: very rare or very common elements can be dealt with almost instantly.
To further optimize: you can sort the structure so that the elements that are most common/most rare are dealt with first. After you have done the first e.g. 3/4, you can do a quick pass to see if the closest matching set is so far ahead of the next set that it is not necessary to continue, though again whether that is worthwhile depends on the details of your data's distribution.
Yet another refinement: make sets[i] one of two possible structures: if the element is very rare or common, sets[i] is just a list of the sets that the ith element is in/not in. However, suppose the ith element is in half the sets. Then sets[i] is just a list of indices half as long as the number of sets, looping through it and incrementing overlaps is wasteful. Have a third value for sign[i]: if sign[i] == 0, then the ith element is relatively close to 50% commonality (this may just mean between 5% and 95%, or anything else), and instead of a list of sets in which it appears, it will simply be an array of 1's and 0's with length equal to C. Then you would just add the array in its entirety to overlaps which would be faster.

Put all of your elements, from the million sets into a Hashtable. The key will be the element, the value will be a set of indexes that point to a containing set.
HashSet<Element>[] AllSets = ...
// preprocess
Hashtable AllElements = new Hashtable(2000);
for(var index = 0; index < AllSets.Count; index++) {
foreach(var elm in AllSets[index]) {
if(!AllElements.ContainsKey(elm)) {
AllElements.Add(elm, new HashSet<int>() { index });
} else {
((HashSet<int>)AllElements[elm]).Add(index);
}
}
}
public List<HashSet<Element>> TopIntersect(HashSet<Element> set, int top = 1) {
// <index, count>
Dictionar<int, int> counts = new Dictionary<int, int>();
foreach(var elm in set) {
var setIndices = AllElements[elm] As HashSet<int>;
if(setIndices != null) {
foreach(var index in setIndices) {
if(!counts.ContainsKey(index)) {
counts.Add(index, 1);
} else {
counts[index]++;
}
}
}
}
return counts.OrderByDescending(kv => kv.Value)
.Take(top)
.Select(kv => AllSets[kv.Key]).ToList();
}

What data structure is conducive to discrete sampling? [duplicate]

Recently I needed to do weighted random selection of elements from a list, both with and without replacement. While there are well known and good algorithms for unweighted selection, and some for weighted selection without replacement (such as modifications of the resevoir algorithm), I couldn't find any good algorithms for weighted selection with replacement. I also wanted to avoid the resevoir method, as I was selecting a significant fraction of the list, which is small enough to hold in memory.
Does anyone have any suggestions on the best approach in this situation? I have my own solutions, but I'm hoping to find something more efficient, simpler, or both.

One of the fastest ways to make many with replacement samples from an unchanging list is the alias method. The core intuition is that we can create a set of equal-sized bins for the weighted list that can be indexed very efficiently through bit operations, to avoid a binary search. It will turn out that, done correctly, we will need to only store two items from the original list per bin, and thus can represent the split with a single percentage.
Let's us take the example of five equally weighted choices, (a:1, b:1, c:1, d:1, e:1)
To create the alias lookup:
Normalize the weights such that they sum to 1.0. (a:0.2 b:0.2 c:0.2 d:0.2 e:0.2) This is the probability of choosing each weight.
Find the smallest power of 2 greater than or equal to the number of variables, and create this number of partitions, |p|. Each partition represents a probability mass of 1/|p|. In this case, we create 8 partitions, each able to contain 0.125.
Take the variable with the least remaining weight, and place as much of it's mass as possible in an empty partition. In this example, we see that a fills the first partition. (p1{a|null,1.0},p2,p3,p4,p5,p6,p7,p8) with (a:0.075, b:0.2 c:0.2 d:0.2 e:0.2)
If the partition is not filled, take the variable with the most weight, and fill the partition with that variable.
Repeat steps 3 and 4, until none of the weight from the original partition need be assigned to the list.
For example, if we run another iteration of 3 and 4, we see
(p1{a|null,1.0},p2{a|b,0.6},p3,p4,p5,p6,p7,p8) with (a:0, b:0.15 c:0.2 d:0.2 e:0.2) left to be assigned
At runtime:
Get a U(0,1) random number, say binary 0.001100000
bitshift it lg2(p), finding the index partition. Thus, we shift it by 3, yielding 001.1, or position 1, and thus partition 2.
If the partition is split, use the decimal portion of the shifted random number to decide the split. In this case, the value is 0.5, and 0.5 < 0.6, so return a.
Here is some code and another explanation, but unfortunately it doesn't use the bitshifting technique, nor have I actually verified it.

A simple approach that hasn't been mentioned here is one proposed in Efraimidis and Spirakis. In python you could select m items from n >= m weighted items with strictly positive weights stored in weights, returning the selected indices, with:
import heapq
import math
import random
def WeightedSelectionWithoutReplacement(weights, m):
elt = [(math.log(random.random()) / weights[i], i) for i in range(len(weights))]
return [x[1] for x in heapq.nlargest(m, elt)]
This is very similar in structure to the first approach proposed by Nick Johnson. Unfortunately, that approach is biased in selecting the elements (see the comments on the method). Efraimidis and Spirakis proved that their approach is equivalent to random sampling without replacement in the linked paper.

Here's what I came up with for weighted selection without replacement:
def WeightedSelectionWithoutReplacement(l, n):
"""Selects without replacement n random elements from a list of (weight, item) tuples."""
l = sorted((random.random() * x[0], x[1]) for x in l)
return l[-n:]
This is O(m log m) on the number of items in the list to be selected from. I'm fairly certain this will weight items correctly, though I haven't verified it in any formal sense.
Here's what I came up with for weighted selection with replacement:
def WeightedSelectionWithReplacement(l, n):
"""Selects with replacement n random elements from a list of (weight, item) tuples."""
cuml = []
total_weight = 0.0
for weight, item in l:
total_weight += weight
cuml.append((total_weight, item))
return [cuml[bisect.bisect(cuml, random.random()*total_weight)] for x in range(n)]
This is O(m + n log m), where m is the number of items in the input list, and n is the number of items to be selected.

I'd recommend you start by looking at section 3.4.2 of Donald Knuth's Seminumerical Algorithms.
If your arrays are large, there are more efficient algorithms in chapter 3 of Principles of Random Variate Generation by John Dagpunar. If your arrays are not terribly large or you're not concerned with squeezing out as much efficiency as possible, the simpler algorithms in Knuth are probably fine.

It is possible to do Weighted Random Selection with replacement in O(1) time, after first creating an additional O(N)-sized data structure in O(N) time. The algorithm is based on the Alias Method developed by Walker and Vose, which is well described here.
The essential idea is that each bin in a histogram would be chosen with probability 1/N by a uniform RNG. So we will walk through it, and for any underpopulated bin which would would receive excess hits, assign the excess to an overpopulated bin. For each bin, we store the percentage of hits which belong to it, and the partner bin for the excess. This version tracks small and large bins in place, removing the need for an additional stack. It uses the index of the partner (stored in bucket[1]) as an indicator that they have already been processed.
Here is a minimal python implementation, based on the C implementation here
def prep(weights):
data_sz = len(weights)
factor = data_sz/float(sum(weights))
data = [[w*factor, i] for i,w in enumerate(weights)]
big=0
while big<data_sz and data[big][0]<=1.0: big+=1
for small,bucket in enumerate(data):
if bucket[1] is not small: continue
excess = 1.0 - bucket[0]
while excess > 0:
if big==data_sz: break
bucket[1] = big
bucket = data[big]
bucket[0] -= excess
excess = 1.0 - bucket[0]
if (excess >= 0):
big+=1
while big<data_sz and data[big][0]<=1: big+=1
return data
def sample(data):
r=random.random()*len(data)
idx = int(r)
return data[idx][1] if r-idx > data[idx][0] else idx
Example usage:
TRIALS=1000
weights = [20,1.5,9.8,10,15,10,15.5,10,8,.2];
samples = [0]*len(weights)
data = prep(weights)
for _ in range(int(sum(weights)*TRIALS)):
samples[sample(data)]+=1
result = [float(s)/TRIALS for s in samples]
err = [a-b for a,b in zip(result,weights)]
print(result)
print([round(e,5) for e in err])
print(sum([e*e for e in err]))

The following is a description of random weighted selection of an element of a
set (or multiset, if repeats are allowed), both with and without replacement in O(n) space
and O(log n) time.
It consists of implementing a binary search tree, sorted by the elements to be
selected, where each node of the tree contains:
the element itself (element)
the un-normalized weight of the element (elementweight), and
the sum of all the un-normalized weights of the left-child node and all of
its children (leftbranchweight).
the sum of all the un-normalized weights of the right-child node and all of
its chilren (rightbranchweight).
Then we randomly select an element from the BST by descending down the tree. A
rough description of the algorithm follows. The algorithm is given a node of
the tree. Then the values of leftbranchweight, rightbranchweight,
and elementweight of node is summed, and the weights are divided by this
sum, resulting in the values leftbranchprobability,
rightbranchprobability, and elementprobability, respectively. Then a
random number between 0 and 1 (randomnumber) is obtained.
if the number is less than elementprobability,
remove the element from the BST as normal, updating leftbranchweight
and rightbranchweight of all the necessary nodes, and return the
element.
else if the number is less than (elementprobability + leftbranchweight)
recurse on leftchild (run the algorithm using leftchild as node)
else
recurse on rightchild
When we finally find, using these weights, which element is to be returned, we either simply return it (with replacement) or we remove it and update relevant weights in the tree (without replacement).
DISCLAIMER: The algorithm is rough, and a treatise on the proper implementation
of a BST is not attempted here; rather, it is hoped that this answer will help
those who really need fast weighted selection without replacement (like I do).

This is an old question for which numpy now offers an easy solution so I thought I would mention it. Current version of numpy is version 1.2 and numpy.random.choice allows the sampling to be done with or without replacement and with given weights.

Suppose you want to sample 3 elements without replacement from the list ['white','blue','black','yellow','green'] with a prob. distribution [0.1, 0.2, 0.4, 0.1, 0.2]. Using numpy.random module it is as easy as this:
import numpy.random as rnd
sampling_size = 3
domain = ['white','blue','black','yellow','green']
probs = [.1, .2, .4, .1, .2]
sample = rnd.choice(domain, size=sampling_size, replace=False, p=probs)
# in short: rnd.choice(domain, sampling_size, False, probs)
print(sample)
# Possible output: ['white' 'black' 'blue']
Setting the replace flag to True, you have a sampling with replacement.
More info here:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.choice.html#numpy.random.choice

We faced a problem to randomly select K validators of N candidates once per epoch proportionally to their stakes. But this gives us the following problem:
Imagine probabilities of each candidate:
0.1
0.1
0.8
Probabilities of each candidate after 1'000'000 selections 2 of 3 without replacement became:
0.254315
0.256755
0.488930
You should know, those original probabilities are not achievable for 2 of 3 selection without replacement.
But we wish initial probabilities to be a profit distribution probabilities. Else it makes small candidate pools more profitable. So we realized that random selection with replacement would help us – to randomly select >K of N and store also weight of each validator for reward distribution:
std::vector<int> validators;
std::vector<int> weights(n);
int totalWeights = 0;
for (int j = 0; validators.size() < m; j++) {
int value = rand() % likehoodsSum;
for (int i = 0; i < n; i++) {
if (value < likehoods[i]) {
if (weights[i] == 0) {
validators.push_back(i);
}
weights[i]++;
totalWeights++;
break;
}
value -= likehoods[i];
}
}
It gives an almost original distribution of rewards on millions of samples:
0.101230
0.099113
0.799657

Permutations with extra restrictions

I have a set of items, for example: {1,1,1,2,2,3,3,3}, and a restricting set of sets, for example {{3},{1,2},{1,2,3},{1,2,3},{1,2,3},{1,2,3},{2,3},{2,3}. I am looking for permutations of items, but the first element must be 3, and the second must be 1 or 2, etc.
One such permutation that fits is:
{3,1,1,1,2,2,3}
Is there an algorithm to count all permutations for this problem in general? Is there a name for this type of problem?
For illustration, I know how to solve this problem for certain types of "restricting sets".
Set of items: {1,1,2,2,3}, Restrictions {{1,2},{1,2,3},{1,2,3},{1,2},{1,2}}. This is equal to 2!/(2-1)!/1! * 4!/2!/2!. Effectively permuting the 3 first, since it is the most restrictive and then permuting the remaining items where there is room.
Also... polynomial time. Is that possible?
UPDATE: This is discussed further at below links. The problem above is called "counting perfect matchings" and each permutation restriction above is represented by a {0,1} on a matrix of slots to occupants.
https://math.stackexchange.com/questions/519056/does-a-matrix-represent-a-bijection
https://math.stackexchange.com/questions/509563/counting-permutations-with-additional-restrictions
https://math.stackexchange.com/questions/800977/parking-cars-and-vans-into-car-van-and-car-van-parking-spots

All of the other solutions here are exponential time--even for cases that they don't need to be. This problem exhibits similar substructure, and so it should be solved with dynamic programming.
What you want to do is write a class that memoizes solutions to subproblems:
class Counter {
struct Problem {
unordered_multiset<int> s;
vector<unordered_set<int>> v;
};
int Count(Problem const& p) {
if (m.v.size() == 0)
return 1;
if (m.find(p) != m.end())
return m[p];
// otherwise, attack the problem choosing either choosing an index 'i' (notes below)
// or a number 'n'. This code only illustrates choosing an index 'i'.
Problem smaller_p = p;
smaller_p.v.erase(v.begin() + i);
int retval = 0;
for (auto it = p.s.begin(); it != p.s.end(); ++it) {
if (smaller_p.s.find(*it) == smaller_p.s.end())
continue;
smaller_p.s.erase(*it);
retval += Count(smaller_p);
smaller_p.s.insert(*it);
}
m[p] = retval;
return retval;
}
unordered_map<Problem, int> m;
};
The code illustrates choosing an index i, which should be chosen at a place where there are v[i].size() is small. The other option is to choose a number n, which should be one for which there are few locations v that it can be placed in. I'd say the minimum of the two deciding factors should win.
Also, you'll have to define a hash function for Problem -- that shouldn't be too hard using boost's hash stuff.
This solution can be improved by replacing the vector with a set<>, and defining a < operator for unordered_set. This will collapse many more identical subproblems into a single map element, and further reduce mitigate exponential blow-up.
This solution can be further improved by making Problem instances that are the same except that the numbers are rearranged hash to the same value and compare to be the same.

You might consider a recursive solution that uses a pool of digits (in the example you provide, it would be initialized to {1,1,1,2,2,3,3,3}), and decides, at the index given as a parameter, which digit to place at this index (using, of course, the restrictions that you supply).
If you like, I can supply pseudo-code.

You could build a tree.
Level 0: Create a root node.
Level 1: Append each item from the first "restricting set" as children of the root.
Level 2: Append each item from the second restricting set as children of each of the Level 1 nodes.
Level 3: Append each item from the third restricting set as children of each of the Level 2 nodes.
...
The permutation count is then the number of leaf nodes of the final tree.
Edit
It's unclear what is meant by the "set of items" {1,1,1,2,2,3,3,3}. If that is meant to constrain how many times each value can be used ("1" can be used 3 times, "2" twice, etc.) then we need one more step:
Before appending a node to the tree, remove the values used on the current path from the set of items. If the value you want to append is still available (e.g. you want to append a "1", and "1" has only been used twice so far) then append it to the tree.

To save space, you could build a directed graph instead of a tree.
Create a root node.
Create a node for each item in the
first set, and link from the root to
the new nodes.
Create a node for each item in the
second set, and link from each first
set item to each second set item.
...
The number of permutations is then the number of paths from the root node to the nodes of the final set.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio