Retrieve known index of a pseudo-random sequence - ruby

I have a set of exactly 16,704,200 unique objects. I need to construct a function f such that:
f(x) returns a seemingly random object from the list (but always the same object for a given value of x)
f(0) through f(16704199) returns the complete set of objects (no duplicates) in that seemingly random order
f doesn't need to store a list of 16,704,200 ordered integers
I've looked at several SO answers about using pseudo-random number generators or linear feedback shift registers to generate sequences of random numbers. The disadvantage there would be the only way to find the value of f(7000) would be to initialize the register, loop 7000 times, and then return the number. (Unless I stored the entire pre-generated sequence, which as stated above I'd prefer not to do.)
Are there any algorithms better suited to finding the 7000th (xth) entry in a randomized sequence?

You can use a Linear Congruential Generator - this type of PRNG is considered very crude nowadays for any purpose requiring statistical randomness, but does have an advantage in your case that it can be made to repeat a specific sequence of known size. It also happens to be reversible, and this is related to your requirement of 1-to-1 mapping between sequence id and selected index id.
First, pick a couple of prime numbers, somewhere between 60% and 80% of your total size N.
N = 16_704_200
A = 9_227_917
C = 11_979_739
You can use the Prime module to find your numbers. You can even select them using a PRNG, and only store the prime numbers that you need.
Now you have these values, you can implement the LCG algorithm, which is your desired f(x):
def lcg x
( A * x + C ) % N
end
A quick test:
lcg( 0 )
# => 11979739
lcg( 12345 )
# => 7971104
(0..9).map { |x| lcg( x) }
# => [ 11979739, 4503456, 13731373, 6255090, 15483007,
# 8006724, 530441, 9758358, 2282075, 11509992 ]
. . . well it might be random, and if you feed back the output as next input parameter then you have an "old school" (and very low quality) PRNG. But you can just use it for index_id = lcg( sequence_id ) to fetch your objects in a random-looking sequence.
Does it map the whole set of input values to the same set of output values:
(0...N).map { |x| lcg( x ) }.uniq.count
# => 16704200
Yes!
Although you don't need it, the algorithm can be reversed. Here's how to do it:
The tricky bit is figuring out the multiplicative inverse of A. Here is an example of how to do that I found.
AINVERSE = 9257653
# Test it:
( A * AINVERSE ) % N
# => 1
Now you have these values, you can implement the LCG algorithm forwards and backwards:
def lcg_fwd x
( A * x + C ) % N
end
def lcg_rev x
( AINVERSE * ( x - C ) ) % N
end
Test it:
lcg_fwd( 0 )
# => 11979739
lcg_rev( 11979739 )
# => 0
lcg_fwd( 12345 )
# => 7971104
lcg_rev( 7971104 )
# => 12345

Perhaps a pre-seeded Random object might do the trick?
prng1 = Random.new(1234)
prng1.seed #=> 1234
prng1.rand(100) #=> 47
prng1.rand(99) #=> 83
prng2 = Random.new(prng1.seed)
prng2.rand(100) #=> 47
prng2.rand(99)   #=> 83
http://www.ruby-doc.org/core-2.1.1/Random.html
If you pick values large enough, you'll get unique numbers:
(1..1_000_000).map {|i| prng1.rand(1_000_000_000_000+i)}.uniq.size
=> 1000000

Related

How to get the partition with the least number of subsets that respects other partitions?

I have a set of elements and some arbitrary partitions of it.
I want to get a partition that divides the set in the least amount of subsets and "respects" the previous existing partitions. By respecting, I mean:
Let A and B be partitions of set X. Partition A respects B if, for every two elements of X, e and f, if e and f are not in the same subset of X according to partition B, they are also not in the same subset of X according to partition A.
Example:
Set is {1,2,3,4,5,6}
Partition1 is {{1,2,3}, {4,5,6}}
Partition2 is {{1,2}, {3,4}, {5,6}}
A partition that would respect Partition1 and Partition2 (and any other partition) is the "every element in its subset" {{1},{2},{3},{4},{5},{6}} partition. However, I want the partition with the least number of subsets, in this case {{1,2}, {3},{4}, {5,6}}.
Is there an algorithm for this problem? I have googled quite a bit, but couldn’t find anything, probably because I am not being able to describe it well. However, it sounds like a generic problem that should be common.
(another way to describe the problem from this Wikipedia article would be: “how to get the coarsest partition that is finer than a set of arbitrary partitions”)
https://en.wikipedia.org/wiki/Partition_of_a_set
I'll call the partition we're looking for the answer.
We'll build the answer as follows:
Take any element not in the answer.
Take the intersection of the subset containing this element from each partition.
Add this subset to the answer.
Repeat.
We'll have to go through these steps once per subset in the answer. At the end, every element will be in a unique subset in the answer, and these will be as coarse as possible.
If the allocation of elements to partitions is random, it is extremely unlikely that any respectful partition has any subsets with more than one element, given 8000 elements, 10 partitions, and 100 subsets per partitions.
What are the odds of a particular pair of elements, say 1 & 2, being in the subset in all 10 partitions? Well, in each partition the odds are about 1/100, and there are 10 of these, so 1 in 100 ^ 10 = 1 in 10 ^ 20.
But there are only choose(8000,2) pairs, which is just under 3.2 * 10 ^ 7.
TL;DR: Unless your partitions aren't random and something about their construction puts the same elements together in subsets far more often than pure chance, the respectful set is almost certain to be 8000 single-element subsets.
Here's the code I used. It's Ruby. The first method generates random partitions, and the second implements the algorithm above.
def get_partitions(num_elts, num_partitions, num_subsets_per_partition)
elements = 0.upto(num_elts - 1).to_a
partitions = []
num_partitions.times do
elements.shuffle!
partition = []
splits = Set.new([0, num_elts])
while splits.size < num_subsets_per_partition + 1 do
splits.add(rand(num_elts))
end
splits_arr = splits.to_a.sort
0.upto(splits_arr.size - 2) do |i|
cur_split = splits_arr[i]
next_split = splits_arr[i+1]
cur_set = (elements.slice(cur_split, next_split - cur_split)).to_set
partition.append(cur_set)
end
partitions.append(partition)
end
return partitions
end
def find_respectful_partition(num_elts, partitions)
elements_set = 0.upto(num_elts - 1).to_set
elt_to_subsets = Hash.new { |h, k| h[k] = [] }
partitions.each do |partition|
partition.each do |subset|
subset.each do |elt|
elt_to_subsets[elt].append(subset)
end
end
end
answer = []
while elements_set.size > 0 do
elt = elements_set.first
subsets_with_elt = elt_to_subsets[elt]
respectful_subset = subsets_with_elt[0]
subsets_with_elt.each do |subset_with_elt|
respectful_subset = respectful_subset.intersection(subset_with_elt)
break if respectful_subset.size == 1
end
answer.append(respectful_subset)
elements_set.subtract(respectful_subset)
end
return answer
end
Here is some working Python:
def partition_to_lookup(partition):
which_partition = {}
i = 0
for part in partition:
for x in part:
which_partition[x] = i
i += 1
return which_partition
def combine_partitions (partitions):
lookups = [partition_to_lookup(partition) for partition in partitions]
reverse_lookup = {}
for x in lookups[0].keys():
key = tuple((lookup[x] for lookup in lookups))
if key in reverse_lookup:
reverse_lookup[key].add(x)
else:
reverse_lookup[key] = {x}
return list(reverse_lookup.values())
print(combine_partitions([[{1,2,3}, {4,5,6}], [{1,2}, {3,4}, {5,6}]]))
If N is the size of the universe, m is the number of partitions, and k the total number of all sets in all partitions, then this will be O(N*m + k).

How can I generate de Bruijn sequences iteratively?

I am looking for a way to generate a de Bruijn sequence iteratively instead of with recursion. My goal is to generate it character by character.
I found some example code in Python for generating de Bruijn sequences and translated it into Rust. I am not yet able to comprehend this technique well enough to create my own method.
Translated into Rust:
fn gen(sequence: &mut Vec<usize>, a: &mut [usize], t: usize, p: usize, k: usize, n: usize) {
if t > n {
if n % p == 0 {
for x in 1..(p + 1) {
sequence.push(a[x])
}
}
} else {
a[t] = a[t - p];
gen(sequence, a, t + 1, p, k, n);
for x in (a[t - p] + 1)..k {
a[t] = x;
gen(sequence, a, t + 1, t, k, n);
}
}
}
fn de_bruijn<T: Clone>(alphabet: &[T], n: usize) -> Vec<T> {
let k = alphabet.len();
let mut a = vec![0; n + 1];
let vecsize = k.checked_pow(n as u32).unwrap();
let mut sequence = Vec::with_capacity(vecsize);
gen(&mut sequence, &mut a, 1, 1, k, n);
sequence.into_iter().map(|x| alphabet[x].clone()).collect()
}
However this is not able to generate iteratively - it goes through a whole mess of recursion and iteration which is impossible to untangle into a single state.
Consider this approach:
Choose the first (lexicographically) representative from every necklace class
Here is Python code for generation of representatives for (binary) necklaces containing d ones (it is possible to repeat for all d values). Sawada article link
Sort representatives in lexicographic order
Make periodic reduction for every representative (if possible): if string is periodic s = p^m like 010101, choose 01
To find the period, it is possible to use string doubling or z-algorithm (I expect it's faster for compiled languages)
Concatenate reductions
Example for n=3,k=2:
Sorted representatives: 000, 001, 011, 111
Reductions: 0, 001, 011, 1
Result: 00010111
The same essential method (with C code) is described in Jörg Arndt's book "Matters Computational", chapter 18
A similar way is mentioned in the wiki
An alternative construction involves concatenating together, in
lexicographic order, all the Lyndon words whose length divides n
You might look for effective way to generate appropriate Lyndon words
I am not familiar with Rust, so I programmed and tested it in Python. Since the poster translated the version in the question from a Python program, I hope it will not be a big issue.
# the following function treats list a as
# k-adic number with n digtis
# and increments this number returning
# the index of the leftmost digit changed
def increment_a7(a, k, n):
digit= n-1
a[digit]+= 1
while a[digit] >= k and digit> 0:
#a[digit]= 0
a[digit]= a[0]+1
a[digit-1]+= 1
digit-= 1
return digit
# the following function adds a to the sequence
# and takes into account, that the beginning of a
# could overlap with the end of sequence
# in that case, it just removes the overlapping digits
# from a before adding the remaining digits to sequence
def append_to_sequence(sequence, a, n):
# here we can assume safely, that a
# does not overlap completely with sequence[-n:]
i= -1
for i in range(n-1, -1, -1):
found= True
# check if the last i digits in sequence
# overlap with the first i digits in a
for j in range(i):
if a[j] != sequence[-i+j]:
# no, they don't overlap
found= False
break
if found:
# yes they overlap, so no need to
# continue the check with a smaller i
break
# now we can just append everything from
# digit i (digit 0 - i-1 are swallowed)
sequence.extend(a[i:])
return n-i
# during the operation we have to keep track of
# the k-adic numbers a, that already occured in
# the sequence. We store them in a set called used
# everytime we add something to the sequence
# we have to update it and add one entry for each
# digit inserted
def update_used(sequence, used, n, num_inserted):
l= len(sequence)
for i in range(num_inserted):
used.add(tuple(sequence[-n-i:l-i]))
# the main work is done in the following function
# it creates and returns the generated sequence
def gen4(k, n):
a= [0]*n
sequence= a[:]
used= set()
# create a fake sequence to add the segments obtained by the cyclic nature
fake= ([k-1] * (n-1))
for i in range(n-1):
fake.append(0)
update_used(fake, used, n, 1)
update_used(sequence, used, n, 1)
valid= True
while valid:
# a is still a valid k-adic number
# this means the generation process
# has not ended
# so construct a new number from the n-1
# last digits of sequence
# followed by a zero
a= sequence[-n+1:]
a.append(0)
while valid and tuple(a) in used:
# the constructed k-adict number a
# was already used, so increment it
# and try again
increment_a(a, k, n)
valid= a[0]<k
if valid:
# great, the number is still valid
# and is not jet part of the sequence
# so add it after removing the overlapping
# digits and update the set with the segments
# we already used
num_inserted= append_to_sequence(sequence, a, n)
update_used(sequence, used, n, num_inserted)
return sequence
I tested the code above by generating some sequences with the original version of gen and this one using the same parameters. For all sets of parameters I tested, the result was the same in both versions.
Please note that this code is less efficient than the original version, especially if the sequence gets long. I guess the costs of the set operations has a non-linear influence on the runtime.
If you like, you can improve it further such as by using a more efficient way to store the used segments. Instead of operating on the k-adic representation (the a-list), you could use a multidimensional array instead.

Generating random number of length 6 with SecureRandom in Ruby

I tried SecureRandom.random_number(9**6) but it sometimes returns 5 and sometimes 6 numbers. I'd want it to be a length of 6 consistently. I would also prefer it in the format like SecureRandom.random_number(9**6) without using syntax like 6.times.map so that it's easier to be stubbed in my controller test.
You can do it with math:
(SecureRandom.random_number(9e5) + 1e5).to_i
Then verify:
100000.times.map do
(SecureRandom.random_number(9e5) + 1e5).to_i
end.map { |v| v.to_s.length }.uniq
# => [6]
This produces values in the range 100000..999999:
10000000.times.map do
(SecureRandom.random_number(9e5) + 1e5).to_i
end.minmax
# => [100000, 999999]
If you need this in a more concise format, just roll it into a method:
def six_digit_rand
(SecureRandom.random_number(9e5) + 1e5).to_i
end
To generate a random, 6-digit string:
# This generates a 6-digit string, where the
# minimum possible value is "000000", and the
# maximum possible value is "999999"
SecureRandom.random_number(10**6).to_s.rjust(6, '0')
Here's more detail of what's happening, shown by breaking the single line into multiple lines with explaining variables:
# Calculate the upper bound for the random number generator
# upper_bound = 1,000,000
upper_bound = 10**6
# n will be an integer with a minimum possible value of 0,
# and a maximum possible value of 999,999
n = SecureRandom.random_number(upper_bound)
# Convert the integer n to a string
# unpadded_str will be "0" if n == 0
# unpadded_str will be "999999" if n == 999999
unpadded_str = n.to_s
# Pad the string with leading zeroes if it is less than
# 6 digits long.
# "0" would be padded to "000000"
# "123" would be padded to "000123"
# "999999" would not be padded, and remains unchanged as "999999"
padded_str = unpadded_str.rjust(6, '0')
Docs to Ruby SecureRand, lot of cool tricks here.
Specific to this question I would say: (SecureRandom.random_number * 1000000).to_i
Docs: random_number(n=0)
If 0 is given or an argument is not given, ::random_number returns a float: 0.0 <= ::random_number < 1.0.
Then multiply by 6 decimal places (* 1000000) and truncate the decimals (.to_i)
If letters are okay, I prefer .hex:
SecureRandom.hex(3) #=> "e15b05"
Docs:
hex(n=nil)
::hex generates a random hexadecimal string.
The argument n specifies the length, in bytes, of the random number to
be generated. The length of the resulting hexadecimal string is twice
n.
If n is not specified or is nil, 16 is assumed. It may be larger in
future.
The result may contain 0-9 and a-f.
Other options:
SecureRandom.uuid #=> "3f780c86-6897-457e-9d0b-ef3963fbc0a8"
SecureRandom.urlsafe_base64 #=> "UZLdOkzop70Ddx-IJR0ABg"
For Rails apps creating a barcode or uid with an object you can do something like this in the object model file:
before_create :generate_barcode
def generate_barcode
begin
return if self.barcode.present?
self.barcode = SecureRandom.hex.upcase
end while self.class.exists?(barcode: barcode)
end
SecureRandom.random_number(n) gives a random value between 0 to n. You can achieve it using rand function.
2.3.1 :025 > rand(10**5..10**6-1)
=> 742840
rand(a..b) gives a random number between a and b. Here, you always get a 6 digit random number between 10^5 and 10^6-1.

Efficient partial permutation sort in Julia

I am dealing with a problem that requires a partial permutation sort by magnitude in Julia. If x is a vector of dimension p, then what I need are the first k indices corresponding to the k components of x that would appear first in a partial sort by absolute value of x.
Refer to Julia's sorting functions here. Basically, I want a cross between sortperm and select!. When Julia 0.4 is released, I will be able to obtain the same answer by applying sortperm! (this function) to the vector of indices and choosing the first k of them. However, using sortperm! is not ideal here because it will sort the remaining p-k indices of x, which I do not need.
What would be the most memory-efficient way to do the partial permutation sort? I hacked a solution by looking at the sortperm source code. However, since I am not versed in the ordering modules that Julia uses there, I am not sure if my approach is intelligent.
One important detail: I can ignore repeats or ambiguities here. In other words, I do not care about the ordering by abs() of indices for two components 2 and -2. My actual code uses floating point values, so exact equality never occurs for practical purposes.
# initialize a vector for testing
x = [-3,-2,4,1,0,-1]
x2 = copy(x)
k = 3 # num components desired in partial sort
p = 6 # num components in x, x2
# what are the indices that sort x by magnitude?
indices = sortperm(x, by = abs, rev = true)
# now perform partial sort on x2
select!(x2, k, by = abs, rev = true)
# check if first k components are sorted here
# should evaluate to "true"
isequal(x2[1:k], x[indices[1:k]])
# now try my partial permutation sort
# I only need indices2[1:k] at end of day!
indices2 = [1:p]
select!(indices2, 1:k, 1, p, Base.Perm(Base.ord(isless, abs, true, Base.Forward), x))
# same result? should evaluate to "true"
isequal(indices2[1:k], indices[1:k])
EDIT: With the suggested code, we can briefly compare performance on much larger vectors:
p = 10000; k = 100; # asking for largest 1% of components
x = randn(p); x2 = copy(x);
# run following code twice for proper timing results
#time {indices = sortperm(x, by = abs, rev = true); indices[1:k]};
#time {indices2 = [1:p]; select!(indices2, 1:k, 1, p, Base.Perm(Base.ord(isless, abs, true, Base.Forward), x))};
#time selectperm(x,k);
My output:
elapsed time: 0.048876901 seconds (19792096 bytes allocated)
elapsed time: 0.007016534 seconds (2203688 bytes allocated)
elapsed time: 0.004471847 seconds (1657808 bytes allocated)
The following version appears to be relatively space-efficient because it uses only an integer array of the same length as the input array:
function selectperm (x,k)
if k > 1 then
kk = 1:k
else
kk = 1
end
z = collect(1:length(x))
return select!(z,1:k,by = (i)->abs(x[i]), rev = true)
end
x = [-3,-2,4,1,0,-1]
k = 3 # num components desired in partial sort
print (selectperm(x,k))
The output is:
[3,1,2]
... as expected.
I'm not sure if it uses less memory than the originally-proposed solution (though I suspect the memory usage is similar) but the code may be clearer and it does produce only the first k indices whereas the original solution produced all p indices.
(Edit)
selectperm() has been edited to deal with the BoundsError that occurs if k=1 in the call to select!().

Working with arbitrary inequalities and checking which, if any, are satisfied

Given a non-negative integer n and an arbitrary set of inequalities that are user-defined (in say an external text file), I want to determine whether n satisfies any inequality, and if so, which one(s).
Here is a points list.
n = 0: 1
n < 5: 5
n = 5: 10
If you draw a number n that's equal to 5, you get 10 points.
If n less than 5, you get 5 points.
If n is 0, you get 1 point.
The stuff left of the colon is the "condition", while the stuff on the right is the "value".
All entries will be of the form:
n1 op n2: val
In this system, equality takes precedence over inequality, so the order that they appear in will not matter in the end. The inputs are non-negative integers, though intermediary and results may not be non-negative. The results may not even be numbers (eg: could be strings). I have designed it so that will only accept the most basic inequalities, to make it easier for writing a parser (and to see whether this idea is feasible)
My program has two components:
a parser that will read structured input and build a data structure to store the conditions and their associated results.
a function that will take an argument (a non-negative integer) and return the result (or, as in the example, the number of points I receive)
If the list was hardcoded, that is an easy task: just use a case-when or if-else block and I'm done. But the problem isn't as easy as that.
Recall the list at the top. It can contain an arbitrary number of (in)equalities. Perhaps there's only 3 like above. Maybe there are none, or maybe there are 10, 20, 50, or even 1000000. Essentially, you can have m inequalities, for m >= 0
Given a number n and a data structure containing an arbitrary number of conditions and results, I want to be able to determine whether it satisfies any of the conditions and return the associated value. So as with the example above, if I pass in 5, the function will return 10.
They condition/value pairs are not unique in their raw form. You may have multiple instances of the same (in)equality but with different values. eg:
n = 0: 10
n = 0: 1000
n > 0: n
Notice the last entry: if n is greater than 0, then it is just whatever you got.
If multiple inequalities are satisfied (eg: n > 5, n > 6, n > 7), all of them should be returned. If that is not possible to do efficiently, I can return just the first one that satisfied it and ignore the rest. But I would like to be able to retrieve the entire list.
I've been thinking about this for a while and I'm thinking I should use two hash tables: the first one will store the equalities, while the second will store the inequalities.
Equality is easy enough to handle: Just grab the condition as a key and have a list of values. Then I can quickly check whether n is in the hash and grab the appropriate value.
However, for inequality, I am not sure how it will work. Does anyone have any ideas how I can solve this problem in as little computational steps as possible? It's clear that I can easily accomplish this in O(n) time: just run it through each (in)equality one by one. But what happens if this checking is done in real-time? (eg: updated constantly)
For example, it is pretty clear that if I have 100 inequalities and 99 of them check for values > 100 while the other one checks for value <= 100, I shouldn't have to bother checking those 99 inequalities when I pass in 47.
You may use any data structure to store the data. The parser itself is not included in the calculation because that will be pre-processed and only needs to be done once, but if it may be problematic if it takes too long to parse the data.
Since I am using Ruby, I likely have more flexible options when it comes to "messing around" with the data and how it will be interpreted.
class RuleSet
Rule = Struct.new(:op1,:op,:op2,:result) do
def <=>(r2)
# Op of "=" sorts before others
[op=="=" ? 0 : 1, op2.to_i] <=> [r2.op=="=" ? 0 : 1, r2.op2.to_i]
end
def matches(n)
#op2i ||= op2.to_i
case op
when "=" then n == #op2i
when "<" then n < #op2i
when ">" then n > #op2i
end
end
end
def initialize(text)
#rules = text.each_line.map do |line|
Rule.new *line.split(/[\s:]+/)
end.sort
end
def value_for( n )
if rule = #rules.find{ |r| r.matches(n) }
rule.result=="n" ? n : rule.result.to_i
end
end
end
set = RuleSet.new( DATA.read )
-1.upto(8) do |n|
puts "%2i => %s" % [ n, set.value_for(n).inspect ]
end
#=> -1 => 5
#=> 0 => 1
#=> 1 => 5
#=> 2 => 5
#=> 3 => 5
#=> 4 => 5
#=> 5 => 10
#=> 6 => nil
#=> 7 => 7
#=> 8 => nil
__END__
n = 0: 1
n < 5: 5
n = 5: 10
n = 7: n
I would parse the input lines and separate them into predicate/result pairs and build a hash of callable procedures (using eval - oh noes!). The "check" function can iterate through each predicate and return the associated result when one is true:
class PointChecker
def initialize(input)
#predicates = Hash[input.split(/\r?\n/).map do |line|
parts = line.split(/\s*:\s*/)
[Proc.new {|n| eval(parts[0].sub(/=/,'=='))}, parts[1].to_i]
end]
end
def check(n)
#predicates.map { |p,r| [p.call(n) ? r : nil] }.compact
end
end
Here is sample usage:
p = PointChecker.new <<__HERE__
n = 0: 1
n = 1: 2
n < 5: 5
n = 5: 10
__HERE__
p.check(0) # => [1, 5]
p.check(1) # => [2, 5]
p.check(2) # => [5]
p.check(5) # => [10]
p.check(6) # => []
Of course, there are many issues with this implementation. I'm just offering a proof-of-concept. Depending on the scope of your application you might want to build a proper parser and runtime (instead of using eval), handle input more generally/gracefully, etc.
I'm not spending a lot of time on your problem, but here's my quick thought:
Since the points list is always in the format n1 op n2: val, I'd just model the points as an array of hashes.
So first step is to parse the input point list into the data structure, an array of hashes.
Each hash would have values n1, op, n2, value
Then, for each data input you run through all of the hashes (all of the points) and handle each (determining if it matches to the input data or not).
Some tricks of the trade
Spend time in your parser handling bad input. Eg
n < = 1000 # no colon
n < : 1000 # missing n2
x < 2 : 10 # n1, n2 and val are either number or "n"
n # too short, missing :, n2, val
n < 1 : 10x # val is not a number and is not "n"
etc
Also politely handle non-numeric input data
Added
Re: n1 doesn't matter. Be careful, this could be a trick. Why wouldn't
5 < n : 30
be a valid points list item?
Re: multiple arrays of hashes, one array per operator, one hash per point list item -- sure that's fine. Since each op is handled in a specific way, handling the operators one by one is fine. But....ordering then becomes an issue:
Since you want multiple results returned from multiple matching point list items, you need to maintain the overall order of them. Thus I think one array of all the point lists would be the easiest way to do this.

Resources