Storing a specific set of numbers in ruby - ruby

How do you generate some sort of checksum of an array of 5 numbers that would distinguish a set of numbers from another?
For example:
[ 1, 2, 3, 4, 5] has the same checksum as [ 2, 3, 4, 5, 1]
I want to generate millions of 5 digit combinations and compare them against a predetermined set of numbers. I want to be able to checksum the ones I generate, and then compare them against a bank of numbers I've already generated.
Let me explain:
I create an array of numbers as an array
I generate 6 numbers in an array using Rand()
I compare the generated numbers to the array I created, exit if they match
If the numbers don't match, create a hash that I can compare future arrays to. The arrangement of the numbers inside of the array do not matter.
I thought about using an md5sum, but then if the elements change inside, then the md5 would be the same.
I could just store the arrays in memory, but I'm trying to minimize the amount of numbers I store in memory

[1,2,3,4,5].sort.hash #=> 1777030444607087813
[2,3,4,5,1].sort.hash #=> 1777030444607087813
should make the sets distinguishable.
Also should be a more memory-friendly solution because
5.size #=> 8
1777030444607087813.size #=> 8

The problem with hashes is that you always need to worry about collisions. Here is a way to make sure each value is unique, (and it's even O(N))
require 'prime'
pr = Prime.take(10)
[ 1, 2, 3, 4, 5].map{|x| pr[x]}.reduce(&:*)
=> 15015
[ 2, 3, 4, 5, 1].map{|x| pr[x]}.reduce(&:*)
=> 15015

Related

How do I solve this question about Pigeonhole Principle (Discrete Mathematics)?

I am not understanding the following question. I mean I want to know the sample input output for this problem question: "The pigeonhole principle states that if a function f has n distinct inputs but less than n distinct outputs,then there exist two inputs a and b such that a!=b and f(a)=f(b). Present an algorithm to find a and b such that f(a)=f(b). Assume that the function inputs are 1,2,......,and n.?"
I am unable to solve this problem as I am not understanding the question clearly. looking for your help.
The pigeonhole principle says that if you have more items than boxes, at least one of the boxes must have multiple items in it.
If you want to find which items a != b have the property f(a) == f(b), a straightforward approach is to use a hashmap data structure. Use the function value f(x) as key to store the item value x. Iterate through the items, x=1,...,n. If there is no entry at f(x), store x. If there is, the current value of x and the value stored at f(x) are a pair of the type you're seeking.
In pseudocode:
h = {} # initialize an empty hashmap
for x in 1,...,n
if h[f(x)] is empty
h[f(x)] <- x # store x in the hashmap indexed by f(x)
else
(x, h[f(x)]) qualify as a match # do what you want with them
If you want to identify all pigeons who have roommates, initialize the hashmap with empty sets. Then iterate through the values and append the current value x to the set indexed by f(x). Finally, iterate through the hashmap and pick out all sets with more than one element.
Since you didn't specify a language, for the fun of it I decided to implement the latter algorithm in Ruby:
N = 10 # number of pigeons
# Create an array of value/function pairs.
# Using N-1 for range of rand guarantees at least one duplicate random
# number, and with the nature of randomness, quite likely more.
value_and_f = Array.new(N) { |index| [index, rand(N-1)]}
h = {} # new hash
puts "Value/function pairs..."
p value_and_f # print the value/function pairs
value_and_f.each do |x, key|
h[key] = [] unless h[key] # create an array if none exists for this key
h[key] << x # append the x to the array associated with this key
end
puts "\nConfirm which values share function mapping"
h.keys.each { |key| p h[key] if h[key].length > 1 }
Which produces the following output, for example:
Value/function pairs...
[[0, 0], [1, 3], [2, 1], [3, 6], [4, 7], [5, 4], [6, 0], [7, 1], [8, 0], [9, 3]]
Confirm which values share function mapping
[0, 6, 8]
[1, 9]
[2, 7]
Since this implementation uses randomness, it will produce different results each time you run it.
Well let's go step by step.
I have 2 boxes. My father gave me 3 chocolates....
And I want to put those chocolates in 2 boxes. For our benefit let's name the chocolate a,b,c.
So how many ways we can put them?
[ab][c]
[abc][]
[a][bc]
And you see something strange? There is atleast one box with more than 1 chocolate.
So what do you think?
You can try this with any number of boxes and chocolates ( more than number of boxes) and try this. You will see that it's right.
Well let's make it more easy:
I have 5 friends 3 rooms. We are having a party. And now let's see what happens. (All my friends will sit in any of the room)
I am claiming that there will be atleast one room where there will be more than 1 friend.
My friends are quite mischievious and knowing this they tried to prove me wrong.
Friend-1 selects room-1.
Friend-2 thinks why room-1? Then I will be correct so he selects room-2
Friend-3 also thinks same...he avoids 1 and 2 room and get into room-3
Friend-4 now comes and he understands that there is no other empty room and so he has to enter some room. And thus I become correct.
So you understand the situation?
There n friends (funtions) but unfortunately or (fortunately) their rooms (output values) are less than n. So ofcourse one of the there exists 2 friend of mine a and b who shares the same room.( same value f(a)=f(b))
Continuing what https://stackoverflow.com/a/42254627/7256243 said.
Lets say that you map an array A of length N to an array B with length N-1.
Than the result could be an array B; were for 1 index you would have 2 elements.
A = {1,2,3,4,5,6}
map A -> B
Were a possible solution could be.
B= {1,2,{3,4},5,6}
The mapping of A -> could be done in any number of ways.
Here in this example both input index of 3 and 4 in Array A have the same index in array B.
I hope this usefull.

Dividing an array into equally weighted subarrays

Algorithm question here.
I have an unordered array containing product weights, e.g. [3, 2, 5, 5, 8] which need to be divided up into smaller arrays.
Rules:
REQUIRED: Should return 1 or more arrays.
REQUIRED: No array should sum to more than 12.
REQUIRED: Return minimum possible number of arrays, ex. total weight of example above is 23, which can fit into two arrays.
IDEALLY: Arrays should be weighted as evenly as possible.
In the example above, the ideal return would be [ [3, 8], [2, 5, 5] ]
My current thoughts:
Number of arrays to return will be (sum(input_array) / 12).ceil
A greedy algorithm could work well enough?
This is a combination of the bin packing problem and multiprocessor scheduling problem. Both are NP-hard.
Your three requirements constitute the bin packing problem: find the minimal number of bins of a fixed size (12) that fit all the numbers.
Once you solve that, you have the multiprocessor scheduling problem: given a fixed number of bins, what is the most even way to distribute the numbers among them.
There are number of well-known approximate algorithms for both problems.
How about something with an entirely different take on it. Something really simple. Like this, which is based on common horse sense:
module Splitter
def self.split(values, max_length = 12)
# if the sum of all values is lower than the max_length there
# is no point in continuing
return values unless self.sum(values) > max_length
optimized = []
current = []
# start off by ordering the values. perhaps it's a good idea
# to start off with the smallest values first; this will result
# in gathering as much values as possible in the first array. This
# in order to conform to the rule "Should return minimum possible
# number of arrays"
ordered_values = values.sort
ordered_values.each do |v|
if self.sum(current) + v > max_length
# finish up the current iteration if we've got an optimized pair
optimized.push(current)
# reset for the next iteration
current = []
end
current.push(v)
end
# push the last iteration
optimized.push(current)
return optimized
end
# calculates the sum of a collection of numbers
def self.sum(numbers)
if numbers.empty?
return 0
else
return numbers.inject{|sum,x| sum + x }
end
end
end
Which can be used like so:
product_weights = [3, 2, 5, 5, 8]
p Splitter.split(product_weights)
The output will be:
[[2, 3, 5], [5], [8]]
Now, as said before, this is a really simple sample. And I've excluded the validations for empty or non-numeric values in the array for brevity. But it does seem to conform to your primary requirements:
Splitting the (expected: all numeric) values into arrays
With a ceiling per array, defaulting in the sample to 12
Return minimum amount of arrays by collection the smallest numbers first EDIT: after the edit from the comments, this indeed doesn't work
I do have some doubts regarding the comment on "returning minimum possible number of arrays, and balance the weights throughout those as evenly as possible". I'm sure someone else will come up with an implementation of a better and math-proven algorithm that conforms to that requirement, but perhaps this is at least a suitable example for the discussion?

Data Structure / Hash Function to link Sets of Ints to Value

Given n integer id's, I wish to link all possible sets of up to k id's to a constant value. What I'm looking for is a way to translate sets (e.g. {1, 5}, {1, 3, 5} and {1, 2, 3, 4, 5, 6, 7}) to unique values.
Guarantees:
n < 100 and k < 10 (again: set sizes will range in [1, k]).
The order of id's doesn't matter: {1, 5} == {5, 1}.
All combinations are possible, but some may be excluded.
All sets and values are constant and made only once. No deletes or inserts, no value updates.
Once generated, the only operations taking place will be look-ups.
Look-ups will be frequent and one-directional (given set, look up value).
There is no need to sort (or otherwise organize) the values.
Additionally, it would be nice (but not obligatory) if "neighboring" sets (drop one id, add one id, swap one id, etc) are easy to reach, as well as "all sets that include at least this set".
Any ideas?
Enumerate using the product of primes.
a -> 2
b -> 3
c -> 5
d -> 7
et cetera
Now hash(ab) := 6, and hash (abc) := 30
And a nice side effect is that, if "ab" is a subset of "abc", then:
hash(abc) % hash(ab) == 0
and
hash(abc) / hash(ab) == hash(c)
The bad news: You might run into overflow, the 100th prime will probably be around 1000, and 64 bits cannot accomodate 1000**10. This will not affect the functioning as a hash function; only the subset thingy will fail to work. the same method applied to anagrams
The other option is Zobrist-hashing. It is equivalent to the the primes method, but instead of primes you use a fixed set of (random) numbers, and instead of multiplying you use XOR.
For a fixed small (it needs << ~70 bits) set like yours, it might be possible to tune the zobrist tables to totally avoid collisions (yielding a perfect hash).
And the final (and simplest) way is to use a (100bit) bitmap, and treat that as a hashvalue (maybe after modulo table size)
And a totally unrelated method is to just build a decision tree on the bits of the bitmap. (the tree would have a maximal depth of k) a related kD tree on bit values
May be not the best solution, but you can do the following:
Sort the set from Lowest to highest with a simple IntegerComparator
Add each item of the set to a String
so if you have {2,5,9,4} first Step->{2,4,5,9}; second->"2459"
This way you will get a unique String from a unique set. If you really need to map them to an integer value, you can hash the string after that.
A second way I can think of is to store them in a java Set and simply map it against a HashMap with set as keys
Calculate a 'diff' from each set {1, 6, 87, 89} = {1,5,81,2,0,0,...}
{1,2,3,4} = { 1,1,1,1,0,0,0,0... };
Then binary encode each number with a variable length encoding and concatenate the bits.
It's hard to compare the sets (except for the first few equal bits), but because there can't be many large intervals in a set, all possible values just might fit into 64 bits. (slack of 16 bits at least...)

Sum of cells with several possibilities

I'm programming a Killer Sudoku Solver in Ruby and I try to take human strategies and put them into code. I have implemented about 10 strategies but I have a problem on this one.
In killer sudoku, we have "zones" of cells and we know the sum of these cells and we know possibilities for each cell.
Example :
Cell 1 can be 1, 3, 4 or 9
Cell 2 can be 2, 4 or 5
Cell 3 can be 3, 4 or 9
The sum of all cells must be 12
I want my program to try all possibilities to eliminate possibilities. For instance, here, cell 1 can't be 9 because you can't make 3 by adding two numbers possible in cells 2 and 3.
So I want that for any number of cells, it removes the ones that are impossible by trying them and seeing it doesn't work.
How can I get this working ?
There's multiple ways to approach the general problem of game solving, and emulating human strategies is not always the best way. That said, here's how you can solve your question:
1st way, brute-forcy
Basically, we want to try all possibilities of the combinations of the cells, and pick the ones that have the correct sum.
cell_1 = [1,3,4,9]
cell_2 = [2,4,5]
cell_3 = [3,4,9]
all_valid_combinations = cell_1.product(cell_2,cell_3).select {|combo| combo.sum == 12}
# => [[1, 2, 9], [3, 5, 4], [4, 4, 4], [4, 5, 3]]
#.sum isn't a built-in function, it's just used here for convenience
to pare this down to individual cells, you could do:
cell_1 = all_valid_combinations.map {|combo| combo[0]}.uniq
# => [1, 3, 4]
cell_2 = all_valid_combinations.map {|combo| combo[1]}.uniq
# => [2, 5, 4]
. . .
if you don't have a huge large set of cells, this way is easier to code. it can get a bit inefficienct though. For small problems, this is the way I'd use.
2nd way, backtracking search
Another well known technique takes the problem from the other approach. Basically, for each cell, ask "Can this cell be this number, given the other cells?"
so, starting with cell 1, can the number be 1? to check, we see if cells 2 and 3 can sum to 11. (12-1)
* can cell 2 have the value 2? to check, can cell 3 sum to 9 (11-1)
and so on. In very large cases, where you could have many many valid combinations, this will be slightly faster, as you can return 'true' on the first time you find a valid number for a cell. Some people find recursive algorithms a bit harder to grok, though, so your mileage may vary.

Searching through arrays in Ruby using a range

i've seen a lot of other questions touch on the subject but nothing as on topic as to provide an answer for my particular problem. Is there a way to search an array and return values within a given range...
for clarity I have one array = [0,5,12]
I would like to compare array to another array (array2) using a range of numbers.
Using array[0] as a starting point how would I return all values from array2 +/- 4 of array[0].
In this particular case the returned numbers from array2 will be within the range of -4 and 4.
Thanks for the help ninjas.
Build a Range that is your target ±4 and then use Enumerable#select (remember that Array includes Enumerable) and Range#include?.
For example, let us look for 11±4 in an array that contains the integers between 1 and 100 (inclusive):
a = (1..100).to_a
r = 11-4 .. 11+4
a.select { |i| r.include?(i) }
# [7, 8, 9, 10, 11, 12, 13, 14, 15]
If you don't care about preserving order in your output and you don't have any duplicates in your array you could do it this way:
a & (c-w .. c+w).to_a
Where c is the center of your interval and w is the interval's width. Using Array#& treats the arrays as sets so it will remove duplicates and is not guaranteed to preserver order.

Resources