Summing and comparing arrays in MongoDB - ruby

I'm very new to mongodb, I've done simple stuff like storing and retrieving documents.
I have a collection of documents (thousands and growing) with and embedded array of integers (can be as large as 5000 integers) between 0 and 255
Example Mongo Collection Data:
{
"name": "item1",
"values": [1, 93, 45, 67, 89, 1, 2, 32, 45]
},
{
"name": "item2",
"values": [1, 23, 45, 123, 1, 5, 89, 14, 22]
},
{
"name": "item3",
"values": [23, 1, 44, 78, 89, 22, 150, 23, 12]
},
{
"name": "item4",
"values": [90, 23, 11, 67, 29, 1, 2, 1, 45]
}
Comparison would be:
pseudo code:
distance = 0
for a in passed_in_item
for b in mongo_collection
distance += a - b
end
end
an example passed in array (same as the ones in the mongo document, they will always be the same length):
[1, 93, 45, 67, 89, 1, 2, 32, 45]
I'd like to pass in an array of integers as a query and difference it against the array in the document to find the one with the least difference. Is this the sort of thing map reduce is good at and how would I roughly go about it? An example would be great. Also eventually I'd like the passed in array to come from another document in Mongo in a different collection.
Thanks!

Related

Ruby: using `.each` or `.step`, step forward a random amount for each iteration

(Also open to other similar non-Rails methods)
Given (0..99), return entries that are randomly picked in-order.
Example results:
0, 5, 11, 13, 34..
3, 12, 45, 67, 87
0, 1, 2, 3, 4, 5.. (very unlikely, of course)
Current thought:
(0..99).step(rand(0..99)).each do |subindex|
array.push(subindex)
end
However, this sets a single random value for all the steps whereas I'm looking for each step to be random.
Get a random value for the number of elements to pick, randomly get this number of elements, sort.
(0..99).to_a.sample((0..99).to_a.sample).sort
#⇒ [7, 20, 22, 29, 45, 48, 57, 61, 62, 76, 80, 82]
Or, shorter (credits to #Stefan):
(0..99).to_a.sample(rand(0..99)).sort
#⇒ [7, 20, 22, 29, 45, 48, 57, 61, 62, 76, 80, 82]
Or, in more functional manner:
λ = (0..99).to_a.method(:sample)
λ.(λ.()).sort
To feed exactly N numbers:
N = 10
(0..99).to_a.sample(N).sort
#⇒ [1, 5, 8, 12, 45, 54, 60, 65, 71, 91]
There're many ways to achieve it.
For example here's slow yet simple one:
# given `array`
random_indexes = (0...array.size).to_a.sample(rand(array.size))
random_indexes.sort.each { |i| puts array[i] }
Or why don't you just:
array.each do |value|
next if rand(2).zero?
puts value
end
Or you could use Enumerator#next random number of times.
Below example returns a sorted array with random entries from given range based on randomly picked true or false from array [true, false]:
(0..99).select { [true, false].sample }
=> [0, 3, 12, 13, 14, 17, 20, 24, 26, 28, 30, 32, 34, 35, 36, 38, 39, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 53, 54, 55, 56, 58, 59, 60, 61, 62, 65, 67, 69, 70, 71, 79, 81, 84, 86, 91, 93, 94, 95, 98, 99]
To reduce the chances of a bigger array being returned, you can modify your true/false array to include more falsey values:
(0..99).select { ([true] + [false] * 9).sample }
=> [21, 22, 28, 33, 37, 58, 59, 63, 77, 85, 86]

Algorithm - longest wiggle subsequence

Algorithm:
A sequence of numbers is called a wiggle sequence if the differences
between successive numbers strictly alternate between positive and
negative. The first difference (if one exists) may be either positive
or negative. A sequence with fewer than two elements is trivially a
wiggle sequence.
For example, [1,7,4,9,2,5] is a wiggle sequence because the
differences (6,-3,5,-7,3) are alternately positive and negative. In
contrast, [1,4,7,2,5] and [1,7,4,5,5] are not wiggle sequences, the
first because its first two differences are positive and the second
because its last difference is zero.
Given a sequence of integers, return the length of the longest
subsequence that is a wiggle sequence. A subsequence is obtained by
deleting some number of elements (eventually, also zero) from the
original sequence, leaving the remaining elements in their original
order.
Examples:
Input: [1,7,4,9,2,5]
Output: 6
The entire sequence is a wiggle sequence.
Input: [1,17,5,10,13,15,10,5,16,8]
Output: 7
There are several subsequences that achieve this length. One is [1,17,10,13,10,16,8].
Input: [1,2,3,4,5,6,7,8,9]
Output: 2
My soln:
def wiggle_max_length(nums)
[ build_seq(nums, 0, 0, true, -1.0/0.0),
build_seq(nums, 0, 0, false, 1.0/0.0)
].max
end
def build_seq(nums, index, len, wiggle_up, prev)
return len if index >= nums.length
if wiggle_up && nums[index] - prev > 0 || !wiggle_up && nums[index] - prev < 0
build_seq(nums, index + 1, len + 1, !wiggle_up, nums[index])
else
build_seq(nums, index + 1, len, wiggle_up, prev)
end
end
This is working for smaller inputs (e.g [1,1,1,3,2,4,1,6,3,10,8] and for all the sample inputs, but its failing for very large inputs (which is harder to debug) like:
[33,53,12,64,50,41,45,21,97,35,47,92,39,0,93,55,40,46,69,42,6,95,51,68,72,9,32,84,34,64,6,2,26,98,3,43,30,60,3,68,82,9,97,19,27,98,99,4,30,96,37,9,78,43,64,4,65,30,84,90,87,64,18,50,60,1,40,32,48,50,76,100,57,29,63,53,46,57,93,98,42,80,82,9,41,55,69,84,82,79,30,79,18,97,67,23,52,38,74,15]
which should have output: 67 but my soln outputs 57. Does anyone know what is wrong here?
The approach tried is a greedy solution (because it always uses the current element if it satisfies the wiggle condition), but this does not always work.
I will try illustrating this with this simpler counter-example: 1 100 99 6 7 4 5 2 3.
One best sub-sequence is: 1 100 6 7 4 5 2 3, but the two build_seq calls from the algorithm will produce these sequences:
1 100 99
1
Edit: A slightly modified greedy approach does work -- see this link, thanks Peter de Rivaz.
Dynamic Programming can be used to obtain an optimal solution.
Note: I wrote this before seeing the article mentioned by #PeterdeRivaz. While dynamic programming (O(n2)) works, the article presents a superior (O(n)) "greedy" algorithm ("Approach #5"), which is also far easier to code than a dynamic programming solution. I have added a second answer that implements that method.
Code
def longest_wiggle(arr)
best = [{ pos_diff: { length: 0, prev_ndx: nil },
neg_diff: { length: 0, prev_ndx: nil } }]
(1..arr.size-1).each do |i|
calc_best(arr, i, :pos_diff, best)
calc_best(arr, i, :neg_diff, best)
end
unpack_best(best)
end
def calc_best(arr, i, diff, best)
curr = arr[i]
prev_indices = (0..i-1).select { |j|
(diff==:pos_diff) ? (arr[j] < curr) : (arr[j] > curr) }
best[i] = {} if best.size == i
best[i][diff] =
if prev_indices.empty?
{ length: 0, prev_ndx: nil }
else
prev_diff = previous_diff(diff)
j = prev_indices.max_by { |j| best[j][prev_diff][:length] }
{ length: (1 + best[j][prev_diff][:length]), prev_ndx: j }
end
end
def previous_diff(diff)
diff==:pos_diff ? :neg_diff : :pos_diff·
end
def unpack_best(best)
last_idx, last_diff =
best.size.times.to_a.product([:pos_diff, :neg_diff]).
max_by { |i,diff| best[i][diff][:length] }
return [0, []] if best[last_idx][last_diff][:length].zero?
best_path = []
loop do
best_path.unshift(last_idx)
prev_index = best[last_idx][last_diff][:prev_ndx]
break if prev_index.nil?
last_idx = prev_index·
last_diff = previous_diff(last_diff)
end
best_path
end
Examples
longest_wiggle([1, 4, 2, 6, 8, 3, 2, 5])
#=> [0, 1, 2, 3, 5, 7]]
The length of the longest wiggle is 6 and consists of the elements at indices 0, 1, 2, 3, 5 and 7, that is, [1, 4, 2, 6, 3, 5].
A second example uses the larger array given in the question.
arr = [33, 53, 12, 64, 50, 41, 45, 21, 97, 35, 47, 92, 39, 0, 93, 55, 40, 46,
69, 42, 6, 95, 51, 68, 72, 9, 32, 84, 34, 64, 6, 2, 26, 98, 3, 43, 30,
60, 3, 68, 82, 9, 97, 19, 27, 98, 99, 4, 30, 96, 37, 9, 78, 43, 64, 4,
65, 30, 84, 90, 87, 64, 18, 50, 60, 1, 40, 32, 48, 50, 76, 100, 57, 29,
arr.size 63, 53, 46, 57, 93, 98, 42, 80, 82, 9, 41, 55, 69, 84, 82, 79, 30, 79,
18, 97, 67, 23, 52, 38, 74, 15]
#=> 100
longest_wiggle(arr).size
#=> 67
longest_wiggle(arr)
#=> [0, 1, 2, 3, 5, 6, 7, 8, 9, 10, 12, 14, 16, 17, 19, 21, 22, 23, 25,
# 27, 28, 29, 30, 32, 34, 35, 36, 37, 38, 39, 41, 42, 43, 44, 47, 49, 50,
# 52, 53, 54, 55, 56, 57, 58, 62, 63, 65, 66, 67, 70, 72, 74, 75, 77, 80,
# 81, 83, 84, 90, 91, 92, 93, 95, 96, 97, 98, 99]
As indicated, the largest wiggle is comprised of 67 elements of arr. Solution time was essentially instantaneous.
The values of arr at those indices are as follows.
[33, 53, 12, 64, 41, 45, 21, 97, 35, 47, 39, 93, 40, 46, 42, 95, 51, 68, 9,
84, 34, 64, 6, 26, 3, 43, 30, 60, 3, 68, 9, 97, 19, 27, 4, 96, 37, 78, 43,
64, 4, 65, 30, 84, 18, 50, 1, 40, 32, 76, 57, 63, 53, 57, 42, 80, 9, 41, 30,
79, 18, 97, 23, 52, 38, 74, 15]
[33, 53, 12, 64, 41, 45, 21, 97, 35, 92, 0, 93, 40, 69, 6, 95, 51, 72, 9, 84, 34, 64, 2, 98, 3, 43, 30, 60, 3, 82, 9, 97, 19, 99, 4, 96, 9, 78, 43, 64, 4, 65, 30, 90, 18, 60, 1, 40, 32, 100, 29, 63, 46, 98, 42, 82, 9, 84, 30, 79, 18, 97, 23, 52, 38, 74]
Explanation
I had intended to provide an explanation of the algorithm and its implementation, but having since learned there is a superior approach (see my note at the beginning of my answer), I have decided against doing that, but would of course be happy to answer any questions. The link in my note explains, among other things, how dynamic programming can be used here.
Let Wp[i] be the longest wiggle sequence starting at element i, and where the first difference is positive. Let Wn[i] be the same, but where the first difference is negative.
Then:
Wp[k] = max(1+Wn[k'] for k<k'<n, where A[k'] > A[k]) (or 1 if no such k' exists)
Wn[k] = max(1+Wp[k'] for k<k'<n, where A[k'] < A[k]) (or 1 if no such k' exists)
This gives an O(n^2) dynamic programming solution, here in pseudocode
Wp = [1, 1, ..., 1] -- length n
Wn = [1, 1, ..., 1] -- length n
for k = n-1, n-2, ..., 0
for k' = k+1, k+2, ..., n-1
if A[k'] > A[k]
Wp[k] = max(Wp[k], Wn[k']+1)
else if A[k'] < A[k]
Wn[k] = max(Wn[k], Wp[k']+1)
result = max(max(Wp[i], Wn[i]) for i = 0, 1, ..., n-1)
In a comment on #quertyman's answer, #PeterdeRivaz provided a link to an article that considers various approaches to solving the "longest wiggle subsequence" problem. I have implemented "Approach #5", which has a time-complexity of O(n).
The algorithm is simple as well as fast. The first step is to remove one element from each pair of consecutive elements that are equal, and continue to do so until there are no consecutive elements that are equal. For example, [1,2,2,2,3,4,4] would be converted to [1,2,3,4]. The longest wiggle subsequence includes the first and last elements of the resulting array, a, and every element a[i], 0 < i < a.size-1 for which a[i-1] < a[i] > a[i+1] ora[i-1] > a[i] > a[i+1]. In other words, it includes the first and last elements and all peaks and valley bottoms. Those elements are A, D, E, G, H, I in the graph below (taken from the above-referenced article, with permission).
Code
def longest_wiggle(arr)
arr.each_cons(2).
reject { |a,b| a==b }.
map(&:first).
push(arr.last).
each_cons(3).
select { |triple| [triple.min, triple.max].include? triple[1] }.
map { |_,n,_| n }.
unshift(arr.first).
push(arr.last)
end
Example
arr = [33, 53, 12, 64, 50, 41, 45, 21, 97, 35, 47, 92, 39, 0, 93, 55, 40,
46, 69, 42, 6, 95, 51, 68, 72, 9, 32, 84, 34, 64, 6, 2, 26, 98, 3,
43, 30, 60, 3, 68, 82, 9, 97, 19, 27, 98, 99, 4, 30, 96, 37, 9, 78,
43, 64, 4, 65, 30, 84, 90, 87, 64, 18, 50, 60, 1, 40, 32, 48, 50, 76,
100, 57, 29, 63, 53, 46, 57, 93, 98, 42, 80, 82, 9, 41, 55, 69, 84,
82, 79, 30, 79, 18, 97, 67, 23, 52, 38, 74, 15]
a = longest_wiggle(arr)
#=> [33, 53, 12, 64, 41, 45, 21, 97, 35, 92, 0, 93, 40, 69, 6, 95, 51, 72,
# 9, 84, 34, 64, 2, 98, 3, 43, 30, 60, 3, 82, 9, 97, 19, 99, 4, 96, 9,
# 78, 43, 64, 4, 65, 30, 90, 18, 60, 1, 40, 32, 100, 29, 63, 46, 98, 42,
# 82, 9, 84, 30, 79, 18, 97, 23, 52, 38, 74, 15]
a.size
#=> 67
Explanation
The steps are as follows.
arr = [3, 4, 4, 5, 2, 3, 7, 4]
enum1 = arr.each_cons(2)
#=> #<Enumerator: [3, 4, 4, 5, 2, 3, 7, 4]:each_cons(2)>
We can see the elements that will be generated by this enumerator by converting it to an array.
enum1.to_a
#=> [[3, 4], [4, 4], [4, 5], [5, 2], [2, 3], [3, 7], [7, 4]]
Continuing, remove all but one of each group of successive equal elements.
d = enum1.reject { |a,b| a==b }
#=> [[3, 4], [4, 5], [5, 2], [2, 3], [3, 7], [7, 4]]
e = d.map(&:first)
#=> [3, 4, 5, 2, 3, 7]
Add the last element.
f = e.push(arr.last)
#=> [3, 4, 5, 2, 3, 7, 4]
Next, find the peaks and valley bottoms.
enum2 = f.each_cons(3)
#=> #<Enumerator: [3, 4, 5, 2, 3, 7, 4]:each_cons(3)>
enum2.to_a
#=> [[3, 4, 5], [4, 5, 2], [5, 2, 3], [2, 3, 7], [3, 7, 4]]
g = enum2.select { |triple| [triple.min, triple.max].include? triple[1] }
#=> [[4, 5, 2], [5, 2, 3], [3, 7, 4]]
h = g.map { |_,n,_| n }
#=> [5, 2, 7]
Lastly, add the first and last values of arr.
i = h.unshift(arr.first)
#=> [3, 5, 2, 7]
i.push(arr.last)
#=> [3, 5, 2, 7, 4]

Make a square multiplication table in Ruby

I got this question in an interview and got almost all the way to the answer but got stuck on the last part. If I want to get the multiplication table for 5, for instance, I want to get the output to be formatted like so:
1, 2, 3, 4, 5
2, 4, 6, 8, 10
3, 6, 9, 12, 15
4, 8, 12, 16, 20
5, 10, 15, 20, 25
My answer to this is:
def make_table(n)
s = ""
1.upto(n).each do |i|
1.upto(n).each do |j|
s += (i*j).to_s
end
s += "\n"
end
p s
end
But the output for make_table(5) is:
"12345\n246810\n3691215\n48121620\n510152025\n"
I've tried variations with array but I'm getting similar output.
What am I missing or how should I think about the last part of the problem?
You can use map and join to get a String in one line :
n = 5
puts (1..n).map { |x| (1..n).map { |y| x * y }.join(', ') }.join("\n")
It iterates over rows (x=1, x=2, ...). For each row, it iterates over cells (y=1, y=2, ...) and calculates x*y. It joins every cells in a row with ,, and joins every rows in the table with a newline :
1, 2, 3, 4, 5
2, 4, 6, 8, 10
3, 6, 9, 12, 15
4, 8, 12, 16, 20
5, 10, 15, 20, 25
If you want to keep the commas aligned, you can use rjust :
puts (1..n).map { |x| (1..n).map { |y| (x * y).to_s.rjust(3) }.join(',') }.join("\n")
It outputs :
1, 2, 3, 4, 5
2, 4, 6, 8, 10
3, 6, 9, 12, 15
4, 8, 12, 16, 20
5, 10, 15, 20, 25
You could even go fancy and calculate the width of n**2 before aligning commas :
n = 11
width = Math.log10(n**2).ceil + 1
puts (1..n).map { |x| (1..n).map { |y| (x * y).to_s.rjust(width) }.join(',') }.join("\n")
It outputs :
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22
3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33
4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44
5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55
6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66
7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77
8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88
9, 18, 27, 36, 45, 54, 63, 72, 81, 90, 99
10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110
11, 22, 33, 44, 55, 66, 77, 88, 99, 110, 121
Without spaces between the figures, the result is indeed unreadable. Have a look at the % operator, which formats strings and numbers. Instead of
s += (i*j).to_s
you could write
s += '%3d' % (i*j)
If you really want to get the output formatted in the way you explained in your posting (which I don't find that much readable), you could do a
s += "#{i*j}, "
This leaves you with two extra characters at the end of the line, which you have to remove. An alternative would be to use an array. Instead of the inner loop, you would have then something like
s += 1.upto(n).to_a.map {|j| i*j}.join(', ') + "\n"
You don't need to construct a string if you're only interested in printing the table and not returning the table(as a string).
(1..n).each do |a|
(1..n-1).each { |b| print "#{a * b}, " }
puts a * n
end
This is how I'd do it.
require 'matrix'
n = 5
puts Matrix.build(n) { |i,j| (i+1)*(j+1) }.to_a.map { |row| row.join(', ') }
1, 2, 3, 4, 5
2, 4, 6, 8, 10
3, 6, 9, 12, 15
4, 8, 12, 16, 20
5, 10, 15, 20, 25
See Matrix::build.
You can make it much shorter but here's my version.
range = Array(1..12)
range.each do |element|
range.map { |item| print "#{element * item} " } && puts
end

Stop Pry from putting each value of a returned array on a new line?

I watched the RubyConf 2013 talk on Pry and I have decided I ought to give it a good try.
I am working with some large arrays. It would be easier to work with my code if Pry would display returned arrays the way IRB does. What seems odd is that pry will not add newlines if the number of chars in the displayed array is small but it will add them when the number of chars in the displayed array surpasses some threshold (appears to be 26 chars in my case). Does anybody know how to make Pry stop doing this?
IRB:
main 001(0) > a = [] #=> []
main 002(0) > (1..100).each{|i| a << i} #=> 1..100
main 003(0) > a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]
Pry:
[1] pry(main)> a = []
=> []
[2] pry(main)> (1..26).each{ a << 1 }
=> 1..26
[3] pry(main)> a
=> [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
[4] pry(main)> a << 1
=> [1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1]
Edit your .pryrc file to include
Pry.config.print = proc { |output, value| output.puts "=> #{value.inspect}" }
To get pry to print inline like robertjlooby's answer, but retain the pretty-printing:
# in your .pryrc
Pry.config.print = lambda do |output, value, _pry_|
_pry_.pager.open do |pager|
pager.print _pry_.config.output_prefix
Pry::ColorPrinter.pp(value, pager, 9e99)
end
end
adapted from pry source

Ordering things in python...?

I was under the impression that set() would order a collection much like .sort()
However it seems that it doesn't, what was peculiar to me was why it reorders the collection.
>>> h = '321'
>>> set(h)
set(['1', '3', '2'])
>>> h
'321'
>>> h = '22311'
>>> set(h)
set(['1', '3', '2'])
why doesn't it return set(['1', '2', '3']). I also seems that no matter how many instances of each number I user or in what order I use them it always return set(['1', '3', '2']). Why?
Edit:
So I have read your answers and my counter to that is this.
>>> l = [1,2,3,3]
>>> set(l)
set([1, 2, 3])
>>> l = [3,3,2,3,1,1,3,2,3]
>>> set(l)
set([1, 2, 3])
Why does it order numbers and not strings?
Also
import random
l = []
for itr in xrange(101):
l.append(random.randint(1,101))
print set(l)
Outputs
>>>
set([1, 2, 4, 5, 6, 8, 10, 11, 12, 14, 15, 16, 18, 19, 23, 24, 25, 26, 29, 30, 31, 32, 34, 40, 43, 45, 46, 47, 48, 49, 50, 51, 53, 54, 55, 57, 58, 59, 60, 61, 62, 63, 64, 66, 67, 69, 70, 74, 75, 77, 79, 80, 83, 84, 85, 87, 88, 89, 90, 93, 94, 96, 97, 99, 101])
python set is unordered, hence there is no guarantee that the elements would be ordered in the same way as you specify them
If you want a sorted output, then call sorted:
sorted(set(h))
Responding to your edit: it comes down to the implementation of set. In CPython, it boils down to two things:
1) the set will be sorted by hash (the __hash__ function) modulo a limit
2) the limit is generally the next largest power of 2
So let's look at the int case:
x=1
type(x) # int
x.__hash__() # 1
for ints, the hash equals the original value:
[x==x.__hash__() for x in xrange(1000)].count(False) # = 0
Hence, when all the values are ints, it will use the integer hash value and everything works smoothly.
for the string representations, the hashes dont work the same way:
x='1'
type(x)
# str
x.__hash__()
# 6272018864
To understand why the sort breaks for ['1','2','3'], look at those hash values:
[str(x).__hash__() for x in xrange(1,4)]
# [6272018864, 6400019251, 6528019634]
In our example, the mod value is 4 (3 elts, 2^1 = 2, 2^2 = 4) so
[str(x).__hash__()%4 for x in xrange(1,4)]
# [0, 3, 2]
[(str(x).__hash__()%4,str(x)) for x in xrange(1,4)]
# [(0, '1'), (3, '2'), (2, '3')]
Now if you sort this beast, you get the ordering that you see in set:
[y[1] for y in sorted([(str(x).__hash__()%4,str(x)) for x in xrange(1,4)])]
# ['1', '3', '2']
From the python documentation of the set type:
A set object is an unordered collection of distinct hashable objects.
This means that the set doesn't have a concept of the order of the elements in it. You should not be surprised when the elements are printed on your screen in an unusual order.
A set in Python tries to be a "set" in the mathematical sense of the term. No duplicates, and order shouldn't matter.

Resources