In MongoDB, how can I replicate this simple query using map/reduce in ruby? - ruby

So using the regular MongoDB library in Ruby I have the following query to find average filesize across a set of 5001 documents:
avg = 0
total = collection.count()
Rails.logger.info "#{total} asset creation stats in the system"
collection.find().each {|row| avg += (row["filesize"] * (1/total.to_f)) if row["filesize"]}
Its pretty simple, so I'm trying to do the same using map/reduce as a learning exercise. This is what I came up with:
map = 'function(){emit("filesizes", {size: this.filesize, num: 1});}'
reduce = 'function(k, vals){
var result = {size: 0, num: 0};
for(var x in vals) {
var new_total = result.num + vals[x].num;
result.num = new_total
result.size = result.size + (vals[x].size * (vals[x].num / new_total));
}
return result;
}'
#results = collection.map_reduce(map, reduce)
However the two queries come back with two different results!
What am I doing wrong?

You're weighting the results by doing the division in every reduce function.
Say you had [{size : 5, num : 1}, {size : 5, num : 1}, {size : 5, num : 1}]. Your reduce would calculate:
result.size = 0 + (5*(1/1)) = 5
result.size = 5 + (5*(1/2)) = 7.25
result.size = 7.25 + (5*(1/3)) = 8.9
As you can see, this weights the results towards the earliest elements.
Fortunately, there's a simple solution. Just add a finalize function, which will be run once after the reduce step is finished.

Related

StatsBase.sample() can't draw without replacement if FrequencyWeights() are provided

I'm trying to sample without replacement using StatsBase.sample() in Julia. Because I have my data in the following form I can use my counts as FrequencyWeights():
using StatsBase
data = ["red", "blue", "green"]
counts = [2000, 2000, 1]
balls = StatsBase.sample(data, FrequencyWeights(counts), 1000)
One problem with this is that StatsBase.sample() implicitly sets replace=true so this is possible:
countmap(balls)
Dict("blue" => 478,
"green" => 2, # <= two green balls?
"red" => 520)
Explicitly setting replace=false throws an error.
balls = StatsBase.sample(data, FrequencyWeights(counts), 1000, replace=false)
Cannot draw 3 samples from 1000 samples without replacement.
error(::String)#error.jl:33
var"#sample!#174"(::Bool, ::Bool, ::typeof(StatsBase.sample!), ::Random._GLOBAL_RNG, ::Vector{String}, ::StatsBase.FrequencyWeights{Int64, Int64, Vector{Int64}}, ::Vector{String})#sampling.jl:858
#sample#175#sampling.jl:871[inlined]
#sample#176#sampling.jl:874[inlined]
top-level scope#Local: 2[inlined]
Is my only solution here to reformat my data to a wide form like this? Because that seems very inefficient as my actual data set has a lot of counts.:
wide_data = [fill("red", 2000)..., fill("blue", 2000)..., "green"]
sample(wide_data, 1000, replace=false)
You could use something like this:
function mysample(data::AbstractVector, counts::AbstractVector, n::Integer)
#assert n <= sum(counts)
#assert firstindex(data) == 1
#assert firstindex(counts) == 1
res = similar(data, n)
fw = FrequencyWeights(copy(counts))
for i in 1:n
j = sample(axes(data, 1), fw)
res[i] = data[j]
fw.sum -= 1
fw.values[j] -= 1
end
return res
end

Function to find X numbers that add up to a certain value

I need a function that finds a variable amount of numbers, which together must add up to a certain value. In this case it is 8.
The numbers which can be added together are predefined in a table, to make things easier.
Current approach: Shuffle the table using a small algorithm, add first X values together, if they don't add up to 8, start over (including shuffling again) until the first X values add up to 8.
My code does work, just 2 problems: It takes a long time to process (obviously) and it can cause a stack overflow error if I don't add a cooldown.
Code can be dirty, it's not for a live production. Also im only an intermediate lua developer at best...
function sleep (a) -- random sleep function I found
local sec = tonumber(os.clock() + a);
while (os.clock() < sec) do
end
end
function shuffle(tbl) -- random shuffle function I found
for i = #tbl, 2, -1 do
math.randomseed( os.time() )
math.random();math.random();math.random();math.random();
local j = math.random(i)
tbl[i], tbl[j] = tbl[j], tbl[i]
end
return tbl
end
local times = {
0.5,
1.0,
1.5,
2.0,
2.5,
3.0,
3.5,
4.0
}
local timeunits = {} --refer to line 49, I did not want to do it like that...
function nnumbersto8(amount)
local sum = 0
local numbs = {}
times = shuffle(times) --reshuffle the set
for i = 1,amount,1 do --add first x values together
sum = sum + times[i]
numbs[i] = times[i]
end
if sum ~= 8 then sleep(0.1) nnumbersto8(amount) return end --if they are not 8, repeat process with cooldown to avoid stack overflow
--return numbs -- This doesn't work for some reason, nothing gets returned outside the function
timeunits = numbs
end
nnumbersto8(5) -- manual run it for now
print(unpack(timeunits))
There must be a simpler way, right?
Thanks in advance, any help is appreciated!
Here is a method that will work for large numbers of elements, and will pick a random solution with theoretically even likelihood for each.
function solution_node (value, count, remainder)
local node = {}
node.value = value
node.count = count
node.remainder = remainder
return node
end
function choose_solutions (node1, node2)
if node1 == nil then
return node2
elseif node2 == nil then
return node1
else
-- Make a random choice of which solution to pick.
if node1.count < math.random(node1.count + node2.count) then
node2.count = node1.count + node2.count
return node2
else
node1.count = node1.count + node2.count
return node1
end
end
end
function decode_solution (node)
if node == nil then
return nil
end
answer = {}
while node.value ~= nil do
table.insert(answer, node.value)
-- This causes the solution to be randomly shuffled.
local i = math.random(#answer)
answer[#answer], answer[i] = answer[i], answer[#answer]
node = node.remainder
end
return answer
end
function random_sum(tbl, count, target)
local choices = {}
-- Normally arrays are not 0-based in Lua but this is very convenient.
for j = 0,count do
choices[j] = {}
end
-- Make sure that the empty set is there.
choices[0][0.0] = solution_node(nil, 1, nil)
for i = 1,#tbl do
for j = count,1,-1 do
for this_sum, node in pairs(choices[j-1]) do
local next_sum = this_sum + tbl[i]
local next_node = solution_node(tbl[i], node.count, node)
-- Try adding this value in to a solution.
if next_sum <= target then
choices[j][next_sum] = choose_solutions(next_node, choices[j][next_sum])
end
end
end
end
return decode_solution(choices[count][target])
end
local times = {
0.2,
0.3,
0.5,
1.0,
1.2,
1.3,
1.5,
2.0,
2.5,
3.0,
3.5,
4.0
}
math.randomseed( os.time() )
local result = random_sum(times, 5, 8.0)
print("answer")
for k, v in pairs(result) do print(v) end
Sorry for my code. I haven't coded in Lua for a few years.
This is the subset sum problem with an extra restriction on the number of elements you are allowed to choose.
The solution is to use Dynamic Programming similar to regular Subset Sum, but add an extra variable that indicates how many items you have used.
This should go something among the lines of:
Failing stop clauses:
DP[-1][x][n] = false, for all x,n>0 // out of elements
DP[i][-1][n] = false, for all i,n>0 // exceeded X items
DP[i][x][n] = false n < 0 // Passed the sum limit. This is an optimization only if all elements are non negative.
Successful stop clause:
DP[i][0][0] = true for all i >= 0
Recursive formula:
DP[i][x][n] = DP[i-1][x][n] OR DP[i-1][x-1][n-item[i]] // Watch for n<item[i] case here.
^ ^
Did not take the item Used the item
There are no solutions for 1, 2 and for values greater than 5, so the function only accepts 3, 4 and 5.
Here we are doing a shallow copy of the times table then we get a random index from the copy and begin searching for the solution, removing values we use as we go.
local times = {
0.5,
1.0,
1.5,
2.0,
2.5,
3.0,
3.5,
4.0
}
function nNumbersTo8(amount)
if amount < 3 or amount > 5 then
return {}
end
local sum = 0
local numbers = {}
local set = {table.unpack(times)}
for i = 1, amount - 1, 1 do
local index = math.random(#set)
local value = set[index]
if not (8 < (sum + value)) then
sum = sum + value
table.insert(numbers, value)
table.remove(set, index)
else
break
end
end
local reminder = 8 - sum
for _,v in ipairs(set)do
if v == reminder then
sum = sum + v
table.insert(numbers, v)
break
end
end
if #numbers == amount then
return numbers
else
return nNumbersTo8(amount)
end
end
for i=1,100 do
print(table.unpack(nNumbersTo8(5)))
end
Example response:
1.5 0.5 3 2 1
3 0.5 1.5 1 2
2 3 1.5 0.5 1
3 2 1.5 1 0.5
0.5 1 2 3 1.5

How to model a mixture of 3 Normals in PyMC?

There is a question on CrossValidated on how to use PyMC to fit two Normal distributions to data. The answer of Cam.Davidson.Pilon was to use a Bernoulli distribution to assign data to one of the two Normals:
size = 10
p = Uniform( "p", 0 , 1) #this is the fraction that come from mean1 vs mean2
ber = Bernoulli( "ber", p = p, size = size) # produces 1 with proportion p.
precision = Gamma('precision', alpha=0.1, beta=0.1)
mean1 = Normal( "mean1", 0, 0.001 )
mean2 = Normal( "mean2", 0, 0.001 )
#deterministic
def mean( ber = ber, mean1 = mean1, mean2 = mean2):
return ber*mean1 + (1-ber)*mean2
Now my question is: how to do it with three Normals?
Basically, the issue is that you can't use a Bernoulli distribution and 1-Bernoulli anymore. But how to do it then?
edit: With the CDP's suggestion, I wrote the following code:
import numpy as np
import pymc as mc
n = 3
ndata = 500
dd = mc.Dirichlet('dd', theta=(1,)*n)
category = mc.Categorical('category', p=dd, size=ndata)
precs = mc.Gamma('precs', alpha=0.1, beta=0.1, size=n)
means = mc.Normal('means', 0, 0.001, size=n)
#mc.deterministic
def mean(category=category, means=means):
return means[category]
#mc.deterministic
def prec(category=category, precs=precs):
return precs[category]
v = np.random.randint( 0, n, ndata)
data = (v==0)*(50+ np.random.randn(ndata)) \
+ (v==1)*(-50 + np.random.randn(ndata)) \
+ (v==2)*np.random.randn(ndata)
obs = mc.Normal('obs', mean, prec, value=data, observed = True)
model = mc.Model({'dd': dd,
'category': category,
'precs': precs,
'means': means,
'obs': obs})
The traces with the following sampling procedure look good as well. Solved!
mcmc = mc.MCMC( model )
mcmc.sample( 50000,0 )
mcmc.trace('means').gettrace()[-1,:]
there is a mc.Categorical object that does just this.
p = [0.2, 0.3, .5]
t = mc.Categorical('test', p )
t.random()
#array(2, dtype=int32)
It returns an int between 0 and len(p)-1. To model the 3 Normals, you make p a mc.Dirichlet object (it accepts a k length array as the hyperparameters; setting the values in the array to be the same is setting the prior probabilities to be equal). The rest of the model is nearly identical.
This is a generalization of the model I suggested above.
Update:
Okay, so instead of having different means, we can collapse them all into 1:
means = Normal( "means", 0, 0.001, size=3 )
...
#mc.deterministic
def mean(categorical=categorical, means = means):
return means[categorical]

Checking for termination when converting real to rational

Recently I found this in some code I wrote a few years ago. It was used to rationalize a real value (within a tolerance) by determining a suitable denominator and then checking if the difference between the original real and the rational was small enough.
Edit to clarify : I actually don't want to convert all real values. For instance I could choose a max denominator of 14, and a real value that equals 7/15 would stay as-is. It's not as clear that as it's an outside variable in the algorithms I wrote here.
The algorithm to get the denominator was this (pseudocode):
denominator(x)
frac = fractional part of x
recip = 1/frac
if (frac < tol)
return 1
else
return recip * denominator(recip)
end
end
Seems to be based on continued fractions although it became clear on looking at it again that it was wrong. (It worked for me because it would eventually just spit out infinity, which I handled outside, but it would be often really slow.) The value for tol doesn't really do anything except in the case of termination or for numbers that end up close. I don't think it's relatable to the tolerance for the real - rational conversion.
I've replaced it with an iterative version that is not only faster but I'm pretty sure it won't fail theoretically (d = 1 to start with and fractional part returns a positive, so recip is always >= 1) :
denom_iter(x d)
return d if d > maxd
frac = fractional part of x
recip = 1/frac
if (frac = 0)
return d
else
return denom_iter(recip d*recip)
end
end
What I'm curious to know if there's a way to pick the maxd that will ensure that it converts all values that are possible for a given tolerance. I'm assuming 1/tol but don't want to miss something. I'm also wondering if there's an way in this approach to actually limit the denominator size - this allows some denominators larger than maxd.
This can be considered a 2D minimization problem on error:
ArgMin ( r - q / p ), where r is real, q and p are integers
I suggest the use of Gradient Descent algorithm . The gradient in this objective function is:
f'(q, p) = (-1/p, q/p^2)
The initial guess r_o can be q being the closest integer to r, and p being 1.
The stopping condition can be thresholding of the error.
The pseudo-code of GD can be found in wiki: http://en.wikipedia.org/wiki/Gradient_descent
If the initial guess is close enough, the objective function should be convex.
As Jacob suggested, this problem can be better solved by minimizing the following error function:
ArgMin ( p * r - q ), where r is real, q and p are integers
This is linear programming, which can be efficiently solved by any ILP (Integer Linear Programming) solvers. GD works on non-linear cases, but lack efficiency in linear problems.
Initial guesses and stopping condition can be similar to stated above. Better choice can be obtained for individual choice of solver.
I suggest you should still assume convexity near the local minimum, which can greatly reduce cost. You can also try Simplex method, which is great on linear programming problem.
I give credit to Jacob on this.
A problem similar to this is solved in the Approximations section beginning ca. page 28 of Bill Gosper's Continued Fraction Arithmetic document. (Ref: postscript file; also see text version, from line 1984.) The general idea is to compute continued-fraction approximations of the low-end and high-end range limiting numbers, until the two fractions differ, and then choose a value in the range of those two approximations. This is guaranteed to give a simplest fraction, using Gosper's terminology.
The python code below (program "simpleden") implements a similar process. (It probably is not as good as Gosper's suggested implementation, but is good enough that you can see what kind of results the method produces.) The amount of work done is similar to that for Euclid's algorithm, ie O(n) for numbers with n bits, so the program is reasonably fast. Some example test cases (ie the program's output) are shown after the code itself. Note, function simpleratio(vlo, vhi) as shown here returns -1 if vhi is smaller than vlo.
#!/usr/bin/env python
def simpleratio(vlo, vhi):
rlo, rhi, eps = vlo, vhi, 0.0000001
if vhi < vlo: return -1
num = denp = 1
nump = den = 0
while 1:
klo, khi = int(rlo), int(rhi)
if klo != khi or rlo-klo < eps or rhi-khi < eps:
tlo = denp + klo * den
thi = denp + khi * den
if tlo < thi:
return tlo + (rlo-klo > eps)*den
elif thi < tlo:
return thi + (rhi-khi > eps)*den
else:
return tlo
nump, num = num, nump + klo * num
denp, den = den, denp + klo * den
rlo, rhi = 1/(rlo-klo), 1/(rhi-khi)
def test(vlo, vhi):
den = simpleratio(vlo, vhi);
fden = float(den)
ilo, ihi = int(vlo*den), int(vhi*den)
rlo, rhi = ilo/fden, ihi/fden;
izok = 'ok' if rlo <= vlo <= rhi <= vhi else 'wrong'
print '{:4d}/{:4d} = {:0.8f} vlo:{:0.8f} {:4d}/{:4d} = {:0.8f} vhi:{:0.8f} {}'.format(ilo,den,rlo,vlo, ihi,den,rhi,vhi, izok)
test (0.685, 0.695)
test (0.685, 0.7)
test (0.685, 0.71)
test (0.685, 0.75)
test (0.685, 0.76)
test (0.75, 0.76)
test (2.173, 2.177)
test (2.373, 2.377)
test (3.484, 3.487)
test (4.0, 4.87)
test (4.0, 8.0)
test (5.5, 5.6)
test (5.5, 6.5)
test (7.5, 7.3)
test (7.5, 7.5)
test (8.534537, 8.534538)
test (9.343221, 9.343222)
Output from program:
> ./simpleden
8/ 13 = 0.61538462 vlo:0.68500000 9/ 13 = 0.69230769 vhi:0.69500000 ok
6/ 10 = 0.60000000 vlo:0.68500000 7/ 10 = 0.70000000 vhi:0.70000000 ok
6/ 10 = 0.60000000 vlo:0.68500000 7/ 10 = 0.70000000 vhi:0.71000000 ok
2/ 4 = 0.50000000 vlo:0.68500000 3/ 4 = 0.75000000 vhi:0.75000000 ok
2/ 4 = 0.50000000 vlo:0.68500000 3/ 4 = 0.75000000 vhi:0.76000000 ok
3/ 4 = 0.75000000 vlo:0.75000000 3/ 4 = 0.75000000 vhi:0.76000000 ok
36/ 17 = 2.11764706 vlo:2.17300000 37/ 17 = 2.17647059 vhi:2.17700000 ok
18/ 8 = 2.25000000 vlo:2.37300000 19/ 8 = 2.37500000 vhi:2.37700000 ok
114/ 33 = 3.45454545 vlo:3.48400000 115/ 33 = 3.48484848 vhi:3.48700000 ok
4/ 1 = 4.00000000 vlo:4.00000000 4/ 1 = 4.00000000 vhi:4.87000000 ok
4/ 1 = 4.00000000 vlo:4.00000000 8/ 1 = 8.00000000 vhi:8.00000000 ok
11/ 2 = 5.50000000 vlo:5.50000000 11/ 2 = 5.50000000 vhi:5.60000000 ok
5/ 1 = 5.00000000 vlo:5.50000000 6/ 1 = 6.00000000 vhi:6.50000000 ok
-7/ -1 = 7.00000000 vlo:7.50000000 -7/ -1 = 7.00000000 vhi:7.30000000 wrong
15/ 2 = 7.50000000 vlo:7.50000000 15/ 2 = 7.50000000 vhi:7.50000000 ok
8030/ 941 = 8.53347503 vlo:8.53453700 8031/ 941 = 8.53453773 vhi:8.53453800 ok
24880/2663 = 9.34284641 vlo:9.34322100 24881/2663 = 9.34322193 vhi:9.34322200 ok
If, rather than the simplest fraction in a range, you seek the best approximation given some upper limit on denominator size, consider code like the following, which replaces all the code from def test(vlo, vhi) forward.
def smallden(target, maxden):
global pas
pas = 0
tol = 1/float(maxden)**2
while 1:
den = simpleratio(target-tol, target+tol);
if den <= maxden: return den
tol *= 2
pas += 1
# Test driver for smallden(target, maxden) routine
import random
totalpass, trials, passes = 0, 20, [0 for i in range(20)]
print 'Maxden Num Den Num/Den Target Error Passes'
for i in range(trials):
target = random.random()
maxden = 10 + round(10000*random.random())
den = smallden(target, maxden)
num = int(round(target*den))
got = float(num)/den
print '{:4d} {:4d}/{:4d} = {:10.8f} = {:10.8f} + {:12.9f} {:2}'.format(
int(maxden), num, den, got, target, got - target, pas)
totalpass += pas
passes[pas-1] += 1
print 'Average pass count: {:0.3}\nPass histo: {}'.format(
float(totalpass)/trials, passes)
In production code, drop out all the references to pas (etc.), ie, drop out pass-counting code.
The routine smallden is given a target value and a maximum value for allowed denominators. Given maxden possible choices of denominators, it's reasonable to suppose that a tolerance on the order of 1/maxden² can be achieved. The pass-counts shown in the following typical output (where target and maxden were set via random numbers) illustrate that such a tolerance was reached immediately more than half the time, but in other cases tolerances 2 or 4 or 8 times as large were used, requiring extra calls to simpleratio. Note, the last two lines of output from a 10000-number test run are shown following the complete output of a 20-number test run.
Maxden Num Den Num/Den Target Error Passes
1198 32/ 509 = 0.06286837 = 0.06286798 + 0.000000392 1
2136 115/ 427 = 0.26932084 = 0.26932103 + -0.000000185 1
4257 839/2670 = 0.31423221 = 0.31423223 + -0.000000025 1
2680 449/ 509 = 0.88212181 = 0.88212132 + 0.000000486 3
2935 440/1853 = 0.23745278 = 0.23745287 + -0.000000095 1
6128 347/1285 = 0.27003891 = 0.27003899 + -0.000000077 3
8041 1780/4243 = 0.41951449 = 0.41951447 + 0.000000020 2
7637 3926/7127 = 0.55086292 = 0.55086293 + -0.000000010 1
3422 27/ 469 = 0.05756930 = 0.05756918 + 0.000000113 2
1616 168/1507 = 0.11147976 = 0.11147982 + -0.000000061 1
260 62/ 123 = 0.50406504 = 0.50406378 + 0.000001264 1
3775 52/3327 = 0.01562970 = 0.01562750 + 0.000002195 6
233 6/ 13 = 0.46153846 = 0.46172772 + -0.000189254 5
3650 3151/3514 = 0.89669892 = 0.89669890 + 0.000000020 1
9307 2943/7528 = 0.39094049 = 0.39094048 + 0.000000013 2
962 206/ 225 = 0.91555556 = 0.91555496 + 0.000000594 1
2080 564/1975 = 0.28556962 = 0.28556943 + 0.000000190 1
6505 1971/2347 = 0.83979548 = 0.83979551 + -0.000000022 1
1944 472/ 833 = 0.56662665 = 0.56662696 + -0.000000305 2
3244 291/1447 = 0.20110574 = 0.20110579 + -0.000000051 1
Average pass count: 1.85
Pass histo: [12, 4, 2, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
The last two lines of output from a 10000-number test run:
Average pass count: 1.77
Pass histo: [56659, 25227, 10020, 4146, 2072, 931, 497, 233, 125, 39, 33, 17, 1, 0, 0, 0, 0, 0, 0, 0]

Bridge crossing puzzle

Four men have to cross a bridge at night.Any party who crosses, either one or two men, must carry the flashlight with them. The flashlight must be walked back and forth; it cannot be thrown, etc. Each man walks at a different speed. One takes 1 minute to cross, another 2 minutes, another 5, and the last 10 minutes. If two men cross together, they must walk at the slower man's pace. There are no tricks--the men all start on the same side, the flashlight cannot shine a long distance, no one can be carried, etc.
And the question is What's the fastest they can all get across. I am basically looking for some generalized approach to these kind of problem. I was told by my friend, that this can be solved by Fibonacci series, but the solution does not work for all.
Please note this is not a home work.
There is an entire PDF (alternate link) that solves the general case of this problem (in a formal proof).
17 minutes - this is a classic MS question.
1,2 => 2 minutes passed.
1 retuns => 3 minutes passed.
5,10 => 13 minutes passed.
2 returns => 15 minutes passed.
1,2 => 17 minute passed.
In general the largest problem / slowest people should always be put together, and sufficient trips of the fastest made to be able to bring the light back each time without using a slow resource.
I would solve this problem by placing a fake job ad on Dice.com, and then asking this question in the interviews until someone gets it right.
As per Wikipedia
The puzzle is known to have appeared as early as 1981, in the book Super Strategies For Puzzles and Games. In this version of the puzzle, A, B, C and D take 5, 10, 20, and 25 minutes, respectively, to cross, and the time limit is 60 minutes
This question was however popularized after its appearance in the book "How Would You Move Mount Fuji?"
the question can be generalized for N people with varying individual time taken to cross the bridge.
The below program works for a generic N no of people and their times.
class Program
{
public static int TotalTime(List<int> band, int n)
{
if (n < 3)
{
return band[n - 1];
}
else if (n == 3)
{
return band[0] + band[1] + band[2];
}
else
{
int temp1 = band[n - 1] + band[0] + band[n - 2] + band[0];
int temp2 = band[1] + band[0] + band[n - 1] + band[1];
if (temp1 < temp2)
{
return temp1 + TotalTime(band, n - 2);
}
else if (temp2 < temp1)
{
return temp2 + TotalTime(band, n - 2);
}
else
{
return temp2 + TotalTime(band, n - 2);
}
}
}
static void Main(string[] args)
{
// change the no of people crossing the bridge
// add or remove corresponding time to the list
int n = 4;
List<int> band = new List<int>() { 1, 2, 5, 10 };
band.Sort();
Console.WriteLine("The total time taken to cross the bridge is: " + Program.TotalTime(band, n));
Console.ReadLine();
}
}
OUTPUT:
The total time taken to cross the bridge is: 17
For,
int n = 5;
List<int> band = new List<int>() { 1, 2, 5, 10, 12 };
OUTPUT:
The total time taken to cross the bridge is: 25
For,
int n = 4;
List<int> band = new List<int>() { 5, 10, 20, 25 };
OUTPUT
The total time taken to cross the bridge is: 60
Here's the response in ruby:
#values = [1, 2, 5, 10]
# #values = [1, 2, 5, 10, 20, 25, 30, 35, 40]
#values.sort!
#position = #values.map { |v| :first }
#total = 0
def send_people(first, second)
first_time = #values[first]
second_time = #values[second]
#position[first] = :second
#position[second] = :second
p "crossing #{first_time} and #{second_time}"
first_time > second_time ? first_time : second_time
end
def send_lowest
value = nil
#values.each_with_index do |v, i|
if #position[i] == :second
value = v
#position[i] = :first
break
end
end
p "return #{value}"
return value
end
def highest_two
first = nil
second = nil
first_arr = #position - [:second]
if (first_arr.length % 2) == 0
#values.each_with_index do |v, i|
if #position[i] == :first
first = i unless first
second = i if !second && i != first
end
break if first && second
end
else
#values.reverse.each_with_index do |v, i|
real_index = #values.length - i - 1
if #position[real_index] == :first
first = real_index unless first
second = real_index if !second && real_index != first
end
break if first && second
end
end
return first, second
end
#we first send the first two
#total += send_people(0, 1)
#then we get the lowest one from there
#total += send_lowest
#we loop through the rest with highest 2 always being sent
while #position.include?(:first)
first, second = highest_two
#total += send_people(first, second)
#total += send_lowest if #position.include?(:first)
end
p "Total time: #{#total}"
Another Ruby implementation inspired by #roc-khalil 's solution
#values = [1,2,5,10]
# #values = [1,2,5,10,20,25]
#left = #values.sort
#right = []
#total_time = 0
def trace(moving)
puts moving
puts "State: #{#left} #{#right}"
puts "Time: #{#total_time}"
puts "-------------------------"
end
# move right the fastest two
def move_fastest_right!
fastest_two = #left.shift(2)
#right = #right + fastest_two
#right = #right.sort
#total_time += fastest_two.max
trace "Moving right: #{fastest_two}"
end
# move left the fastest runner
def move_fastest_left!
fastest_one = #right.shift
#left << fastest_one
#left.sort!
#total_time += fastest_one
trace "Moving left: #{fastest_one}"
end
# move right the slowest two
def move_slowest_right!
slowest_two = #left.pop(2)
#right = #right + slowest_two
#right = #right.sort
#total_time += slowest_two.max
trace "Moving right: #{slowest_two}"
end
def iterate!
move_fastest_right!
return if #left.length == 0
move_fastest_left!
move_slowest_right!
return if #left.length == 0
move_fastest_left!
end
puts "State: #{#left} #{#right}"
puts "-------------------------"
while #left.length > 0
iterate!
end
Output:
State: [1, 2, 5, 10] []
-------------------------
Moving right: [1, 2]
State: [5, 10] [1, 2]
Time: 2
-------------------------
Moving left: 1
State: [1, 5, 10] [2]
Time: 3
-------------------------
Moving right: [5, 10]
State: [1] [2, 5, 10]
Time: 13
-------------------------
Moving left: 2
State: [1, 2] [5, 10]
Time: 15
-------------------------
Moving right: [1, 2]
State: [] [1, 2, 5, 10]
Time: 17
-------------------------
An exhaustive search of all possibilities is simple with such a small problem space. Breadth or depth first would work. It is a simple CS problem.
I prefer the missionary and cannibal problems myself
17 -- a very common question
-> 1-2 = 2
<- 2 = 2
-> 5,10 = 10 (none of them has to return)
<- 1 = 1
-> 1,2 = 2
all on the other side
total = 2+2+10+1+2 = 17
usually people get it as 19 in the first try
Considering there will be 2 sides, side 1 and side 2, and N number of people should cross from side 1 to side 2. The logic to cross the bridge by a limit of L number of people would be -
Step 1 : Move L number of the fastest members from side 1 to side 2
Step 2 : Bring back the fastest person back from Side 2 to Side 1
Step 3 : Move L number of slowest members from side 1 to side 2
Step 4 : Bring back the fastest person among the ones present in Side 2
Repeat these steps until you will be left with no one in Side 1, either at the end of step 2 or at the end of step 4.
A code in C# for n number of people, with just 2 persons at a time is here. This will intake N number of people, which can be specified in runtime. It will then accept person name and time taken, for N people. The output also specifies the iteration of the lowest time possible.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace RiverCrossing_Problem
{
class Program
{
static void Main(string[] args)
{
Dictionary<string, int> Side1 = new Dictionary<string, int>();
Dictionary<string, int> Side2 = new Dictionary<string, int>();
Console.WriteLine("Enter number of persons");
int n = Convert.ToInt32(Console.ReadLine());
Console.WriteLine("Enter the name and time taken by each");
for(int a =0; a<n; a++)
{
string tempname = Console.ReadLine();
int temptime = Convert.ToInt32(Console.ReadLine());
Side1.Add(tempname, temptime);
}
Console.WriteLine("Shortest time and logic:");
int totaltime = 0;
int i = 1;
do
{
KeyValuePair<string, int> low1, low2, high1, high2;
if (i % 2 == 1)
{
LowestTwo(Side1, out low1, out low2);
Console.WriteLine("{0} and {1} goes from side 1 to side 2, time taken = {2}", low1.Key, low2.Key, low2.Value);
Side1.Remove(low2.Key);
Side1.Remove(low1.Key);
Side2.Add(low2.Key, low2.Value);
Side2.Add(low1.Key, low1.Value);
totaltime += low2.Value;
low1 = LowestOne(Side2);
Console.WriteLine("{0} comes back to side 1, time taken = {1}", low1.Key, low1.Value);
totaltime += low1.Value;
Side1.Add(low1.Key, low1.Value);
Side2.Remove(low1.Key);
i++;
}
else
{
HighestTwo(Side1, out high1, out high2);
Console.WriteLine("{0} and {1} goes from side 1 to side 2, time taken = {2}", high1.Key, high2.Key, high1.Value);
Side1.Remove(high1.Key);
Side1.Remove(high2.Key);
Side2.Add(high1.Key, high1.Value);
Side2.Add(high2.Key, high2.Value);
totaltime += high1.Value;
low1 = LowestOne(Side2);
Console.WriteLine("{0} comes back to side 1, time taken = {1}", low1.Key, low1.Value);
Side2.Remove(low1.Key);
Side1.Add(low1.Key, low1.Value);
totaltime += low1.Value;
i++;
}
} while (Side1.Count > 2);
KeyValuePair<string, int> low3, low4;
LowestTwo(Side1, out low3, out low4);
Console.WriteLine("{0} and {1} goes from side 1 to side 2, time taken = {2}", low3.Key, low4.Key, low4.Value);
Side2.Add(low4.Key, low4.Value);
Side2.Add(low3.Key, low3.Value);
totaltime += low4.Value;
Console.WriteLine("\n");
Console.WriteLine("Total Time taken = {0}", totaltime);
}
public static void LowestTwo(Dictionary<string, int> a, out KeyValuePair<string, int> low1, out KeyValuePair<string, int> low2)
{
Dictionary<string, int> b = a;
low1 = b.OrderBy(kvp => kvp.Value).First();
b.Remove(low1.Key);
low2 = b.OrderBy(kvp => kvp.Value).First();
}
public static void HighestTwo(Dictionary<string,int> a, out KeyValuePair<string,int> high1, out KeyValuePair<string,int> high2)
{
Dictionary<string, int> b = a;
high1 = b.OrderByDescending(k => k.Value).First();
b.Remove(high1.Key);
high2 = b.OrderByDescending(k => k.Value).First();
}
public static KeyValuePair<string, int> LowestOne(Dictionary<string,int> a)
{
Dictionary<string, int> b = a;
return b.OrderBy(k => k.Value).First();
}
}
}
Sample output for a random input provided which is 7 in this case, and 2 persons to cross at a time will be:
Enter number of persons
7
Enter the name and time taken by each
A
2
B
5
C
3
D
7
E
9
F
4
G
6
Shortest time and logic:
A and C goes from side 1 to side 2, time taken = 3
A comes back to side 1, time taken = 2
E and D goes from side 1 to side 2, time taken = 9
C comes back to side 1, time taken = 3
A and C goes from side 1 to side 2, time taken = 3
A comes back to side 1, time taken = 2
G and B goes from side 1 to side 2, time taken = 6
C comes back to side 1, time taken = 3
A and C goes from side 1 to side 2, time taken = 3
A comes back to side 1, time taken = 2
A and F goes from side 1 to side 2, time taken = 4
Total Time taken = 40
I mapped out the possible solutions algebraically and came out the with the fastest time . and assigning algebra with the list of A,B,C,D where A is the smallest and D is the biggest
the formula for the shortest time is B+A+D+B+B or 3B+A+D
or in wordy terms, the sum of second fastest times 3 and add with the Most Fastest and Most Slowest.
looking at the program there was also a question of increased items. Although I haven't gone through it, but I am guessing the formula still applies, just add till all items with the second item times 3 and sum of everything except 2nd slowest times.
e.g. since 4 items are 3 x second + first and fourth.
then 5 items are 3 x second + first, third and fifth.
would like to check this out using the program.
also i just looked at the pdf shared above, so for more items it is the sum of
3 x second + fastest + sum of slowest of each subsequent pair.
looking at the steps for the optimized solution, the idea is
-right - for two items going to the right the fastest is 1st and 2nd fastest ,
-left - then plus the fastest going back for a single item is the fastest item
-right - bring the slowest 2 items, which will account for only the slowest item and disregard the second slowest.
-left - the 2nd fastest item.
-final right - the 1st and 2nd fastest again
so again summing up = 2nd fastest goes 3 times, fastest goes once, and slowest goes with 2nd slowest.
A simple algorithm is : assume 'N' is the number of people who can cross at same time and one person has to cross back bearing the torch
When moving people from first side to second side preference should be given to the 'N' slowest walkers
Always use fastest walker to take torch from second side to first side
When moving people from first side to second side, take into consideration who will bring back the torch in the next step. If the speed of the torch bearer in next step will be equal to the fastest walker, among the 'N' slowest walkers, in the current step then instead of choosing 'N' slowest walker, as given in '1', choose 'N' fastest walkers
Here is a sample python script which does this: https://github.com/meowbowgrr/puzzles/blob/master/bridgentorch.py

Resources