trying to find the lowest average height in this .dat file of numbers - ruby

Im trying to fit a swimming pool onto this piece of terrain. The terrain is the first index (10x10 in this case) and the last index is the size the pool will be(2x2).
ive figured out how to read in the terrain and get the mean and standard deviation of it but now i need to find the lowest average height. I know i need to use a while loop but I dont know how to go about this can anyone help me ?
10
1 1 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 12 12 12
1 2 3 4 5 6 7 12 12 12
1 2 3 4 5 6 7 12 12 12
21

Here are two answers showing different styles. The first is faster (only important for HUGE terrain sizes), but less "Ruby-esque"; the second is more functional, but creates extra intermediary data. For your own best education, I encourage you to ensure that you understand these thoroughly, and choose how to proceed in a way that is best for you.
Also, I've assumed that the 21 you have in your question is a mistake, and you meant to have a 2 there.
First, both solutions start with the same code that creates an array of arrays for the terrain:
# Load the text file as an array of strings
lines = IO.readlines('pool.txt')
# Turn it into an array of arrays of numbers
terrain = lines.map{ |s| s.scan(/\d+/).map(&:to_i) }
# Throw out the silly grid size; we'll infer it from real data instead!
terrain.shift
# Take the last line (pool size) out of the terrain
pool_size = terrain.pop.first
The first solution walks through the terrain and calculates the average for each sub-grid, keeping track of the lowest number:
# For fun, we'll allow terrain that doesn't have to be square
rows = terrain.length
cols = terrain.first.length
best_size = Float::INFINITY
0.upto(rows-pool_size-1) do |y|
0.upto(cols-pool_size-1) do |x|
# x,y is the upper left corner of a valid pool_size × pool_size grid
average = 0.0
0.upto(pool_size-1) do |m|
0.upto(pool_size-1) do |n|
# Add up each point in the sub-grid
average += terrain[y+n][x+m]
end
end
# The number of points we added is the square of the size
average /= (pool_size*pool_size)
# Mark this as the best seen so far
best_size = average if average < best_size
end
end
p best_size
#=> 1.25
The second solution finds all the sub-grids, and then uses the Enumerable#min_by method to find the best. We also create a method for calculating the average on an array of numbers, just for fun and more self-describing code:
# See http://ruby-doc.org/stdlib-1.9.3/libdoc/matrix/rdoc/Matrix.html
require 'matrix'
class Matrix
# Average all values in the array (as a float)
def average
parts = to_a.flatten
parts.inject(:+) / parts.length.to_f
end
end
# Hey look, a nice 2D grid of elevations!
terrain = Matrix[ *terrain ]
# Create an array of matrices, each one representing a possible pool
rows = 0...(terrain.row_size - size)
cols = 0...(terrain.column_size - size)
pools = rows.flat_map{|x| cols.map{ |y| terrain.minor(x,size,y,size) } }
# Find the lowest pool by calling the above 'average' method on each
lowest = pools.min_by(&:average)
p lowest, lowest.average
#=> Matrix[[1, 1], [1, 2]]
#=> 1.25
On my computer the simple array-of-arrays method takes ~0.6s to find the lowest 3x3 pool in a random 400×400 terrain, while the matrix technique takes ~1.3s. So the matrix style is more than twice as slow, but still plenty fast for your assignment. :)

It's Ruby. You probably want to use iterators, not while loops.
But do your own homework. You'll learn more.

Related

Can you check for duplicates by taking the sum of the array and then the product of the array?

Let's say we have an array of size N with values from 1 to N inside it. We want to check if this array has any duplicates. My friend suggested two ways that I showed him were wrong:
Take the sum of the array and check it against the sum 1+2+3+...+N. I gave the example 1,1,4,4 which proves that this way is wrong since 1+1+4+4 = 1+2+3+4 despite there being duplicates in the array.
Next he suggested the same thing but with multiplication. i.e. check if the product of the elements in the array is equal to N!, but again this fails with an array like 2,2,3,2, where 2x2x3x2 = 1x2x3x4.
Finally, he suggested doing both checks, and if one of them fails, then there is a duplicate in the array. I can't help but feel that this is still incorrect, but I can't prove it to him by giving him an example of an array with duplicates that passes both checks. I understand that the burden of proof lies with him, not me, but I can't help but want to find an example where this doesn't work.
P.S. I understand there are many more efficient ways to solve such a problem, but we are trying to discuss this particular approach.
Is there a way to prove that doing both checks doesn't necessarily mean there are no duplicates?
Here's a counterexample: 1,3,3,3,4,6,7,8,10,10
Found by looking for a pair of composite numbers with factorizations that change the sum & count by the same amount.
I.e., 9 -> 3, 3 reduces the sum by 3 and increases the count by 1, and 10 -> 2, 5 does the same. So by converting 2,5 to 10 and 9 to 3,3, I leave both the sum and count unchanged. Also of course the product, since I'm replacing numbers with their factors & vice versa.
Here's a much longer one.
24 -> 2*3*4 increases the count by 2 and decreases the sum by 15
2*11 -> 22 decreases the count by 1 and increases the sum by 9
2*8 -> 16 decreases the count by 1 and increases the sum by 6.
We have a second 2 available because of the factorization of 24.
This gives us:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24
Has the same sum, product, and count of elements as
1,3,3,4,4,5,6,7,9,10,12,13,14,15,16,16,17,18,19,20,21,22,22,23
In general you can find these by finding all factorizations of composite numbers, seeing how they change the sum & count (as above), and choosing changes in both directions (composite <-> factors) that cancel out.
I've just wrote a simple not very effective brute-force function. And it shows that there is for example
1 2 4 4 4 5 7 9 9
sequence that has the same sum and product as
1 2 3 4 5 6 7 8 9
For n = 10 there are more such sequences:
1 2 3 4 6 6 6 7 10 10
1 2 4 4 4 5 7 9 9 10
1 3 3 3 4 6 7 8 10 10
1 3 3 4 4 4 7 9 10 10
2 2 2 3 4 6 7 9 10 10
My write-only c++ code is here: https://ideone.com/2oRCbh

what does 'work through an algorithm by hand' mean?

I'm doing this assignment and I don't understand the wording. Do you think it means to write in pseudocode or write a paragraph? Does anyone have any ideas?
It means to describe the algorithm with words and draw the array values in each step. Here is an example: https://www.geeksforgeeks.org/bubble-sort/
Before executing the algorithm, the array is:
4 2 12 1 7 9 9
After executing the algorithm, the array is:
1 2 4 7 9 9 12
During the execution of the algorithm, the array slowly changes from what is was before, to what it will be after. Your assignment requires you to show all the intermediate steps.
For instance, the very first step of execution will be "compare element at position 0 with position 1; if element at position 1 is lower, then swap the two elements". The first two elements are 4 and 2; 2 is lower; hence they should be swapped; the resulting array is:
2 4 12 1 7 9 9
Then the second step will be "compare elements at position 1 and 2", which are 4 and 12; etc.

matlab for loop: fastest and most efficient method to reproduce large matrix

My data is a 2096x252 matrix of double values. I need a for loop or an equivalent which performs the following:
Each time the matrix is reproduced the first array is deleted and the second becomes the first. When the loop runs again, the remaining matrix is reproduced and the first array is deleted and the next becomes the first and so on.
I've tried using repmat but it is too slow and tedious when dealing with large matrices (2096x252).
Example input:
1 2 3 4
3 4 5 6
3 5 7 5
9 6 3 2
Desired output:
1 2 3 4
3 4 5 6
3 5 7 5
9 6 3 2
3 4 5 6
3 5 7 5
9 6 3 2
3 5 7 5
9 6 3 2
9 6 3 2
Generally with Matlab it is much faster to pre-allocate a large array than to build it incrementally. When you know in advance the final size of the large array there's no reason not to follow this general advice.
Something like the following should do what you want. Suppose you have an array in(nrows, ncols); then
indices = [0 nrows:-1:1];
out = zeros(sum(indices),ncols);
for ix = 1:nrows
out(1+sum(indices(1:ix)):sum(indices(1:ix+1)),:) = in(ix:end,:);
end
This worked on your small test input. I expect you can figure out what is going on.
Whether it is the fastest of all possible approaches I don't know, but I expect it to be much faster than building a large matrix incrementally.
Disclaimer:
You'll probably have memory issues with large matrices, but that is not the question.
Now, to the business:
For a given matrix A, the straightforward approach with the for loop would be:
[N, M] = size(A);
B = zeros(sum(1:N), M);
offset = 1;
for i = 1:N
B(offset:offset + N - i, :) = A(i:end, :);
offset = offset + size(A(i:end, :), 1);
end
B is the desired output matrix.
However, this solution is expected to be slow as well, because of the for loop.
Edit: preallocated B instead of dynamically changing size (this optimization should achieve a slight speedup).

Summation of difference between matrix elements

I am in the process of building a function in MATLAB. As a part of it I have to calculate differences between elements in two matrices and sum them up.
Let me explain considering two matrices,
1 2 3 4 5 6
13 14 15 16 17 18
and
7 8 9 10 11 12
19 20 21 22 23 24
The calculations in the first row - only four elements in both matrices are considered at once (zero indicates padding):
(1-8)+(2-9)+(3-10)+(4-11): This replaces 1 in initial matrix.
(2-9)+(3-10)+(4-11)+(5-12): This replaces 2 in initial matrix.
(3-10)+(4-11)+(5-12)+(6-0): This replaces 3 in initial matrix.
(4-11)+(5-12)+(6-0)+(0-0): This replaces 4 in initial matrix. And so on
I am unable to decide how to code this in MATLAB. How do I do it?
I use the following equation.
Here i ranges from 1 to n(h), n(h), the number of distant pairs. It depends on the lag distance chosen. So if I choose a lag distance of 1, n(h) will be the number of elements - 1.
When I use a 7 X 7 window, considering the central value, n(h) = 4 - 1 = 3 which is the case here.
You may want to look at the circshfit() function:
a = [1 2 3 4; 9 10 11 12];
b = [5 6 7 8; 12 14 15 16];
for k = 1:3
b = circshift(b, [0 -1]);
b(:, end) = 0;
diff = sum(a - b, 2)
end

Using one probability set to generate another [duplicate]

This question already has answers here:
Expand a random range from 1–5 to 1–7
(78 answers)
Closed 8 years ago.
How can I generate a bigger probability set from a smaller probability set?
This is from Algorithm Design Manual -Steven Skiena
Q:
Use a random number generator (rng04) that generates numbers from {0,1,2,3,4} with equal probability to write a random number generator that generates numbers from 0 to 7 (rng07) with equal probability?
I tried for around 3 hours now, mostly based on summing two rng04 outputs. The problem is that in that case the probability of each value is different - 4 can come with 5/24 probability while 0 happening is 1/24. I tried some ways to mask it, but cannot.
Can somebody solve this?
You have to find a way to combine the two sets of random numbers (the first and second random {0,1,2,3,4} ) and make n*n distinct possibilities. Basically the problem is that with addition you get something like this
X
0 1 2 3 4
0 0 1 2 3 4
Y 1 1 2 3 4 5
2 2 3 4 5 6
3 3 4 5 6 7
4 4 5 6 7 8
Which has duplicates, which is not what you want. One possible way to combine the two sets would be the Z = X + Y*5 where X and Y are the two random numbers. That would give you a set of results like this
X
0 1 2 3 4
0 0 1 2 3 4
Y 1 5 6 7 8 9
2 10 11 12 13 14
3 15 16 17 18 19
4 20 21 22 23 24
So now that you have a bigger set of random numbers, you need to do the reverse and make it smaller. This set has 25 distinct values (because you started with 5, and used two random numbers, so 5*5=25). The set you want has 8 distinct values. A naïve way to do this would be
x = rnd(5) // {0,1,2,3,4}
y = rnd(5) // {0,1,2,3,4}
z = x+y*5 // {0-24}
random07 = x mod 8
This would indeed have a range of {0,7}. But the values {1,7} would appear 3/25 times, and the value 0 would appear 4/25 times. This is because 0 mod 8 = 0, 8 mod 8 = 0, 16 mod 8 = 0 and 24 mod 8 = 0.
To fix this, you can modify the code above to this.
do {
x = rnd(5) // {0,1,2,3,4}
y = rnd(5) // {0,1,2,3,4}
z = x+y*5 // {0-24}
while (z != 24)
random07 = z mod 8
This will take the one value (24) that is throwing off your probabilities and discard it. Generating a new random number if you get a 'bad' value like this will make your algorithm run very slightly longer (in this case 1/25 of the time it will take 2x as long to run, 1/625 it will take 3x as long, etc). But it will give you the right probabilities.
The real problem, of course, is the fact that the numbers in the middle of the sum (4 in this case) occur in many combinations (0+4, 1+3, etc.) whereas 0 and 8 have exactly one way to be produced.
I don't know how to solve this problem, but I'm going to try to reduce it a bit for you. Some points to consider:
The 0-7 range has 8 possible values, so ultimately the total number of possible situations that you should aim for has to be a multiple of 8. That way you can have an integral number of distributions per value in that codomain.
When you take the sum of two density functions, the number of possible situations (not necessarily distinct when you evaluate the sum, just in terms of different permutations of inputs) is equal to the product of the size of each of the input sets.
Thus, given two {0,1,2,3,4} sets summed together, you have 5*5=25 possibilities.
It will not be possible to get a multiple of eight (see first point) from powers of 5 (see second point, but extrapolate it to any number of sets > 1), so you will need to have a surplus of possible situations in your function and ignore some of them if they occur.
The simplest way to do that, as far as I can see at this point, is to use the sum of two {0,1,2,3,4} sets (25 possibilities) and ignore 1 (to leave 24, a multiple of 8).
Thus the challenge now has been reduced to this: Find a way to distribute the remaining 24 possibilities among the 8 output values. For this, you'll probably NOT want to use the sum, but rather just the input values.
One way to do that is, imagine a number in base 5 constructed from your input. Ignore 44 (that's your 25th, superfluous value; if you get it, synthesize a new set of inputs) and take the others, modulo 8, and you'll get your 0-7 across 24 different input combinations (3 each), which is an equal distribution.
My logic would be this:
rn07 = 0;
do {
num = rng04;
}
while(num == 4);
rn07 = num * 2;
do {
num = rng04;
}
while(num == 4);
rn07 += num % 2

Resources