Check if all items within sub-array are identical Ruby - ruby

Trying to check if all items within sub-arrays are the same. For example, I have a 5x5 board and I want to know if one of the arrays contains all x's:
board = [[47, 44, 71, 8, 88],
['x', 'x', 'x', 'x', 'x'],
# [83, 85, 97, 'x', 57],
[83, 85, 97, 89, 57],
[25, 31, 96, 68, 51],
[75, 70, 54, 80, 83]]
I currently have:
def check_x
board.each do |x|
return true if x.include?('x')
end
return false
end
But this will merely check if one of the integers is x and not all. Any suggestions would be greatly appreciated.

A bit more idiomatic:
board.one? { |row| row.all? { |item| item == 'x' } }

As simple as board.map { |row| row.uniq.count == 1 } will do
#=> [false, true, false, false, false]
uniq returns unique elements in an array. map here is iterating over your array and passing one row at a time to the block. It will return true for cases where all elements in an array are same (['x', 'x', 'x', 'x', 'x'].uniq #=> ['x'] whose length is 1)
If you just want to check if any row in board has all duplicate elements, ruby has just a function. Guess what? any?. Just change above one-liner with any? as:
board.any? { |row| row.uniq.count == 1 } #=> true
If you want to find out which row(s) has/have all the duplicates, and what duplicate it has:
board.each.with_index.select { |row, index| row.uniq.count == 1 }
#=> [[["x", "x", "x", "x", "x"], 1]], where 1 is index.
Pure Ruby awesomeness.

if all elements are same in an array, that means maximum and minimum is equal.
for your board you can find index of desired sub-array with this one line
board.each {|b| puts board.index(b) if b.max == b.min}
or just replace x.include?("x") with x.min == x.max in your function for true/false result

Assuming all elements of board (rows of the board) are the same size, which seems a reasonable assumption, you could do it thus:
x_row = ['x']*board.first.size
#=> ["x", "x", "x", "x", "x"]
board.any? { |row| row == x_row }
#=> true

Assuming it's always a fixed length array, your method can just be:
def full_row
board.each do |row|
return true if (row.uniq.count == 1) && (row[0] == 'x')
end
return false
end
This could be boiled down to fewer lines, but I hate line wrapping in vim :p

Related

I have to write a program that outputs the largest X number, given X number and a huge filesize

So far, I have this code that reads a file and sorts it using Ruby. But this doesn't sort the numbers correctly and I think it will be inefficient, given that the file can be as big as 200GB and contains a number on each line. Can you suggest what else to do?
File.open("topN.txt", "w") do |file|
File.readlines("N.txt").sort.reverse.each do |line|
file.write(line.chomp<<"\n")
end
End
After everyone help over here this is how my code is looking so far...
begin
puts "What is the file name?"
file = gets.chomp
puts "Whats is the N number?"
myN = Integer(gets.chomp)
rescue ArgumentError
puts "That's not a number, try again"
retry
end
topN = File.open(file).each_line.max(myN){|a,b| a.to_i <=> b.to_i}
puts topN
Sorting 200GB of data in memory will not be very performant. I would write a little helper class which only remembers the N biggest elements added so far.
class SortedList
attr_reader :list
def initialize(size)
#list = []
#size = size
end
def add(element)
return if #min && #min > element
list.push(element)
reorganize_list
end
private
def reorganize_list
#list = list.sort.reverse.first(#size)
#min = list.last
end
end
Initialize an instance with the require N and the just add the values parsed from each line to this instance.
sorted_list = SortedList.new(n)
File.readlines("N.txt").each do |line|
sorted_list.add(line.to_i)
end
puts sorted_list.list
Suppose
str = File.read(in_filename)
#=> "117\n106\n143\n147\n63\n118\n146\n93\n"
You could convert that string to an enumerator that enumerates lines, use Enumerable#sort_by to sort those lines in descending order, join the resulting lines (that end in newlines) to form a string that can be written to file:
str.each_line.sort_by { |line| -line.to_i }.join
#=> "147\n146\n143\n118\n117\n106\n93\n63\n"
Another way is to convert the string to array of integers, sort the array using Array#sort, reverse the resulting array and then join the elements of the array back into a string that can be written to file:
str.each_line.map(&:to_i).sort.reverse.join("\n") << "\n"
#=> "147\n146\n143\n118\n117\n106\n93\n63\n"
Let's do a quick benchmark.
require 'benchmark/ips'
(str = 1_000_000.times.map { rand(10_000) }.join("\n") << "\n").size
Benchmark.ips do |x|
x.report("sort_by") { str.each_line.sort_by { |line| -line.to_i }.join }
x.report("sort") { str.each_line.map(&:to_i).sort.reverse.join("\n") << "\n" }
x.compare!
end
Comparison:
sort: 0.4 i/s
sort_by: 0.3 i/s - 1.30x slower
The mighty sort wins again!
You left this comment on your question:
"Write a program, topN, that given a number N and an arbitrarily large file that contains individual numbers on each line (e.g. 200Gb file), will output the largest N numbers, highest first."
That problem seems to me as somewhat different than the one described in the question, and also constitutes a more interesting problem. I have addressed that problem in this answer.
Code
def topN(fname, n, m=n)
raise ArgumentError, "m cannot be smaller than n" if m < n
f = File.open(fname)
best = Array.new(n)
n.times do |i|
break best.replace(best[0,i]) if f.eof?
best[i] = f.readline.to_i
end
best.sort!.reverse!
return best if f.eof?
new_best = Array.new(n)
cand = Array.new(m)
until f.eof?
rd(f, cand)
merge_arrays(best, new_best, cand)
end
f.close
best
end
def rd(f, cand)
cand.size.times { |i| cand[i] = (f.eof? ? -Float::INFINITY : f.readline.to_i) }
cand.sort!.reverse!
end
def merge_arrays(best, new_best, cand)
cand_largest = cand.first
best_idx = best.bsearch_index { |n| cand_largest > n }
return if best_idx.nil?
bi = best_idx
cand_idx = 0
nbr_to_compare = best.size-best_idx
nbr_to_compare.times do |i|
if cand[cand_idx] > best[bi]
new_best[i] = cand[cand_idx]
cand_idx += 1
else
new_best[i] = best[bi]
bi += 1
end
end
best[best_idx..-1] = new_best[0, nbr_to_compare]
end
Examples
Let's create a file with 10 million representations of integers, one per line.
require 'time'
FName = 'test'
(s = 10_000_000.times.with_object('') { |_,s| s << rand(100_000_000).to_s << "\n" }).size
s[0,27]
#=> "86752031\n84524374\n29347072\n"
File.write(FName, s)
#=> 88_888_701
Next, create a simple method to invoke topN with different arguments and to also show execution times.
def try_one(n, m=n)
t = Time.now
a = topN(FName, n, m)
puts "#{(Time.new-t).round(2)} seconds"
puts "top 5: #{a.first(5)}"
puts "bot 5: #{a[n-5..n-1]}"
end
In testing I found that setting m less than n was never desirable in terms of computational time. Requiring that m >= n allowed a small simplification to the code and a small efficiency improvement. I therefore made m >= n a requirement.
try_one 100, 100
9.44 seconds
top 5: [99999993, 99999993, 99999991, 99999971, 99999964]
bot 5: [99999136, 99999127, 99999125, 99999109, 99999078]
try_one 100, 1000
9.53 seconds
top 5: [99999993, 99999993, 99999991, 99999971, 99999964]
bot 5: [99999136, 99999127, 99999125, 99999109, 99999078]
try_one 100, 10_000
9.95 seconds
top 5: [99999993, 99999993, 99999991, 99999971, 99999964]
bot 5: [99999136, 99999127, 99999125, 99999109, 99999078]
Here I've tested for the case of producing the 100 largest values with different number of lines of the file to read at a time m. As seen, the method is insensitive to this latter value. As expected, the largest 5 values and the smallest 5 values (of the 100 returned) are the same in all cases.
try_one 1_000
9.31 seconds
top 5: [99999993, 99999993, 99999991, 99999971, 99999964]
bot 5: [99990425, 99990423, 99990415, 99990406, 99990399]
try_one 1000, 10_000
9.24 seconds
The time required to return the 1,000 largest values is, in fact, slightly less than the times for returning the largest 100. I expect that's not reproducible. The top 5 are of course the same as when returning the largest 100 values. I therefore will not display that line below. The smallest 5 values of the 1000 returned are of course smaller than when the largest 100 values are returned.
try_one 10_000
12.15 seconds
bot 5: [99898951, 99898950, 99898946, 99898932, 99898922]
try_one 100_000
13.2 seconds
bot 5: [98995266, 98995259, 98995258, 98995254, 98995252]
try_one 1_000_000
14.34 seconds
bot 5: [89999305, 89999302, 89999301, 89999301, 89999287]
Explanation
Notice that reuse three arrays, best, cand and new_best. Specifically, I replace the contents of these arrays many times rather than continually creating new (potentially very large) arrays, leaving orphaned arrays to be garbage-collected. A little testing showed this approach improved performance.
We can create a small example and then step through the calculations.
fname = 'temp'
File.write(fname, 20.times.map { rand(100) }.join("\n") << "\n")
#=> 58
This file contains representations of integers in the following array.
arr = File.read(fname).lines.map(&:to_i)
#=> [9, 66, 80, 64, 67, 67, 89, 10, 62, 94, 41, 16, 0, 22, 68, 72, 41, 64, 87, 24]
Sorted, this is:
arr.sort_by! { |n| -n }
#=> [94, 89, 87, 80, 72, 68, 67, 67, 66, 64, 64, 62, 41, 41, 24, 22, 16, 10, 9, 0]
Let's assume we want the 5 largest values.
arr[0,5]
#=> [94, 89, 87, 80, 72]
First, set the two parameters: n, the number of largest values to return, and m, the number of lines to read from the file at a time.
n = 5
m = 5
The calculation follow.
m < n
#=> false, so do not raise ArgumentError
f = File.open(fname)
#=> #<File:temp>
best = Array.new(n)
#=> [nil, nil, nil, nil, nil]
n.times { |i| f.eof? ? (return best.replace(best[0,i])) : best[i] = f.readline.to_i }
best
#=> [9, 66, 80, 64, 67]
best.sort!.reverse!
#=> [80, 67, 66, 64, 9]
f.eof?
#=> false, so do not return
new_best = Array.new(n)
#=> [nil, nil, nil, nil, nil]
cand = Array.new(m)
#=> [nil, nil, nil, nil, nil]
puts "best=#{best}".rjust(52)
until f.eof?
rd(f, cand)
merge_arrays(best, new_best, cand)
puts "cand=#{cand}, best=#{best}"
end
f.close
best
#=> [94, 89, 87, 80, 72]
The following is displayed.
best=[80, 67, 66, 64, 9]
cand=[94, 89, 67, 62, 10], best=[94, 89, 80, 67, 67]
cand=[68, 41, 22, 16, 0], best=[94, 89, 80, 68, 67]
cand=[87, 72, 64, 41, 24], best=[94, 89, 87, 80, 72]
Enumerable.max takes an argument which specifies how many elements will be returned, and a block which specifies how elements are compared:
N = 5
p File.open("test.txt").each_line.max(N){|a,b| a.to_i <=> b.to_i}
This does not read the entire file in memory; the file is read line by line.

All indexes having the same character in nested array Ruby

I'm trying to see if an index, say index 3, all has the same character. For example:
nested_array = [[47, 44, 71, 'x', 88],
[22, 69, 75, 'x', 73],
[83, 85, 97, 'x', 57],
[25, 31, 96, 'x', 51],
[75, 70, 54, 'x', 83]]
As we can see, index 3 has all x'x in it instead of numbers. I want to be able to verify if ALL of the indexes have an x, if so, program returns true. If all but 1 have x's it would return false.
I have now implemented this conditional statement as it iterates through each index looking for a value the size equal to 1. This seemed to do the trick.
if array.map {|zero| zero[0]}.uniq.size == 1
return true
elsif array.map {|one| one[1]}.uniq.size == 1
return true
elsif array.map {|two| two[2]}.uniq.size == 1
return true
elsif array.map {|three| three[3]}.uniq.size == 1
return true
elsif array.map {|four| four[4]}.uniq.size == 1
return true
else
return false
end
nested_array.empty? || nested_array.map {|row| row[3]}.uniq.one?
The special condition for empty arrays is necessary because they trivially fulfill any condition that is specified for each element, but the uniqueness test would not pick it up. Feel free to drop it if you will not in fact have an empty array, or the test should fail if the array is empty.
EDIT: Seems I misunderstood the question - try this as an alternative to your multi-if:
nested_array.transpose.any? { |column| column.uniq.one? }
"When you flip rows and columns, does any of the rows (that used to be columns) have only one unique element?"
(if you want true for [], again, prefix with nested_array.empty? || ...).
EDIT: Thanks to spickermann, replaced .size == 1 with .one?. It really does read better.
You did not specify that all elements of nested_array are necessarily of the same size, so I will offer a solution does not have that as a requirement:
nested_array = [[47, 44, 71, 'x', 88],
[75, 70, 54, 'x', 83, 85, 90],
[22, 69, 75, 'x', 73],
[83, 85, 97, 'x', 57],
[25, 31, 96, 'x', 51, 33]]
nested_array.first.zip(*nested_array[1..-1]).any? {|row| row.uniq.size==1}
#=> true
We have:
b = nested_array.first.zip(*nested_array[1..-1])
#=> [[47, 75, 22, 83, 25],
# [44, 70, 69, 85, 31],
# [71, 54, 75, 97, 96],
# ["x", "x", "x", "x", "x"],
# [88, 83, 73, 57, 51]]
b.any? { |row| row.uniq.size == 1 }
#=> true
Initally, I had:
nested_array.first.zip(*nested_array[1..-1]).any? {|row|
row[1..-1].all? { |e| e == row.first } }
rather than { |row| row.uniq.size==1 }. As #Amadan points out, the former should be a bit faster.
If the question is whether a specific element of b contains all the same values:
index = 3
a = nested_array.first.zip(*nested_array[1..-1])
(index >= a.size) ? false : a[index].uniq.size==1
#=> true

How do I compare two arrays to make sure that every object in the first array is present in the second in Ruby?

I've written the following code with the intent of comparing two arrays and making sure that every element in array1 is present in array2. If an element is not present, it should return false. If it is present, it should return true.
array1.each do |x|
if (array2.include?(x))
array2.delete_at(array2.index(x))
next
else return false
I have set it up to delete the element from the second array to account for duplicate objects, but I can't figure out where to return true.
I need it to iterate through the entire array and return true when it is confirmed that each element from the first array is present in the second array. Currently this code will return an array of the values that are present in both or false if an element that is not present in array2 is input.
To test if all elements of array1 are included in array2:
array2 | array1 == array2
Edit: Removed incorrect code pointed out by Rustam A. Gasanov.
Here is your method corrected to work:
def all_present_and_accounted_for?(array1,array2)
array1.each do |x|
if (array2.include?(x))
array2.delete_at(array2.index(x))
else
return false
end
end
true
end
array1 = [1,2,3,3,4]
array2 = [4,6,5,3,2,1,3,8]
all_present_and_accounted_for?(array1,array2) #=> true
array2 = [4,6,5,3,2,1,5,8]
all_present_and_accounted_for?(array1,array2) #=> false
You can improve that by making a copy of array2 (so array2 is not modified) and a couple of other small changes:
def all_present_and_accounted_for?(array1, array2)
a2 = array2.dup
array1.each do |n|
return false unless i = a2.index(n)
a2.delete_at(i)
end
true
end
array1 = [1,2,3,3,4]
all_present_and_accounted_for?(array1, [4,6,5,3,2,1,3,8])
#=> true
all_present_and_accounted_for?(array1, [4,6,5,3,2,1,5,8])
#=> false
Here's another way. First, a helper I've often wished for:
class Array
def %(arr)
arr.each_with_object(dup) do |e,a|
i = a.index(e)
a.delete_at(i) if i
end
end
end
For example:
arr = [38, 38, 40, 40, 40, 41, 41, 41, 41, 60]
arr % [41, 60, 40, 38, 40, 41]
#=> [38, 40, 41, 41]
If a and b are two arrays, a%b is similar to a-b, except rather than removing all elements of a that are contained in b, it removes one character in a (the one with the smallest index) for each instance of that character in b. Now wouldn't this be a handy method to have as a built-in?
With this helper it's a simple matter to determine if every element in array1 is contained in array2:
(array1 % array2).empty?
For example,
array1 = [1,2,3,3,4]
(array1 % [4,6,5,3,2,1,3,8]).empty? #=> true
(array1 % [4,6,5,3,2,1,5,8]).empty? #=> false

Ruby finding element with max value in multidimensional array using max_by

I'm trying to implement max_by to find element with highest value in multidimensional array.
code as follows
ar = [[123,345,43,35,43,1],[456,123,43,35,43,1],[675,123,43,35,43,1],[123,123,43,35,43,321]]
x = ar.max_by { |a,b| a <=> b }
p "result #{x.inspect}"
And the output is " result [456, 123, 43, 35, 43, 1]"
Can you please explain to me what's wrong with my code ?
Update 1
using max_by
ar = [ {a:1},{a:2},{a:3}]
x = ar.max_by { |e| e[:a] }
p "result #{x.inspect}"
I've left this update as a reminder for myself of whoever may bump into similar problem
You need to do :
ar = [[123,345,43,35,43,1],[456,123,43,35,43,1],[675,123,43,35,43,1],[123,123,43,35,43,321]]
x = ar.max { |a,b| a.max <=> b.max }
With #max_by, you are passing each element array, and then |a, b|, actually doing parallel assignment on a and b. This is not what you want I trust. What I have given above is the way to do it.
max_by handles the comparison for you, just return the maximum value for one element:
ar.max_by { |a| a.max }
#=> [675, 123, 43, 35, 43, 1]
Or even shorter:
ar.max_by(&:max)
#=> [675, 123, 43, 35, 43, 1]
Well, if I'm not mistaken then you'll need to find the greatest values of each subarray and the results should be something like:
[345, 456, 675, 321]
If that what are you looking for:
x = ar.map{|x| x.max}

Iterating over a multidimensional array?

class Lod
attr_accessor :lodnr
attr_accessor :lobnr
attr_accessor :stknr
def initialize(lodnr, lobnr, stknr)
#lodnr = lodnr
#lobnr = lobnr
#stknr = stknr.chomp
end
def to_s
"%8s, %5s, %3s" % [#lodnr, #lobnr, #stknr]
end
end
I have an array called sold which contains these four arrays:
[10000, 150, 5]
[500, 10, 1]
[8000, 171, 3]
[45, 92, 4]
The four arrays are objects of a class, imported from at .txt file.
input = File.open("lodsedler.txt", "r")
input.each do |line|
l = line.split(',')
if l[0].to_i.between?(0, 99999) && l[1].to_i.between?(1, 180) && l[2].to_i.between?(1, 10)
sold << Lod.new(l[0], l[1], l[2])
else
next
end
end
I want to count the first value in each array, looking for a randomly selected number which is stored in first.
The error I get is always something like this, whatever i try:
Undefined method xxx for #Lod:0x0000000022e2d48> (NoMethodError)
The problem is that i can't seem to acces the first value in all the arrays.
You could try
a = [[10000, 150, 5], [500, 10, 1],[8000, 171, 3],[45, 92, 4]]
You can access a[0][0] 10000 or a[2][1] 171 or iterate
a.each do |row|
row.each do |column|
puts column
end
end
Edit for comment regarding using braces instead of do:
Sure it's possible but I believe do..end in preferred:
https://stackoverflow.com/a/5587403/514463
a.each { |row|
row.each { |column|
puts column
}
}
An easy way to get the first element of each sub array is to use transpose:
special_number = 45
array = [
[10000, 150, 5],
[500, 10, 1],
[8000, 171, 3],
[45, 92, 4]
]
p array.transpose.first.count(special_number) #=> 1
Edit: Actually simpler and more direct...
p array.map(&:first).count(special_number) #=> 1

Resources