class Lod
attr_accessor :lodnr
attr_accessor :lobnr
attr_accessor :stknr
def initialize(lodnr, lobnr, stknr)
#lodnr = lodnr
#lobnr = lobnr
#stknr = stknr.chomp
end
def to_s
"%8s, %5s, %3s" % [#lodnr, #lobnr, #stknr]
end
end
I have an array called sold which contains these four arrays:
[10000, 150, 5]
[500, 10, 1]
[8000, 171, 3]
[45, 92, 4]
The four arrays are objects of a class, imported from at .txt file.
input = File.open("lodsedler.txt", "r")
input.each do |line|
l = line.split(',')
if l[0].to_i.between?(0, 99999) && l[1].to_i.between?(1, 180) && l[2].to_i.between?(1, 10)
sold << Lod.new(l[0], l[1], l[2])
else
next
end
end
I want to count the first value in each array, looking for a randomly selected number which is stored in first.
The error I get is always something like this, whatever i try:
Undefined method xxx for #Lod:0x0000000022e2d48> (NoMethodError)
The problem is that i can't seem to acces the first value in all the arrays.
You could try
a = [[10000, 150, 5], [500, 10, 1],[8000, 171, 3],[45, 92, 4]]
You can access a[0][0] 10000 or a[2][1] 171 or iterate
a.each do |row|
row.each do |column|
puts column
end
end
Edit for comment regarding using braces instead of do:
Sure it's possible but I believe do..end in preferred:
https://stackoverflow.com/a/5587403/514463
a.each { |row|
row.each { |column|
puts column
}
}
An easy way to get the first element of each sub array is to use transpose:
special_number = 45
array = [
[10000, 150, 5],
[500, 10, 1],
[8000, 171, 3],
[45, 92, 4]
]
p array.transpose.first.count(special_number) #=> 1
Edit: Actually simpler and more direct...
p array.map(&:first).count(special_number) #=> 1
Related
I want to parse a array in such way it gives following output
arr1 = (1..5).to_a
arr2 = (4..10).to_a
arr3 = (10..20).to_a
(1..3).map do |i|
puts arr#{i} # It will throw an error, I am looking a way to achieve this.
end
Need to achieve above result in ruby.
You can do almost anything in Ruby. To get the value of a local variable in the current binding, you'd use local_variable_get:
arr1 = (1..5).to_a
arr2 = (4..10).to_a
arr3 = (10..20).to_a
(1..3).each do |i|
puts binding.local_variable_get("arr#{i}")
end
But that's cumbersome and error prone.
If you want to iterate over objects, put them in a collection. If you want the objects to have a certain label (like your variable name), use a hash:
arrays = {
arr1: (1..5).to_a,
arr2: (4..10).to_a,
arr3: (10..20).to_a
}
arrays.each do |name, values|
puts "#{name} = #{values}"
end
Output:
arr1 = [1, 2, 3, 4, 5]
arr2 = [4, 5, 6, 7, 8, 9, 10]
arr3 = [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
If the names are not relevant, use an array as shown in max pleaner's answer.
The quick and dirty way is to use eval:
(1..3).map do |i|
puts eval("arr#{i}")
end
but you should not do this in your code, it's non-idiomatic, slower, unsafe, and is not properly using data structures. A better way is to move the arrays into a parent array:
arrays = [
(1..5).to_a,
(4..10).to_a,
(10..20).to_a
]
arrays.each { |arr| puts arr }
So far, I have this code that reads a file and sorts it using Ruby. But this doesn't sort the numbers correctly and I think it will be inefficient, given that the file can be as big as 200GB and contains a number on each line. Can you suggest what else to do?
File.open("topN.txt", "w") do |file|
File.readlines("N.txt").sort.reverse.each do |line|
file.write(line.chomp<<"\n")
end
End
After everyone help over here this is how my code is looking so far...
begin
puts "What is the file name?"
file = gets.chomp
puts "Whats is the N number?"
myN = Integer(gets.chomp)
rescue ArgumentError
puts "That's not a number, try again"
retry
end
topN = File.open(file).each_line.max(myN){|a,b| a.to_i <=> b.to_i}
puts topN
Sorting 200GB of data in memory will not be very performant. I would write a little helper class which only remembers the N biggest elements added so far.
class SortedList
attr_reader :list
def initialize(size)
#list = []
#size = size
end
def add(element)
return if #min && #min > element
list.push(element)
reorganize_list
end
private
def reorganize_list
#list = list.sort.reverse.first(#size)
#min = list.last
end
end
Initialize an instance with the require N and the just add the values parsed from each line to this instance.
sorted_list = SortedList.new(n)
File.readlines("N.txt").each do |line|
sorted_list.add(line.to_i)
end
puts sorted_list.list
Suppose
str = File.read(in_filename)
#=> "117\n106\n143\n147\n63\n118\n146\n93\n"
You could convert that string to an enumerator that enumerates lines, use Enumerable#sort_by to sort those lines in descending order, join the resulting lines (that end in newlines) to form a string that can be written to file:
str.each_line.sort_by { |line| -line.to_i }.join
#=> "147\n146\n143\n118\n117\n106\n93\n63\n"
Another way is to convert the string to array of integers, sort the array using Array#sort, reverse the resulting array and then join the elements of the array back into a string that can be written to file:
str.each_line.map(&:to_i).sort.reverse.join("\n") << "\n"
#=> "147\n146\n143\n118\n117\n106\n93\n63\n"
Let's do a quick benchmark.
require 'benchmark/ips'
(str = 1_000_000.times.map { rand(10_000) }.join("\n") << "\n").size
Benchmark.ips do |x|
x.report("sort_by") { str.each_line.sort_by { |line| -line.to_i }.join }
x.report("sort") { str.each_line.map(&:to_i).sort.reverse.join("\n") << "\n" }
x.compare!
end
Comparison:
sort: 0.4 i/s
sort_by: 0.3 i/s - 1.30x slower
The mighty sort wins again!
You left this comment on your question:
"Write a program, topN, that given a number N and an arbitrarily large file that contains individual numbers on each line (e.g. 200Gb file), will output the largest N numbers, highest first."
That problem seems to me as somewhat different than the one described in the question, and also constitutes a more interesting problem. I have addressed that problem in this answer.
Code
def topN(fname, n, m=n)
raise ArgumentError, "m cannot be smaller than n" if m < n
f = File.open(fname)
best = Array.new(n)
n.times do |i|
break best.replace(best[0,i]) if f.eof?
best[i] = f.readline.to_i
end
best.sort!.reverse!
return best if f.eof?
new_best = Array.new(n)
cand = Array.new(m)
until f.eof?
rd(f, cand)
merge_arrays(best, new_best, cand)
end
f.close
best
end
def rd(f, cand)
cand.size.times { |i| cand[i] = (f.eof? ? -Float::INFINITY : f.readline.to_i) }
cand.sort!.reverse!
end
def merge_arrays(best, new_best, cand)
cand_largest = cand.first
best_idx = best.bsearch_index { |n| cand_largest > n }
return if best_idx.nil?
bi = best_idx
cand_idx = 0
nbr_to_compare = best.size-best_idx
nbr_to_compare.times do |i|
if cand[cand_idx] > best[bi]
new_best[i] = cand[cand_idx]
cand_idx += 1
else
new_best[i] = best[bi]
bi += 1
end
end
best[best_idx..-1] = new_best[0, nbr_to_compare]
end
Examples
Let's create a file with 10 million representations of integers, one per line.
require 'time'
FName = 'test'
(s = 10_000_000.times.with_object('') { |_,s| s << rand(100_000_000).to_s << "\n" }).size
s[0,27]
#=> "86752031\n84524374\n29347072\n"
File.write(FName, s)
#=> 88_888_701
Next, create a simple method to invoke topN with different arguments and to also show execution times.
def try_one(n, m=n)
t = Time.now
a = topN(FName, n, m)
puts "#{(Time.new-t).round(2)} seconds"
puts "top 5: #{a.first(5)}"
puts "bot 5: #{a[n-5..n-1]}"
end
In testing I found that setting m less than n was never desirable in terms of computational time. Requiring that m >= n allowed a small simplification to the code and a small efficiency improvement. I therefore made m >= n a requirement.
try_one 100, 100
9.44 seconds
top 5: [99999993, 99999993, 99999991, 99999971, 99999964]
bot 5: [99999136, 99999127, 99999125, 99999109, 99999078]
try_one 100, 1000
9.53 seconds
top 5: [99999993, 99999993, 99999991, 99999971, 99999964]
bot 5: [99999136, 99999127, 99999125, 99999109, 99999078]
try_one 100, 10_000
9.95 seconds
top 5: [99999993, 99999993, 99999991, 99999971, 99999964]
bot 5: [99999136, 99999127, 99999125, 99999109, 99999078]
Here I've tested for the case of producing the 100 largest values with different number of lines of the file to read at a time m. As seen, the method is insensitive to this latter value. As expected, the largest 5 values and the smallest 5 values (of the 100 returned) are the same in all cases.
try_one 1_000
9.31 seconds
top 5: [99999993, 99999993, 99999991, 99999971, 99999964]
bot 5: [99990425, 99990423, 99990415, 99990406, 99990399]
try_one 1000, 10_000
9.24 seconds
The time required to return the 1,000 largest values is, in fact, slightly less than the times for returning the largest 100. I expect that's not reproducible. The top 5 are of course the same as when returning the largest 100 values. I therefore will not display that line below. The smallest 5 values of the 1000 returned are of course smaller than when the largest 100 values are returned.
try_one 10_000
12.15 seconds
bot 5: [99898951, 99898950, 99898946, 99898932, 99898922]
try_one 100_000
13.2 seconds
bot 5: [98995266, 98995259, 98995258, 98995254, 98995252]
try_one 1_000_000
14.34 seconds
bot 5: [89999305, 89999302, 89999301, 89999301, 89999287]
Explanation
Notice that reuse three arrays, best, cand and new_best. Specifically, I replace the contents of these arrays many times rather than continually creating new (potentially very large) arrays, leaving orphaned arrays to be garbage-collected. A little testing showed this approach improved performance.
We can create a small example and then step through the calculations.
fname = 'temp'
File.write(fname, 20.times.map { rand(100) }.join("\n") << "\n")
#=> 58
This file contains representations of integers in the following array.
arr = File.read(fname).lines.map(&:to_i)
#=> [9, 66, 80, 64, 67, 67, 89, 10, 62, 94, 41, 16, 0, 22, 68, 72, 41, 64, 87, 24]
Sorted, this is:
arr.sort_by! { |n| -n }
#=> [94, 89, 87, 80, 72, 68, 67, 67, 66, 64, 64, 62, 41, 41, 24, 22, 16, 10, 9, 0]
Let's assume we want the 5 largest values.
arr[0,5]
#=> [94, 89, 87, 80, 72]
First, set the two parameters: n, the number of largest values to return, and m, the number of lines to read from the file at a time.
n = 5
m = 5
The calculation follow.
m < n
#=> false, so do not raise ArgumentError
f = File.open(fname)
#=> #<File:temp>
best = Array.new(n)
#=> [nil, nil, nil, nil, nil]
n.times { |i| f.eof? ? (return best.replace(best[0,i])) : best[i] = f.readline.to_i }
best
#=> [9, 66, 80, 64, 67]
best.sort!.reverse!
#=> [80, 67, 66, 64, 9]
f.eof?
#=> false, so do not return
new_best = Array.new(n)
#=> [nil, nil, nil, nil, nil]
cand = Array.new(m)
#=> [nil, nil, nil, nil, nil]
puts "best=#{best}".rjust(52)
until f.eof?
rd(f, cand)
merge_arrays(best, new_best, cand)
puts "cand=#{cand}, best=#{best}"
end
f.close
best
#=> [94, 89, 87, 80, 72]
The following is displayed.
best=[80, 67, 66, 64, 9]
cand=[94, 89, 67, 62, 10], best=[94, 89, 80, 67, 67]
cand=[68, 41, 22, 16, 0], best=[94, 89, 80, 68, 67]
cand=[87, 72, 64, 41, 24], best=[94, 89, 87, 80, 72]
Enumerable.max takes an argument which specifies how many elements will be returned, and a block which specifies how elements are compared:
N = 5
p File.open("test.txt").each_line.max(N){|a,b| a.to_i <=> b.to_i}
This does not read the entire file in memory; the file is read line by line.
This question already has answers here:
duplicates in 2 dimensional array
(2 answers)
Closed 9 years ago.
I have this input .txt file loaded into array so it comes out like this:
class Note
def initialize(a, b)
#a = a
#b = b
end
end
input = File.open("DATA", "r")
input.each do |line|
l = line.split(',')
arr << Note.new(l[0], l[1])
end
I want to count and output to another .txt file how many times each inner array is equal to another inner array, e.g. [500, 2, x], where x is the times the [500, 2] array is represented in arr.
input.txt example
10000, 150
00500, 10
08000, 171
00045, 92
00045, 93
00045, 92
00045, 93
expected output
10000, 150, 1
00500, 10, 1
08000, 171, 1
00045, 92, 2
00045, 93, 2
Thanks in advance.
What about this:
arr.uniq.map {|el| [*el, arr.count(el)] }
For each unique element of the array, count how may times it is there and make a new array with the element and its count.
Example:
arr = [[1,2], [5,5], [5,5], [8,7], [1,2], [5,5]]
#=> [[1, 2, 2], [5, 5, 3], [8, 7, 1]]
Is this what you are looking for :
arr = [[5000, 52],[99422, 1],[5000, 52],[325, 63]]
arr.group_by{|a| a }.map{|k,v| [*k,v.size]}
# => [[5000, 52, 2], [99422, 1, 1], [325, 63, 1]]
Edit
ar = File.readlines("/home/kirti/Ruby/input.txt").map{|s| s.scan(/\w+/) }
ar.group_by{|a| a }.map{|k,v| [*k,v.size]}
# => [["10000", "150", 1],
# ["00500", "10", 1],
# ["08000", "171", 1],
# ["00045", "92", 2],
# ["00045", "93", 2]]
I want to build a custom method Array#drop_every(n) (I know it's monkey patching, I am doing this for a homework), which returns a new array omitting every nth element:
[4, 8, 15, 16, 23, 42].drop_every(2) # [4, 15, 23]
I want to implement it with Array#delete_if, but by referring to the index and not to the element itself, (similar to each_index) something like this:
def drop_every(step)
self.delete_if { |index| index % step == 0 }
end
How do I do this? I don't insist on using delete_if, I also looked at drop_while and reject, other suggestions are welcome.
You can use with_index method that returns enumerator, filter your collection and then get rid of the indexes.
class Array
def drop_every(step)
self.each.with_index.select { |_, index| index % step == 0 }.map(&:first)
end
end
[4, 8, 15, 16, 23, 42].drop_every(2) # => [4, 15, 23]
def drop_every(step)
reject.with_index { |x,i| (i+1) % step == 0 }
end
[4, 8, 15, 16, 23, 42].reject.with_index{|x,i| (i+1) % 2 == 0}
# => [4, 15, 23]
[4, 8, 15, 16, 23, 42].reject.with_index{|x,i| (i+1) % 3 == 0}
# => [4, 8, 16, 23]
You could use the values_at method to selectively filter out indices which you want.
class Array
def drop_every(step)
self.values_at(*(0...self.size).find_all{ |x| (x+1) % step != 0 })
end
end
The answer was accepted while I was typing it. I will post it anyways.
def drop_every step
delete_if.with_index(1){|_, i| i.%(step).zero?}
end
class Array
def drop_every(step)
self.each_slice(step).flat_map{|slice| slice[0..-2]}
end
end
p [4, 8, 15, 16, 23, 42].drop_every(2) #=> [4, 15, 23]
I'd extend the Enumerable mixin instead:
module Enumerable
def drop_every(step)
return to_enum(:drop_every, step) unless block_given?
each.with_index(1) do |o, i|
yield o unless i % step == 0
end
end
end
(1..10).drop_every(3) { |a| p a }
# outputs below
1
2
4
5
7
8
10
I need to locate all integer elements in an array, whose sum is equal to one of the integer elements within the array.
For example, let's assume I have an array like this as input:
[1, 2, 4, 10, 90, 302, 312, 500]
Then output should have all integer elements including the integer element which is sum of other elements. It will be like: [10, 302, 312] i.e. 10+302 = 312
This is what I tried in ruby:
numbers = [1, 2, 4, 10, 90, 302, 312, 500]
numbers.each_with_index do |number, index|
target = []
return [] if numbers.size < 3
target << number
puts "target in numbers.each: #{target.inspect}"
0.upto(numbers.size).each do |i|
puts "target in (index+1).upto: #{target.inspect}"
target << numbers[i] unless index == i
next if target.size < 2
break if target.inject(&:+) > numbers.max
puts "== array starts =="
puts [target, target.inject(&:+)].flatten.inspect if numbers.include? target.inject(&:+)
puts "== array ends =="
end
end
But it's not making expected output. I'll update if I get any luck on this. Till then can anyone point out that what I am doing wrong here? Thanks.
An algorithm will be good for me as well.
An implementation:
arr = [1, 2, 4, 10, 90, 302, 312, 500]
(2..arr.count).each do |len|
arr.combination(len).each do |comb|
sum = comb.inject(:+)
if arr.include? sum
puts (comb << sum).inspect
end
end
end
zwippie's answer with small changes..
arr = [1, 2, 4, 10, 90, 302, 312, 500]
result = []
(2..arr.count-1).to_a.each do |len|
arr.combination(len).to_a.each do |comb|
sum = comb.inject(:+)
if arr.include? sum
result << (comb << sum)
end
end
end
result