Add to an array from within a loop using Ruby - ruby

I am having a few problems adding to an array from within a loop.
It only adds the last results to the array and loses the previous 9 sets.
I think I have to create a new array inside of the loop and then add the new one to the previous. I'm just not sure how I go about doing that.
array = Array.new
10.times do
array2 = Array.new
pagenum = 0
results = Nokogiri::HTML(open("#{url}#{pagenum}"))
results.css("div").each do |div|
array.push div.inner_text
end
pagenum + 10
array.concat(array2)
end

You are fetching same page (0) 10 times.
10.times do
...
pagenum = 0 # <--------
results = Nokogiri::HTML(open("#{url}#{pagenum}"))
...
end
Try following:
array = Array.new
10.times do |pagenum|
results = Nokogiri::HTML(open("#{url}#{pagenum}"))
array += results.css("div").map(&:inner_text)
end

Related

Merging Ranges using Sets - Error - Stack level too deep (SystemStackError)

I have a number of ranges that I want merge together if they overlap. The way I’m currently doing this is by using Sets.
This is working. However, when I attempt the same code with a larger ranges as follows, I get a `stack level too deep (SystemStackError).
require 'set'
ranges = [Range.new(73, 856), Range.new(82, 1145), Range.new(116, 2914), Range.new(3203, 3241)]
set = Set.new
ranges.each { |r| set << r.to_set }
set.flatten!
sets_subsets = set.divide { |i, j| (i - j).abs == 1 } # this line causes the error
puts sets_subsets
The line that is failing is taken directly from the Ruby Set Documentation.
I would appreciate it if anyone could suggest a fix or an alternative that works for the above example
EDIT
I have put the full code I’m using here:
Basically it is used to add html tags to an amino acid sequence according to some features.
require 'set'
def calculate_formatting_classes(hsps, signalp)
merged_hsps = merge_ranges(hsps)
sp = format_signalp(merged_hsps, signalp)
hsp_class = (merged_hsps - sp[1]) - sp[0]
rank_format_positions(sp, hsp_class)
end
def merge_ranges(ranges)
set = Set.new
ranges.each { |r| set << r.to_set }
set.flatten
end
def format_signalp(merged_hsps, sp)
sp_class = sp - merged_hsps
sp_hsp_class = sp & merged_hsps # overlap regions between sp & merged_hsp
[sp_class, sp_hsp_class]
end
def rank_format_positions(sp, hsp_class)
results = []
results += sets_to_hash(sp[0], 'sp')
results += sets_to_hash(sp[1], 'sphsp')
results += sets_to_hash(hsp_class, 'hsp')
results.sort_by { |s| s[:pos] }
end
def sets_to_hash(set = nil, cl)
return nil if set.nil?
hashes = []
merged_set = set.divide { |i, j| (i - j).abs == 1 }
merged_set.each do |s|
hashes << { pos: s.min.to_i - 1, insert: "<span class=#{cl}>" }
hashes << { pos: s.max.to_i - 0.1, insert: '</span>' } # for ordering
end
hashes
end
working_hsp = [Range.new(7, 136), Range.new(143, 178)]
not_working_hsp = [Range.new(73, 856), Range.new(82, 1145),
Range.new(116, 2914), Range.new(3203, 3241)]
sp = Range.new(1, 20).to_set
# working
results = calculate_formatting_classes(working_hsp, sp)
# Not Working
# results = calculate_formatting_classes(not_working_hsp, sp)
puts results
Here is one way to do this:
ranges = [Range.new(73, 856), Range.new(82, 1145),
Range.new(116, 2914), Range.new(3203, 3241)]
ranges.size.times do
ranges = ranges.sort_by(&:begin)
t = ranges.each_cons(2).to_a
t.each do |r1, r2|
if (r2.cover? r1.begin) || (r2.cover? r1.end) ||
(r1.cover? r2.begin) || (r1.cover? r2.end)
ranges << Range.new([r1.begin, r2.begin].min, [r1.end, r2.end].max)
ranges.delete(r1)
ranges.delete(r2)
t.delete [r1,r2]
end
end
end
p ranges
#=> [73..2914, 3203..3241]
The other answers aren't bad, but I prefer a simple recursive approach:
def merge_ranges(*ranges)
range, *rest = ranges
return if range.nil?
# Find the index of the first range in `rest` that overlaps this one
other_idx = rest.find_index do |other|
range.cover?(other.begin) || other.cover?(range.begin)
end
if other_idx
# An overlapping range was found; remove it from `rest` and merge
# it with this one
other = rest.slice!(other_idx)
merged = ([range.begin, other.begin].min)..([range.end, other.end].max)
# Try again with the merged range and the remaining `rest`
merge_ranges(merged, *rest)
else
# No overlapping range was found; move on
[ range, *merge_ranges(*rest) ]
end
end
Note: This code assumes each range is ascending (e.g. 10..5 will break it).
Usage:
ranges = [ 73..856, 82..1145, 116..2914, 3203..3241 ]
p merge_ranges(*ranges)
# => [73..2914, 3203..3241]
ranges = [ 0..10, 5..20, 30..50, 45..80, 50..90, 100..101, 101..200 ]
p merge_ranges(*ranges)
# => [0..20, 30..90, 100..200]
I believe your resulting set has too many items (2881) to be used with divide, which if I understood correctly, would require 2881^2881 iterations, which is such a big number (8,7927981983090337174360463368808e+9966) that running it would take nearly forever even if you didn't get stack level too deep error.
Without using sets, you can use this code to merge the ranges:
module RangeMerger
def merge(range_b)
if cover?(range_b.first) && cover?(range_b.last)
self
elsif cover?(range_b.first)
self.class.new(first, range_b.last)
elsif cover?(range_b.last)
self.class.new(range_b.first, last)
else
nil # Unmergable
end
end
end
module ArrayRangePusher
def <<(item)
if item.kind_of?(Range)
item.extend RangeMerger
each_with_index do |own_item, idx|
own_item.extend RangeMerger
if new_range = own_item.merge(item)
self[idx] = new_range
return self
end
end
end
super
end
end
ranges = [Range.new(73, 856), Range.new(82, 1145), Range.new(116, 2914), Range.new(3203, 3241)]
new_ranges = Array.new
new_ranges.extend ArrayRangePusher
ranges.each do |range|
new_ranges << range
end
puts ranges.inspect
puts new_ranges.inspect
This will output:
[73..856, 82..1145, 116..2914, 3203..3241]
[73..2914, 3203..3241]
which I believe is the intended output for your original problem. It's a bit ugly, but I'm a bit rusty at the moment.
Edit: I don't think this has anything to do with your original problem before the edits which was about merging ranges.

Values in the while loop do not modify outside values

I have a long code but I tried to copy and adapt my problem in as few lines as possible . I have a method which creates an array( 2D ) with 0 and 1
array1 = newValue(2) - the number 2 represents how many 1 the array has
array2 = newValue(3)
and this loop
(0..9).each do|i|
(0..9).each do|j|
while((array1[i][j] == array2[i][j]) && (array2[i][j] == 1)) do
array1 = newvalue(2)
array2 = newvalue(3)
end
end
end
I'm using the while loop so I won t have a 1 in the same position in both arrays . But what is inside the while loop doesn't modify the values of the array . I also tried using map!/collect! but I think I did something wrong because nothing happened. I hope you can understand what I was trying to do .
Edit:
def newValue(value)
value = value.to_i
array = Array.new(10) { Array.new(10 , 0) }
(a lot of conditions on how to position the items in the array)
return array
end
Here's my take... hopefully it'll help out. It seems that what you noticed was true. The arrays are not getting reset. Probably because inside the each blocks, the scope is lost. This is probably because the are arrays. I took a slightly different approach. Put everything in a class so you can have instance variables that you can control and you know where they are and that they are always the same.
I pulled out the compare_arrays function which just returns the coordinates of the match if there is one. If not it returns nil. Then, youre while loop is simplified in the reprocess method. If you found a match, reprocess until you don't have a match any more. I used a dummy newValue method that just returned another 2d array (as you suggested yours does). This seems to do the trick from what I can tell. Give it a whirl and see what you think. You can access the two arrays after all the processing with processor.array1 as you can see I did at the bottom.
# generate a random 2d array with 0's and val's
def generateRandomArray(val=1)
array = []
(0..9).each do |i|
(0..9).each do |j|
array[i] ||= []
array[i][j] = (rand > 0.1) ? 0 : val
end
end
array
end
array1 = generateRandomArray
array2 = generateRandomArray
def newValue(val)
generateRandomArray(val)
end
class Processor
attr_reader :array1, :array2
def initialize(array1, array2)
#array1 = array1
#array2 = array2
end
def compare_arrays
found = false
for ii in 0..9
break unless for jj in 0..9
if ((#array2[ii][jj] == 1) && (#array1[ii][jj] == 1))
found = true
break
end
end
end
[ii,jj] if found
end
def reprocess
while compare_arrays
puts "Reprocessing"
#array1 = newValue(2)
#array2 = newValue(3)
reprocess
end
end
end
processor = Processor.new(array1, array2)
processor.reprocess
puts processor.array1.inspect

Problems with Ruby Arrays

I am currently writing a small ruby class that is intended to store the amount of times a randomly generated number is seen within an array, along with the value that is seen itself.
I am trying to do the following to add the value along with a default times seen as 1 to the array. The script should check to see if the value has been included within the array and if so, increment the times this value has been seen by 1
However I am recieving duplicate values, which shouldnt happen - as the code should only allow a value to be stored once and if the value is already in the memory array, increment it by one.
The code is attatched below, if anyone can suggest what I am doing wrong it would be awesome.
Cheers
Martin
#memory = Array.new
def store(value)
foundflag = false
#memory.each do |array|
if value == array[0]
#Incrementing value timesseen
array[1] = array[1]+1
#Value found, changing found flag
foundflag = true
#Loop break
break
end
end
if foundflag != true then
#memory.push([value,1])
end
end
store(5)
Full script (untidy!)
class STAT
def initialize()
#STAT memory settings
#memory = Array.new
#Prediction settings
#predictions = 0
#sucessfulpredictions = 0
end
#STAT main functions
def store(value)
foundflag = false
#memory.each do |array|
if value == array[0]
#Incrementing value timesseen
array[1] = array[1]+1
#Value found, changing found flag
foundflag = true
#Loop break
break
end
end
if foundflag != true then
#memory.push([value,1])
end
end
def predict(nosimulations)
#Generate random value less than the total memory size
randomvalue = rand(total)+1
count = 0
#memory.each do |array|
value = array[0]
timesseen = array[1]
if randomvalue <= count + timesseen
puts "Predicted value #{value}"
end
count = count + array[1]
end
end
def simulate(nosimulations)
count = 1
while count <= nosimulations
#Generating a random number
randomnumber = rand(100)+1
#Storing the random number
store(randomnumber)
#Print simulation details#
puts "Running simulation #{count} of #{nosimulations}"
puts "Generated random number: #{randomnumber}"
#Incrementing count
count = count + 1
end
end
#STAT technical functions
def inspect()
#Print memory information message
puts "Memory information:"
#memory.each do |array|
value = array[0]
timesseen = array[1]
puts "Value #{value} - times seen: #{timesseen}/#{total}"
end
end
def total()
total = 0
#memory.each do |array|
total = total + array[1]
end
return total
end
#STAT load/save functions
def load(filename)
#Default engine to be loaded
enginename = "#{filename}.stat"
#Print checking for saved engine message
puts "Checking for saved memory file..."
#Checking for saved engine
if File.exists?(enginename)
#Print loading engine message
puts "Loading memory..."
#memory = Marshal.load File.read(enginename)
#Print engine loaded message
puts "Engine loaded"
else
#Print memory not found message
puts "Cannot load memory, no memory file found"
end
end
def save(filename)
#Default name for engine to be saved
enginename = "#{filename}.stat"
#Print saving engine message
puts "Saving memory..."
#Saving engine to specified file
serialized_array = Marshal.dump(#memory)
savefile = File.new(enginename,"w")
savefile.write(serialized_array)
savefile.close
#Print engine saved message
puts "Memory saved"
end
end
#STAT class test software
stat = STAT.new
filename = "test"
#Load
stat.load(filename)
#Simulate
stat.simulate(1000000)
#Testing
#stat.store(5)
#stat.store(5)
#Inspect
stat.inspect
#Predict
#stat.predict(1000000)
#Save
stat.save(filename)
Use:
occurrences = Hash[ my_array.group_by{ |o| o }.map{ |o,a| [o,a.length] } ]
to get a hash mapping each value to the number of times it occurs. To calculate it yourself:
occurrences = Hash.new(0)
my_array.each{ |o| occurrences[o]+=1 }
See the docs if you don't know what these methods do.
You are working too hard. You want to store the amount of times a randomly generated number is seen within an array, along with the value that is seen itself.
How can Ruby help you with that? Since you are working with an array, glance through the available methods and yes, there it is: count.
ar = [1,2,1,3,2,4,3,2,6,7,8,7]
rn = rand(10)
ar.count(rn)

Ruby Script to Create an Array of Arrays

I have written a simple screen scraping script and at the end of the script I am attempting to create an array of arrays in preparation for an activerecord insert. The structure I am trying to achieve is as follows:
Array b holds a series of 10 element arrays
b = [[0,1,2,3,4,5,6,7,8,9],[0,1,2,3,4,5,6,7,8,9],[0,1,2,3,4,5,6,7,8,9]]
Currently when I try to print out Array b the array is empty. I'm still fairly new to ruby and programming for that matter and would appreciate any feedback on how to get values in array b and to improve the overall script. Script follows:
require "rubygems"
require "celerity"
t = 0
r = 0
c = 0
a = Array.new(10)
b = Array.new
#initialize Browser
browser = Celerity::IE.new
#goto Login Page
browser.goto('http://www1.drf.com/drfLogin.do?type=membership')
#input UserId and Password
browser.text_field(:name, 'p_full_name').value = 'username'
browser.text_field(:name, 'p_password').value = 'password'
browser.button(:index, 2).click
#goto DRF Frontpage
browser.goto('http://www.drf.com/frontpage')
#goto DRF Entries
browser.goto('http://www1.drf.com/static/indexMenus/eindex.html')
#click the link to access the entries
browser.link(:text, '09').click
browser.tables.each do |table|
t = t + 1
browser.table(:index, t).rows.each do |row|
r = r + 1
browser.table(:index, t).row(:index, r).cells.each do |cell|
a << cell.text
end
b << a
a.clear
end
r = 0
end
puts b
browser.close
This a minor rewrite of your main loop to a more Ruby-like way.
b = Array.new
browser.tables.each_with_index do |table, t|
browser.table(:index, 1 + t).rows.each_with_index do |row, r|
a = Array.new(10)
browser.table(:index, 1 + t).row(:index, 1 + r).cells.each do |cell|
a << cell.text
end
b << a
end
end
puts b
I moved the array initializations to immediately above where they'll be needed. That's a programmer-choice thing of course.
Rather than create two counter variables up above, I switched to using each_with_index which adds an index variable, starting at 0. To get your 1-offsets I add 1.
They're not big changes but they add up to a more cohesive app.
Back to the original code: One issue I see with it is that you create your a array outside the loops then reuse it when you assign to b. That means that each time the same array gets used, but cleared and values stored to it. That will cause the previous array values to be overwritten, but resulting in duplicated arrays in b.
require 'pp'
a = []
b = []
puts a.object_id
a[0] = 1
b << a
a.clear
a[0] = 2
b << a
puts
pp b
b.each { |ary| puts ary.object_id }
# >> 2151839900
# >>
# >> [[2], [2]]
# >> 2151839900
# >> 2151839900
Notice that the a array gets reused repeatedly.
If I change a to a second array there are two values for b and a is two separate objects:
require 'pp'
a = []
b = []
puts a.object_id
a[0] = 1
b << a
a = []
a[0] = 2
b << a
puts
pp b
b.each { |ary| puts ary.object_id }
# >> 2151839920
# >>
# >> [[1], [2]]
# >> 2151839920
# >> 2151839780
Hopefully that'll help you avoid the problem in the future.
Your problem is there at the end:
b << a # push a *reference to* a onto b
a.clear # clear a; the reference in b now points to an empty array!
If you remove the reference to a.clear and start that loop with:
browser.tables.each do |table|
t = t + 1
a = []
...you'll be golden (at least as far as your array-building goes)
I can't tell from your question whether you have multiple tables or not. Maybe just one? In which case:
b = browser.tables.first.rows.map {|row| row.cells.map(&:text)}
If you have multiple tables, and really want an array (tables) of arrays (rows) of arrays (cells), that would be
b = browser.tables.map {|t| t.rows.map {|row| row.cells.map(&:text)}}
And if the tables all have the same structure and you just want all the rows as if they were in one big table, you can do:
b = browser.tables.map {|t| t.rows.map {|row| row.cells.map(&:text)}}.flatten(1)

How to iterate through an array starting from the last element? (Ruby)

I came with below solution but I believe that must be nicer one out there ...
array = [ 'first','middle','last']
index = array.length
array.length.times { index -= 1; puts array[index]}
Ruby is smart
a = [ "a", "b", "c" ]
a.reverse_each {|x| print x, " " }
array.reverse.each { |x| puts x }
In case you want to iterate through a range in reverse then use:
(0..5).reverse_each do |i|
# do something
end
You can even use a for loop
array = [ 'first','middle','last']
for each in array.reverse do
print array
end
will print
last
middle
first
If you want to achieve the same without using reverse [Sometimes this question comes in interviews]. We need to use basic logic.
array can be accessed through index
set the index to length of array and then decrements by 1 until index reaches 0
output to screen or a new array or use the loop to perform any logic.
def reverseArray(input)
output = []
index = input.length - 1 #since 0 based index and iterating from
last to first
loop do
output << input[index]
index -= 1
break if index < 0
end
output
end
array = ["first","middle","last"]
reverseArray array #outputs: ["last","middle","first"]
In a jade template you can use:
for item in array.reverse()
item
You can use "unshift" method to iterate and add items to new "reversed" array.
Unshift will add a new item to the beginning of an array.
While << - adding in the end of an array. Thats why unshift here is good.
array = [ 'first','middle','last']
output = []
# or:
for item in array # array.each do |item|
output.unshift(item) # output.unshift(item)
end # end
puts "Reversed array: #{output}"
will print: ["last", "middle", "first"]
We can also use "until":
index = array.length - 1
until index == -1
p arr[index]
index -= 1
end

Resources