Problems with Ruby Arrays - ruby

I am currently writing a small ruby class that is intended to store the amount of times a randomly generated number is seen within an array, along with the value that is seen itself.
I am trying to do the following to add the value along with a default times seen as 1 to the array. The script should check to see if the value has been included within the array and if so, increment the times this value has been seen by 1
However I am recieving duplicate values, which shouldnt happen - as the code should only allow a value to be stored once and if the value is already in the memory array, increment it by one.
The code is attatched below, if anyone can suggest what I am doing wrong it would be awesome.
Cheers
Martin
#memory = Array.new
def store(value)
foundflag = false
#memory.each do |array|
if value == array[0]
#Incrementing value timesseen
array[1] = array[1]+1
#Value found, changing found flag
foundflag = true
#Loop break
break
end
end
if foundflag != true then
#memory.push([value,1])
end
end
store(5)
Full script (untidy!)
class STAT
def initialize()
#STAT memory settings
#memory = Array.new
#Prediction settings
#predictions = 0
#sucessfulpredictions = 0
end
#STAT main functions
def store(value)
foundflag = false
#memory.each do |array|
if value == array[0]
#Incrementing value timesseen
array[1] = array[1]+1
#Value found, changing found flag
foundflag = true
#Loop break
break
end
end
if foundflag != true then
#memory.push([value,1])
end
end
def predict(nosimulations)
#Generate random value less than the total memory size
randomvalue = rand(total)+1
count = 0
#memory.each do |array|
value = array[0]
timesseen = array[1]
if randomvalue <= count + timesseen
puts "Predicted value #{value}"
end
count = count + array[1]
end
end
def simulate(nosimulations)
count = 1
while count <= nosimulations
#Generating a random number
randomnumber = rand(100)+1
#Storing the random number
store(randomnumber)
#Print simulation details#
puts "Running simulation #{count} of #{nosimulations}"
puts "Generated random number: #{randomnumber}"
#Incrementing count
count = count + 1
end
end
#STAT technical functions
def inspect()
#Print memory information message
puts "Memory information:"
#memory.each do |array|
value = array[0]
timesseen = array[1]
puts "Value #{value} - times seen: #{timesseen}/#{total}"
end
end
def total()
total = 0
#memory.each do |array|
total = total + array[1]
end
return total
end
#STAT load/save functions
def load(filename)
#Default engine to be loaded
enginename = "#{filename}.stat"
#Print checking for saved engine message
puts "Checking for saved memory file..."
#Checking for saved engine
if File.exists?(enginename)
#Print loading engine message
puts "Loading memory..."
#memory = Marshal.load File.read(enginename)
#Print engine loaded message
puts "Engine loaded"
else
#Print memory not found message
puts "Cannot load memory, no memory file found"
end
end
def save(filename)
#Default name for engine to be saved
enginename = "#{filename}.stat"
#Print saving engine message
puts "Saving memory..."
#Saving engine to specified file
serialized_array = Marshal.dump(#memory)
savefile = File.new(enginename,"w")
savefile.write(serialized_array)
savefile.close
#Print engine saved message
puts "Memory saved"
end
end
#STAT class test software
stat = STAT.new
filename = "test"
#Load
stat.load(filename)
#Simulate
stat.simulate(1000000)
#Testing
#stat.store(5)
#stat.store(5)
#Inspect
stat.inspect
#Predict
#stat.predict(1000000)
#Save
stat.save(filename)

Use:
occurrences = Hash[ my_array.group_by{ |o| o }.map{ |o,a| [o,a.length] } ]
to get a hash mapping each value to the number of times it occurs. To calculate it yourself:
occurrences = Hash.new(0)
my_array.each{ |o| occurrences[o]+=1 }
See the docs if you don't know what these methods do.

You are working too hard. You want to store the amount of times a randomly generated number is seen within an array, along with the value that is seen itself.
How can Ruby help you with that? Since you are working with an array, glance through the available methods and yes, there it is: count.
ar = [1,2,1,3,2,4,3,2,6,7,8,7]
rn = rand(10)
ar.count(rn)

Related

Is there a a function in Ruby to increment an objects variable inside an array in this example?

The drop1.amount or drop2.amount of Drop object in this example will not increase after the first time theyre run through.
class Drop
attr_accessor :item, :price, :amount
end
drop1 = Drop.new()
drop1.item = "item1"
drop1.price = 2247
drop1.amount = 1
drop2 = Drop.new()
drop2.item = "item2"
drop2.price = 4401
drop2.amount = 60
x = 0
array = []
while x < 10
rand1 = rand(2)
if rand1 == 0
if array.include? drop1.item
drop1.amount = drop1.amount + 1
else
array << drop1.item
array << drop1.amount
end
elsif rand1 == 1
if array.include? drop2.item
drop2.amount = drop2.amount + 60
else
array << drop2.item
array << drop2.amount
end
end
x += 1
end
puts array.to_s.gsub('"', '').gsub('[', '').gsub(']', '')
puts ""
puts drop1.amount
puts drop2.amount
Example of expected output:
item2, 240, item1, 6
6
240
Example of actual result:
item2, 60, item1, 1
6
240
I am looking for a change to the else statements in lines 24 and 32. The purpose of the this program is to create an array of items that will display the "item" one time and an incremented "amount" when a drop is randomly chosen multiple times.
array << drop1.amount does not make an alias of drop1.amount, it makes a one-time copy of the number value contained in drop1.amount. When you update drop1.amount, the copy in array is unchanged. Instead, put a reference to the object onto the result array or update the result array value directly (depending on whether you want to modify the original or not).
For example we can stick to the existing design with something like:
# ...
if array.include? drop1.item
array[array.index(drop1.item)+1] += 1
drop1.amount += 1 # optionally update the original (less ideal than an alias)
else
array << drop1.item
array << drop1.amount
end
# ...
if array.include? drop2.item
array[array.index(drop2.item)+1] += 60
drop2.amount += 60
else
array << drop2.item
array << drop2.amount
end
# ...
While this emits the expected output, this sort of awkward searching and repeated code suggests that there are fundamental design flaws.
I'd write the program something like:
Drop = Struct.new(:item, :price, :amount)
drops = [
Drop.new("item1", 2247, 1),
Drop.new("item2", 4401, 60)
]
increment_amounts = drops.map {|e| e.amount}
result = [nil] * drops.size
10.times do
choice = rand(drops.size)
if result[choice]
result[choice].amount += increment_amounts[choice]
else
result[choice] = drops[choice]
end
end
puts result.compact
.shuffle
.flat_map {|e| [e.item, e.amount]}
.to_s
.gsub(/["\[\]]/, "")
puts "\n" + drops.map {|e| e.amount}.join("\n")
Suggestions which the above version illustrates:
Use a struct instead of a class for such a simple type and set its properties using the constructor rather than accessors.
Use arrays instead of thing1, thing2, etc. This makes it a lot easier to make random choices (among other things). Note that the above version is expandable if you later decide to add more drops. After adding a third or 100 drops (along with corresponding increment amounts), everything just works.
Prefer a clear name like result instead of a generic name like array.
x = 0 ... while x < 10 ... x += 1 is clearer as 10.times.
Pass a regex to gsub instead of a string to avoid chaining multiple calls.
I don't know if I undestand properly the logic, but consider using a Hash instead of an Array:
h = {}
10.times do |x|
rand1 = rand(2)
if rand1 == 0
if h.has_key? drop1.item
drop1.amount += 1
h[drop1.item] = drop1.amount
else
h[drop1.item] = drop1.amount
end
elsif rand1 == 1
if h.has_key? drop2.item
drop2.amount += 60
h[drop2.item] = drop2.amount
else
h[drop2.item] = drop2.amount
end
end
end
For checking the result:
p h
p drop1.amount
p drop2.amount
Other option, if it is viable for you, let the class do the job defining it like this:
class Drop
attr_accessor :item, :price, :amount
def initialize(item:'no_name', price: 0, amount: 0)
#item = item
#price = price
#amount = amount
#counts = 0
#increment = amount
end
def count!
#counts += 1
#amount += #increment if #counts > 1
end
end
Then store the instances in an array:
drops = []
drops << Drop.new(item: 'item1', price: 2247, amount: 1)
drops << Drop.new(item: 'item2', price: 4401, amount: 60)
Run the random picking sampling the drops array:
10.times do |x|
drops.sample.count!
end
Check the result:
drops.each do |drop|
puts "#{drop.item} - #{drop.amount} - #{drop.price}"
end
You can also define a reset method which restores the amount and the counts to the original value:
def reset
#amount = #increment
#count = 0
end

Merging Ranges using Sets - Error - Stack level too deep (SystemStackError)

I have a number of ranges that I want merge together if they overlap. The way I’m currently doing this is by using Sets.
This is working. However, when I attempt the same code with a larger ranges as follows, I get a `stack level too deep (SystemStackError).
require 'set'
ranges = [Range.new(73, 856), Range.new(82, 1145), Range.new(116, 2914), Range.new(3203, 3241)]
set = Set.new
ranges.each { |r| set << r.to_set }
set.flatten!
sets_subsets = set.divide { |i, j| (i - j).abs == 1 } # this line causes the error
puts sets_subsets
The line that is failing is taken directly from the Ruby Set Documentation.
I would appreciate it if anyone could suggest a fix or an alternative that works for the above example
EDIT
I have put the full code I’m using here:
Basically it is used to add html tags to an amino acid sequence according to some features.
require 'set'
def calculate_formatting_classes(hsps, signalp)
merged_hsps = merge_ranges(hsps)
sp = format_signalp(merged_hsps, signalp)
hsp_class = (merged_hsps - sp[1]) - sp[0]
rank_format_positions(sp, hsp_class)
end
def merge_ranges(ranges)
set = Set.new
ranges.each { |r| set << r.to_set }
set.flatten
end
def format_signalp(merged_hsps, sp)
sp_class = sp - merged_hsps
sp_hsp_class = sp & merged_hsps # overlap regions between sp & merged_hsp
[sp_class, sp_hsp_class]
end
def rank_format_positions(sp, hsp_class)
results = []
results += sets_to_hash(sp[0], 'sp')
results += sets_to_hash(sp[1], 'sphsp')
results += sets_to_hash(hsp_class, 'hsp')
results.sort_by { |s| s[:pos] }
end
def sets_to_hash(set = nil, cl)
return nil if set.nil?
hashes = []
merged_set = set.divide { |i, j| (i - j).abs == 1 }
merged_set.each do |s|
hashes << { pos: s.min.to_i - 1, insert: "<span class=#{cl}>" }
hashes << { pos: s.max.to_i - 0.1, insert: '</span>' } # for ordering
end
hashes
end
working_hsp = [Range.new(7, 136), Range.new(143, 178)]
not_working_hsp = [Range.new(73, 856), Range.new(82, 1145),
Range.new(116, 2914), Range.new(3203, 3241)]
sp = Range.new(1, 20).to_set
# working
results = calculate_formatting_classes(working_hsp, sp)
# Not Working
# results = calculate_formatting_classes(not_working_hsp, sp)
puts results
Here is one way to do this:
ranges = [Range.new(73, 856), Range.new(82, 1145),
Range.new(116, 2914), Range.new(3203, 3241)]
ranges.size.times do
ranges = ranges.sort_by(&:begin)
t = ranges.each_cons(2).to_a
t.each do |r1, r2|
if (r2.cover? r1.begin) || (r2.cover? r1.end) ||
(r1.cover? r2.begin) || (r1.cover? r2.end)
ranges << Range.new([r1.begin, r2.begin].min, [r1.end, r2.end].max)
ranges.delete(r1)
ranges.delete(r2)
t.delete [r1,r2]
end
end
end
p ranges
#=> [73..2914, 3203..3241]
The other answers aren't bad, but I prefer a simple recursive approach:
def merge_ranges(*ranges)
range, *rest = ranges
return if range.nil?
# Find the index of the first range in `rest` that overlaps this one
other_idx = rest.find_index do |other|
range.cover?(other.begin) || other.cover?(range.begin)
end
if other_idx
# An overlapping range was found; remove it from `rest` and merge
# it with this one
other = rest.slice!(other_idx)
merged = ([range.begin, other.begin].min)..([range.end, other.end].max)
# Try again with the merged range and the remaining `rest`
merge_ranges(merged, *rest)
else
# No overlapping range was found; move on
[ range, *merge_ranges(*rest) ]
end
end
Note: This code assumes each range is ascending (e.g. 10..5 will break it).
Usage:
ranges = [ 73..856, 82..1145, 116..2914, 3203..3241 ]
p merge_ranges(*ranges)
# => [73..2914, 3203..3241]
ranges = [ 0..10, 5..20, 30..50, 45..80, 50..90, 100..101, 101..200 ]
p merge_ranges(*ranges)
# => [0..20, 30..90, 100..200]
I believe your resulting set has too many items (2881) to be used with divide, which if I understood correctly, would require 2881^2881 iterations, which is such a big number (8,7927981983090337174360463368808e+9966) that running it would take nearly forever even if you didn't get stack level too deep error.
Without using sets, you can use this code to merge the ranges:
module RangeMerger
def merge(range_b)
if cover?(range_b.first) && cover?(range_b.last)
self
elsif cover?(range_b.first)
self.class.new(first, range_b.last)
elsif cover?(range_b.last)
self.class.new(range_b.first, last)
else
nil # Unmergable
end
end
end
module ArrayRangePusher
def <<(item)
if item.kind_of?(Range)
item.extend RangeMerger
each_with_index do |own_item, idx|
own_item.extend RangeMerger
if new_range = own_item.merge(item)
self[idx] = new_range
return self
end
end
end
super
end
end
ranges = [Range.new(73, 856), Range.new(82, 1145), Range.new(116, 2914), Range.new(3203, 3241)]
new_ranges = Array.new
new_ranges.extend ArrayRangePusher
ranges.each do |range|
new_ranges << range
end
puts ranges.inspect
puts new_ranges.inspect
This will output:
[73..856, 82..1145, 116..2914, 3203..3241]
[73..2914, 3203..3241]
which I believe is the intended output for your original problem. It's a bit ugly, but I'm a bit rusty at the moment.
Edit: I don't think this has anything to do with your original problem before the edits which was about merging ranges.

puts statement is printing on two lines

I have a class called PolynomialElements and inside that class I have a method called printElement that has a puts statement that print two variables. The puts statement is printing the variables on different lines. How do I get puts to print the two variables on one line. My code is below, it is line #5 where the puts statement is.
class PolynomialElements
attr_accessor :element, :size
def printElement
puts "#{element}x^#{size}"
end
end
askAgain = true
polyArray = Array.new
while askAgain
puts "How many numbers do you want to enter? "
numString = gets
num = numString.to_i
while num > 0
puts "Enter a value for the Polynomial "
value = gets
polyArray.push(value)
num -= 1
end
sizeOfArray = polyArray.length
polyArray.each do |x|
var = PolynomialElements.new
var.element = x
sizeOfArray -= 1
var.size = sizeOfArray
var.printElement
end
puts "Enter y to enter new number or anything else to quit"
cont = gets
if cont.chomp != "y"
askAgain = false
else
polyArray.clear
end
end
In the while loop change:
value = gets
to:
value = gets.chomp
You will then get:
Enter a value for the Polynomial
3
1x^2
2x^1
3x^0

How can you add to a hash value instead of having it overwrite with the new value?

Basically I have these files (medline from NCBI). Each is associated with a journal title. Each has 0, 1 or more genbank identification numbers (GBIDs). I can associate the number of GBIDs per file with each journal name. My problem is that I may have more than one file associated with the same journal, and I don't know how to add the number of GBIDs per file into a total number of GBIDs per journal.
My current code:
jt stands for journal title, pulled out properly from the file. GBIDs are added to the count as encountered.
Full code:
#!/usr/local/bin/ruby
require 'rubygems'
require 'bio'
Bio::NCBI.default_email = 'kepresto#uvm.edu'
ncbi_search = Bio::NCBI::REST::ESearch.new
ncbi_fetch = Bio::NCBI::REST::EFetch.new
print "\nQuery?\s"
query_phrase = gets.chomp
"\nYou said \"#{query_phrase}\". Searching, please wait..."
pmid_list = ncbi_search.search("pubmed", "#{query_phrase}", 0)
puts "\nYour search returned #{pmid_list.count} results."
if pmid_list.count > 200
puts "\nToo big."
exit
end
gbid_hash = Hash.new
jt_hash = Hash.new(0)
pmid_list.each do |pmid|
ncbi_fetch.pubmed(pmid, "medline").each do |pmid_line|
if pmid_line =~ /JT.+- (.+)\n/
jt = $1
jt_count = 0
jt_hash[jt] = jt_count
ncbi_fetch.pubmed(pmid, "medline").each do |pmid_line_2|
if pmid_line_2 =~ /SI.+- GENBANK\/(.+)\n/
gbid = $1
jt_count += 1
gbid_hash["#{gbid}\n"] = nil
end
end
if jt_count > 0
puts "#{jt} = #{jt_count}"
end
jt_hash[jt] += jt_count
end
end
end
jt_hash.each do |key,value|
# if value > 0
puts "Journal: #{key} has #{value} entries associtated with it. "
# end
end
# gbid_file = File.open("temp_*.txt","r").each do |gbid_count|
# puts gbid_count
# end
My result:
Your search returned 192 results.
Virology journal = 8
Archives of virology = 9
Virus research = 1
Archives of virology = 6
Virology = 1
Basically, how do I get it to say Archives of virology = 15, but for any journal title? I tried a hash, but the second archives of virology just overwrote the first... is there a way to make two keys add their values in a hash?
I don't entirely follow what you are asking for here.
However, you are overwriting your value for a given hash key because because you are doing this:
jt_count = 0
jt_hash[jt] = jt_count
You already initialized your hash earlier like this:
jt_hash = Hash.new(0)
That is, every key will have a default value of 0. Thus, there's no need to do initialize jt_hash[jt] to 0.
If you remove this line:
jt_hash[jt] = jt_count
Then the values for jt_hash[jt] should accumulate for each pass through the loop
ncbi_fetch.pubmed(pmid, "medline").each do |pmid_line|
....
end
Change these two lines:
jt_count = 0
jt_hash[jt] = jt_count
to this:
if jt_hash[jt] == nil
jt_count = 0
jt_hash[jt] = jt_count
else
jt_count = jt_hash[jt]
end
This just check the hash for a null value at that key and if it is null stick an integer in it. If it is not null then return the previous integer so you can add to it.

In a hash, how do you add two values for the same key instead of overwriting?

Basically I have these files (medline from NCBI). Each is associated with a journal title. Each has 0, 1 or more genbank identification numbers (GBIDs). I can associate the number of GBIDs per file with each journal name. My problem is that I may have more than one file associated with the same journal, and I don't know how to add the number of GBIDs per file into a total number of GBIDs per journal.
My current code:
jt stands for journal title, pulled out properly from the file. GBIDs are added to the count as encountered.
... up to this point, the first search is performed, each "pmid" you can think of
as a single file, so each "fetch" goes through all the files one at a time...
pmid_list.each do |pmid|
ncbi_fetch.pubmed(pmid, "medline").each do |pmid_line|
if pmid_line =~ /JT.+- (.+)\n/
jt = $1
jt_count = 0
jt_hash[jt] = jt_count
ncbi_fetch.pubmed(pmid, "medline").each do |pmid_line_2|
if pmid_line_2 =~ /SI.+- GENBANK\/(.+)\n/
gbid = $1
jt_count += 1
gbid_hash["#{gbid}\n"] = nil
end
end
if jt_count > 0
puts "#{jt} = #{jt_count}"
end
end
end
end
My result:
Your search returned 192 results.
Virology journal = 8
Archives of virology = 9
Virus research = 1
Archives of virology = 6
Virology = 1
Basically, how do I get it to say Archives of virology = 15, but for any journal title? I tried a hash, but the second archives of virology just overwrote the first... is there a way to make two keys add their values in a hash?
Full code:
#!/usr/local/bin/ruby
require 'rubygems'
require 'bio'
Bio::NCBI.default_email = 'kepresto#uvm.edu'
ncbi_search = Bio::NCBI::REST::ESearch.new
ncbi_fetch = Bio::NCBI::REST::EFetch.new
print "\nQuery?\s"
query_phrase = gets.chomp
"\nYou said \"#{query_phrase}\". Searching, please wait..."
pmid_list = ncbi_search.search("pubmed", "#{query_phrase}", 0)
puts "\nYour search returned #{pmid_list.count} results."
if pmid_list.count > 200
puts "\nToo big."
exit
end
gbid_hash = Hash.new
jt_hash = Hash.new(0)
pmid_list.each do |pmid|
ncbi_fetch.pubmed(pmid, "medline").each do |pmid_line|
if pmid_line =~ /JT.+- (.+)\n/
jt = $1
jt_count = 0
jt_hash[jt] = jt_count
ncbi_fetch.pubmed(pmid, "medline").each do |pmid_line_2|
if pmid_line_2 =~ /SI.+- GENBANK\/(.+)\n/
gbid = $1
jt_count += 1
gbid_hash["#{gbid}\n"] = nil
end
end
if jt_count > 0
puts "#{jt} = #{jt_count}"
end
jt_hash[jt] += jt_count
end
end
end
jt_hash.each do |key,value|
# if value > 0
puts "Journal: #{key} has #{value} entries associtated with it. "
# end
end
# gbid_file = File.open("temp_*.txt","r").each do |gbid_count|
# puts gbid_count
# end
At the top somewhere declare the jt_hash to start with zero's:
jt_hash = Hash.new(0)
Then, after:
puts "#{jt} = #{jt_count}"
Put:
jt_hash[jt] += jt_count
This makes it so that jt_count is incremented in the hash, rather than overwritten. This will print out everything as it happens, so you'll get something like:
Your search returned 192 results.
Virology journal = 8
Archives of virology = 9
Virus research = 1
Archives of virology = 15
Virology = 1
If you then want everything to just print once just put something right at the end that goes through jt_hash and prints everything:
jt_hash.each { |elem|
puts "#{elem[1]} = #{elem[0]}"
}

Resources