How to change a value in an array via a hash? - ruby

I want to change the value of an array via a hash, for example:
arr = ['g','g','e','z']
positions = {1 => arr[0], 2 => arr[1]}
positions[1] = "ee"
Problem is that the one that changed is hash and not array. When I do p arr It still outputs ['g','g','e','z']. Is there a way around this?

You're going to need to add another line of code to do what you want:
arr = ['g','g','e','z']
positions = {1 => arr[0], 2 => arr[1]}
positions[1] = "ee"
arr[0] = positions[1]
Another option would be to make a method that automatically updated the array for you, something like this:
def update_hash_and_array(hash, array, val, index)
# Assume that index is not zero indexed like you have
hash[index] = val
array[index - 1] = val
end
update_hash_and_array(positions, arr, "ee", 1) # Does what you want

This is possible to code into your hash using procs.
arr = ['g','g','e','z']
positions = {1 => -> (val) { arr[0] = val } }
positions[1].('hello')
# arr => ['hello', 'g', 'e', 'z']
You can generalize this a bit if you want to generate a hash that can modify any array.
def remap_arr(arr, idx)
(idx...arr.length+idx).zip(arr.map.with_index{|_,i| -> (val) {arr[i] = val}}).to_h
end
arr = [1,2,3,4,5,6]
positions = remap_arr(arr, 1)
positions[2].('hello')
# arr => [1,'hello',3,4,5,6]
positions[6].('goodbye')
# arr => [1,'hello',3,4,5,'goodbye']
But I'm hoping this is just a thought experiment, there is no reason to change the way array indexing behavior works to start from 1 rather than 0. In such cases, you would normally just want to offset the index you have to match the proper array indexing (starting at zero). If that is not sufficient, it's a sign you need a different data structure.

#!/usr/bin/env ruby
a = %w(q w e)
h = {
1 => a[0]
}
puts a[0].object_id # 70114787518660
puts h[1].object_id # 70114787518660
puts a[0] === h[1] # true
# It is a NEW object of a string. Look at their object_ids.
# That why you can not change value in an array via a hash.
h[1] = 'Z'
puts a[0].object_id # 70114787518660
puts h[1].object_id # 70114574058580
puts a[0] === h[1] # false
h[2] = a
puts a.object_id # 70308472111520
puts h[2].object_id # 70308472111520
puts h[2] === a # true
puts a[0] === h[2][0] # true
# Here we can change value in the array via the hash.
# Why?
# Because 'h[2]' and 'a' are associated with the same object '%w(q w e)'.
# We will change the VALUE without creating a new object.
h[2][0] = 'X'
puts a[0] # X
puts h[2][0] # X
puts a[0] === h[2][0] # true

Related

Project Euler 8 in Ruby

I know my code works to get the correct answer for 4 adjacent integers. But it's not working with 13.
The only thing I can think of is that it can be an issue with an unsigned int, but in Ruby I don't think I'd have that problem because it would change automatically into a Bignum class.
So that means that somewhere in my calculation I am wrong?
Please give me a hint.
# Euler 8
# http://projecteuler.net/index.php?section=problems&id=8
# Find the thirteen adjacent digits in the 1000-digit number
# that have the greatest product.
# What is the value of this product?
number = []
#split the integer as a string into an array
long_digit = "73167176531330624919225119674426574742355349194934
96983520312774506326239578318016984801869478851843
85861560789112949495459501737958331952853208805511
12540698747158523863050715693290963295227443043557
66896648950445244523161731856403098711121722383113
62229893423380308135336276614282806444486645238749
30358907296290491560440772390713810515859307960866
70172427121883998797908792274921901699720888093776
65727333001053367881220235421809751254540594752243
52584907711670556013604839586446706324415722155397
53697817977846174064955149290862569321978468622482
83972241375657056057490261407972968652414535100474
82166370484403199890008895243450658541227588666881
16427171479924442928230863465674813919123162824586
17866458359124566529476545682848912883142607690042
24219022671055626321111109370544217506941658960408
07198403850962455444362981230987879927244284909188
84580156166097919133875499200524063689912560717606
05886116467109405077541002256983155200055935729725
71636269561882670428252483600823257530420752963450"
long_digit.split("").map { |s| number << s.to_i }
#iterate through the array to find the 13 ajacent digits that have the largest product
largest_product = 0
a = 0
#stay within the bounds of the array
while number[a+12]
current_product = number[a] * number[a+1] * number[a+2] * number[a+3] * number[a+4] * number[a+5] * number[a+6] * number[a+7] * number[a+8] * number[a+9] * number[a+10] * number[a+11] * number[a+12]
if current_product > largest_product
largest_product = current_product
end
a = a + 1
end
puts largest_product
I think this solution is pretty clean and simple:
#!/usr/bin/env ruby
input = "
73167176531330624919225119674426574742355349194934
96983520312774506326239578318016984801869478851843
85861560789112949495459501737958331952853208805511
12540698747158523863050715693290963295227443043557
66896648950445244523161731856403098711121722383113
62229893423380308135336276614282806444486645238749
30358907296290491560440772390713810515859307960866
70172427121883998797908792274921901699720888093776
65727333001053367881220235421809751254540594752243
52584907711670556013604839586446706324415722155397
53697817977846174064955149290862569321978468622482
83972241375657056057490261407972968652414535100474
82166370484403199890008895243450658541227588666881
16427171479924442928230863465674813919123162824586
17866458359124566529476545682848912883142607690042
24219022671055626321111109370544217506941658960408
07198403850962455444362981230987879927244284909188
84580156166097919133875499200524063689912560717606
05886116467109405077541002256983155200055935729725
71636269561882670428252483600823257530420752963450"
.gsub(/\s+/, '')
puts input.chars
.map(&:to_i)
.each_cons(13)
.map { |seq| seq.reduce(:*) }
.max
gsub performs the trimming.
chars gets the characters.
map(&:to_i) maps all the chars to ints.
each_cons(13) gets blocks of consecutive numbers (https://ruby-doc.org/core-2.4.1/Enumerable.html#method-i-each_cons)
map { |seq| seq.reduce(:*) } is going to take each of the consecutive blocks and perform a reduce (multiplying all the numbers of each slice/consecutive block of numbers).
max gets the maximum value.
Issue seems to be due to lot of white space chars in the string long_digit that are become 0 in the array number, thus giving wrong results.
Here is a corrected and simplified version. After removing newlines and spaces using gsub, we now have a 1000 digit number and we get correct answer.
number = long_digit.gsub!(/\s/, '').split("").map{ |s| s.to_i }
n = 13
p number.each_cons(n).map{|a| a.reduce {|a, i| a = a * i }}.max
#=> 23514624000
First, let's fix the string:
long_digit.gsub!(/\s|\n/,'')
long_digit.size #=> 1000
We can speed this up by eliminating 13-character substrings that contain a zero:
shorter_digit_arr = long_digit.split('0').reject { |s| s.size < 13 }
#=> ["7316717653133",
# "6249192251196744265747423553491949349698352",
# "6326239578318",
# "18694788518438586156",
# "7891129494954595",
# "17379583319528532",
# "698747158523863",
# "435576689664895",
# "4452445231617318564",
# "987111217223831136222989342338",
# "81353362766142828",
# "64444866452387493",
# "1724271218839987979",
# "9377665727333",
# "594752243525849",
# "632441572215539753697817977846174",
# "86256932197846862248283972241375657",
# "79729686524145351",
# "6585412275886668811642717147992444292823",
# "863465674813919123162824586178664583591245665294765456828489128831426",
# "96245544436298123",
# "9878799272442849",
# "979191338754992",
# "559357297257163626956188267"]
Now, for each element of shorter_digit_arr, find the 13-character substring whose product of digits is greatest, then find the largest of those (shorter_digit_arr.size #=> 24) products. The main benefit of splitting the string into substrings in this way is that absence of zeroes allows us to perform the product calculations in a more efficient way than simply grinding out 12 multiplications for each substring:
res = shorter_digit_arr.map do |s|
cand = s[0,13].each_char.reduce(1) { |prod,t| prod * t.o_i }
best = { val: cand, offset: 0 }
(13...s.size).each do |i|
cand = cand*(s[i].to_i)/(s[i-13].to_i)
best = { val: cand, offset: i-12 } if cand > best[:val]
end
[best[:val], s[best[:offset],13]]
end.max_by(&:first)
#=> [23514624000, "5576689664895"]
puts "max_product: %d for: '%s'" % res
#=> max_product: 23514624000 for: '5576689664895'
The solution is the last 13 characters of:
s = shorter_digit_arr[7]
#=> "435576689664895"
The key here is the line:
cand = cand*(s[i].to_i)/(s[i-13].to_i)
which computes a 13-digit product by multiplying the "previous" 13-digit product by the digit added and dividing it by the digit dropped off.
In finding the maximum product for this element, the calculations are as follows:
s = "435576689664895"
cand = s[0,13].each_char.reduce(1) { |prod,t| prod * t.to_i }
#=> = "4355766896648".each_char.reduce(1) { |prod,t| prod * t.to_i }
# = 6270566400
best_val = { val: 6270566400, offset: 0 }
enum = (13...s.size).each
#=> #<Enumerator: 13...15:each>
The elements of this enumerator will be passed to the block by Enumerator#each. We can see what they are by converting enum to an array:
enum.to_a
#=> [13, 14]
We can use Enumerator#next to simulate the passing of the elements of enum to the block and their assignment to the block variable i.
Pass the first element of the enumerator (13) to the block:
i = enum.next
#=> 13
cand = cand*(s[i].to_i)/(s[i-13].to_i)
# = 6270566400*(s[13].to_i)/(s[0].to_i)
# = 6270566400*(9)/(4)
# = 14108774400
cand > best[:val]
#=> 14108774400 > 6270566400 => true
best = { val: cand, offset: i-12 }
#=> { val: 14108774400, offset: 1 }
Pass the second element (14) to the block:
i = enum.next
#=> 14
cand = cand*(s[i].to_i)/(s[i-13].to_i)
#=> = 14108774400*(s[14].to_i)/(s[1].to_i)
# = 14108774400*(5)/(3)
# = 23514624000
cand > best[:val]
#=> 23514624000 > 14108774400 => true
best = { val: 23514624000, offset: 2 }
All elements of the enumerator have now been passed to the block. We can confirm that:
i = enum.next
#=> StopIteration: iteration reached an end
The result (for shorter_digit_arr[7]) is:
[best[:val], s[best[:offset],13]]
#=> [23514624000, "435576689664895"[2,13]]
# [23514624000, "5576689664895"]

Double index numbers in array (Ruby)

The array counts is as follows:
counts = ["a", 1]
What does this:
counts[0][0]
refer to?
I've only seen this before:
array[idx]
but never this:
array[idx][idx]
where idx is an integer.
This is the entire code where the snippet of code before was from:
def num_repeats(string) #abab
counts = [] #array
str_idx = 0
while str_idx < string.length #1 < 4
letter = string[str_idx] #b
counts_idx = 0
while counts_idx < counts.length #0 < 1
if counts[counts_idx][0] == letter #if counts[0][0] == b
counts[counts_idx][1] += 1
break
end
counts_idx += 1
end
if counts_idx == counts.length #0 = 0
# didn't find this letter in the counts array; count it for the
# first time
counts.push([letter, 1]) #counts = ["a", 1]
end
str_idx += 1
end
num_repeats = 0
counts_idx = 0
while counts_idx < counts.length
if counts[counts_idx][1] > 1
num_repeats += 1
end
counts_idx += 1
end
return counts
end
The statement
arr[0]
Gets the first item of the array arr, in some cases this may also be an array (Or another indexable object) this means you can get that object and get an object from that array:
# if arr = [["item", "another"], "last"]
item = arr[0]
inner_item = item[0]
puts inner_item # => "item"
This can be shortened to
arr[0][0]
So any 2 dimensional array or array containing indexable objects can work like this, e.g. with an array of strings:
arr = ["String 1", "Geoff", "things"]
arr[0] # => "String 1"
arr[0][0] # => "S"
arr[1][0] # => "G"
It's for nested indexing
a = [ "item 0", [1, 2, 3] ]
a[0] #=> "item 0"
a[1] #=> [1, 2, 3]
a[1][0] #=> 1
Since the value at index 1 is another array you can use index referencing on that value as well.
EDIT
Sorry I didn't thoroughly read the original question. The array in question is
counts = ["a", 1]
In this case counts[0] returns "a" and since we can use indexes to references characters of a string, the 0th index in the string "a" is simply "a".
str = "hello"
str[2] #=> "l"
str[1] #=> "e"

How can I convert a human-readable number to a computer-readable number in Ruby?

I'm working in Ruby with an array that contains a series of numbers in human-readable format (e.g., 2.5B, 1.27M, 600,000, where "B" stands for billion, "M" stands for million). I'm trying to convert all elements of the array to the same format.
Here is the code I've written:
array.each do |elem|
if elem.include? 'B'
elem.slice! "B"
elem = elem.to_f
elem = (elem * 1000000000)
else if elem.include? 'M'
elem.slice! "M"
elem = elem.to_f
elem = (elem * 1000000)
end
end
When I inspect the elements of the array using puts(array), however, the numbers appear with the "B" and "M" sliced off but the multiplication conversion does not appear to have been applied (e.g., the numbers now read 2.5, 1.27, 600,000, instead of 2500000000, 1270000, 600,000).
What am I doing wrong?
First thing to note is that else if in ruby is elsif. See http://www.tutorialspoint.com/ruby/ruby_if_else.htm
Here is a working function for you to try out:
def convert_array_items_from_human_to_integers(array)
array.each_with_index do |elem,i|
if elem.include? 'B'
elem.slice! "B"
elem = elem.to_f
elem = (elem * 1000000000)
elsif elem.include? 'M'
elem.slice! "M"
elem = elem.to_f
elem = (elem * 1000000)
end
array[i] = elem
end
return array
end
Calling convert_array_items_from_human_to_integers(["2.5B", "1.2M"])
returns [2500000000.0, 1200000.0]
Another variation:
array = ['2.5B', '1.27M', '$600000']
p array.each_with_object([]) { |i, a|
i = i.gsub('$', '')
a << if i.include? 'B'
i.to_f * 1E9
elsif i.include? 'M'
i.to_f * 1E6
else
i.to_f
end
}
#=> [2500000000.0, 1270000.0, 600000.0]
Try this:
array.map do |elem|
elem = elem.gsub('$','')
if elem.include? 'B'
elem.to_f * 1000000000
elsif elem.include? 'M'
elem.to_f * 1000000
else
elem.to_f
end
end
This uses map instead of each to return a new array. Your attempt assigns copies of the array elements, leaving the original array in place (except for the slice!, which modifies in place). You can dispense with the slicing in the first place, since to_f will simply ignore any non-numeric characters.
EDIT:
If you have leading characters such as $2.5B, as your question title indicates (but not your example), you'll need to strip those explicitly. But your sample code doesn't handle those either, so I assume that's not an issue.
Expanding a bit on pjs' answer:
array.each do |elem|
elem is a local variable pointing to each array element, one at a time. When you do this:
elem.slice! "B"
you are sending a message to that array element telling it to slice the B. And you're seeing that in the end result. But when you do this:
elem = elem.to_f
now you've reassigned your local variable elem to something completely new. You haven't reassigned what's in the array, just what elem is.
Here's how I'd go about it:
ARY = %w[2.5B 1.27M 600,000]
def clean_number(s)
s.gsub(/[^\d.]+/, '')
end
ARY.map{ |v|
case v
when /b$/i
clean_number(v).to_f * 1_000_000_000
when /m$/i
clean_number(v).to_f * 1_000_000
else
clean_number(v).to_f
end
}
# => [2500000000.0, 1270000.0, 600000.0]
The guts of the code are in the case statement. A simple check for the multiplier allows me to strip the undesired characters and multiply by the right value.
Normally we could use to_f to find the floating-point number to be multiplied for strings like '1.2', but it breaks down for things like '$1.2M' because of the "$". The same thing is true for embedded commas marking thousands:
'$1.2M'.to_f # => 0.0
'1.2M'.to_f # => 1.2
'6,000'.to_f # => 6.0
'6000'.to_f # => 6000.0
To fix the problem for simple strings containing just the value, it's not necessary to do anything fancier than stripping undesirable characters using gsub(/[^\d.]+/, ''):
'$1.2M'.gsub(/[^\d.]+/, '') # => "1.2"
'1.2M'.gsub(/[^\d.]+/, '') # => "1.2"
'6,000'.gsub(/[^\d.]+/, '') # => "6000"
'6000'.gsub(/[^\d.]+/, '') # => "6000"
[^\d.] means "anything NOT a digit or '.'.
Be careful how you convert your decimal values to integers. You could end up throwing away important precision:
'0.2M'.gsub(/[^\d.]+/, '').to_f * 1_000_000 # => 200000.0
'0.2M'.gsub(/[^\d.]+/, '').to_i * 1_000_000 # => 0
('0.2M'.gsub(/[^\d.]+/, '').to_f * 1_000_000).to_i # => 200000
Of course all this breaks down if your string is more complex than a simple number and multiplier. It's easy to break down a string and identify those sort of sub-strings, but that's a different question.
I would do it like this:
Code
T, M, B = 1_000, 1_000_000, 1_000_000_000
def convert(arr)
arr.map do |n|
m = n.gsub(/[^\d.TMB]/,'')
m.to_f * (m[-1][/[TMB]/] ? Object.const_get(m[-1]) : 1)
end
end
Example
arr = %w[$2.5B 1.27M 22.5T, 600,000]
convert(arr)
# => [2500000000.0, 1270000.0, 22500.0, 600000.0]
Explanation
The line
m = n.gsub(/[^\d.TMB]/,'')
# => ["2.5B", "1.27M", "22.5T", "600000"]
merely eliminates unwanted characters.
m.to_f * (m[-1][/[TMB]/] ? Object.const_get(m[-1]) : 1)
returns the product of the string converted to a float and a constant given by the last character of the string, if that character is T, M or B, else 1.
Actual implementation might be like this:
class A
T, M, B = 1_000, 1_000_000, 1_000_000_000
def doit(arr)
c = self.class.constants.map(&:to_s).join
arr.map do |n|
m = n.gsub(/[^\d.#{c}]/,'')
m.to_f * (m[-1][/[#{c}]/] ? self.class.const_get(m[-1]) : 1)
end
end
end
If we wished to change the reference for 1,000 from T to K and add T for trillion, we would need only change
T, M, B = 1_000, 1_000_000, 1_000_000_000
to
K, M, B, T = 1_000, 1_000_000, 1_000_000_000, 1_000_000_000_000

Ruby Array Maniplulation - Scanning and Counting

I'm trying to read though each string in the array and count the number of times the letters occurs in each position (ie 1 , 2, 3, 4). How am I not using the multidimensional array and += operator correctly?
def scan_str(arr)
position = [[]]
x = 0
arr.select do |word|
word.length.times do |i|
if word.index('G') == x
position[x+1,0] += 1
x += 1
elsif word.index('A') == x
position[x+1,1] += 1
x += 1
elsif word.index('T') == x
position[x+1,2] += 1
x += 1
elsif word.index('C') == x
position[x+1,3] += 1
x += 1
else
x += 1
end
end
end
p position
end
input = ["CTAGATA","CCCGAT","AAATT","TTCAAATGA"]
scan_str(input)
Thanks this is helpful. But now how do I manipulate the array without the error message "`[]': no implicit conversion from nil to integer (TypeError)"... There must be something I'm not getting about the index or position [][] syntax.
def scan_str(arr)
position = [[]]
z=arr.count
x = 0
arr.select do |word|
if word.index('G') == x
position[y][0] += (countG =+ 1)/z
x += 1
y += 1
elsif word.index('A') == x
position[y][1] += (countA =+ 1)/z
x += 1
y += 1
elsif word.index('T') == x
position[y][2] += (countT =+ 1)/z
x += 1
y += 1
elsif word.index('C') == x
position[y][3] += (countC =+ 1)/z
x += 1
y += 1
else
x += 1
y += 1
end
end
p position
end
input = ["CTAGATA","CCCGAT","AAATT","TTCAAATGA"]
scan_str(input)
AS they almost answered it in comments:
position[1,3] is 3 elements from 2nd position, counting from 0.
Correct syntax is: position[1][3].
ps. Example:
arr=[[1,2,3], [4,5,6]]
arr[1][2]
# 6 # 3rd element from 2nd array, counting from 0!
It is no very beautiful and should probably be split into a few functions:
a = ["CTAGATA","CCCGAT","AAATT","TTCAAATGA"]
p Hash[
a.map{|sub| sub.chars.with_index(1).to_a}
.flatten(1).group_by(&:last)
.map{|pos, values|
[pos, Hash[values.group_by{|char,|char}.map{|char,s|[char, s.size.to_f/values.length]}]]
}
] #=> {1=>{"C"=>0.5, "A"=>0.25, "T"=>0.25}, 2=>{"T"=>0.5, "C"=>0.25, "A"=>0.25}, 3=>
As the problem with your code has been explained, I would like to suggest a more "Ruby-like" approach:
TEST = ['G', 'A', 'T', 'C']
def scan_str(arr)
TEST.each_with_object({}) {|c,h| h[c] = arr.each_with_object(Hash.new(0)) {|line, hh| \
line.chars.each_with_index {|s,i| hh[i] += 1 if s==c}}}
end
arr = ["CTAGATA","CCCGAT","AAATT","TTCAAATGA"]
scan_str(arr)
# => {"G"=>{3=>2, 7=>1}, \
# => "A"=>{2=>2, 4=>3, 6=>1, 0=>1, 1=>1, 3=>1, 5=>1, 8=>1}, \
# => "T"=>{1=>2, 5=>2, 3=>1, 4=>1, 0=>1, 6=>1}, \
# => "C"=>{0=>2, 1=>1, 2=>2}}
A few points:
It is probably most convenient to put the results in a hash. Here I have scan_str returning a hash whose keys are the elements of TEST. The value of each key is itself a hash, with each key being a line offset position and the associated value being the number of times the letter given by the outer key is located at that position.
I first iterate over the elements of TEST using Enumerable#each_with_object with the default object being an empty hash {}. Inside the block the hash is referenced by h. The alternative would be to define an empty has (h = {}) in the line above and then use TEST.each {|c|... instead. Had I done that it would have also been necessary to add the line h at the end of the method, so that the hash would be returned.
For each element c of TEST, I iterate over the lines of the array, again using each_with_object. This time, however, the default value of the object is Hash.new(0) which creates a hash with default values of zero. By doing that, when hh[i] += 1 is executed in the inner loop, we don't have to check if hh has a key i; if it does not, Ruby first executes hh[i] = 0 (zero being the default value), then hh[i] += 1 => 1.
For each line, line.chars converts the line to an array of characters. I then iterate with Enumerable#each_with_index. Inside the block, the character (string of length one) and line offset are referenced by s and i respectively.
There are a couple of ways to obtain the result you want. The first, and probably easiest, would be just to change the code I've already offered. I'll do that later today.
The second is to use the code above as a "helper method".
Use helper" method
To use the code we already have, rename the scan_str method above to scan_str_helper and add this:
def scan_str(arr)
h = scan_str_helper(arr)
posh = Hash[h.values.map(&:keys).flatten.uniq.map {|e| \
[e,Hash[TEST.zip([0]*TEST.size)]]}]
h.each {|k,v| v.each {|kk,vv| posh[kk][k] += vv}}
posh.each_with_object({}) {|(k,v),hp| tot = 1.0 * v.values.reduce(&:+); \
hp[k] = Hash[v.keys.zip(v.values.map {|e| e/tot})]}
end
scan_str(arr)
# {3=>{"G"=>0.5, "A"=>0.25, "T"=>0.25, "C"=>0.0}, 7=>{"G"=>1.0, "A"=>0.0, "T"=>0.0, "C"=>0.0},
# 2=>{"G"=>0.0, "A"=>0.5, "T"=>0.0, "C"=>0.5}, 4=>{"G"=>0.0, "A"=>0.75, "T"=>0.25, "C"=>0.0},
# 6=>{"G"=>0.0, "A"=>0.5, "T"=>0.5, "C"=>0.0}, 0=>{"G"=>0.0, "A"=>0.25, "T"=>0.25, "C"=>0.5},
# 1=>{"G"=>0.0, "A"=>0.25, "T"=>0.5, "C"=>0.25},
# 5=>{"G"=>0.0, "A"=>0.3333333333333333, "T"=>0.6666666666666666, "C"=>0.0},
# 8=>{"G"=>0.0, "A"=>1.0, "T"=>0.0, "C"=>0.0}}
A few more notes:
h.values.map(&:keys).flatten.uniq => [3, 7, 2, 4, 6, 0, 1, 5, 8]` merely constructs an array of the position offsets that contain one or more TEST elements.
h.keys.zip([0]*TEST.size) => h.keys.zip([0, 0, 0, 0]) => Hash[["G",0], ["A",0], ["T",0], ["C",0]]] => {"G"=>0, "A"=>0, "T"=>0, "C"=>0}, so for e = 3 (say), Hash[3, {"G"=>0, "A"=>0, "T"=>0, "C"=>0}] => {3=>{"G"=>0, "A"=>0, "T"=>0, "C"=>0}}.
Instead of h.keys.zip([0]*TEST.size), you may be tempted to write a = [0]*TEST.size; TEST.zip(a). That doesn't work. I'll leave it to you to figure out why.
h.each {|k,v| v.each {|kk,vv| posh[kk][k] += vv}} fills the hash posh =>
# {3=>{"G"=>2, "A"=>1, "T"=>1, "C"=>0}, 7=>{"G"=>1, "A"=>0, "T"=>0, "C"=>0},
# 2=>{"G"=>0, "A"=>2, "T"=>0, "C"=>2}, 4=>{"G"=>0, "A"=>3, "T"=>1, "C"=>0},
# 6=>{"G"=>0, "A"=>1, "T"=>1, "C"=>0}, 0=>{"G"=>0, "A"=>1, "T"=>1, "C"=>2},
# 1=>{"G"=>0, "A"=>1, "T"=>2, "C"=>1}, 5=>{"G"=>0, "A"=>1, "T"=>2, "C"=>0},
# 8=>{"G"=>0, "A"=>1, "T"=>0, "C"=>0}}.
The last line merely converts the numbers of occurrences to fractions. For example, 3=>{"G"=>2, "A"=>1, "T"=>1, "C"=>0} is converted to 3=>{"G"=>0.5, "A"=>0.25, "T"=>0.25, "C"=>0.0}
Modification of Initial Code
def scan_str(arr)
a = Array.new(arr.map(&:size).max).map {|e| \
Hash[TEST.zip(Array.new(TEST.size,0))]}
arr.each {|s| s.chars.each_with_index {|c,i| TEST.each \
{|ss| a[i][ss] += 1 if c == ss}}}
Hash[a.map.with_index {|h,i| tot = 1.0 * h.values.reduce(&:+); tot > 0.0 ? \
[i, Hash[h.keys.zip(h.values.map {|e| e/tot})]] : nil}.compact]
end
The first statement creates an array a, the ith element corresponding to character offset i in each line. The value of the ith element is the hash referred to in the next note, with all values equal to zero.
The second statement fills the array a:
# => [{"G"=>0, "A"=>1, "T"=>1, "C"=>2}, {"G"=>0, "A"=>1, "T"=>2, "C"=>1},
# => {"G"=>0, "A"=>2, "T"=>0, "C"=>2}, {"G"=>2, "A"=>1, "T"=>1, "C"=>0},
# => {"G"=>0, "A"=>3, "T"=>1, "C"=>0}, {"G"=>0, "A"=>1, "T"=>2, "C"=>0},
# => {"G"=>0, "A"=>1, "T"=>1, "C"=>0}, {"G"=>1, "A"=>0, "T"=>0, "C"=>0},
# => {"G"=>0, "A"=>1, "T"=>0, "C"=>0}].
The last statement converts each element of a to a hash if the sum of the values is positive; else to nil. compact removes all elements that are nil. Putting Hash[ and the beginning and ] at the end converts the array to a hash, which is returned by scan_str.
Note this approach gives the same result as the method that the used the "helper" method, though the order of the elements of the hash is different.

Ruby array with an extra state

I'm trying to go through an array and add a second dimension for true and false values in ruby.
For example. I will be pushing on arrays to another array where it would be:
a = [[1,2,3,4],[5]]
I would like to go through each array inside of "a" and be able to mark a state of true or false for each individual value. Similar to a map from java.
Any ideas? Thanks.
You're better off starting with this:
a = [{ 1 => false, 2 => false, 3 => false, 4 => false }, { 5 => false }]
Then you can just flip the booleans as needed. Otherwise you will have to pollute your code with a bunch of tests to see if you have a Fixnum (1, 2, ...) or a Hash ({1 => true}) before you can test the flag's value.
Hashes in Ruby 1.9 are ordered so you wouldn't lose your ordering by switching to hashes.
You can convert your array to this form with one of these:
a = a.map { |x| Hash[x.zip([false] * x.length)] }
# or
a = a.map { |x| x.each_with_object({}) { |i,h| h[i] = false } }
And if using nil to mean "unvisited" makes more sense than starting with false then:
a = a.map { |x| Hash[x.zip([nil] * x.length)] }
# or
a = a.map { |x| x.each_with_object({}) { |i,h| h[i] = nil } }
Some useful references:
Hash[]
each_with_object
zip
Array *
If what you are trying to do is simply tag specific elements in the member arrays with boolean values, it is just a simple matter of doing the following:
current_value = a[i][j]
a[i][j] = [current_value, true_or_false]
For example if you have
a = [[1,2,3,4],[5]]
Then if you say
a[0][2] = [a[0,2],true]
then a becomes
a = [[1,2,[3,true],4],[5]]
You can roll this into a method
def tag_array_element(a, i, j, boolean_value)
a[i][j] = [a[i][j], boolean_value]
end
You might want to enhance this a little so you don't tag a specific element twice. :) To do so, just check if a[i][j] is already an array.
Change x % 2 == 0 for the actual operation you want for the mapping:
>> xss = [[1,2,3,4],[5]]
>> xss.map { |xs| xs.map { |x| {x => x % 2} } }
#=> [[{1=>false}, {2=>true}, {3=>false}, {4=>true}], [{5=>false}]]

Resources