I am a fresher in ruby language. Help me to this
I have a map, containing 4 keys. Initial state the value of all keys are zero like below
data_source_map = = Hash.new
data_source_map.store("ab",0)
data_source_map.store("cde",0)
data_source_map.store("fgh",0)
data_source_map.store("jik",0)
I have a while loop, iterating a files from a specific location
while (file = queue.deq)
begin
cat = 'cat'
if file.split('.').last=='gz' || file.split('.').last=='zip'
cat = 'zcat'
end
user_ids.each do |user|
res = run_command4("aws s3 cp #{file} - | #{cat} | grep #{user} | wc -l",true,'s3cmd stream failed')
output = "#{user},#{file.split('/')[-1]},#{file.split('/')[-2]},#{res[:output][0]}"
if "ab".eql?(file.split('/')[-2])
data_source ="ab"
elsif "cde".eql?(file.split('/')[-2])
data_source ="cde"
elsif "fgh".eql?(file.split('/')[-2])
data_source ="fgh"
elsif "jik".eql?(file.split('/')[-2])
data_source ="jik"
else
data_source ="NA"
end
end
end
end
{res[:output][0] is a number with respect to the keys.Each file having these keys and an integer number.
Each iteration i need to update the integer value of a given key. how to do this in ruby.
am trying to create consolidated report like below
|ab |200
|cde |4000
|fgh |0
I suggest using a Hash with a default value so you don't need to initialize all possible keys before you start, but that is not required.
data_source_map = Hash.new(0)
To update a value in a hash, you just need to set the new updated value to the hash, it will overwrite the old value.
data_source_map[data_source] = data_source_map[data_source] + 1
This takes the current value from the hash and adds 1 to it, then stores it back into the Hash. Though this is a very common operation, so there is a shorthand for it as well.
data_source_map[data_source] += 1
Related
I've sent my I18n files to be translated by a third party. Since my translator is not computer savvy we made a spreadsheet with the keys, they where sent in dot notation and the values translated.
For example:
es.models.parent: "Pariente"
es.models.teacher: "Profesor"
es.models.school: "Colegio"
How can I move that into a YAML file?
UPDATE: Just like #tadman said, this already is YAML. So if you are with the, you are just fine.
So we will focus this question if you would like to have the tree structure for YAML.
The first thing to do is transform this into a Hash.
So the previous info moved into this:
tr = {}
tr["es.models.parent"] = "Pariente"
tr["es.models.teacher"] = "Profesor"
tr["es.models.school"] = "Colegio"
Then we just advanced creating a deeper hash.
result = {} #The resulting hash
tr.each do |k, value|
h = result
keys = k.split(".") # This key is a concatenation of keys
keys.each_with_index do |key, index|
h[key] = {} unless h.has_key? key
if index == keys.length - 1 # If its the last element
h[key] = value # then we only need to set the value
else
h = h[key]
end
end
end;
require 'yaml'
puts result.to_yaml #Here it is for your YAMLing pleasure
This is my code, which is supposed to hash the 2 columns in fotoFd.csv and then save the hashed columns in a separate file, T4Friendship.csv:
require "csv"
arrayUser=[]
arrayUserUnique=[]
arrayFriends=[]
fileLink = "fotoFd.csv"
f = File.open(fileLink, "r")
f.each_line { |line|
row = line.split(",");
arrayUser<<row[0]
arrayFriends<<row[1]
}
arrayUserUnique = arrayUser.uniq
arrayHash = []
for i in 0..arrayUser.size-1
arrayHash<<arrayUser[i]
arrayHash<<i
end
hash = Hash[arrayHash.each_slice(2).to_a]
array1 =hash.values_at *arrayUser
array2 =hash.values_at *arrayFriends
fileLink = "T4Friendship.csv"
for i in 0..array1.size-1
logfile = File.new(fileLink,"a")
logfile.print("#{array1[i]},#{array2[i]}\n")
logfile.close
end
The first columns contains users, and the second column contains their friends. So, I want it to produce something like this in the T4Friendship.csv:
1 2
1 4
1 10
1 35
2 1
2 8
2 11
3 28
3 31
...
The problem is caused by the splat expansion of a large array. The splat * can be used to expand an array as a parameter list. The parameters are passed on the stack. If there are too many parameters, you'll exhaust stack space and get the mentioned error.
Here's a quick demo of the problem in irb that tries to splat an array of one million elements when calling puts:
irb
irb(main):001:0> a = [0] * 1000000; nil # Use nil to suppress statement output
=> nil
irb(main):002:0> puts *a
SystemStackError: stack level too deep
from /usr/lib/ruby/1.9.1/irb/workspace.rb:80
Maybe IRB bug!
irb(main):003:0>
You seem to be processing large CSV files, and so your arrayUser array is quite large. Expanding the large array with the splat causes the problem on the line:
array1 =hash.values_at *arrayUser
You can avoid the splat by calling map on arrayUser, and converting each value in a block:
array1 = arrayUser.map{ |user| hash[user] }
Suggested Code
Your code appears to map names to unique ID numbers. The output appears to be the same format as the input, except with the names translated to ID numbers. You can do this without keeping any arrays around eating up memory, and just use a single hash built up during read, and used to translate the names to numbers on the fly. The code would look like this:
def convertCsvNamesToNums(inputFileName, outputFileName)
# Create unique ID number hash
# When unknown key is lookedup, it is added with new unique ID number
# Produces a 0 based index
nameHash = Hash.new { |hash, key| hash[key] = hash.size }
# Convert input CSV with names to output CSV with ID numbers
File.open(inputFileName, "r") do |inputFile|
File.open(outputFileName, 'w') do |outputFile|
inputFile.each_line do |line|
# Parse names from input CSV
userName, friendName = line.split(",")
# Map names to unique ID numbers
userNum = nameHash[userName]
friendNum = nameHash[friendName]
# Write unique ID numbers to output CSV
outputFile.puts "#{userNum}, #{friendNum}"
end
end
end
end
convertCsvNamesToNums("fotoFd.csv", "T4Friendship.csv")
Note: This code assigns ID numbers to user and friends, as they are encountered. Your previous code assigned ID numbers to users only, and then looked up the friends after. The code I suggested will ensure friends are assigned ID numbers, even if they never appeared in the user list. The numerical ordering will different slightly from what you supplied, but I assume that is not important.
You can also shorten the body of the inner loop to:
# Parse names from input, map to ID numbers, and write to output
outputFile.puts line.split(",").map{|name| nameHash[name]}.join(',')
I thought I'd include this change separately for readability.
Updated Code
As per your request in the comments, here is code that gives priority to the user column for ID numbers. Only once the first column is completely processed will ID numbers be assigned to entries in the second column. It does this by first passing over the input once, adding the first column to the hash, and then passing over the input a second time to process it as before, using the pre-prepared hash from the first pass. New entries can still be added in the second pass in the case where the friend column contains a new entry that doesn't exist anywhere in the user column.
def convertCsvNamesToNums(inputFileName, outputFileName)
# Create unique ID number hash
# When unknown key is lookedup, it is added with new unique ID number
# Produces a 0 based index
nameHash = Hash.new { |hash, key| hash[key] = hash.size }
# Pass over the data once to give priority to user column for ID numbers
File.open(inputFileName, "r") do |inputFile|
inputFile.each_line do |line|
name, = line.split(",") # Parse name from line, ignore the rest
nameHash[name] # Add name to unique ID number hash (if it doesn't already exist)
end
end
# Convert input CSV with names to output CSV with ID numbers
File.open(inputFileName, "r") do |inputFile|
File.open(outputFileName, 'w') do |outputFile|
inputFile.each_line do |line|
# Parse names from input, map to ID numbers, and write to output
outputFile.puts line.split(",").map{|name| nameHash[name]}.join(',')
end
end
end
end
convertCsvNamesToNums("fotoFd.csv", "T4Friendship.csv")
I'm trying to demonstrate a situation where it's necessary to pass a block to Hash.new in order to set up default values for a given key when creating a hash of hashes.
To show what can go wrong, I've created the following code, which passes a single value as an argument to Hash.new. I expected all outer hash keys to wind up holding a reference to the same inner hash, causing the counts for the "piles" to get mixed together. And indeed, that does seem to have happened. But part_counts.each doesn't seem to find any keys/values to iterate over, and part_counts.keys returns an empty array. Only part_counts[0] and part_counts[1] successfully retrieve a value for me.
piles = [
[:gear, :spring, :gear],
[:axle, :gear, :spring],
]
# I do realize this should be:
# Hash.new {|h, k| h[k] = Hash.new(0)}
part_counts = Hash.new(Hash.new(0))
piles.each_with_index do |pile, pile_index|
pile.each do |part|
part_counts[pile_index][part] += 1
end
end
p part_counts # => {}
p part_counts.keys # => []
# The next line prints no output
part_counts.each { |key, value| p key, value }
p part_counts[0] # => {:gear=>3, :spring=>2, :axle=>1}
For context, here is the corrected code that I intend to show after the "broken" code. The parts for each pile within part_counts are separated, as they should be. each and keys work as expected, as well.
# ...same pile initialization code as above...
part_counts = Hash.new {|h, k| h[k] = Hash.new(0)}
# ...same part counting code as above...
p part_counts # => {0=>{:gear=>2, :spring=>1}, 1=>{:axle=>1, :gear=>1, :spring=>1}}
p part_counts.keys # => [0, 1]
# The next line of code prints:
# 0
# {:gear=>2, :spring=>1}
# 1
# {:axle=>1, :gear=>1, :spring=>1}
part_counts.each { |key, value| p key, value }
p part_counts[0] # => {:gear=>2, :spring=>1}
But why don't each and keys work (at all) in the first sample?
We'll start by decomposing this a little bit:
part_counts = Hash.new(Hash.new(0))
That's the same as saying:
default_hash = { }
default_hash.default = 0
part_counts = { }
part_counts.default = default_hash
Later on, you're saying things like this:
part_counts[pile_index][part] += 1
That's the same as saying:
h = part_counts[pile_index]
h[part] += 1
You're not using the (correct) block form of the default value for your Hash so accessing the default value doesn't auto-vivify the key. That means that part_counts[pile_index] doesn't create a pile_index key in part_counts, it just gives you part_counts.default and you're really saying:
h = part_counts.default
h[part] += 1
You're not doing anything else to add keys to part_counts so it has no keys and:
part_counts.keys == [ ]
So why does part_counts[0] give us {:gear=>3, :spring=>2, :axle=>1}? part_counts doesn't have any keys and in particular doesn't have a 0 key so:
part_counts[0]
is the same as
part_counts.default
Up above where you're accessing part_counts[pile_index], you're really just getting a reference to the default, the Hash won't clone it, you get the whole default value that the Hash will use next time. That means that:
part_counts[pile_index][part] += 1
is another way of saying:
part_counts.default[part] += 1
so you're actually just changing part_counts's default value in-place. Then when you part_counts[0], you're accessing this modified default value and there's the {:gear=>3, :spring=>2, :axle=>1} that you accidentally built in your loop.
The value given to Hash.new is used as the default value, but this value is not inserted into the hash. So part_count remains empty. You can get the default value by using part_count[...] but this has no effect on the hash, it doesn't really contain the key.
When you call part_counts[pile_index][part] += 1, then part_counts[pile_index] returns the default value, and it's this value that is modified with the assignment, not part_counts.
You have something like:
outer = Hash.new({})
outer[1][2] = 3
p outer, outer[1]
which can also be written like:
inner = {}
outer = Hash.new(inner)
inner2 = outer[1] # inner2 refers to the same object as inner, outer is not modified
inner2[2] = 3 # same as inner[2] = 3
p outer, inner
I'm completely new to ruby and wanted to ask for some help with this ruby script.
it's supposed to take in a string and find out which character occurs the most frequently. It does this using a hash, it stores all the characters in a hash and then iterates through it to find the one with greatest value. As of right now it doesn't seem to be working properly and i'm not sure why. It reads the characters in properly as far as i can tell with print statements. Any help is appreciated.
Thanks!
puts "Enter the string you want to search "
input = gets.chomp
charHash = Hash.new
input.split("").each do |i|
if charHash.has_key?(i)
puts "incrementing"
charHash[i]+=1
else
puts"storing"
charHash.store(i, 1)
end
end
goc = ""
max = 0
charHash.each { |key,value| goc = key if value > max }
puts "The character #{goc} occurs the most frequently"
There are two major issues with you code:
As commented by Holger Just, you have to use += 1 instead of ++
charHash.store(:i, 1) stores the symbol :i, you want to store i
Fixing these results in a working code (I'm using snake_case here):
char_hash = Hash.new
input.split("").each do |i|
if char_hash.has_key?(i)
char_hash[i] += 1
else
char_hash.store(i, 1)
end
end
You can omit the condition by using 0 as your default hash value and you can replace split("").each with each_char:
char_hash = Hash.new(0)
input.each_char do |i|
char_hash[i] += 1
end
Finally, you can pass the hash into the loop using Enumerator#with_object:
char_hash = input.each_char.with_object(Hash.new(0)) { |i, h| h[i] += 1 }
I might be missing something but it seems that instead of
charHash.each { |key,value| goc = key if value > max }
you need something like
charHash.each do |key,value|
if value > max then
max = value
goc = key
end
end
Notice the max = value statement. In your current implementation (i.e. without updating the max variable), every character that appears in the text at least once satisfies the condition and you end up getting the last one.
My txt file contains a few lines and i want to add each line to a hash with key as first 2 words and value as 3rd word...The following code has no errors but the logic may be wrong...last line is supposed to print all the keys of the hash...but nothing happens...pls help
def word_count(string)
count = string.count(' ')
return count
end
h = Hash.new
f = File.open('sheet.txt','r')
f.each_line do |line|
count = word_count(line)
if count == 3
a = line.split
h.merge(a[0]+a[1] => a[2])
end
end
puts h.keys
Hash#merge doesn't modify the hash you call it on, it returns the merged Hash:
merge(other_hash) → new_hash
Returns a new hash containing the contents of other_hash and the contents of hsh. [...]
Note the Returns a new hash... part. When you say this:
h.merge(a[0]+a[1] => a[2])
You're merge the new values you built into a copy of h and then throwing away the merged hash; the end result is that h never gets anything added to it and ends up being empty after all your work.
You want to use merge! to modify the Hash:
h.merge!(a[0]+a[1] => a[2])
or keep using merge but save the return value:
h = h.merge(a[0]+a[1] => a[2])
or, since you're only adding a single value, just assign it:
h[a[0] + a[1]] = a[2]
If you want to add the first three words of each line to the hash, regardless of how many words there are, then you can drop the if count == 3 line. Or you can change it to if count > 2 if you want to make sure that there are at least three words.
Also, mu is correct. You'll want h.merge!