Assign hash key and value from string using split - ruby

I have a few strings that I am retrieving from a file birthdays.txt. An example of a string is below:
Christopher Alexander, Oct 4, 1936
I would like to separate the strings and let variable name be a hash key and birthdate the hash value. Here is my code:
birthdays = {}
File.read('birthdays.txt').each_line do |line|
line = line.chomp
name, birthdate = line.split(/\s*,\s*/).first
birthdays = {"#{name}" => "#{birthdate}"}
puts birthdays
end
I managed to assign name to the key. However, birthdate returns "".

File.new('birthdays.txt').each.with_object({}) do
|line, birthdays|
birthdays.store(*line.chomp.split(/\s*,\s*/, 2))
puts birthdays
end

I feel like some of the other solutions are overthinking this a bit. All you need to do is split each line into two parts, the part before the first comma and the part after, which you can do with line.split(/,\s*/, 2), then call to_h on the resulting array of arrays:
data = <<END
Christopher Alexander, Oct 4, 1936
Winston Churchill, Nov 30, 1874
Max Headroom, Apr 4, 1985
END
data.each_line.map do |line|
line.chomp.split(/,\s*/, 2)
end.to_h
# => { "Christopher Alexander" => "Oct 4, 1936",
# "Winston Churchill" => "Nov 30, 1874",
# "Max Headroom" => "April 4, 1985" }
(You will, of course, want to replace data with your File object.)

birthdays = Hash.new
File.read('birthdays.txt').each_line do |line|
line = line.chomp
name, birthdate = line.split(/\s*,\s*/, 2)
birthdays[name]= birthdate
puts birthdays
end

Using #Jordan's data:
data.each_line.with_object({}) do |line, h|
name, _, bdate = line.chomp.partition(/,\s*/)
h[name] = bdate
end
#=> {"Christopher Alexander"=>"Oct 4, 1936",
# "Winston Churchill"=>"Nov 30, 1874",
# "Max Headroom"=>"Apr 4, 1985"}

Related

Map strings of letters and numbers based on if there's a valid date

I have 5 strings:
MO170915C00075000
GILD1514H117
9ZZZFD898
AHMIQ
894990415
The first two have valid dates between between the first set of numbers and then then the next single letter (ex: C).
If I may ask, how would pick out the first two strings due to the date that's in them (need to identify that they contain a date after the first set of letters and before a single character) and then correctly format the dates?
For the first one, I can get the date using the GSub below:
("20" + #ticker.gsub(/(\w+?)(\d{6})([a-z])\d+/i,'\2')).to_date
You could use Date._parse to see which information could be found by Date.parse.
Without any specification, you'll be basically shooting in the dark. Since the logic is so fuzzy, it cannot work magically with any weird string as input :
require 'date'
weird_dates = %w(MO170915C00075000 MA20172115C00075000 GILD1514H117 9ZZZFD898 AHMIQ 894990415)
weird_dates.each do |date_str|
date_hash = Date._parse(date_str)
puts date_str
puts " #{date_hash}"
if date_hash[:year] && date_hash[:mon] && date_hash[:mday]
print " It looks like a date"
begin
date = Date.parse(date_str)
puts " : #{date}"
rescue ArgumentError
puts " but it's not a valid one!"
end
else
puts " Sorry, not enough information"
end
puts
end
It outputs :
MO170915C00075000
{:year=>2017, :mon=>9, :mday=>15}
It looks like a date : 2017-09-15
MA20172115C00075000
{:year=>2017, :mon=>21, :mday=>15}
It looks like a date but it's not a valid one!
GILD1514H117
{:hour=>1514, :min=>117}
Sorry, not enough information
9ZZZFD898
{:yday=>898}
Sorry, not enough information
AHMIQ
{}
Sorry, not enough information
894990415
{}
Sorry, not enough information
If you know the exact input format, you should use Date.strptime.
Code
require 'date'
def extract_dates(arr)
arr.each_with_object([]) do |str,a|
s = str[/\d+/] || ''
a <<
case s.size
when 8
[convert_to_time(s, 4, 2, 2)]
when 7
[convert_to_time(s, 4, 2, 1), convert_to_time(s, 4, 1, 2)]
when 6
[convert_to_time(s, 4, 1, 1), convert_to_time(s, 2, 2, 2)]
when 5
[convert_to_time(s, 2, 2, 1), convert_to_time(s, 2, 1, 2)]
when 4
[convert_to_time(s, 2, 1, 1)]
else
[]
end.compact
end
end
def convert_to_time(s, y, m, d)
ss = s.dup
ss.insert(0, "20") if y == 2
ss.insert(4, "0") if m == 1
ss.insert(6, "0") if d == 1
DateTime.strptime(ss, "%Y%m%d") rescue nil
end
Examples
arr = <<-_.split
MA170915C00075000
MA20170915C00075000
MA20172115C00075000
GILD1514H117
GILD15111H117
9ZZZFD898
AHMIQ
894990415
_
#=> ["MA170915C00075000", "MA20170915C00075000", "MA20172115C00075000",
# "GILD1514H117", "GILD15111H117", "9ZZZFD898", "AHMIQ", "894990415"]
arr.zip(extract_dates arr)
#=> [["MA170915C00075000",
# [#<DateTime: 1709-01-05T00:00:00+00:00 ((2345264j,0s,0n),+0s,2299161j)>,
# #<DateTime: 2017-09-15T00:00:00+00:00 ((2458012j,0s,0n),+0s,2299161j)>]],
# ["MA20170915C00075000",
# [#<DateTime: 2017-09-15T00:00:00+00:00 ((2458012j,0s,0n),+0s,2299161j)>]],
# ["MA20172115C00075000", []],
# ["GILD1514H117",
# [#<DateTime: 2015-01-04T00:00:00+00:00 ((2457027j,0s,0n),+0s,2299161j)>]],
# ["GILD15111H117",
# [#<DateTime: 2015-11-01T00:00:00+00:00 ((2457328j,0s,0n),+0s,2299161j)>,
# #<DateTime: 2015-01-11T00:00:00+00:00 ((2457034j,0s,0n),+0s,2299161j)>]],
# ["9ZZZFD898", []],
# ["AHMIQ", []],
# ["894990415", []]]
This shows that:
the first string of digits in "MA170915C00075000", "170915", can be interpreted as one of two dates, "1709-1-5" and "2017-09-15".
the first digits in "MA20170915C00075000", "20170915", can be interpreted as one date only, "2017-09-15".
the first digits in "MA20172115C00075000", "20172115", do not represent a valid date, so an empty array is returned.
the first digits in "GILD15111H117", "15111", could represent the date "2015-11-01" or "2015-01-11".

How do you iterate over an array to create a hash?

I need to create a hash from the array below, and have it look like - peoples_ages = {"Joe"=> 25}. I can iterate over it using each_with_index, but I don't need the index as the value, I need the person's age. Instead I was thinking of using either Hash[people_array]... or .each_with_object. Is it best to use .map instead and just put .to_h after?
class Person
attr_reader :name, :age
def initialize(name, age)
#name = name
#age = age
end
end
nick = Person.new("Nick", 25)
jim = Person.new("Jim", 37)
bob = Person.new("Bob", 23)
rob = Person.new("Rob", 29)
sue = Person.new("Sue", 31)
peeps = [nick, jim, bob, rob, sue]
# iterate over peeps array to create a hash that looks like this:
# people_ages = {
# "Nick" => 25,
# "Jim" => 37,
# "Bob" => 23,
# etc...
# }
peeps.each_with_object({}){|e, h| h[e.name] = e.age}
Hash[peeps.map {|person| [person.name, person.age]} ]
Or if Ruby 2.0
peeps.map {|person| [person.name, person.age]}.to_h

Remove leading white space in Ruby hash value

I'm working on an example problem from Chris Pine's Learn to Program book and I'm having an issue removing white space in my hash values.
I start with a txt file that contains names and birthday information, like so:
Christopher Alexander,  Oct  4, 1936
Christopher Lambert,    Mar 29, 1957
Christopher Lee,        May 27, 1922
Christopher Lloyd,      Oct 22, 1938
Christopher Pine,       Aug  3, 1976
Then I go through each line, split at the first comma, and then try to go through each key,value to strip the white space.
birth_dates = Hash.new {}
File.open 'birthdays.txt', 'r' do |f|
f.read.each_line do |line|
name, date = line.split(/,/, 2)
birth_dates[name] = date
birth_dates.each_key { |a| birth_dates[a].strip! }
end
But nothing is getting stripped.
{"Christopher Alexander"=>"  Oct  4, 1936", "Christopher Lambert"=>"    Mar 29, 1957", "Christopher Lee"=>"        May 27, 1922", "Christopher Lloyd"=>"      Oct 22, 1938", "Christopher Pine"=>"       Aug  3, 1976", "Christopher Plummer"=>"    Dec 13, 1927", "Christopher Walken"=>"     Mar 31, 1943", "The King of Spain"=>"      Jan  5, 1938"}
I've seen a handful of solutions for Arrays using .map - but this was the only hash example I came across. Any idea why it may not be working for me?
UPDATE: removed the redundant chomp as per sawa's comment.
For parsing comma delimited files i use CSV like this
def parse_birthdays(file='birthdays.txt', hash={})
CSV.foreach(file, :converters=> lambda {|f| f ? f.strip : nil}){|name, date, year|hash[name] = "#{year}-#{date.gsub(/ +/,'-')}" }
hash
end
parse_birthdays
# {"Christopher Alexander"=>"1936-Oct-4", "Christopher Lambert"=>"1957-Mar-29", "Christopher Lee"=>"1922-May-27", "Christopher Lloyd"=>"1938-Oct-22", "Christopher Pine"=>"1976-Aug-3"}
of if you need real date's you can drop the lambda
def parse_birthdays(file='birthdays.txt', hash={})
CSV.foreach(file){|name, date, year|hash[name] = Date.parse("#{year}-#{date}")}
hash
end
parse_birthdays
# {"Christopher Alexander"=>#<Date: 2014-10-04 ((2456935j,0s,0n),+0s,2299161j)>, "Christopher Lambert"=>#<Date: 2014-03-29 ((2456746j,0s,0n),+0s,2299161j)>, "Christopher Lee"=>#<Date: 2014-05-27 ((2456805j,0s,0n),+0s,2299161j)>, "Christopher Lloyd"=>#<Date: 2014-10-22 ((2456953j,0s,0n),+0s,2299161j)>, "Christopher Pine"=>#<Date: 2014-08-03 ((2456873j,0s,0n),+0s,2299161j)>}
I would write this
File.open 'birthdays.txt', 'r' do |f|
f.read.each_line do |line|
name, date = line.split(/,/, 2)
birth_dates[name] = date.chomp
birth_dates.each_key { |a| birth_dates[a].strip! }
end
as below:
File.open 'birthdays.txt', 'r' do |f|
f.read.each_line do |line|
name, date = line.split(/,/, 2)
birth_dates[name] = date.chomp.strip
end
end
or
birth_dates = File.readlines('birthdays.txt').with_object({}) do |line,hsh|
name, date = line.split(/,/, 2)
hsh[name] = date.chomp.strip
end

reset a count in ruby when putting to screen from parallel arrays

This is home work so I would prefer not to put up my code. I have 2 parallel arrays, 1.names 2. ages. The idea is to puts all ages less than 21. I can do this. The problem is that when I puts "#{count}. #{names[count]}, #{ages[count]}" <---The beginning count prints out the index number or position of element in array. Obviously what I want is for it to start at 1. if there are three names...
name, age
name, age
name, age
NOT
5, name, age
6, name, age
I am using a while loop with an if statement. I don't need code, just would like some feedback to trigger more ideas. Thanks for your time, much appreciated.
names[name1, name2, name3]
ages[age1, age2, age3]
#view all younger than 21
count = 0
while count < names.length
if ages[count] < 21
puts "#{count}. #{names[count]}, #{ages[count]}" #works
end
count += 1
end
pause
You shouldn't have "parallel arrays" in the first place! Data that belongs together should be manipulated together, not separately.
Instead of something like
names = %w[john bob alice liz]
ages = [16, 22, 18, 23 ]
You could, for example, have a map (called Hash in Ruby):
people = { 'john' => 16, 'bob' => 22, 'alice' => 18, 'liz' => 23 }
Then you would have something like:
puts people.select {|_name, age| age > 21 }.map.
with_index(1) {|(name, age), i| "#{i}. #{name}, #{age}" }
# 1. bob, 22
# 2. liz, 23
If you have no control over the creation of those parallel arrays, then it is still better to convert them to a sensible data structure first, and avoid the pain of having to juggle them in your algorithm:
people = Hash[names.zip(ages)]
Even better yet: you should have Person objects. After all, Ruby is object-oriented, not array-oriented or hash-oriented:
class Person < Struct.new(:name, :age)
def to_s
"#{name}, #{age}"
end
def adult?
age > 21
end
end
people = [
Person.new('john', 16),
Person.new('bob', 22),
Person.new('alice', 18),
Person.new('liz', 23)]
puts people.select(&:adult?).map.with_index(1) {|p, i| "#{i}. #{p}" }
Again, if you don't have control of the creation of those two parallel arrays, you can still convert them first:
people = names.zip(ages).map {|name, age| Person.new(name, age) }

Nicely formatting output to console, specifying number of tabs

I am generating a script that is outputting information to the console. The information is some kind of statistic with a value. So much like a hash.
So one value's name may be 8 characters long and another is 3. when I am looping through outputting the information with two \t some of the columns aren't aligned correctly.
So for example the output might be as such:
long value name 14
short 12
little 13
tiny 123421
long name again 912421
I want all the values lined up correctly. Right now I am doing this:
puts "#{value_name} - \t\t #{value}"
How could I say for long names, to only use one tab? Or is there another solution?
Provided you know the maximum length to be no more than 20 characters:
printf "%-20s %s\n", value_name, value
If you want to make it more dynamic, something like this should work nicely:
longest_key = data_hash.keys.max_by(&:length)
data_hash.each do |key, value|
printf "%-#{longest_key.length}s %s\n", key, value
end
There is usually a %10s kind of printf scheme that formats nicely.
However, I have not used ruby at all, so you need to check that.
Yes, there is printf with formatting.
The above example should right align in a space of 10 chars.
You can format based on your widest field in the column.
printf ([port, ]format, arg...)
Prints arguments formatted according to the format like sprintf. If the first argument is the instance of the IO or its subclass, print redirected to that object. the default is the value of $stdout.
String has a built-in ljust for exactly this:
x = {"foo"=>37, "something long"=>42, "between"=>99}
x.each { |k, v| puts "#{k.ljust(20)} #{v}" }
# Outputs:
# foo 37
# something long 42
# between 99
Or, if you want tabs, you can do a little math (assuming tab display width of 8) and write a short display function:
def tab_pad(label, tab_stop = 4)
label_tabs = label.length / 8
label.ljust(label.length + tab_stop - label_tabs, "\t")
end
x.each { |k, v| puts "#{tab_pad(k)}#{v}" }
# Outputs:
# foo 37
# something long 42
# between 99
There was few bugs in it before, but now you can use most of printf syntax with % operator:
1.9.3-p194 :025 > " %-20s %05d" % ['hello', 12]
=> " hello 00012"
Of course you can use precalculated width too:
1.9.3-p194 :030 > "%-#{width}s %05x" % ['hello', 12]
=> "hello 0000c"
I wrote a thing
Automatically detects column widths
Spaces with spaces
Array of arrays [[],[],...] or array of hashes [{},{},...]
Does not detect columns too wide for console window
lists = [
[ 123, "SDLKFJSLDKFJSLDKFJLSDKJF" ],
[ 123456, "ffff" ],
]
array_maxes
def array_maxes(lists)
lists.reduce([]) do |maxes, list|
list.each_with_index do |value, index|
maxes[index] = [(maxes[index] || 0), value.to_s.length].max
end
maxes
end
end
array_maxes(lists)
# => [6, 24]
puts_arrays_columns
def puts_arrays_columns(lists)
maxes = array_maxes(hashes)
lists.each do |list|
list.each_with_index do |value, index|
print " #{value.to_s.rjust(maxes[index])},"
end
puts
end
end
puts_arrays_columns(lists)
# Output:
# 123, SDLKFJSLDKFJSLDKFJLSDKJF,
# 123456, ffff,
and another thing
hashes = [
{ "id" => 123, "name" => "SDLKFJSLDKFJSLDKFJLSDKJF" },
{ "id" => 123456, "name" => "ffff" },
]
hash_maxes
def hash_maxes(hashes)
hashes.reduce({}) do |maxes, hash|
hash.keys.each do |key|
maxes[key] = [(maxes[key] || 0), key.to_s.length].max
maxes[key] = [(maxes[key] || 0), hash[key].to_s.length].max
end
maxes
end
end
hash_maxes(hashes)
# => {"id"=>6, "name"=>24}
puts_hashes_columns
def puts_hashes_columns(hashes)
maxes = hash_maxes(hashes)
return if hashes.empty?
# Headers
hashes.first.each do |key, value|
print " #{key.to_s.rjust(maxes[key])},"
end
puts
hashes.each do |hash|
hash.each do |key, value|
print " #{value.to_s.rjust(maxes[key])},"
end
puts
end
end
puts_hashes_columns(hashes)
# Output:
# id, name,
# 123, SDLKFJSLDKFJSLDKFJLSDKJF,
# 123456, ffff,
Edit: Fixes hash keys considered in the length.
hashes = [
{ id: 123, name: "DLKFJSDLKFJSLDKFJSDF", asdfasdf: :a },
{ id: 123456, name: "ffff", asdfasdf: :ab },
]
hash_maxes(hashes)
# => {:id=>6, :name=>20, :asdfasdf=>8}
Want to whitelist columns columns?
hashes.map{ |h| h.slice(:id, :name) }
# => [
# { id: 123, name: "DLKFJSDLKFJSLDKFJSDF" },
# { id: 123456, name: "ffff" },
#]
For future reference and people who look at this or find it... Use a gem. I suggest https://github.com/wbailey/command_line_reporter
You typically don't want to use tabs, you want to use spaces and essentially setup your "columns" your self or else you run into these types of problems.

Resources