Ruby prasing csv date to format invalid date(argumentError) - ruby

I am parsing a csv file and i need to change the dob to match a certain format of YYY-MM-DD I keep getting this error of parse': invalid date (ArgumentError)
it happens when it tries to parse this date of "6/6/99" How can i fix this so I don't get any errors for any of the dates I have?
list of all dates in csv, I am not sure if any of the other dates following the one above would error out as well.
"12/12/2010"
"1/1/1988"
"2/2/1966"
"6/6/99"
"1/4/88"
"4/4/1948"
"1/6/1988"
"1/7/1988"
"1/8/88"
"1/9/88"
"1988-02-12"
"1-11-88"
"1/12/88"
"1/13/88"
my code
require 'csv'
require 'time'
require 'date'
def parse_csv
table = CSV.parse(File.read("input.csv", encoding: 'bom|utf-8' ), headers: true, col_sep: ",")
formatted = table.map(&:to_h)
formatted.each do |x|
if x["dob"] =~ /^\d{4}-\d{2}-\d{2}$/
p "correct"
else
parsed = Date.parse(x["dob"], "%Y-%m-%d")
p parsed
end
end
end
parse_csv

From the documentation:
Date.parse
Parses the given representation of date and time, and creates a date
object.
This method does not function as a validator. If the input string
does not match valid formats strictly, you may get a cryptic result.
Should consider to use Date.strptime instead of this method as
possible.
Your current implementation doesn't really make sense; it's not doing what you think it is:
Date.parse('6/6/99', "%Y-%m-%d")
#=> Date::Error: invalid date
This isn't saying "convert "6/6/99" into "1999-06-06", it's saying "try to parse "6/6/99" into a Date object (and the second argument is essentially being ignored!).
If you're confident what format that date is supposed to be, then (as per the documentation referenced above!) you can use Date.strptime to try explicitly parsing it as this format. For example:
Date.strptime('6/6/99', "%m/%d/%y")
#=> #<Date: 1999-06-06 ((2451336j,0s,0n),+0s,2299161j)>
Or if you're not confident whether the first two values are supposed to represent month-day or day-month, then you'd need to handle this explicitly in the code and treat the error appropriately.
tl;dr:
Date.parse is unreliable for arbitrary inputs; it only makes a "best guess" for the date format. And unsurprisingly here, it fails for at least one of the ambiguous formats you're throwing at it.
Date.strptime is the correct way to parse each date, when you know which format you expect.

Related

Ruby date parsing

I'm sure I'm missing something obvious, but - I'm trying to parse a date like "8/5/2018" (in m/d/y format). Ruby seems perfectly capable of creating dates in this format using the format string %-m/%-d/%Y and strftime, but for some reason strptime can't read it?
[1] pry(main)> require 'date'
=> true
[2] pry(main)> format = '%-m/%-d/%Y'
=> "%-m/%-d/%Y"
[3] pry(main)> today = Date.today.strftime(format)
=> "8/5/2018"
[4] pry(main)> Date.strptime(today, format)
ArgumentError: invalid date
from (pry):4:in `strptime'
Does strptime use a different set of format keys than strftime (the ruby docs seem to suggest that they're the same)? Or am I missing something else here?
They use the same format keys, but the docs for strptime state:
strptime does not support specification of flags and width unlike strftime.
So the issue is those flags (in this case the hyphens (-). Kind of dumb that you can't just use the same format specifier for inputting dates, as well as outputting them, but it is what it is, you could do:
Date.strptime(today, format.delete('-'))
the problem is about format! if you change it to
%m/%d/%Y
, it will be fix !
format = '%m/%d/%Y'
today = Date.today.strftime(format)
Date.strptime(today, format)
#=> Mon, 06 Aug 2018

Changing the format from a mySQL datetime format to a different type in ruby?

I am attempting to change an SQL datetime variable (2016-06-09 14:29:34) into a format that looks like this (00:00_20160601). I have tried to follow a couple of SO questions that will allow me to format a Time object.
This is what I have done so far:
start_datetime = "2016-06-09 14:29:34"
t =Time.new(start_datetime)
t.strftime("%H:%M_%Y%d%m")
This results in the time being formatted to 2016-01-01 00:00:00 +0000, which is obviously not what I want. I was wondering if someone could help me format the datetime object the way I specified?
You can do this with DateTime:
require 'datetime'
DateTime.parse("2016-06-09 14:29:34").strftime("%H:%M_%Y%d%m")
#=> "14:29_20160906"
The format you're feeding in is basically ISO-8601 so it's parsed by default.
Feeding that value into Time.new is completely incorrect. The first argument there is the year, the rest have to be supplied separately. That's why you get 2016-01-01, since everything else comes out as defaults.
Time.new is converting automatically and the result of "2016-06-09 14:29:34".to_i is 2016.
It's not entirely clear why your day value changes from 09 in the input to 01 in the desired output, so I'll use the normal thing and output the same as was input:
require 'time'
start_datetime = "2016-06-09 14:29:34"
t = Time.strptime(start_datetime, '%Y-%m-%d %H:%M:%S')
t.strftime('00:00_%Y%m%d') # => "00:00_20160609"
Since the hours and minutes are being thrown away there are a couple of other ways to go about this.
Ignore the hours and minutes when parsing:
t = Time.strptime(start_datetime, '%Y-%m-%d')
Or use a Date object instead of a Time object:
require 'date'
start_datetime = "2016-06-09 14:29:34"
t = Date.strptime(start_datetime, '%Y-%m-%d')
t.strftime('00:00_%Y%m%d') # => "00:00_20160609"

What kind of date format is this and how do I transform it?

Making a GET request to a private (no public documentation) API returns data in JSON format.
The value for date looks as follows:
AanmeldDatum: "/Date(1262300400000+0100)/"
There's another variable called AangebodenSindsTekst which means OfferedSinceText and it's value is "8 augustus 2014". So the unknown Date format should get parsed into that specific value.
I'm wondering what kind of date format it is and how can I transform this to something like this 2014-08-08 with Ruby?
I've tried this:
require 'time'
t = '1262300400000+0100'
t2 = Time.parse(t)
# => ArgumentError: no time information in "1262300400000+0100"
Ruby's Time class is your friend, especially the strptime method:
require 'time'
foo = Time.strptime('1262300400000+0100', '%N') # => 2014-08-08 16:57:25 -0700
foo = Time.strptime('1262300400000+0100', '%N%z') # => 2014-08-08 08:57:25 -0700
%N tells Ruby to use nanoseconds. It's throwing away the precision after the 9th digit which is OK if you don't need the rest of the value. Nanosecond accuracy is good enough for most of us.
%z tells Ruby to find the timezone offset, which it then applies to the returned value.
While parse can often figure out how to tear apart an incoming string, it's not bullet-proof, nor is it all-knowing. For speed, I'd recommend learning and relying on strptime if your strings are consistent.
As the Tin Man pointed out in this answer, use the following instead:
Time.strptime('1262300400000+0100', '%Q%z')
it could be milliseconds since epoc, take off the last 3 zeros and plug it into a unix time stamp converter, comes out as Dec 31st 2009
TIME STAMP: 1262300400
DATE (M/D/Y # h:m:s): 12 / 31 / 09 # 11:00:00pm UTC

Ruby local_to_utc returns invalid year

I have the following date string ('US/Eastern'), which I need to convert to UTC:
date_src = '2014-07-07T23:10:00+0'
First I convert it to a "valid" format so I can operate it on later processes. I use the following to have an iso version of the date:
date = DateTime.parse(date_src).iso8601
At this point date is a nice '2014-07-07T23:10:00+00:00'. The last step on my process is to translate this date to UTC. I'm using the following:
TZInfo::Timezone.get('US/Eastern').local_to_utc(date)
The problem is this is giving me 20014 as output, instead of the UTC version of the original date. If I try:
TZInfo::Timezone.get('UTC').local_to_utc(date)
I get 2014, which is the correct year but still unexpected output.
Any ideas about what I'm doing wrong, and what I could use to solve the problem?
local_to_utc actually expects a Time or a DateTime instance:
TZInfo::Timezone.get('US/Eastern').local_to_utc(DateTime.parse(date_src))
# => #<DateTime: 2014-07-08T03:10:00+00:00 ((2456847j,11400s,0n),+0s,2299161j)>
From the documentation, you can have a hint on what actually happened:
All methods in TZInfo that operate on a time can be used with either Time or DateTime instances or with nteger timestamps (i.e. as returned by Time#to_i). The type of the values returned will match the the type passed in.
What actually happens is the local_to_utc calls to_i on the input parameter, which on a string returns the parsed integer from the beginning of the string (2014 in your case since date is the string 2014-07-07T23:10:00+00:00), and adds the time difference to it - 18000 for "US/Eastern" (5 hour difference), and 0 for UTC:
date.to_i
# => 2014
TZInfo::Timezone.get('US/Eastern').local_to_utc(date) - date.to_i
# => 18000
TZInfo::Timezone.get('UTC').local_to_utc(date) - date.to_i
# => 0
So the bottom line is - kind of serendipitously you saw this weird behavior, which stems from the compilation of some surprising quirks of the APIs you used...

How to find dates in a csv file using a regular expression and store in an array [using Ruby]?

I have data stored in a csv file that looks like this:
Date,BLOCK,,Wood,Miscellaneous,,Totals,MO
Saturday,4055-RU,4055-AR,4091,1139,1158,,100
11/13/15,C Sort,B,C,iGPS,PECO,,
Starting,758,"3,936",840,0,0,"5,534",
Sorted,656,736,540,162,64,"2,158",
Subtotal 1,"1,414","4,672","1,380",162,64,"7,692",
Shipped,0,"1,152",620,162,64,"1,898",
,"1,414","3,520",860,0,0,"5,794",
Physical,"1,414","3,520",860,0,0,"5,794",
Variance,0,0,0,0,0,0,
Date,BLOCK,,Wood,Miscellaneous,,Totals,MO
Saturday,4055-RU,4055-AR,4091,1139,1158,,100
11/14/15,C Sort,B,C,iGPS,PECO,,
Starting,758,"3,936",840,0,0,"5,534",
Sorted,656,736,540,162,64,"2,158",
Subtotal 1,"1,414","4,672","1,380",162,64,"7,692"
Shipped,0,"1,152",620,162,64,"1,898"
,"1,414","3,520",860,0,0,"5,794"
Physical,"1,414","3,520",860,0,0,"5,794"
Variance,0,0,0,0,0,0
and I need to make an array of all the dates mentioned (in this case, dates = ['11/13/15', '11/14/15'].
I believe it is possible to pull this info out using a regular expression, but I don't really understand how they work/how to go about this. So, how can I extract the dates?
EDIT: I can sort through the data by row using CSV.foreach, but the trouble I am having is to tell the program to pull out anything that matches a date format (ie. 11/13/15). Does that make more sense of my question?
Thank you!
- Sean
The correct one liner is:
File.open('yourfile.csv').read.scan /\d{2}\/\d{2}\/\d{2}/
and by the way \d{2} is so much nicer than \d\d and here's why:
you can see the 2. \d{2} reads like "2 digit number" (once you're
used to it)
if you want to change it to 1 or 2 digits you can do {1,2}
dates = []
File.open('yourfile.csv').each_line do |line|
if m = line.match(/^\d\d\/\d\d\/\d\d/)
dates.push m
end
end
puts dates
BTW. I am sure someone could write this as a one-liner, but this might be a little easier to understand for someone new to Ruby.
I making these assumptions:
All dates are of the format mm/dd/yy.
All dates that you want in the array are at the start of each line.
You don't need to verify that it is a valid date.
You could get a first approximation with this:
dates = CSV.open('x.csv').map{|r| r.select { |x| x =~ /\d\d\/\d\d\/\d\d/ } }.flatten
and then, if needed, scan through the elements of dates to make sure numbers are in the proper ranges (so that you don't accidentally include a date that claims to be Feb 31 2001). If you want to check the format, you could use DateTime.strptime and catch ArgumentErrors:
clean = dates.select do |d|
begin
# I'm guessing on the date format.
DateTime.strptime(d, '%m/%d/%y')
rescue ArgumentError
nil
end
end

Resources