Parsing date from text using Ruby - ruby

I'm trying to figure out how to extract dates from unstructured text using Ruby.
For example, I'd like to parse the date out of this string "Applications started after 12:00 A.M. Midnight (EST) February 1, 2010 will not be considered."
Any suggestions?

Try Chronic (http://chronic.rubyforge.org/) it might be able to parse that otherwise you're going to have to use Date.strptime.

Assuming you just want dates and not datetimes:
require 'date'
string = "Applications started after 12:00 A.M. Midnight (EST) February 1, 2010 will not be considered."
r = /(January|February|March|April|May|June|July|August|September|October|November|December) (\d+{1,2}), (\d{4})/
if string[r]
date =Date.parse(string[r])
puts date
end

Also you can try a gem that can help find date in string.
Exapmle:
input = 'circa 1960 and full date 07 Jun 1941'
dates_from_string = DatesFromString.new
dates_from_string.get_structure(input)
#=> return
# [{:type=>:year, :value=>"1960", :distance=>4, :key_words=>[]},
# {:type=>:day, :value=>"07", :distance=>1, :key_words=>[]},
# {:type=>:month, :value=>"06", :distance=>1, :key_words=>[]},
# {:type=>:year, :value=>"1941", :distance=>0, :key_words=>[]}]

Related

Rails DateTime gives invalid date sometimes and not others

I've got a bunch of user-inputted dates and times like so:
date = "01:00pm 06/03/2015"
I'm trying to submit them to a datetime column in a database, and I'm trying to systemize them like this:
DateTime.strptime(date, '%m/%d/%Y %H:%M')
But I consistently get an invalid date error. What am I doing wrong? If I submit the string without strptime the record will save but it sometimes gets the date wrong.
Also, how can I append a timezone to a DateTime object?
Edit:
So .to_datetime and DateTime.parse(date) work for the date string and fail for date2. What's going on?
date2 = "03:30pm 05/28/2015"
Try using to_datetime:
date.to_datetime
# => Fri, 06 Mar 2015 13:00:00 +0000
Also if you read the documentation for DateTime#strptime, here. It states:
Parses the given representation of date and time with the given
template, and creates a date object.
Its important to note that the template sequence must match to that of input string sequence, which don't in your case - leading to error.
Update
Using to_datetime over second example will generate
ArgumentError: invalid date
This is because it expects the date to be in dd-mm-yy format. Same error will be raised for DateTime.parse as to_datetime is nothing but an api for the later. You should use strptime in-case of non-standard custom date formats. Here:
date2 = "03:30pm 05/28/2015"
DateTime.strptime(date2, "%I:%M%p %m/%d/%Y")
# => Thu, 28 May 2015 15:30:00 +0000
date = "01:00pm 06/03/2015"
DateTime.parse(date)
=> Fri, 06 Mar 2015 13:00:00 +0000
You haven't got your parameters in the correct order.
DateTime.strptime(date, '%H:%M%p %m/%d/%Y')
You'll also need to add %p for the am/pm suffix

Rails 4 parse a date in a different language

I have a text_field :birthday_line in my user form, that I need to parse into the user's birthday attribute.
So I'm doing something like this in my User class.
attr_accessor :birthday_line
before_save :set_birthday
def set_birthday
self.birthday = Date.strptime(birthday_line, I18n.translate("date.formats.default")
end
But the problem is that for some reason it gives me an error saying Invalid date when I try to pass in a string 27 января 1987 г. wich should be parsed to 1987-01-27.
The format and month names in my config/locales/ru.yml
ru:
date:
formats:
default: "%d %B %Y г."
month_names: [~, января, февраля, марта, апреля, мая, июня, июля, августа, сентября, октября, ноября, декабря]
seem to be correct.
Date.parse also doesn't help, it just parses the day number (27) and puts the month and year to todays date (so it'll be September 27 2013 instead of January 27 1987).
I had the same problem and what I can suggest:
string_with_cyrillic_date = '27 Января 1987'
1)create array of arrays like this
months = [["января", "Jan"], ["февраля", "Feb"], ["марта", "Mar"], ["апреля", "Apr"], ["мая", "May"], ["июня", "Jun"], ["июля", "Jul"], ["августа", "Aug"], ["сентября", "Sep"], ["октября", "Oct"], ["ноября", "Nov"], ["декабря", "Dec"]]
2) Now you can iterate this and find your cyrillic month:
months.each do |cyrillic_month, latin_month|
if string_with_cyrillic_date.match cyrillic_month
DateTime.parse string_with_cyrillic_date.gsub!(/#{cyrillic_month}/, latin_month)
end
end
And now you will receive the date that you expect
27 Jan 1987

How do I output to the following format: month - year?

Today's month is November (11). With 1.years.ago.to_date..Date.today how can I output:
11 - 2010, 12 - 2010, 01 - 2011, 02 - 2011, 03 - 2011, etc
strftime
Use function for all date modifications in ruby
Refer This DOC
There's probably a more efficient way to do this, but this will give you the output you want:
require "active_support/core_ext/integer/time"
((1.year.ago.to_date)..(Date.today)).map { |d| d.strftime("%m-%Y") }.uniq!
For print date used strtotime() function.
//For today print a date used the following code
echo date('m/d/Y',strtotime("today"));
//For one year ago print a date used the following code
echo date('m.d.Y',strtotime("-1 years"));
//For coming year date from today used following code
echo date('m.d.Y',strtotime("1 years"));
You can to add a new format to your locales.
#/config/locales/en.yml
en:
date:
formats:
month_year: "%m - %Y"
and to use it with I18n.l(your_date, :format => :month_year)
This will help if you want to change the format later, you will change in a unique point.

ROR + Ruby Date From XML API

By using XML API, I got date-time as "2008-02-05T12:50:00Z". Now I wanna convert this text format into different format like "2008-02-05 12:50:00". But I am getting proper way.
I have tried this one :: #a = "2008-02-05T12:50:00Z"
Steps
1. #a.to_date
=> Tue, 05 Feb 2008
2. #a.to_date.strftime('%Y')
=> "2008"
3. #a.to_date.strftime('%Y-%m-%d %H:%M:%S')
=> "2008-02-05 00:00:00
Suggest some thing ?
The to_date method converts your string to a date but dates don't have hours, minutes, or seconds. You want to use DateTime:
require 'date'
d = DateTime.parse('2008-02-05T12:50:00Z')
d.strftime('%Y-%m-%d %H:%M:%S')
# 2008-02-05 12:50:00
Use Ruby's DateTime:
DateTime.parse("2008-02-05T12:50:00Z") #=> #<DateTime: 2008-02-05T12:50:00+00:00 (353448293/144,0/1,2299161)>
From there you can output the value in any format you want using strftime. See Time#strftime for more info.

convert String to DateTime

I need to parse following String into a DateTime Object:
30/Nov/2009:16:29:30 +0100
Is there an easy way to do this?
PS: I want to convert the string above as is. The colon after the year is not a typo. I also want to solve the problem with Ruby and not RoR.
Shouldn't this also work for Rails?
"30/Nov/2009 16:29:30 +0100".to_datetime
DateTime.strptime allows you to specify the format and convert a String to a DateTime.
I have had success with:
require 'time'
t = Time.parse(some_string)
This will convert the string in date to datetime, if using Rails:
"05/05/2012".to_time
Doc Reference: https://apidock.com/rails/String/to_time
I used Time.parse("02/07/1988"), like some of the other posters.
An interesting gotcha was that Time was loaded by default when I opened up IRB, but Time.parse was not defined. I had to require 'time' to get it to work.
That's with Ruby 2.2.
convert string to date:
# without timezone
DateTime.strptime('2012-12-09 00:01:36', '%Y-%m-%d %H:%M:%S')
=> Sun, 09 Dec 2012 00:01:36 +0000
# with specified timezone
DateTime.strptime('2012-12-09 00:01:36 +8', '%Y-%m-%d %H:%M:%S %z')
=> Sun, 09 Dec 2012 00:01:36 +0800
refer to:
https://ruby-doc.org/stdlib-3.1.1/libdoc/date/rdoc/Date.html
in Ruby 1.8, the ParseDate module will convert this and many other date/time formats. However, it does not deal gracefully with the colon between the year and the hour. Assuming that colon is a typo and is actually a space, then:
#!/usr/bin/ruby1.8
require 'parsedate'
s = "30/Nov/2009 16:29:30 +0100"
p Time.mktime(*ParseDate.parsedate(s)) # => Mon Nov 30 16:29:30 -0700 2009
You can parse a date time string with a given timezone as well:
zone = "Pacific Time (US & Canada)"
ActiveSupport::TimeZone[zone].parse("2020-05-24 18:45:00")
=> Sun, 24 May 2020 18:45:00 PDT -07:00

Resources