I am trying to use regex to verify a date format and I would like to check if the day is less than 32. Similarly, that the month is also less than 12. I have no idea how to about it. Currently, this is what I have;
^[0-1]?[0-9]{1}\-[0-3]?[0-9]{1}\-[0-9]{2,4}$
This regex achieves the format (m)m-(d)d-(yy)yy
TL;DR
Don't use regular expressions for comparison operations. Use a regex to split off values to compare, or use an actual parser.
Use Regular Expressions to Extract Comparables
Date comparisons is a really poor problem for regex to solve. At most, you should use a regular expression to extract your days of the month for a numeric comparison. For example:
date = '01-01-1970'
date.split('-')[1].to_i < 32
#=> true
However, the code above won't really tell you if a given date is valid. For example, what about February 30th or November 31st? Instead, you should attempt to parse the date to determine its validity.
Use a Date Parser
The best way to tell if a given date is valid is to parse it with a date parser, and then report a Boolean result or handle the exception. For example, you could attempt to parse the date with Date#parse.
Boolean Results
If you just want a Boolean result, you can coerce a valid/invalid parse to true or false. For example:
require 'date'
date = '01-33-1970'
!!(Date.parse date rescue nil)
#=> false
Rescuing and Reporting the Exception
Less magically, you would need to rescue ArgumentError from Date#parse. For example:
require 'date'
def valid_date? date_string
true if Date.parse date_string
rescue ArgumentError => e
STDERR.puts "#{e.class}: #{e}: '#{date_string}'"
false
end
valid_date? '11-31-1970'
This will do what you expect, albeit more verbosely. For example, the above example will print the exception to standard error, and then return false as the result.
ArgumentError: invalid date: '11-31-1970'
#=> false
^(?:[0-1][1-2]|[1-9])\-(?:3[0-1]|[0-2][1-9]|[1-9])\-[0-9]{2}(?:[0-9]{2})?$
should do what you're looking for. It will only allow months from 1-12 (either 1-9 or 01-12), days from 1-31 (either 1-9 or 01-31) and years of at least 2 digits with a maximum of four. Tested on regex101.
Basic:
Here is a regex that should do what you want:
^(0[1-9]|1[0-2]|[1-9])-(0[1-9]|[1-2][0-9]|3[0-1]|[1-9])-\d{2}(\d{2})?$
It matches months greater than 0 and less than 13, then -, then days greater than 0 and less than 32, then -, then years (2 digits or 4 digits).
Bonus:
Full regex for matching dates in that format with validation:
^((0?[13578]|10|12)-(([1-9])|(0[1-9])|([12])([0-9]?)|(3[01]?))-((19)([2-9])(\d{1})|(20)([01])(\d{1})|([8901])(\d{1}))|(0?[2469]|11)-(([1-9])|(0[1-9])|([12])([0-9]?)|(3[0]?))-((19)([2-9])(\d{1})|(20)([01])(\d{1})|([8901])(\d{1})))$
If you want to determine the string is a valid date, you'd be better off attempting to convert it. If it won't convert, it's not valid.
def date_valid?(date_string)
format = '%m/%d/' + (date_string.split(-).last.size == 4 ? '%Y' : '%y')
return true if Date.strptime(date_string, format)
rescue ArgumentError
return false
end
Related
I created a hash out of file that contains date as a string in different formats (like September 1988, the other line would be July 11th 1960, and sometimes year only)
require 'date'
def create_book_hash(book_array)
{
link: book_array[0],
title: book_array[1],
author: book_array[2],
pages: book_array[3].to_i,
date: book_array[4],
rating: book_array[5].to_f,
genre: book_array[6]
}
end
def books_sorted_by_date (books_array)
books_array.sort_by { |key| Date.strptime(key[:date], '%Y, %m') }
end
book_file= File.read("books.txt")
.split("\n")
.map { |line| line.split("|")}
.map { |book_array| create_book_hash(book_array)}
puts books_sorted_by_date(book_file)
I'm trying to sort books by date, so it would be in ascending order by year and since I have different string types, i put a hash key as the first argument in strptime to access all the values in :date . And that gives me \strptime': invalid date (Date::Error).` I don't understand why and what can I do to convert these strings into date objects? (just ruby, no rails)
Handle Both Standard and Custom Date Strings
Date#parse doesn't handle arbitrary strings in all cases. Even when it does, it may not handle them the way you expect. For example:
parse_date "1/1/18"
#=> #<Date: 2001-01-18 ((2451928j,0s,0n),+0s,2299161j)>
While Date#parse handles many date formats automagically, it only successfully parses objects that match its internal expectations. When you have multiple or arbitrary date formats, you have to define your own date specifications using Date#strptime to handle those formats that Date#parse doesn't understand, or that it handles incorrectly. For example:
require 'date'
def parse_date str
Date.parse str
rescue Date::Error
case str
when /\A\d{4}\z/
Date.strptime str, '%Y'
when /\A\d{2}\z/
Date.strptime str, '%y'
else
raise "unexpected date format: #{str}"
end
end
date_samples = ["July 11th 1960", "September 1988", "1776"]
date_samples.map { |date| parse_date(date) }
#=> [#<Date: 1960-07-11 ((2437127j,0s,0n),+0s,2299161j)>, #<Date: 1988-09-01 ((2447406j,0s,0n),+0s,2299161j)>, #<Date: 1776-01-01 ((2369731j,0s,0n),+0s,2299161j)>]
This obviously is not an exhaustive list of potential formats, but you can add more examples to date_samples and update the case statement to include any unambiguous date formats you expect from your data set.
Date.strptime needs two parameters date-string and format of the date. To use strptime you need to know what is the format of the string beforehand.
see some examples here - https://apidock.com/ruby/Date/strptime/class
In your program you don't know exact format of the date on that line when it parses so you need to try something like -
def books_sorted_by_date (books_array)
books_array.sort_by { |key| Date.parse(key[:date]) }
end
Date.parse needs one argument - date string, it then tries to guess the date.
see details - https://apidock.com/ruby/v2_6_3/Date/parse/class
You will still have problems with just year with this approach.
My ruby command is,
"980,323,344.00".to_i
Why does it return 980 instead of 980323344?
You can achieve it by doing this :
"980,323,344.00".delete(',').to_i
The reason your method call to to_i does not return as expected is explained here, and to quote, the method :
Returns the result of interpreting leading characters in str as an integer base base (between 2 and 36). Extraneous characters past the end of a valid number are ignored.
Extraneous characters in your case would be the comma character that ends at 980, the reason why you see 980 being returned
In ruby calling to_i on a string will truncate from the beginning of a string where possible.
number_string = '980,323,344.00'
number_string.delete(',').to_i
#=> 980323344
"123abc".to_i
#=> 123
If you want to add underscores to make longer number more readable, those can be used where the conventional commas would be in written numbers.
"980_323_344.00".to_i
#=> 980323344
The documentation for to_i might be a bit misleading:
Returns the result of interpreting leading characters in str as an integer base base (between 2 and 36)
"interpreting" doesn't mean that it tries to parse various number formats (like Date.parse does for date formats). It means that it looks for what's a valid integer literal in Ruby (in the given base). For example:
1234. #=> 1234
'1234'.to_i #=> 1234
1_234. #=> 1234
'1_234'.to_i. #=> 1234
0d1234 #=> 1234
'0d1234'.to_i #=> 1234
0x04D2 #=> 1234
'0x04D2'.to_i(16) #=> 1234
Your input as a whole however is not a valid integer literal: (Ruby doesn't like the ,)
980,323,344.00
# SyntaxError (syntax error, unexpected ',', expecting end-of-input)
# 980,323,344.00
# ^
But it starts with a valid integer literal. And that's where the the seconds sentence comes into play:
Extraneous characters past the end of a valid number are ignored.
So the result is 980 – the leading characters which form a valid integer converted to an integer.
If your strings always have that format, you can just delete the offending commas and run the result through to_i which will ignore the trailing .00:
'980,323,344.00'.delete(',') #=> "980323344.00"
'980,323,344.00'.delete(',').to_i #=> 980323344
Otherwise you could use a regular expression to check its format before converting it:
input = '980,323,344.00'
number = case input
when /\A\d{1,3}(,\d{3})*\.00\z/
input.delete(',').to_i
when /other format/
# other conversion
end
And if you are dealing with monetary values, you should consider using the money gem and its monetize addition for parsing formatted values:
amount = Monetize.parse('980,323,344.00')
#=> #<Money fractional:98032334400 currency:USD>
amount.format
#=> "$980.323.344,00"
Note that format requires i18n so the above example might require some setup.
I have a Date object in Ruby.
When I do myobj.month I get 8. How do I get the date's month with a leading zero such as 08.
Same idea with day.
What I am trying to get at the end is 2015/08/05.
There is the possibility of using a formated string output
Examples:
puts sprintf('%02i', 8)
puts '%02i' % 8
%02i is the format for 2 digits width integer (number) with leading zeros.
Details can be found in the documentation for sprintf
In your specific case with a date, you can just use the Time#strftime od Date#strftime method:
require 'time'
puts Time.new(2015,8,1).strftime("%m")
Is there a way to see if a string is a valid month name in ruby?
You can do:
require 'date'
Date::MONTHNAMES.include? string
Note that this will return true if string is nil. All month names are capitalized, so if you don't care for case:
Date::MONTHNAMES.include?(string && string.capitalize)
If you want nil to return false:
!!string && Date::MONTHNAMES.include?(string.capitalize)
I will use the method #grep. It will validate all the possible month strings.
require 'date'
Date::MONTHNAMES.grep(Regexp.new(string, true)).empty?
If the above method returns true, that means the string is not valid month name, otherwise it is.
I passed the second argument to the method Regexp::new as true, to make the regex pattern case insensitive.
Try to use the Date.parse method instead. This has benefits over using the Date::MONTHNAMES.include? string as it will take into account for short handed month strings (eg: jun, aug, dec etc).
require 'date'
if (Date.parse(string) rescue false)
# code for valid month string
else
# code for invalid month string
end
You could also use a regex
months_regex = /(Jan|Febr)uary|March|April|May|June|July|August|September|(Octo|Novem|Decem)ber/
string =~ regex
the position, 0 , will return if there is a match else it will return nill
I have a method which parses a string in to a date, but i want to validate that i don't try to parse a non numeric string or a string which dosent represent a date or time format?
how can id o this?
at the moment i have:
if(string=~ /^\D*$/ )
{
return false
else
do something_else
}
this was fine for a non numeric string like "UNKNOWN" but wouldn't work for "UNKNOWN1"
any idea what i can use to make sure that only date or time formats are parsed?
DateTime.strptime v ParseDate.parsedate
No pun intended but the information herein is now out of date (2015) and some methods and modules have been removed from Ruby 2.x I'm leaving it here just in case someone, somewhere is still using 1.8.7
Ok, maybe there was a small pun intended there ;-)
You would think that you could use either Date.parse or DateTime.parse to check for bad dates (see more on Date.parse here)
d = Date.parse(string) rescue nil
if d
do_something
else
return false
end
because bad values throw an exception which you can catch. However the test strings suggested actually return a Date with Date.parse
For example ..
~\> irb
>> Date.parse '12-UNKN/34/OWN1'
=> #<Date: 4910841/2,0,2299161>
>>
Date.parse just isn't clever enough to do the job :-(
ParseDate.parsedate does a better job. You can see that it attempts to parse the date but in the test examples, doesn't find a valid year or month. More information here
>> require 'parsedate'
=> true
>> ParseDate.parsedate '2010-09-09'
=> [2010, 9, 9, nil, nil, nil, nil, nil]
>> ParseDate.parsedate 'dsadasd'
=> [nil, nil, nil, nil, nil, nil, nil, nil]
>> ParseDate.parsedate '12-UNKN/34/OWN1'
=> [nil, nil, 12, nil, nil, nil, nil, nil]
>> ParseDate.parsedate '12-UNKN/34/OWN1'
=> [nil, nil, 12, nil, nil, nil, nil, nil]
Regardless of which method you use to parse a date, you can validate strict conformance by reformatting the resulting date and comparing it with the original input. For example:
def strict_parse(input, format)
Time.strptime(input, format).tap { |output| expect(output.strftime(format)).to eq input }
end
This is strict however, e.g. "1/9/2014" won't parse with format "%d/%m/%Y". It would have to be "01/09/2014" to be acceptable.
Ruby's parsers are optimistic, if you can throw out a bunch of garbage and get a result from the input string, Date.parse and DateTime.strptime will try to do it.
You want a pessimistic and strict check, which means instead of assuming acceptance after trying to hunt for garbage with a regex, you should assume rejection and hunt for treasure with your regex.
Your first check: "Is a string numeric" is using a regex to try and find a string which is comprised entirely of non-numeric characters, and rejecting if it finds it. \D (with a capital D) is looking for non-numeric characters, and input strings will only match your regex if it is comprised entirely of 0 or more non-numeric characters.
You'll likely have better luck with the following logic for numerics:
if(string=~ /^\d*$/ )
something_else
else
return false
end
This matches a string comprised entirely of 0 or more numeric characters, does something_else if it finds it, and returns false otherwise.
For times you want to explicitly search for times and reject all other values. For an HH:MM:SSAM format which tolerates omitting leading 0's for each field, with 12 hour times you could use the following:
if (string =~ /^[01]?\d:[0-5]?\d:[0-5]?\d[AP]M$/)
something_else
else
return false
end
Likewise for dates you want to explicitly search for dates that are valid, and reject all other values. For MM/DD/YYYY which tolerates omitting leading 0's for everything but years field you could go with:
if (string =~ /^[0-1]\d\/[0-3]?\d\/\d{4}/)
something_else
else
return false
end
Ruby's utility functions try to be verbose in what they accept, but for validation that is not a useful trait. Be strict, assume that everything is invalid until it proves otherwise, then accept it.
I'd advise you to establish a list of date and datetime formats that you expect and intend to support. You can define them using strftime compatible strings, and then use the same strings when parsing dates, using DateTime#strptime. Try to parse your input strings with each supported pattern, the first one which doesn't throw an exception will return parsed date. If each throws an exception, the string is not valid date.
Check this out:
Returns true is string is a valid time, false otherwise:
require 'time'
def is_a_time?(string)
!!(Time.parse(string) rescue false)
end
Returns true is string is a valid date, false otherwise:
require 'date'
def is_a_date?(string)
!!(Date.parse(string) rescue false)
end