Figure date format from string in ruby - ruby

I am working in a simple data loader for text files and would like to add a feature for correctly loading dates into the tables. The problem I have is that I do not know the date format before hand, and it will not be my script doing the inserts - it has to generate insert statements for later use.
The Date.parse is almost what I'd need. If there was a way to grab the format it identified on the string in a way I could use to generate a to_date(...)(Oracle standard) would be perfect.
An example:
My input file:
user_name;birth_date
Sue;20130427
Amy;31/4/1984
Should generate:
insert into my_table values ('Sue', to_date('20130427','yyyymmdd'));
insert into my_table values ('Amy', to_date('31/4/1984','dd/mm/yyyy'));
Note that it is important the original string remains unchanged - so I cannot parse it to a standard format used in the inserts (it is a requirement).
At the moment I am just testing a bunch of regexes and doing some validation, but I was wondering if there was a more robust way.

Suppose (using for example String#scan), you extracted an array of the date strings from a single file. It may be like:
strings = ["20130427", "20130102", ...]
Prepare in advance an array of all formats you can think of. It may be like:
Formats = ["%Y%m%d", "%y%m%d", "%y/%m/%d", "%m/%d/%y", "%d/%m/%y", ...]
Then check all formats that can parse all of the strings:
require "date"
formats =
Formats.select{|format| strings.all?{|s| Date.strptime(s, format) rescue nil}}
If this array formats includes exactly one element, then that means the strings were unambiguously parsed with that format. Using that format, you can go back to the strings and parse them with that format.
Otherwise, either you failed to provide the appropriate format within Formats, or the strings remained ambiguous.

I would use the Chronic gem. It will extract dates in most formats.
It has options to resolve the ambiguity in the xx/xx/xxxx format, but you'd have to specify which to prefer when either match.

Related

Format currency with dot instead of comma using i18n

We are using java.text.NumberFormat class to format the currency values using the method getInstance(Locale paramLocale). Our issue is when we pass es_CO(Columbia) language code it automatically formats it in value 123,00 instead of 123.00. Is there a way to format with dot instead of comma?
I am using Spring platform(hybris)
Please note due to business reasons it is not possible for me to change the locale.
You can use DecimalFormat to have your own format.
Look at this How can I format a String number to have commas and round?

Customize OpenCsv CsvToBeanBuilder

How can I intervere in the process of parsing the CSV file?
What I intend to do is that if certain values have bad format, I want to correct them. In particular, if there is a value of "1.000-" it results in a NumberFormatException because the minus sign is suffixed instead of being prefixed. Hence, in such a case, I want to switch "1.000-" to "-1.000" and let the parser or CsvToBeanBuilder respectively, continue its work.

How to use Nifi expression language to change a date into a folder path?

In nifi, I need to transfer a bunch of json files to HDFS. The json files have a field called "creationDate" which has the date in UNIX format. I need to use the date in there to funnel the file to HDFS directories that are named after dates, like "2019-01-19" "2019-01-20" "2019-01-21" etc.
At first I used an "EvaluateJsonPath" processor going to a "PutHDFS" processor. The "Evaluate..." processor had "creationDate" as the property and "${creationDate} as the value. In the PutHDFS processor, for directory I put "/${creationDate}"
But then I realized that the date in the json file has the full timestamp, like "2019-01-19T04:34:28.527722+00:00
Obviously I don't need all that, just the first eight digits. So how can I turn this big string into a neat 8-digit directory name? Will I need to use a regex, and if so, how can this be implemented? Thanks in advance for any help.
You can use UpdateAttribute and use the date expression language functions to format it.
https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html
Example (not specific to your format):
${creationDate:toDate('MM-dd-yyyy'):format('yyyy/MM/dd')}
In UpdateAttribute you would add a new property name creationDate and set the value to an expression like above.

Convert scalar from sif to hrf

I am looping over a set of scalars which contain quarterly sif values. I would like to convert them to hrf format and keep them stored in scalars.
However, I found that format %tq only accepts variables. Hence, the only workaround seems to i) convert the scalar to a variable ii) apply format %tq iii) convert the variable to a scalar.
Is there a more elegant and faster way to do this? (I am using Stata MP 15.1.)
You can have string scalars, so you can do this. I can't see why it would be useful, but that could be failure of imagination; you could enlighten us on why you want this.
. scalar foo = yq(2018, 4)
. scalar foo = string(scalar(foo), "%tq")
. scalar list
foo = 2018q4
What is quite different for scalars is that there is no sense whatsoever in which a display format is attached to or associated with a scalar. You can hold a numeric date or a string date in a scalar, but those are the only choices. You can't have a numeric value with a format on the side that Stata will use for display when suitable. You found that out when you attempted to format a scalar.
Goodness knows whether this is faster (than what?) or more elegant (who decides?). The major difference is that a variable manifestly can contain many dates and a changed format made just once with format can apply to them consistently, whereas changing how you show a bunch of scalars requires a loop every time you do it so far as I can see. Further, it follows from above that you might need to keep two sets of scalars, one numeric for calculation and one string for display.
I've used date constants and typically found that either I use them directly (subtracting 2000 as base doesn't requiring putting it into anything) or I use local macros to hold them. But I can't see anything wrong with using scalars, except possibly indirection.

Get the current date/time in Ruby as string without delimiters?

Is there any good way to get the current date/time in Ruby as a string without separators, not having to use Time.now.strftime('%Y%m%d'), etc.?
The output I'm looking for is something like "20151002112001" or similar, with all digits and no separators, human-readable form, not Unix time.
I found out that by using Active Support, this can be easily done:
require 'active_support/core_ext/time/conversions'
Time.now.to_formatted_s(:number) # => "20151002112419"
Since I already depend on Active Support, this turned out to be the quickest and easiest way to get this done by far.
You can use .to_i on a Time object to get a number representing Unix-time.
You can convert Date or DateTime objects to Time with .to_time. For example:
string = "#{Date.today.to_time.to_i}"
# "1443769200"
If you need to get certain attributes but don't like strftime, you can use methods of Date. For example, you might do something like
date = Date.today
string = "#{date.year}-#{date.month}-#{date.day}"
# "2015-10-2"
This all works without Active Support, but you still need to require 'date'.

Resources