Extract date and time from text using SAS - full-text-search

I have something like this, which is in .txt format.
'random title'
random things , 00:00 AM, 1 January
2005, 555 words, (English)
'random long title'
random things , 00:00 AM, 1 January 2005, 111 words,
(English)
The time and date need to be extracted in the format yyyymmdd and hhmm.
I tried to use comma as the delimiter.
DATA News;
INFILE 'C:xxxx/xxxx/xxxx' DLM',';
INPUT Title $75. Time $10. Date $20. Words $15. Lang $10.;
PROC PRINT DATA=News;
TITLE 'Time and Date';
VAR Time Date;
RUN;
But it failed, those entries contain multiple lines and also are not well-formatted.
Are there any solutions?

If your dates are always formatted like so:
00:00 AM, 1 January 2005
Then you can use a perl regular expression to find them.
data test;
input #;
_prx = prxparse('/\d\d:\d\d (?:AM|PM), \d{1,2} (?:January|February|March) \d{4}/');
start = 1;
stop = length(_infile_);
call prxnext(_prx, start, stop, _infile_, position, length);
do while (position > 0);
found = substr(_infile_, position, length);
put found= position= length=;
call prxnext(_prx, start, stop, _infile_, position, length);
end;
datalines;
'random title'
random things , 00:00 AM, 1 January
2005, 555 words, (English)
'random long title'
random things , 00:00 AM, 1 January 2005, 111 words,
(English)
;;;;
run;
Then use the FOUND value as you would normally with a SAS character variable to obtain date and time, or datetime, information. Obviously extend my short list of months to contain all twelve months.
That finds the second example, but not the first (which is not reasonably findable using datalines in an example); but if you are not using datalines, but instead a text file, you could manipulate the record format to remove the line feed and carriage return and thus see both as a single record (and thus match). Look into RECFM=N for more details on that.

Related

Add custom letter to timestamp instead of month in shell or perl

So, I am not a coder but i have to write a shell script that can "call" a timestamp in format [A][21][16][30][4] where A is the Month (A for January, B for February, C for March and so on), 21 is the day, 16 is the hours, 30 is the minutes and 4 are the tenth of a milisecond (0-5). Brackets are only for visualization, so the timestamp should be A2116304
This needs to be either a shell script or a perl code, that is part of a shell script (i need to put this is an existing shell script).
I tried searching for solution, but couldnt find anything useful.
The idea is that i need to append this custom timestamp to a file name, like
FILENAME.TIMESTMAP
Thanks !
Assuming that:
time is now,
time zone is GMT,
you want a fixed length timestamp,
the last digit should be tens of seconds.
I suggest something like:
my ($s, $m, $h, $D, $M) = gmtime;
my $prefix = "snapshot";
my $filename = sprintf "%s.%s%02d%02d%02d%d",
$prefix, chr($M+ord "A"), $D, $h, $m, $s/10;
print $filename, "\n";
Output:
snapshot.J2612182
You can use localtime instead of gmtime if you don't want to use GMT.
Both *time functions take a UNIX timestamp as argument, in case you need something other than now.

DATES with awk in UNIX [duplicate]

This question already has an answer here:
take date from file in unix
(1 answer)
Closed 6 years ago.
I want to take two dates as argument from the user ( ) with
$./tool.sh --born-since <dateA> --born-until <dateB>
and from a file print the lines that are between those two dates.For example:
933|Mahinda|Perera|male|1989-12-03|2010-03-17T13:32:10.447+0000|192.248.2.123|Firefox
1129|Carmen|Lepland|female|1984-02-18|2010-02-28T04:39:58.781+0000|81.25.252.111|Internet Explorer
4194|Hồ Chí|Do|male|1988-10-14|2010-03-17T22:46:17.657+0000|103.10.89.118|Internet Explorer
So , i use awk command like this :
awk -F'|' '{print $4} [ file ... ]
to take the dates .. how can i use awk to make the dates from the txt to seconds form ?
if the date variables are in the same format, you can convert everything to numbers and use comparison.
awk -F'|' -v from=$dateA -v to=$dateB '{gsub("-","",$5);
gsub("-","",from); gsub("-","",to)}
from <= $5 && $5 <= to' file
Note, it's the fifth field in your file.
You can either call the /bin/date +"%s" --date="DATESTRING" through system() if the DATESTRING matches a format "/bin/date" understands, or you use the internal mktime() function. But then you need to split your date according to awk(1):
mktime(datespec)
Turn datespec into a time stamp of the same form as returned by systime(), and return the result. The datespec is a string of
the form YYYY MM DD HH MM SS[ DST]. The contents of the string are six or seven numbers representing respectively the full year
including century, the month from 1 to 12, the day of the month from 1 to 31, the hour of the day from 0 to 23, the minute from 0
to 59, the second from 0 to 60, and an optional daylight saving flag. The values of these numbers need not be within the ranges
specified; for example, an hour of -1 means 1 hour before midnight. The origin-zero Gregorian calendar is assumed, with year 0
preceding year 1 and year -1 preceding year 0. The time is assumed to be in the local timezone. If the daylight saving flag is
positive, the time is assumed to be daylight saving time; if zero, the time is assumed to be standard time; and if negative (the
default), mktime() attempts to determine whether daylight saving time is in effect for the specified time. If datespec does not
contain enough elements or if the resulting time is out of range, mktime() returns -1.
So you need to prepare your date fields to use the form given in the documentation.
split($5, D, "-");
DS = sprintf("%4d %2d %2d 00 00 00", D[1], D[2], D[3]);
T = mktime(DS);
should do the job.

Oracle date calculation issue

there is a requirement like below:
string format is : dd hh:mm:ss, this means (days hours:minutes:seconds, day is optional)
now the string will add to value "1/1/4000", so if the incoming value is "00:15:00" the resulting value would be 1/1/4000 00:15:00 (add 15 minutes to 1/1/4000). If the incoming value is 2 00:15:00 then the resulting value would be 1/3/4000 00:15:00 (add 2 days and 15 minutes to 1/1/4000) . If the incoming value is 32 00:15:00 then the resulting value would be 2/1/4000 00:15:00.
so is there any simple method to implement this requirement above?
You can convert your input string to INTERVAL DAY TO SECOND datatype using TO_DSINTERVAL and then add it to your default date. The result will be a date.
date'4000-01-01' + TO_DSINTERVAL('2 23:23:12');
But this requires your input string to be in DD HH:MI:SS format. Since in your input, day is optional, you should append 0 days to the string, in case it isn't present.

Convert 12-hour time to 24-hour time with a regex

I have a bunch of strings with opening hours in this format:
Mon-Fri: AM7:00-PM8:00\nSat-Sun: AM8:00-PM6:00
I can deal with the "AM" part by just removing it, but I'd like to convert the PM by
Removing "PM"
Adding 12 to the number before the ":"
Taking care of the fact that PM is sometimes double-digits (e.g. PM11:00)
There can be zero or more PM times in the string.
I'm not sure how to manipulate the time as a number. I've gotten this far:
opening_hours.sub! /PM([\d]?[\d]):/, "***\1***"
Which outputs things like this:
AM7:15-***\u0001***00
The '\u0001` may be due to Japanese characters in the string.
You can take advantage of the fact that String#gsub accepts a block. Something like this will do for you?
s = "Mon-Fri: AM7:00-PM8:00\nSat-Sun: AM8:00-PM11:00"
s2 = s.gsub('AM', '').gsub(/PM(\d+)/) do |match|
(match.gsub('PM', '').to_i + 12).to_s
end
s2 # => "Mon-Fri: 7:00-20:00\nSat-Sun: 8:00-23:00"
Have a look at this question, ruby has a class called datatime.
Convert 12 hr time to 24 hr format in Ruby

Need direction on writing custom datetime formatting in VB6

I have a string:
A -DDD HH:MM:SS
and currently trying to write a function that will take in this string and also the format to convert it to. For example, say I wanted to display just the HH:MM ss (Hours with leading Zeros + colon + Minutes with leading Zeros + no colons + seconds without leading Zeros.
I understand for VB6 you'd probably use a string function like Mid(str, int, int) to get the time portion. But if I create a custom format of
HH:MM ss
How would you approach formating this?
J
Chop off the strictly formatted time part and use the format function;
s = "A -??? 12:34:56"
t = right$(s,8)
?format$(t, "HH:NN ss")
12:34 56
?format$(t, "HH:NN ss AM/PM")
12:34 56 PM
?format$(t, "H, N, S AM/PM")
12, 34, 56 PM
(N is minute here)

Resources