Date conversion in SAS (String to Date) - sorting

I import an Excel-spreadsheet using the following SAS-procedure.
%let datafile = 'Excel.xlsx'
%let tablename = myDB
proc import datafile = &datafile
out = &tablename
dbms = xlsx
replace
;
run;
One of the variables (date_variable) has the format DD.MM.YYYY. Therefore, I was defining a new format like this:
data &tablename;
set &tablename;
format date_variable mmddyy10.;
run;
Now, I would like to sort the table by that variable:
proc sort data = &tablename;
by date_variable;
run;
However, as the date_variable is defined as a string, I dont get the sorting right. How can I re-define the date_variable as a date?

Use the input function to convert the string-value containing a date representation to a date-value that can have a date-style format applied to it that will affect how the date-value is rendered in output and viewers.
The input function requires an informat as an argument. Check the format documentation, which includes an entry for:
DDMMYYw. Informat
Reads date values in the form ddmmyy or dd-mm-yy, where a special character, such as a hyphen (-), period (.), or slash (/), separates the day, month, and year; the year can be either 2 or 4 digits.
Example:
* string values contain date representations;
data have;
input date_from_excel $10.;
cards;
31.01.2019
01.02.2019
15.01.2019
run;
data want;
set have;
date = input(date_from_excel, ddmmyy10.); * convert to SAS date values;
format date ddmmyyd10.; * format to apply when rendering those values;
run;
* sort by SAS date values;
proc sort data=want;
by date;
run;

format date_variable mmddyy10.; does not convert a string to a date. It just sets a format for displaying that field etc.
Essentially I think what you are saying is date_variable is a string that looks like "31.01.2019". If that is the case, you will have to first convert it into a date value:
date_variable_converted = input(date_variable, DDMMYY10.);
You should be now able to sort using date_variable_converted which is a SAS date value.

Related

How to convert string to date with Tibbo AggreGate expression language

I have a table of string data. It is essentially a csv file.
Example string: "01.10.2021 5:41:15";"255,1759949"
I need to convert datatype from string to date
Need to replace original string with date column
The simpliest way is to add and drop column:
removeColumns(
addColumns(
table("<<str1><S>><<str2><S>>","01.10.2021 5:41:15","255,1759949","01.10.2021 5:41:15","255,1759949")
,"<d><D>"
,"parseDate({str1}, 'dd.MM.yy HH:mm:ss', 'GMT+3' )"
)
,"str1"
)

How to get only first part of the date ('DD') from an entire date?

I've converted a date into string. Now, I want to parse a string to get the DD part from DD-MM-YYYY.
For e.g.
If the date is 03-05-2017 (DD-MM-YYYY) then the goal is to get only first part of the string i.e. 03 (DD).
You've tagged this question as a ServiceNow question, so I assume you're using the ServiceNow GlideDateTime Class to derive the date as a string. If that is correct, did you know that you can actually derive the day of the month directly from the GlideDateTime object? You can use the getDayOfMonth(), getDayOfMonthLocalTime(), or getDayOfMonthUTC().
You could of course, also use String.prototype.indxOf() to get the first hyphen's location, and then return everything up to that location using String.prototype.slice().
Or, if you're certain that the day of the month in the string will contain an initial zero, you can simply .slice() out a new string from index 0 through index 2.
var date = '03-05-2017';
var newDate = date.slice(0, 2);
console.log(newDate); //==>Prints "03".
var alternateNewDate = date.slice(0, date.indexOf('-'));
console.log(alternateNewDate); //==>Prints "03".

Date formatting in a grid

I'm trying to display a date column in grid like this: "dd-mm-yyyy". In dbf table, the date is stored in this format: "YYYY-MM-DDThh:mm:ss" in a character field.
The grid is created from this cursor:
select id,beginningDate,endDate,cnp from doc ORDER BY id desc INTO CURSOR myCursor
I wish something like this:
select id,convert(beginningDate, Datetime,"dd-mm-yyyy"),endDate,cnp from doc ORDER BY id desc INTO CURSOR myCursor
Fox doesn't have a builtin function called convert(), nor can it handle your non-standard date/time string format directly.
A quick and dirty way to convert a string foo in the given format ("YYYY-MM-DDThh:mm:ss") to a date/time value is
ctot("^" + chrtran(foo, "T", " "))
The caret marks the input as the locale-independent standard format, which differs from the input format only by having a space instead of a 'T'.
You can extract the date portion from this via the ttod() function, or simply extract only the date portion from the string and convert that:
ctod("^" + left(foo, 10))
Fox's controls - including those in a grid - normally use the configured Windows system format (assuming that set("SYSFORMATS") == "ON"); you can override this by playing with the SET DATE command.
There seems to be no mask-based date formatting option as in most other languages. dtoc() and ttoc() don't take format strings, transform() takes a format string but blithely ignores it for date values.
I am with Tamar on this subject, you should have used a datetime field instead.
Since you are storing it like this anyway, you can 'convert' to datetime using the built-in cast function (or ttod(ctot()) in versions older than VFP9 - in either case you don't need to remove T character):
select id, ;
Cast(Cast("^"+beginningDate as datetime) as date) as beginningDate, ;
endDate,cnp ;
from doc ;
ORDER BY id desc ;
INTO CURSOR myCursor ;
nofilter
In grid or any other textbox control, you can control its display style using DateFormat property. ie:
* assuming it is Columns(2). 11 is DMY
thisform.myGrid.Columns(2).SetAll('DateFormat', 11)

Pig Help: Splitting a Field into Multiple Fields

Hi I am playing around with Pig for the first time and am curious how to deal with splitting up a field into multiple other fields.
I have a bag, A, like the one below:
grunt> Dump A;
(text, text, Mon Mar 07 12:00:00 CDT 2016)
What I'd like to do is split the Date-Time field into multiple fields so that I can explore the distribution of the data set and do group bys on the Day of Week, Month, Year, etc.
I have been looking at tokenize but am unsure this meets my needs as I need/want to have field names added to the bag or create a nested bag.
Any ideas?
Assuming that the value is already of datatype datetime, then you could use the following functions to extract individual elements.Builtin function reference DateTime Functions in PIG
B = FOREACH A GENERATE f1,f2,
GetDay(f3) as f3_Day,
GetMonth(f3) as f3_Month,
GetYear(f3) as f3_Year,
GetHour(f3) as f3_Hour,
GetMinute(f3) as f3_Minute,
GetSecond(f3) as f3_Second;
If the datatype is chararray then use the ToDate() function to convert it to datetime and extract the date parts.
B = FOREACH A GENERATE f1,f2,ToDate(f3,'choose your datetime format') as f3_Date;
C = FOREACH B GENERATE f1,f2,
GetDay(f3_Date) as f3_Day,
GetMonth(f3_Date) as f3_Month,
GetYear(f3_Date) as f3_Year,
GetHour(f3_Date) as f3_Hour,
GetMinute(f3_Date) as f3_Minute,
GetSecond(f3_Date) as f3_Second;

Error while formatting the date variables in sas

I am getting an error when i try to do format for a date variable.
This is how my date variable values look-like "26-Dec-58"
"The format $DATE was not found or could not be loaded"
The reason for the error is that my date value is stored as a char variable in the data set so its not accepting the format of a numeric variable when i am formatting the variable.
So i want to convert my date (which is a character) variable into a numeric variable without introducing a new variable.
I tried datepart and substring options but still getting error.
I am still at the learning stage in sas so any code to clear the error is appreciated i know the concept but coding i tried with all i know but still no luck.
Current code:
data Practice.Sales;
set Practice.Sales;
Birthdate = '26-Dec-58';
Purchase_Dt = '15-Sep-04';
t_num_date = input(Birthdate, ddmmyy8.);
t_num_date1 = input(Purchase_Dt, ddmmyy8.);
drop Birthdate Purchase_Dt;
format Birth_date ddmmyy8. PurchaseDt ddmmyy8. Price DOLLAR10.2;
rename t_num_date = Birthdate;
rename t_num_date1 = Purchase_Dt;
run;
It isn't possible to change an existing variable in a SAS dataset directly from character to numeric. However, you can create a new numeric variable, drop the original character variable, then rename the new one:
data example;
mydate = '26-Dec-58';
t_num_date = input(mydate, date9.);
drop mydate;
format t_num_date date9.;
rename t_num_date = mydate;
output;
run;
N.B. you have to apply the format to the temporary variable before it gets renamed. Attempting to apply a numeric format after renaming it results in an error, as the format statement is processed before the rest of the data step runs, and at that point the original character variable hasn't been dropped yet.
This code works fine for me but still the newly created variables are coming at the last how to change that
data Practice.Sales;
set Practice.Sales;
Birth_date=datepart(input(Birthdate,anydtdtm19.));
PurchaseDt = datepart(input(Purchase_Dt,anydtdtm19.));
format Birth_date PurchaseDt ddmmyy8. Price DOLLAR10.2;
drop Birthdate Purchase_Dt;
run;
You need to use the rename at set statement and then drop them at the end.
data Practice.Sales;
set Practice.Sales
*rename old variables;
(rename=(birthdate=birth_date purchase_dt=purchasedt));
*use renamed variables in code;
Birthdate=datepart(input(Birth_date,anydtdtm19.));
Purchase_Dt = datepart(input(PurchaseDt,anydtdtm19.));
format Birthdate Purchase_Dt date9. Price DOLLAR10.2;
drop Birth_date PurchaseDt;
run;
The variables in a SAS dataset appear in the order they were created.
If you really want to determine the order of the variables yourself, then you should create them before your set statement. Further, I suppose you prefer NOT changing the name of the variables, while still changing their type. Therefore we have to rename the existing variables while reading them.
data Practice.Sales;
format Birthdate Purchase_Dt ddmmyy8. Price DOLLAR10.2;
set Practice.Sales (rename =(Birthdate=oldBirth Purchase_Dt = oldPurchase));
Birthdate=datepart(input(oldBirth,anydtdtm19.));
Purchase_Dt = datepart(input(oldPurchase,anydtdtm19.));
drop Birthdate Purchase_Dt;
run;

Resources