I have some panel data with quarterly data in string format (imported from .csv file).
The name of the variable is datacqrt and it is in the format "YYYY"Q"Q" say 1998Q3. So I have it from 1966Q1 to 2014Q4 for each of the around 200 categories.
I am following the Stata guide and creating a new variable like this
generate time = date(datacqtr,"YQ")
but then it creates only missing values.
How do I make Stata understand the variable datacqrt is a time?
The function date() is for creating daily dates in Stata terms, i.e. days with an origin 0 at 1 January 1960. This is documented at length in (e.g.) help dates and times but is even clearer with the synonym daily().
You need the function quarterly(), for example:
. di %tq quarterly("2015q3", "YQ")
2015q3
. di %3.0f quarterly("2015q3", "YQ")
222
In your case you want something like
gen qdate = quarterly(datacqrt, "YQ")
Related
I am struggling to compile my large dataset and am assuming syntax commands are the answer, however, I am not skilled at all with syntax. My questions are specific to what syntax commands (or other methods) I should use to create hundreds/thousands of new variable names so I do not need to do it manually.
I am working with a dataset involving intimate partner homicides and domestic violence services utilization among victims from 2012-2021 (10 years), examined monthly (120 months). Across that timeframe, I have a three variable name set (REC [number of clients who received services], CALL [number of calls for services], HOUR [number of hours advocates/employees spent providing services]) that needs to be repeated monthly Jan-Dec across 10 years 2012-2021 and 39 separate services. See below:
MonthYear_REC_ServiceName
MonthYear_CALL_ServiceName
MonthYear_HOUR_ServiceName
"Month" in the above is Jan-Dec (01-12), "Year" is 2012-2021 (12-21), and "ServiceName" would be replaced with 39 different services. As an example for the year 2017 and "Shelter" services:
0117_REC_Shelter
0117_CALL_Shelter
0117_HOUR_Shelter
0217_REC_Shelter
0217_CALL_Shelter
0217_HOUR_Shelter
0317_REC_Shelter
0317_CALL_Shelter
0317_HOUR_Shelter
.....so on and so forth until December of 2017.
To further explain: This sequential monthly order would need to be repeated for each year in the 2012-2021 timeframe for each of 39 services for which I have data. "Shelter" services are shown as an example above, but I also need the same set of variable names across 38 other service types such as group counseling, legal advocacy, economic assistance, etc.
My overall question is (sorry for the repetition)- What syntax commands would I need to input to create this MASSIVE amount of variable names/variables? I hope this makes sense to everyone in the same way it makes sense to me! Sorry for the length and thank you in advance.
Best,
Shannon H.
Assuming what you want is to create an empty dataset with all the variable names you described, this will do it:
INPUT PROGRAM.
LOOP ind = 1 to (12*10*3*39).
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.
do repeat vr=month year set service/vl=12 10 3 39.
compute vr=mod(ind,vl).
recode vr (0=vl).
compute ind=trunc((ind-1)/vl)+1.
end repeat.
compute year=year+11.
formats all (f2).
alter type month year (a2) set (a4).
compute month = char.lpad(ltrim(month), 2, "0").
recode set (" 1"="REC")(" 2"="CALL")(" 3"="HOUR").
* I suggest at this point you use "match files" to match the service numbers here with a list of service names.
* The following code creates fictitious service names instead to demonstrate how to use them.
string serviceName (a20) vrnm (a50).
compute serviceName=concat("service", char.lpad(ltrim(string(service, f2)), 2, "0") ).
* now to create the final variable names.
compute vrnm=concat(month, year, "_", set, "_", serviceNAme).
flip NEWNAMES = vrnm.
select if CASE_LBL="".
exe.
Problem:
In a field called $Detailed Decription$ sometimes dateformat 08/09/2021 is enterd and this need to be converted to swedish format 2022-02-11
I'am going to use BMC Developer studio and make a filter but i cant find a fitting solution for it. Replacing it wont work (i think) becaus it need to have a value to replace it with.
Maby there can be a function that reads regex (\d{2})/(\d{1,2})/(\d{4}) but how can i convert it?
If it's sometimes - look at AR System User Preferencje form. Check certain user's locale and date time config.
Also is important where the data comes from. Could be a browser setting or java script mod.
1- Using Set fields action, copy the date value from Detailed Description to a Date/Time field (i.e. z1D_DateTime01).
2- Using Set fields action and Functions (MONTH, YEAR, DAY, HOUR, MINUTE, SECOND) you can parse the date/time and convert it to format you like.
Something like this:
SwedishDate = YEAR($z1D_DateTime01$) + "-" + MONTH($z1D_DateTime01$) + "-" + DAY($z1D_DateTime01$)
This will capture the parts of date and combine them with "-" in the middle and in the order you want.
Can someone please guide me on how to convert more than 6 characters into int? Because I need to do sum after convert to int. I tried so many ways like CInt, CLng, etc still throw exponential value.
Stroutput = 2018050302216556
Sum = Stroutput + 1
I tried to divide into sveral chuck using right function but it doesnt look good. Can be manage but I need another option. Thanks
You seem to be working with a Date Structure, which as VBS says - hard to represent as numbers only. Use the CDate to get a date object from the string (If needed change the representation of that string to (YYYY-mm-dd ...). With the DateAdd method you can add days, years etc; and finally the FormatDateTime will create an output of your wish.
I'm working on a project that requires me to find the temporal average (e.g: hour, day, month) for multiple datasets and then do calculations on those averages. The issue I am running into is that Apache Pig will not group by the time nor dump the DateTime values. I've tried several solutions posted here on Stack Overlflow and elsewhere to no avail. I've also read over the documentation, and am unable to find a solution.
Here is my code so far:
data = LOAD 'TestData' USING PigStorage(',');
t_data = foreach data generate (chararray)$0 as date, (double)$305 as w_top, (double)$306 as t_top, (double)$310 as w_mid, (double)$311 as t_mid, (double)$315 as w_bot, (double)$316 as t_bot, (double)$319 as pressure;
times = FOREACH t_data GENERATE ToDate(date,'YYYY-MM-ddThh:mm:ss.s') as (date), w_top, t_top, w_mid, t_mid, w_bot, t_bot, pressure;
grp_hourly = GROUP times by GetHour(date);
average = foreach grp_hourly generate flatten(group), times.date, AVG(times.w_top), AVG(times.t_top), AVG(times.w_mid), AVG(times.t_mid), AVG(times.w_bot), AVG(times.t_bot);
And some sample lines from the data:
2011-01-06 15:00:00.0 ,0.07225,-11.36384,-0.045,-11.24599,0.036,-12.44104,1021.707
2011-01-06 15:00:00.1 ,0.09975,-11.34448,-0.0325,-11.26053,0.041,-12.45392,1021.694
2011-01-06 15:00:00.2 ,0.15375,-11.35576,-0.02975,-11.26536,0.01025,-12.44748,1021.407
2011-01-06 15:00:00.3 ,-0.00225,-11.42034,-0.03775,-11.28477,-0.013,-12.44429,1021.764
2011-01-06 15:00:00.4 ,0.01625,-11.33965,-0.0395,-11.27989,-0.0395,-12.42172,1021.484
What I Currently Get as Output:
I get a file with one average of every variable I feed into APACHE Pig without a date and time (most likely the average of each variable over the entire data set). I need them for each hour and to be printed with the output. Any tips would be appreciated. Sorry if my post is messy, I don't post to Stack Overflow often.
The date and time pattern string in ToDate doesn't exactly match your data. You have YYYY-MM-ddThh:mm:ss.s but your data looks like 2011-01-06 15:00:00.0. You need to match the spaces in your data, and since your hours are on the 24 hour, you need to use HH instead of hh. Check out the documentation for Java SimpleDateFormat class. Try this pattern string instead:
times = FOREACH t_data GENERATE ToDate(date,'yyyy-MM-dd HH:mm:ss.s ') as date;
To debug your code, try dumping right after creating the relation times instead of at the end since it seems like the problem is with ToDate().
Savage's answer was correct. The issue I had in my code was a quotation mark that was too close to the date and time string. So instead of writing mine like this:
(date,'YYYY-MM-ddThh:mm:ss.s')
It should be written like this:
(date,'YYYY-MM-ddThh:mm:ss.s ')
CodeIgniter stores timezones for its date class in
system/language/english/date_lang.php
I would like to change the strings in this file so that
$lang['UM12'] = '(UTC -12:00) Baker/Howland Island';
$lang['UM11'] = '(UTC -11:00) Samoa Time Zone, Niue';
would instead be
$lang['-12:00'] = '(UTC -12:00) Baker/Howland Island';
$lang['-11:00'] = '(UTC -11:00) Samoa Time Zone, Niue';
Is this possible at all?
Any change I make to the UM__ portion of one line makes it show as a blank on the dropdown. The remaining (unchanged) lines appear OK.
I have also tried to clone this file to application/language/english/ with similar bad results.
Any insights on this?
It looks like this would require hacks to the date_helper.php file which I am not willing to do.
Instead, the date class in CI has the timezones() function which allows you to convert from, for example, UM5 to -5. In that case one can wrap this function around the U__ value coming from the view/dropdown -- and then save it to DB as -5 or some other INT.
Since I want to show the user their selected timezone on that same dropdown, I am forced to have DB fields for the U__ and INT timezone formats. As far as I know, there is no CI function to convert from -5 to UM5.
So, for the user, I pull the U__ format into the view to autopopulate the dropdown.
For timezone conversions and such, I use the INT format.