Syntax for Sequential Variable Names - syntax

I am struggling to compile my large dataset and am assuming syntax commands are the answer, however, I am not skilled at all with syntax. My questions are specific to what syntax commands (or other methods) I should use to create hundreds/thousands of new variable names so I do not need to do it manually.
I am working with a dataset involving intimate partner homicides and domestic violence services utilization among victims from 2012-2021 (10 years), examined monthly (120 months). Across that timeframe, I have a three variable name set (REC [number of clients who received services], CALL [number of calls for services], HOUR [number of hours advocates/employees spent providing services]) that needs to be repeated monthly Jan-Dec across 10 years 2012-2021 and 39 separate services. See below:
MonthYear_REC_ServiceName
MonthYear_CALL_ServiceName
MonthYear_HOUR_ServiceName
"Month" in the above is Jan-Dec (01-12), "Year" is 2012-2021 (12-21), and "ServiceName" would be replaced with 39 different services. As an example for the year 2017 and "Shelter" services:
0117_REC_Shelter
0117_CALL_Shelter
0117_HOUR_Shelter
0217_REC_Shelter
0217_CALL_Shelter
0217_HOUR_Shelter
0317_REC_Shelter
0317_CALL_Shelter
0317_HOUR_Shelter
.....so on and so forth until December of 2017.
To further explain: This sequential monthly order would need to be repeated for each year in the 2012-2021 timeframe for each of 39 services for which I have data. "Shelter" services are shown as an example above, but I also need the same set of variable names across 38 other service types such as group counseling, legal advocacy, economic assistance, etc.
My overall question is (sorry for the repetition)- What syntax commands would I need to input to create this MASSIVE amount of variable names/variables? I hope this makes sense to everyone in the same way it makes sense to me! Sorry for the length and thank you in advance.
Best,
Shannon H.

Assuming what you want is to create an empty dataset with all the variable names you described, this will do it:
INPUT PROGRAM.
LOOP ind = 1 to (12*10*3*39).
END CASE.
END LOOP.
END FILE.
END INPUT PROGRAM.
EXECUTE.
do repeat vr=month year set service/vl=12 10 3 39.
compute vr=mod(ind,vl).
recode vr (0=vl).
compute ind=trunc((ind-1)/vl)+1.
end repeat.
compute year=year+11.
formats all (f2).
alter type month year (a2) set (a4).
compute month = char.lpad(ltrim(month), 2, "0").
recode set (" 1"="REC")(" 2"="CALL")(" 3"="HOUR").
* I suggest at this point you use "match files" to match the service numbers here with a list of service names.
* The following code creates fictitious service names instead to demonstrate how to use them.
string serviceName (a20) vrnm (a50).
compute serviceName=concat("service", char.lpad(ltrim(string(service, f2)), 2, "0") ).
* now to create the final variable names.
compute vrnm=concat(month, year, "_", set, "_", serviceNAme).
flip NEWNAMES = vrnm.
select if CASE_LBL="".
exe.

Related

Declaring time series in Stata

I have some panel data with quarterly data in string format (imported from .csv file).
The name of the variable is datacqrt and it is in the format "YYYY"Q"Q" say 1998Q3. So I have it from 1966Q1 to 2014Q4 for each of the around 200 categories.
I am following the Stata guide and creating a new variable like this
generate time = date(datacqtr,"YQ")
but then it creates only missing values.
How do I make Stata understand the variable datacqrt is a time?
The function date() is for creating daily dates in Stata terms, i.e. days with an origin 0 at 1 January 1960. This is documented at length in (e.g.) help dates and times but is even clearer with the synonym daily().
You need the function quarterly(), for example:
. di %tq quarterly("2015q3", "YQ")
2015q3
. di %3.0f quarterly("2015q3", "YQ")
222
In your case you want something like
gen qdate = quarterly(datacqrt, "YQ")

Hide Labels with No Data in SPSS

I just started using SPSS, there is a option of Select cases that I was trying in SPSS, and later on finding frequency based on that filter.
For Eg:
Suppose Q1 has 12 parts, Q1_1 Q1_2 Q1_3 Q1_4 Q1_5 Q1_6 Q1_7 Q1_8 Q1_9 Q1_10 Q1_11 Q1_12
I want to see data in these variables based on a condition that I used in select cases. Now when I try to see frequencies of these variables based on the filter, only 4 out of 12 satisfy has data.
Now my question is can I hide rest 8 and show only 4 with data on my output window.
It's not entirely clear what you are trying to describe however reading between the lines, I'm guessing you are trying to delete tables generated from FREQUENCIES which may happen to be empty (likely due to a filter applied but perhaps not necessarily either)
You could do this with SPSS Scripting but avoiding that, you may want to explore using CTABLES, which though may not be in the exact same format as FREQUENCY table output it will still none the less retrieve the same information.
Solution below. Assumes Python Integration with SPSS SELECT VARIABLES installed and of course the CTABLE add-on module.
/****** Simulate example data ******/.
input program.
loop #j = 1 to 100.
compute ID=#j.
vector Q(12).
loop #i = 1 to 12.
do if #j<51 and #i<9.
compute Q(#i) = $sysmis.
else.
compute Q(#i) = trunc(rv.uniform(1,5)).
end if.
end loop.
end case.
end loop.
end file.
end input program.
execute.
/************************************/.
/* frequencies without filtering applied */.
freq q1 to q12.
/* frequencies WITH filtering applied */.
/* Empty table here shoult be removed */.
temp.
select if (ID<51).
freq q1 to q12.
spssinc select variables macroname="!Qp" /properties pattern = "^Q\d+$"/options separator="+" order=file.
spssinc select variables macroname="!Qs" /properties pattern = "^Q\d+$"/options separator=" " order=file.
temp.
select if (ID<51).
ctables /table (!Qp)[c][count colpct]
/categories variables=!Qs empty=exclude.
Note if you had assess empty variables at a total level then there is a function in spssaux2 (spssaux2.FindEmptyVars) which could help you find the empty variables and then you could build the syntax to exclude these and so contain the variables with only valid responses and then run FREQUENCIES. But I don't think spssaux2.FindEmptyVars will honor any filtering.

How to get user to enter in 24 hour format in BBC Basic

I am making a program that will enable me to work out the avergae speed of something over a set distance
For this to work the user needs to input the start time and the end time.. I am not sure how you input time in a 24 hour format.
Furthermore I need to find the difference in the 2 times and then work out the speed.. which is distance/time taken.
Let's say distance was 1000 meters
I lack a bbc basic compiler but you should create some like this
print str$(secondsinday("22:50:01")-secondsinday("17:09:17"))
sub secondsinday(t$)
return val(left$(t$,2))*3600+val(mid$(t$,4,2))*60+val(right$(t$,2))
end sub
I saw some bbc basic examples and the formula should be the same, only the function syntax is diffrent (I'll try and convert it after some research)

Excel VBA - Any performance benefits between const string range or define names for ranges?

G'Day,
I have a question more towards helping me understand on more about how Excel VBA can effectively manage defined ranges that have been declared in one place in order to execute data well. Just wanting to work out which two options (I know so far) is better or not as preferred best practice before working more on this project.
The problem I'm solving is to make a small table containing a number of failures across a set of fictional suppliers, thus the table looks like this (sorry it is in raw form)
"Company Name" "No. of Failures"
"Be Cool Machine" 7
"Coolant Quarters" 5
"Little Water Coolants 3
"Air Movers Systems" 7
"Generals Coolant" 5
"Admire Coolants" 4
My first option (Const String) is this module/formula as follows.
Option Explicit
Public Const CountofFailures As String = "J7:J12"
Sub btnRandom()
' Declaration of variables
Dim c As Range
' Provide a random number for failures across Suppliers
For Each c In ActiveSheet.Range(CountofFailures)
c.Value = Random1to10
Next c
End Sub
Function Random1to10() As Integer
'Ensure we have a different value each time we run this macro
Randomize
' Provide a random number from 1 to 10 (Maximum number of Failures)
Random1to10 = Int(Rnd() * 10 + 1)
End Function
Second option (Defined Name) is this module/formula as follows.
Option Explicit
Sub btnRandom()
' Declaration of variables
Dim c As Range
Dim iLoop As Long
' Provide a random number for Suppliers with Defined Range
For Each c In ActiveWorkbook.Names("CountofFailures").RefersToRange
c.Value = Random1to10
Next c
End Sub
Function Random1to10() As Integer
'Ensure we have a different value each time we run this macro
Randomize
' Provide a random number from 1 to 10 (Maximum number of Failures)
Random1to10 = Int(Rnd() * 10 + 1)
End Function
Any suggestions - I would do a macro timer test later if this helps?
Would there be a third option if I fetch a range listed in a cell as value? I haven't seen a code that does this in practice?
I don't know the performance difference-I suspect const is faster. My general advice is 'don't worry about performance until you have a performance problem'. Otherwise you end up guessing what to spend your optimize time on and it may not be right.
As for named ranges, the benefit is that they move when you insert rows and columns. If you insert a new column at column I your first example needs to be edited and your second example will conitinue to work.
Both of your codes loop through ranges which will be the bottleneck. I suggest you
Use a range name to automatically "locate" your data - ie if you insert/delete rows and columns your reference remains intact. My experience though is that many range names in a file can end up obfuscating what the workbook is doing
Do a single write to this range
code
Sub QuickFill()
Randomize
Range("CountofFailures").Formula = "=Randbetween(1,10)"
Range("CountofFailures").Value = Range("CountofFailures").Value
End Sub
I have found that Named Ranges are slower (presumably because Excel has to do an internal lookup on the Name to find what it refers to), but you are very unlikely to be able to find a significant dofference except in very extreme cases (tens of thousands of names being referenced tens of thousands or hundreds of thousands times).
And as Dick says: the benefits far outweigh the insignificant speed loss.

Yearless Ruby dates?

Is there a way to represent dates like 12/25 without year information? I'm thinking of just using an array of [month, year] unless there is a better way.
You could use the Date class and hard set the year to a leap year (so that you could represent 2/29 if you wanted). This would be convenient if you needed to perform 'distance' calculations between two dates (assuming that you didn't need to wrap across year boundaries and that you didn't care about the off-by-one day answers you'd get when crossing 2/29 incorrectly for some years).
It might also be convenient because you could use #strftime to display the date as (for example) "Mar-3" if you wanted.
Depending on the usage, though, I think I would probably represent them explicitly, either in a paired array or something like YearlessDate = Struct.new(:month,:day). That way you're not tempted to make mistakes like those mentioned above.
However, I've never had a date that wasn't actually associated with a year. Assuming this is the case for you, then #SeanHill's answer is best: keep the year info but don't display it to the user when it's not appropriate.
You would use the strftime function from the Time class.
time = Time.now
time.strftime("%m/%d")
While #Phrogz answer makes perfect sense, it has a downside:
YearlessDate = Struct.new(:month,:day)
yearless_date = YearlessDate.new(5, 8)
This interface is prone to MM, DD versus DD, MM confusion.
You might want to use Date instead and consider the year 0 as "yearless date" (provided you're not a historian dealing with real dates around bc/ad of course).
The year 0 is a leap year and therefore accommodates every possible day/month duple:
Date.parse("0000-02-29").leap? #=> true
If you want to make this convention air tight, just define your own class around it, here's a minimalistic example:
class YearlessDate < Date
private :year
end
The most "correct" way to represent a date without a year is as a Fixnum between 001 and 365. You can do comparisons on them without having to turn it into a date, and can easily create a date for a given year as needed using Date.ordinal

Resources