Where does values of argument `timestamp` of quantstrat::ruleSignal's come from? - quantstrat

When reading source code of ruleSignal, argument timestamp at line 66 is a very important input, but I could not figure out where timestamp data come from.
It seems that functions add.indicator, add.signal, add.rule, applyIndicators, applySignals, which run before ruleSignal don't use or generate timestamps values.
I wonder which function generates values of timestamp for ruleSignal to use, or where data of timestamp come from.
Thanks a lot!

Thanks to Brian Peterson's reply which answers my question:
indicators and signals are always presumed to be vectorized, per the
documentation, so no timestamp is required.
rules, by default, are path dependent, and thus need a timestamp to
operate from.
The path-dependent loop is outside of the individual rule function.
see line 555 of rules.R

Related

How to read a timestamp

I have joined in a Data competition for students. They gave me timestamps of users' interaction. Is it useful?
Some of them are:
1615983880510
1615767552552
1615767577100
1616036788631
What you are looking at is a linux timestamp. It's actually a pretty interesting time representation. In short, it's just a number. And it represents the total number of seconds that have passed since January 1st 1970. A date known as "Unix Epoch Time".
Now, if you want to convert that into a readable date there are many ways to do it in basically every programing language. For example in python you might do something like this:
from datetime import datetime
def printdate(unix):
print(datetime.utcfromtimestamp(unix).strftime('%Y-%m-%d %H:%M:%S'))
BUT! It seems like your dates might actually be in miliseconds. Meaning that you might actually want to divide your dates by 1000 before passing them trough the function. So...
def printdatems(unix):
return printdate(unix/1000)
And there you go!
printdatems(1615983880510) #2021-03-17 12:24:40
printdatems(1615767552552) #2021-03-15 00:19:12
printdatems(1615767577100) #2021-03-15 00:19:37
printdatems(1616036788631) #2021-03-18 03:06:28
That's the output for the example dates you provided.
Of course you can find much more information on Wikipedia:
https://en.wikipedia.org/wiki/Unix_time
It's an intresting read!

Is it faster to sort dates or sort strings in SPSS? If so, by how much?

I have a dataset of around 5 million records. The dates are read in as strings. They are in the form MM/DD/YYYY HH:MM:SS. I am only interested in the date part of it so I read them in as (A10) format which effectively trims the time.
I then do ALTER TYPE DateVar (SDATE10). I do this as I thought sorting dates would be quicker but I can't find confirmation of this.
Is there a way to time SPSS commands to work out questions like this?
The quickest way I can think of is to use python for the timestamps, and normal SPSS syntax for the sorting - just to replicate real-life conditions
***Start timer, in python.
begin program.
import time
start = time.time()
end program.
***go out of python, into normal SPSS syntax, and do your stuff.
/*Put the syntax you want to test here
***get back to python, stop timer, and calculate time difference.
begin program.
end = time.time()
print("It took ",end - start, " seconds")
end program.
Check the output log, and it will show you the time.
Not very scientific, but quick and easy.
I recommend re-starting SPSS between tests - just to be sure one test is not affecting the other.
From my experience, alter type does something that affects code execution times. Not sure what, but everything seems slower after an alter type. So you might also consider saving and re-opening after using alter type.
You should keep the Date format, because:
Dates In spss are actually numbers (formatted in the display as dates but just numbers all the same). Sorting numbers is faster than sorting strings.
In any case, sorting by dates as strings will not order the file by dates (eg. "12-OCT-2017" > "11-NOV-2017").
See another good reason in #horace_vr's comment below.

Correlating multiple dynamic values

How can I get the value of important id and ValueType?
I have tried using web_save_param_regexp (but unfortunately I don't fully understand how the function works).
I have also tried using web_save_param (with the help of offset and length).
unfortunately once again I cannot get the accurate value some values change in length specially when the total amount values dynamically changes per run.
<important id=\"insertsomevalueshere\" record=\"1\" nucTotal=\"NUC609.40\"><total amount=\"68.75\" currency=\"USD\"/><total amount=\"609.40\" currency=\"USD\"/><out avgsomecost=\"540.65\" ValueType=\"insertsomevalueshere\" containsawesomeness=\"1\" Score=\"-97961\" somedatatype=\"1\" typeofData=\"VAL\" web=\"1\">
Put these lines of code before the line of code which does your web request:
web_reg_save_param_regexp("ParamName=importantid","Regexp=<important id=\\\"(.*?)\\\"",LAST);
web_reg_save_param_regexp("ParamName=ValueType","Regexp= ValueType=\\\"(.*?)\\\"",LAST);
You will then have two stored parameters 'importantid' and 'ValueType'
Dynamic number of elements to correlate? Your path for resubmission is through web_custom_request(). You will need to build the string you need dynamically with the name:value pairs for all of the data which needs to be included.
This path will place a premium on your string manipulation skills in the language of the tool. The default path is through C, but you have other language options if your skills are more refined in another language.

Convert to E164 only if possible?

Can I determine if the user entered a phone number that can be safely formatted into E164?
For Germany, this requires that the user started his entry with a local area code. For example, 123456 may be a subscriber number in his city, but it cannot be formatted into E164, because we don't know his local area code. Then I would like to keep the entry as it is. In contrast, the input 089123456 is independent of the area code and could be formatted into E164, because we know he's from Germany and we could convert this into +4989123456.
You can simply convert your number into E164 using libphonenumber
and after conversion checks if both the strings are same or not. If they're same means a number can not be formatted, otherwise the number you'll get from library will be formatted in E164.
Here's how you can convert
PhoneNumberUtil phoneUtil = PhoneNumberUtil.getInstance();
String formattedNumber = phoneUtil.format(inputNumber, PhoneNumberFormat.E164);
Finally compare formattedNumber with inputNumber
It looks as though you'll need to play with isValidNumber and isPossibleNumber for your case. format is certainly not guaranteed to give you something actually dialable, see the javadocs. This is suggested by the demo as well, where formatting is not displayed when isValidNumber is false.
I also am dealing with this FWIW. In the context of US numbers: The issue is I'd like to parse using isPossibleNumber in order to be as lenient as possible, and store the number in E164. However then we accept, e.g. +15551212. This string itself even passes isPossibleNumber despite clearly (I think) not being dialable anywhere.

confirm conditional statement applies to >0 observations in Stata

This is something that has puzzled me for some time and I have yet to find an answer.
I am in a situation where I am applying a standardized data cleaning process to (supposedly) similarly structured files, one file for each year. I have a statement such as the following:
replace field="Plant" if field=="Plant & Machinery"
Which was a result of the original code-writing based on the data file for year 1. Then I generalize the code to loop through the years of data. The problem becomes if in year 3, the analogous value in that variable was coded as "Plant and MachInery ", such that the code line above would not make the intended change due to the difference in the text string, but not result in an error alerting the change was not made.
What I am after is some sort of confirmation that >0 observations actually satisfied the condition each instance the code is executed in the loop, otherwise return an error. Any combination of trimming, removing spaces, and standardizing the text case are not workaround options. At the same time, I don't want to add a count if and then assert statement before every conditional replace as that becomes quite bulky.
Aside from going to the raw files to ensure the variable values are standardized, is there any way to do this validation "on the fly" as I have tried to describe? Maybe just write a custom program that combines a count if, assert and replace?
The idea has surfaced occasionally that replace should return the number of observations changed, but there are good reasons why not, notably that it is not a r-class or e-class command any way and it's quite important not to change the way it works because that could break innumerable programs and do-files.
So, I think the essence of any answer is that you have to set up your own monitoring process counting how many values have (or would be) changed.
One pattern is -- when working on a current variable:
gen was = .
foreach ... {
...
replace was = current
replace current = ...
qui count if was != current
<use the result>
}

Resources