Performance for adfuller and SARIMAX - performance

This is somewhat of a continuation of a previous post but I am trying to forecast weekly revenues. My program seems to hang on the adfuller test. It has run before and appears stationary via p-value but not consistently. I have added SARIMAX in as well and the code just hangs. If I cancel out I get a message towards the bottom (periodically) that says the problem is unconstrained.
Data:
Week | Week_Start_Date |Amount |year
Week 1 2018-01-01 42920 2018
Week 2 2018-01-08 37772 2018
Week 3 2018-01-15 41076 2018
Week 4 2018-01-22 38431 2018
Week 5 2018-01-29 101676 2018
Code:
x = organic_search.groupby('Week_Start_Date').Amount.sum()
# Augmented Dickey-Fuller test
ad_fuller_result = adfuller(x)
print(f'ADF Statistic: {ad_fuller_result[0]}')
print(f'p-value: {ad_fuller_result[1]}')
# SARIMA Model
plt.figure(2)
best_model = SARIMAX(x, order=(2, 1, 1), seasonal_order=(2, 1, 1, 52)).fit(dis=1)
print(best_model.summary())
best_model.plot_diagnostics(figsize=(15,12))
I am only working with 185 or so rows. I don't understand why code is just hanging. Any optimization suggestions welcome (for adfuller and SARIMAX).

Fixed via passing organic_search['Amount'] instead of organic_search.groupby('Week_Start_Date').Amount.sum()

Related

Create monthly trigger for Scheduled Task in Powershell (With additional criteria)

I'm currently working on a script that when run, creates some Scheduled tasks that makes the host machine do several things and then restart within a specified time span.
This script needs to be run on multiple domain controllers, and therefor i would like to "load balance" by using something like New-ScheduledTaskTrigger -RandomDelay in order for them to not reboot all at once, but kind of spread it out.
The goal is to be able to change some variables of when to restart, things like:
First Monday of the month between 18:00 and 23:59
Every Thursday between 01:00 and 06:00
Every day between 04:00 and ..... you see where I'm going
However there is no such thing as a "-Monthly" in New-ScheduledTaskTrigger
That's the first problem, this one i can probably solve with the help from other posts, but if i do it for example like this I'm not able to use the -RandomDelay which I think is a major feature for this to work.
Here is how I imagine it should look if the -Monthly did work (for a monthly trigger):
$rebootFrequency = MONTHLY # DAILY, WEEKLY, MONTHLY
$rebootWeek = FIRST # FIRST, SECOND, THIRD, FOURTH, LAST
$rebootDayOfWeek = MON # MON, TUE, WED, THU, FRI, SAT, SUN
$rebootTimeFrom = 10:00 # HH:MM[:SS]
$rebootTimeTo = 16:00 # HH:MM[:SS]
New-ScheduledTaskTrigger -"$rebootFrequency" -WeekOfMonth $rebootWeek;
-DayOfWeek $rebootDayOfWeek -At $rebootTimeFrom -RandomDelay $rebootTimeTo
Do you have any suggestions as to how I should solve this problem?
I could do the same thing with schtask.exe, however I would end up having to make some kind of script to do the "RandomDelay" function.
Feel free to ask further if you have any questions.
Thanks in advance.
Challenge 1
I've now got it to work, but I'm trying to make the script a bit more intuitive, but I can't figure out how i would do it...
What i want to do is to "convert" from using the numbers in days (for example: 16 for Thursday) to being able to write "THU" instead.
Right now it looks something like this:
$rebootDaysOfWeek = "16" # SUN=1, MON=2, TUE=4, WED=8, THU=16 etc.
$trigger.DaysOfWeek = $rebootDaysOfWeek
But I would find it alot cooler if it was something like this:
$rebootDaysOfWeek = "THU" # SUN, MON, TUE, WED, THU, FRI, SAT
$trigger.DaysOfWeek = $rebootDaysOfWeek
But I can't seem to find a way to "convert" $rebootDaysOfWeek to work with the bit mask.
Check out the Microsoft Docs:
https://learn.microsoft.com/en-us/windows/win32/taskschd/time-trigger-example--scripting-
The sample is in VB, but it looks like it's just a ComObject. I haven't had enough time to play around, but you can start like this:
$service = new-object -comobject Schedule.Service
$service.connect()
$taskdefinitiion = $service.NewTask(0)
There's lots of task definition stuff, but it get's down to the triggers and you'll do this:
$triggers = $taskDefinition.Triggers
$trigger = triggers.Create(5) # I had to try different numbers here, didn't dig through the docs
$trigger.DaysOfWeek = 16 #Thursday
$trigger.WeeksOfMonth = 1 # First week, 2 for second, 6 for third, 8 for forth
$trigger.MonthsOfYear = 4095 # all months
$trigger.RandomDelay = 'PT1H' # 1 hour random delay.
I'll let you take it from here. Links to some of the items above:
https://learn.microsoft.com/en-us/windows/win32/taskschd/monthlydowtrigger-daysofweek
https://learn.microsoft.com/en-us/windows/win32/taskschd/monthlydowtrigger-monthsofyear
https://learn.microsoft.com/en-us/windows/win32/taskschd/monthlydowtrigger-weeksofmonth
https://learn.microsoft.com/en-us/windows/win32/taskschd/monthlydowtrigger-randomdelay
UPDATE FOR CHALLENGE 1
In order to use "friendly" references to the bitwise decimal value you can either create a constants section or use hashtable, either way you are going to have to do the conversion yourself:
# Constants
$SUN = 1
$MON = 2
$TUE = 4
$WED = 8
$THU = 16
$FRI = 32
$SAT = 64
# Hashtable - because why not!
$DaysOfWeek = #{
SUN = 1
MON = 2
TUE = 4
WED = 8
THU = 16
FRI = 32
SAT = 64
}
Then you can use:
$trigger.DaysOfWeek = $THU
or
$trigger.DaysOfWeek = $DaysOfWeek["THU"]

Panel Data with time gap, How to create lag variable

I am dealing with panel data with a time gap. but not the same time gap.
Year variable has 1980, 1990, 2000, 2010, 2015, and 2020.
As you can see it has a 10 year time gap up to 2010, but five-years between 2010 and 2020.
After setting up for panel data structure in Stata (using xtset command), I wanted to use the time (lag) operator for my main variable interest and outcome variable. However, when I use L. in front of the variable name, Stata tells me no observations.
Isn't it automatically taking the previous time period?
Or do I create manually the lag variables?
What we need to know, but can't see, is exactly what code you used, specifically xtset. But it's possible to guess. Here I fabricate one panel; a structure with more panels doesn't show different problems.
clear
input Y Year
1 1980
2 1990
3 2000
4 2010
5 2015
6 2020
end
gen ID = 42
If you just specify panel and year variables, Stata expects unit spacing, so lag 1 with yearly data means "the previous year". Asking for a lag 1 variable is legal, but all values are missing.
xtset ID Year
gen lag1 = L1.Y
If you specify delta(5) then a lag 1 variable is missing in all but two observations.
xtset ID Year, delta(5)
gen lag5 = L1.Y
If you try delta(10) that won't work (unless you drop 2015).
xtset ID Year, delta(10)
You can also do this:
bysort ID (Year) : gen prev = Y[_n-1]
Bringing your results together
list , sep(0)
+------------------------------------+
| Y Year ID lag1 lag5 prev |
|------------------------------------|
1. | 1 1980 42 . . . |
2. | 2 1990 42 . . 1 |
3. | 3 2000 42 . . 2 |
4. | 4 2010 42 . . 3 |
5. | 5 2015 42 . 4 4 |
6. | 6 2020 42 . 5 5 |
+------------------------------------+
The no observations error message presumably comes from some other command.

Time ago in words convert into system date-time

Trying to convert strings like 9 weeks ago, 1 year, 6 months ago, 20 hours ago into a ruby time object like Tue, 10 Mar 2015 12:06:15 PDT -07:00.
I've been doing this:
eval("10 days ago".gsub(' ', '.'))
This works fine, but for strings like 1 year, 6 months ago blows up.
I just need to do comparisons like:
eval("10 days ago".gsub(' ', '.')) < (Time.now - 7.days)
I'm using sinatra so no fancy rails helpers.
Please never use eval in production code..
Converting from timeago notation would be quite complex and resource intensive.
However, this way seems the least error prone: It will convert a string like "5 seconds ago" to "5S" and use mapping to find what it means in time, after which it will subtract that time from the current time.
The parse string is dynamically built so it can accomodate most every timeago notation.
require('date')
mapping = {"D"=> "%d","W"=>"%U","H"=>"%T","Y"=>"%Y","M"=>"%m","S"=>"%S"}
timerel = "1 year, 6 months ago".split(",").map { |n| n.gsub(/\s+/, "").upcase()[0,2].split('')}
Date.strptime(
timerel.map {|n| n[0]}.join(" "),
timerel.map {|n| mapping[n[1]]}.join(" ")
)
date = Date.new(0) + (Date.today - Date.strptime(timerel.map {|n| n[0]}.join(" "), timerel.map {|n| mapping[n[1]]}.join(" ")))
=> #<Date: 2014-10-10 ((2456941j,0s,0n),+0s,2299161j)>
It goes without saying that is very error prone. Use at your own risk:
def parse(date:)
eval(date.gsub(/ ?(,|and) ?/, '+').tr(' ', '.').gsub(/^(.*)(\.ago)$/, '(\1)\2'))
end
parse(date: '1 year, 6 months ago') # => Wed, 10 Sep 2014 21:29:11 BST +01:00
parse(date: '1 year, 6 months, 3 weeks, 6 days, 9 hours and 12 seconds ago')
# => Thu, 14 Aug 2014 12:33:07 BST +01:00
The idea is to convert the original string to:
'(1.year+6.months).ago'

NEXT_DAY in Crystal Reports

Is there anything like the Oracle "NEXT_DAY" function available in the syntax that Crystal Reports uses?
I'm trying to write a formula to output the following Monday # 9:00am if the datetime tested falls between Friday # 9:00pm and Monday # 9:00am.
So far I have
IF DAYOFWEEK ({DATETIMEFROMMYDB}) IN [7,1]
OR (DAYOFWEEK({DATETIMEFROMMYDB}) = 6 AND TIME({DATETIMEFROMMYDB}) in time(21,00,00) to time(23,59,59))
OR (DAYOFWEEK({DATETIMEFROMMYDB}) = 2 AND TIME({DATETIMEFROMMYDB}) in time(00,00,00) to time(08,59,59))
THEN ...
I know I can write seperate IF statements to do a different amount of DateAdd for each of Fri, Sat, Sun, Mon, but if I can keep it concise by lumping all of these into one I would much prefer it. I'm already going to be adding additional rules for if the datetime falls outside of business hours on the other weekdays so I want to do as much as possible to prevent this from becoming a very overgrown and ugly formula.
Since there is no CR equivalent that I know of, you can just cheat and borrow the NEXT_DAY() function from the Oracle database. You can do this by creating a SQL Expression and then entering something like:
-- SQL Expression {%NextDay}
(SELECT NEXT_DAY("MYTABLE"."MYDATETIME", 'MONDAY')
FROM DUAL)
then you could either use that directly in your formula:
IF DAYOFWEEK ({MYTABLE.MYDATETIME}) IN [7,1]
OR (DAYOFWEEK({MYTABLE.MYDATETIME}) = 6 AND TIME({MYTABLE.MYDATETIME}) in time(21,00,00) to time(23,59,59))
OR (DAYOFWEEK({MYTABLE.MYDATETIME}) = 2 AND TIME({MYTABLE.MYDATETIME) in time(00,00,00) to time(08,59,59))
THEN DateTime(date({%NextDay}),time(09,00,00))
Or, the even better way would be to just stuff ALL of the logic into the SQL Expression and do away with the formula altogether.
Considering Sunday is 1
And the first 7 is the week we want to back
7 = 1 week
14 = 2 weeks
The last Number (1) is 1 for Sunday, 2 for Monday, 3 for Tuestday
Last Sunday 1 week ago
Today - 7 + ( 7 - WEEKDAY(TODAY) )+1
Last Monday 2 weeks ago
Today - 14 + ( 7 - WEEKDAY(TODAY) )+2
So this 2 formulas give me MONDAY LAST WEEK and SUNDAY LAST WEEK.
EvaluateAfter({DATETIMEFROMMYDB}) ;
If DayOfWeek ({DATETIMEFROMMYDB}) In [crFriday,crSaturday,crSunday,crMonday]
then
IF DayOfWeek ({DATETIMEFROMMYDB}) In [crFriday]
AND TIME({DATETIMEFROMMYDB}) >= time(21,00,00)
then //your code here
Else if Not(DayOfWeek ({DATETIMEFROMMYDB}) In [crFriday] )
AND (TIME({DATETIMEFROMMYDB}) >= time(00,00,00) AND TIME({DATETIMEFROMMYDB}) <= time(23,59,59))
then //your code here
Else if DayOfWeek ({DATETIMEFROMMYDB})In [crMonday]
AND TIME({DATETIMEFROMMYDB}) < time(09,00,00)
then //your code here

algorithm for calculating a week # from a date with custom start of week? (for iCal)

I can only find algorithm for getting ISO 8601 week (week starts on a Monday).
However, the iCal spec says
A week is defined as a seven day period, starting on the day of the
week defined to be the week start (see WKST). Week number one of the
calendar year is the first week that contains at least four (4) days
in that calendar year.
Therefore, it is more complex than ISO 8601 since the start of week can be any day of the week.
Is there an algorithm to determine what is the week number of a date, with a custom start day of week?
or... is there a function in iCal4j that does this? Determine a weekno from a date?
Thanks!
p.s. Limitation: I'm using a JVM language that cannot extend a Java class, but I can invoke Java methods or instantiate Java classes.
if (input_date < firstDateOfTheYear(WKST, year))
{
return ((isLeapYear(year-1))?53:52);
}
else
{
return ((dayOfYear(input_date) - firstDateOfTheYear(WKST, year).day)/7 + 1);
}
firstDateOfTheYear returns the first calendar date given a start of week(WKST) and the year, e.g. if WKST = Thursday, year = 2012, then it returns Jan 5th.
dayOfYear returns sequencial numerical day of the year, e.g. Feb 1st = 32
Example #1: Jan 18th, 2012, start of week is Monday
dayOfYear(Jan 18th, 2012) = 18
firstDateOfTheYear(Monday, 2012) = Jan 2nd, 2012
(18 - 2)/7 + 1 = 3
Answer Week no. 3
Example #2: Jan 18th, 2012, start of week is Thursday
dayOfYear(Jan 18th, 2012) = 18
firstDateOfTheYear(Thursday, 2012) = Jan 5th, 2012
(18 - 5)/7 + 1 = 2
Answer Week no. 2
Example #3: Jan 1st, 2012, start of week is Monday
firstDateOfTheYear(Monday, 2012) = Jan 2nd, 2012
IsLeapYear(2012-1) = false
Jan 1st, 2012 < Jan 2nd, 2012
Answer Week no. 52
Let daysInFirstWeek be the number of days on the first week of the year that are in January. Week starts on a WKST day. (e.g. if Jan 1st is a WKST day, return 7)
Set dayOfYear to the n-th days of the input date's year (e.g. Feb 1st = 32)
If dayOfYear is less than or equal to daysInFirstWeek
3.1. if daysInFirstWeek is greater than or equal to 4, weekNo is 1, skip to step 5.
3.2. Let daysInFirstWeekOfLastYear be the number of days on the first week of the previous year that are in January. Week starts on a WKST day.
3.3. if daysInFirstWeekOfLastYear is 4 or last year is Leap year and daysInFirstWeekOfLastYear is 5, weekNo is 53, otherwise weekNo is 52, skip to step 5.
Set weekNo to ceiling((dayOfYear - daysInFirstWeek) / 7)
4.1. if daysInFirstWeek greater than or equal to 4, increment weekNo by 1
4.2. if daysInFirstWeek equal 53 and count of days on the first week (starting from WKST) of January in the year of inputDate's year + 1 is greater than or equal to 4, set weekNo to 1
return weekNo

Resources