How to aggregate 1-minute data to 30-minutes or 1-hour average data in dolphindb - window

I have a stock data which is ordered by minute, I want to aggregate the 1-minute data to 30-minutes or 1-hour average data. I contextby stock_code first, then invoke function rolling to aggregate. But it does't work. The code is:
mins_data = select rolling(first, code, 15,15) as Code,rolling(first, time_all, 15,15) as time
, rolling(first, open, 15, 15) as Open, rolling(max, high, 15,15) as High, rolling(min, low,
15, 15) as Low, rolling(last, close, 15, 15) as Close, rolling(sum, volume, 15, 15) as Volume,
rolling(sum, turnover, 15, 15) as Turnover from mins_data context by code;
What should I do?

The k-line data can be realized by 'group by code, date, bar(time,...)'. For specific examples, please refer to the k-line tutorial

Related

SSAS Tabular - how to aggregate differently at month grain?

In my cube, I have several measures at the day grain that I'd like to sum at the day grain but average (or take latest) at the month grain or year grain.
Example:
We have a Fact table with Date and number of active subscribers in that day (aka PMC). This is snapshotted per day.
dt
SubscriberCnt
1/1/22
50
1/2/22
55
This works great at the day level. At the month level, we don't want to sum these two values (count = 105) because it doesn't make sense and not accurate.
when someone is looking at month grain, it should look like this - take the latest for the month. (we may change this to do an average instead, management is still deciding)
option 1 - Take latest
Month-Dt
Subscribers
Jan-2022
55
Feb-2022
-
option 2 - Take aveage
Month-Dt
Subscribers
Jan-2022
52
Feb-2022
-
I've not been able to find the right search terms for this but this seems like a common problem.
I added some sample data at the end of a month for testing:
dt
SubscriberCnt
12/30/21
46
12/31/21
48
This formula uses LASTNONBLANKVALUE, which sorts by the first column and provides the latest value that is not blank:
Monthly Subscriber Count = LASTNONBLANKVALUE( 'Table'[dt], SUM('Table'[SubscriberCnt]) )
If you do an AVERAGE, a simple AVERAGE formula will work. If you want an average just for the current month, then try this:
Current Subscriber Count =
VAR _EOM = CLOSINGBALANCEMONTH( SUM('Table'[SubscriberCnt]), DateDim[Date] )
RETURN IF(_EOM <> 0, _EOM, AVERAGE('Table'[SubscriberCnt]) )
But the total row will be misleading, so I would add this so the total row is the latest number:
Current Subscriber Count =
VAR _EOM = CLOSINGBALANCEMONTH( SUM('Table'[SubscriberCnt]), DateDim[Date] ) //Get the number on the last day of the month
VAR _TOT = NOT HASONEVALUE(DateDim[MonthNo]) // Check if this is a total row (more than one month value)
RETURN IF(_TOT, [Monthly Subscriber Count], // For total rows, use the latest nonblank value
IF(_EOM <> 0, _EOM, AVERAGE('Table'[SubscriberCnt]) ) // For month rows, use final day if available, else use the average
)

Subtotal plus limiting data set

I'm brand-spakin' new to SQL and was asked to help write a query for a report. I need to limit the data to the last 10 services done by a clinician, and then subtotal the difference between the two times (time in and out) for each clinician.
I'm guessing I need to do a "LIMIT" clause to limit the data, but I'm not sure how or where to put that information. I am also thinking I need to use "GROUP BY", but not positive on that either. Any help would be appreciated.
I tried simplifying the existing query that my boss started but I'm getting error messages about the GROUP BY clause because I don't have an aggregate.
Select CV.emp_name,
CV.Visittype,
CVt.clientvisit_id,
CV.client_id,
CV.rev_timein,
CV.rev_timeout,
Convert(varchar(25),Cast(CV.rev_timein As Time),8) As Start_Time,
CV.program_id,
CV.cptcode
From ClientVisit CV
Where CV.visittype = 'Mobile Therapy' And CV.program_id = 31
And CV.cptcode <> 'NB' And CV.rev_timein <=
Convert(datetime,IsNull(#param2, GetDate())) And CV.rev_timein >=
Convert(datetime,IsNull(#param1, GetDate())) And
Cast(CV.rev_timein As time) > '15:59'
Group By CV.emp_name,
CV.rev_timein

Power Query (M language) 50 day moving Average

I have a list of products and would like to get a 50 day simple moving average of its volume using Power Query (M).
The table is sorted by product name and date. I add a custom column and applied the code below.
if [date] >= #date(2018,1,29)
then List.Average(List.Range(Source[Volume],[Volume]-1,-50))
else ""
Since it is already sorted by date and name, an if statement was applied with a date as criteria/filter. However, an error occurs that says
'Volume' column not found in the table.
I expect to have an added column in the power query with volume 50 day moving average per product. the calculation to be done if date is greater than or equal Jan 29, 2018.
We don't know what your columns are, but assuming you have [product], [date] and [volume] in Source, this would average the last 50 days of [volume] for the identical [product] based on each [date], and place in a new column
AvgAmountAdded = Table.AddColumn(Source, "AverageAmount", (i) => List.Average(Table.SelectRows(Source, each ([product] = i[product] and [date]<=i[date] and [date]>=Date.AddDays(i[date],-50)))[volume]), type number)
Finally! found a solution.
First, apply Index by product see this post for further details
Then index again without criteria (index all rows)
Then, apply below code
= Table.AddColumn(#"Previous Step", "Volume SMA(50)", each if [Index_byProduct] >= 50 then List.Average(List.Range(#"Previous Step"[Volume], ([Index_All]-50),50)) else 0),
For large dataset, Table.Buffer function is recommended after index-expand step to improve PQ calculation speed

DAX Average Issue

I have this table
and this is the measurement i have to calculate the average
Traded Contract(MTD) := TOTALMTD(SUM([Traded Contract]), 'TestTable'([Trading Date]))
Average := [Traded Contract(MTD)]/SUM([Trading Days])
Currently the result of average is correct up to daily level,
When I wish to see the monthly average, I didn’t filter by date, then I will get the result 9000/14 = 642 which is incorrect, I wish to see 4425 which is the total of each average. How do I amend my Average measurement query to get the expected result
I'm not entirely sure why you would want to do this since 4425 isn't really an average, but you can write your formula as follows:
Average = SUMX(VALUES(TestTable[Trading Date]),
[Traded Contract(MTD)] /
LOOKUPVALUE(TestTable[Trading Days],
TestTable[Trading Date],[Trading Date]))
For more information on how these sort of measures work, I suggest reading the following article:
Subtotals and Grand Totals That Add Up “Correctly”

Two (seemingly) identical queries, one is faster, why?

Two seemingly identical queries (as far as a newbie like me can tell, but the first is faster overall in the partial template rendering time (nothing else changed but the ids statement). Also, when testing through rails console, the latter will visibly run a query, the former will not. I do not understand why - and why the first statement is a few ms faster than the second - though I can guess it is due to the shorter method chaining to get the same result.
UPDATE: My bad. They are not running the same query, but it still is interesting how a select on all columns is faster than a select on one column. Maybe it is a negligible difference compared to the method chaining though.
ids = current_user.activities.map(&:person_id).reverse
SELECT "activities".* FROM "activities" WHERE "activities"."user_id" = 1
SELECT "people".* FROM "people" WHERE "people"."id" IN (1, 4, 12, 15, 3, 14, 17, 10, 5, 6) Rendered activities/_activities.html.haml (7.4ms)
ids = current_user.activities.order('id DESC').select{person_id}.map(&:person_id)
SELECT "activities"."person_id" FROM "activities" WHERE "activities"."user_id" = 1 ORDER BY id DESC
SELECT "people".* FROM "people" WHERE "people"."id" IN (1, 4, 12, 15, 3, 14, 17, 10, 5, 6) Rendered activities/_activities.html.haml (10.3ms)
The purpose of the statement is to retrieve the foreign key reference to people in the order in which they appeared in the activities table, (on its PK).
Note: I use Squeel for SQL.
In the first query, you've chained .map and .reverse, while in the second query, you've used .order('id DESC') .select(person_id) which were unnecessary, if you added .reverse

Resources