Could anyone please explain to me how the following code executes and what is the meaning of preceding keyword in Oracle?
SUM(WIN_30_DUR) OVER(PARTITION BY AGENT_MASTER_ID
ORDER BY ROW_DT ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING)
Hey Thanks for your clarification. I have a small doubt.
Let say if we have 59 days of data from 1st jan to 28 feb. What data this function gets?
You obviously are querying a table T with columns WIN_30_DUR, AGENT_MASTER_ID and ROW_DT (among others). Keep in mind that keywords like OVER, PARTITION show you're using an analytical request: such requests allow you to get information on the current row from the other ones, that would be complex and long to write with GROUP BY or other "standard" clauses.
Here, on a given row, you:
group (PARTITION) by AGENT_MASTER_ID: this gets all the rows of T with current AGENT_MASTER_ID
in the partition formed you ORDER rows by ROW_DT
this ordering allows you to select the 30 rows before the current ROW_DT: this is the meaning of the PRECEDING keyword (0 would select the current row, the opposite is the FOLLOWING clause)
then you do a sum on the WIN_30_DUR field
In usual language, this would mean something like: for each agent, take the sum of durations of the preceding 30 days.
select row_dt, win_30_dur,
agent_master_id,
SUM(WIN_30_DUR) OVER(PARTITION BY AGENT_MASTER_ID
ORDER BY ROW_DT ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING) running_sum
from test;
It uses ROWS BETWEEN 0 PRECEDING AND 0 PRECEDING for returning the results upto the current row. , that is partitioned by the column AGENT_MASTER_ID in your table which is ordered by the ROW_DT.
So, in your query it returns the sum of values of AGENT_MASTER_ID that is preceding between 30 and 1 rows above the current row.
for better understanding: see here: http://sqlfiddle.com/#!4/ce6b4/4/0
ROWS BETWEEN is the windowing clasue. It is used to specify what rows are considered while evaluating the analytic function.
Breaking down the clauses,
PARTITION BY AGENT_MASTER_ID : The rows are partitioned by agent_master_id. That means, while evaluating the function for a particular row, only those rows are considered which which have agent_master_id same as that of the current row.
ORDER BY ROW_DT : The column by which the rows are ordered within each partition.
ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING : This specifies within each partition, consider only those rows starting from the row which precedes the current row by 30, till the row which precedes the current row by 1. Essentially, 30 previous rows.
For explanation purpose lets assume this is how your table looks like. Under the sum_as_analytical I have mentioned what rows are included while calculating the SUM.
agent_master_id win_30_dur row_dt sum_as_analytical
---------------------------------------------------------------------
1 12 01-01-2013 no preceding rows. Sum is null
1 10 02-01-2013 only 1 preceding row. sum = 12
1 14 03-01-2013 only 2 preceding rows. sum = 12 + 10
1 10 04-01-2013 3 preceding rows. sum = 12 + 10 + 14
. .
. .
. .
1 10 30-01-2013 29 preceding rows. sum = 12 + 10 + 14 .... until value for 29-01-2013
1 10 31-01-2013 30 preceding rows. sum = 12 + 10 + 14 .... until value for 30-01-2013
1 20 01-02-2013 30 preceding rows. sum = 10 + 14 + 10 .... until value for 31-01-2013
. .
. .
. .
1 10 28-02-2013 30 preceding rows. sum = sum of values from 29th Jan to 27th FeB
2 10 01-01-2013 no preceding rows. Sum is null
2 15 02-01-2013 only 1 preceding row. sum = 10
2 14 03-01-2013 only 2 preceding rows. sum = 10 + 15
2 12 04-01-2013 3 preceding rows. sum = 10 + 15 + 14
. .
. .
. .
2 23 31-01-2013 30 preceding rows. sum = 10 + 15 + 14 .... until value for 30-01-2013
2 12 01-02-2013 30 preceding rows. sum = 15 + 14 + 12 .... until value for 31-01-2013
. .
. .
. .
2 25 28-02-2013 30 preceding rows. sum = sum of values from 29th Jan to 27th FeB
Few other examples of windowing clasue,
UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING : All preceding rows, current row, all following rows.
2 PRECEDING and 5 FOLLOWING : 2 preceding rows, current row and 5 following rows.
5 PRECEDING and CURRENT ROW : 5 preceding rows and current row.
CURRENT ROW and 1 FOLLOWING : Current row, 1 following row.
Windowing clause is optional. If you omit it, the default in Oracle is UNBOUNDED PRECEDING AND CURRENT ROW, which essentially gives the cumulative total.
Here's a simple demo.
I found solution by assigning result into list..
List<> BOS = Orders1.ToList<>();
decimal running_total = 0;
var result_set =
from x in BOS
select new
{
DESKTOPS = x.NOTEBOOKS,
running_total = (running_total = (decimal)(running_total + x.NOTEBOOKS))
};`enter code here`
Related
I currently have three columns in report builder that look like this.
PU PI LO Total SUM
0 13 31 44
The Total Sum column is an expression that sums the first three columns with =Fields!Put_Away.Value+Fields!Picked.Value+Fields!Loaded.Value. I now want to create one more column that grabs the sum of of those three fields and divides it by 5. How do I do this? I tried =Fields!PU.Value+Fields!PI.Value+Fields!LO.Value/5 but it gives me 19.2 as the result of the example above.
You need to use brackets.
Currently you are doing =Fields!Put_Away.Value+Fields!Picked.Value+Fields!Loaded.Value/5, which converts to 0 + 13 + 31 / 5, or if we include the inferred brackets, 0 + 13 + (31/5).
You want =(Fields!Put_Away.Value+Fields!Picked.Value+Fields!Loaded.Value)/5, which becomes (0 + 13 + 31)/5
I have a column containing many price rows I want to determine the sum of 25 rows range then the next 25 rows range etc. I've done this, it's ok only for the first list:
select sum(substr(UNICOLON,-14,10)) over (order by to_number(:P22_UNI) rows between current row and 24 following) as R
from CNAM_CONCAT;
Report is the sum of the 25 rows in the page 1/2 and TOTAL A PAYER is the sum of all rows
I have a matrix and I want to find the maximum value in each column, then find the index of the row of that maximum value.
A = magic(5)
A =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
[~,colind] = max(max(A))
colind =
3
returns colind as the column index that contains the maximum value. If you want the row:
[~,rowind] = max(A);
max(rowind)
ans =
5
You can use a fairly simple code to do this.
MaximumVal=0
for i= i:length(array)
if MaximumVal>array(i)
MaximumVal=array(i);
Indicies=i;
end
end
MaximumVal
Indicies
Another way to do this would be to use find. You can output the row and column of the maximum element immediately without invoking max twice as per your question. As such, do this:
%// Define your matrix
A = ...;
% Find row and column location of where the maximum value is
[maxrow,maxcol] = find(A == max(A(:)));
Also, take note that if you have multiple values that share the same maximum, this will output all of the rows and columns in your matrix that share this maximum, so it isn't just limited to one row and column as what max will do.
I have something like the following:
a = [1 11; 2 16; 3 9; 4 13; 5 8; 6 14];
b = a;
n = length(a);
Sum = [];
for i=1:1:n,
Sum = b(i,2)+b(i+1:1:n,2)
end
b =
1 11
2 16
3 9
4 13
5 8
6 14
For the first iteration I am looking to find the first combination of values in the second column which are between 19 and 25.
Sum =
27
20
24
19
25
Since 20 is that first combination (Rows 1&3) -- I would like to remove that data at start a new matrix or signify that is the first combination (i.e. place a 1 next to in by creating a third column)
The next step would be to sum the values which are still in the matrix with row 2 value:
Sum =
29
24
30
Then 2&5 would be combined.
However, I would like to allow not only pairs to be combined but also several rows if possible.
Is there something I am overlooking that may simplify this problem?
I don't think you're going to simplify this very much. It's a variation on the knapsack problem, which is NP-hard. The best algorithm to use might depend on the size of your inputs.
This is my code:
data INDAT8; set INDAT6;
Array myarray{24,27};
goodgroups=0;
do i=2 to 24 by 2;
do j=2 to 27;
if myarray[i,j] gt 1 then myarray[i+1,j] = 'bad';
else if myarray[i,j] eq 1 and myarray[i+1,j] = 1 then myarray[i+1,j]= 'good';
end;
end;
run;
proc print data=INDAT8;
run;
Problem:
I have the data in this format- it is just an example: n=2
X Y info
2 1 good
2 4 bad
3 2 good
4 1 bad
4 4 good
6 2 good
6 3 good
Now, the above data is in sorted manner (total 7 rows). I need to make a group of 2 , 3 or 4 rows separately and generate a graph. In the above data, I made a group of 2 rows. The third row is left alone as there is no other column in 3rd row to form a group. A group can be formed only within the same row. NOT with other rows.
Now, I will check if both the rows have “good” in the info column or not. If both rows have “good” – the group formed is also good , otherwise bad. In the above example, 3rd /last group is “good” group. Rest are all bad group. Once I’m done with all the rows, I will calculate the total no. of Good groups formed/Total no. of groups.
In the above example, the output will be: Total no. of good groups/Total no. of groups => 1/3.
This is the case of n=2(size of group)
Now, for n=3, we make group of 3 rows and for n=4, we make a group of 4 rows and find the good /bad groups in a similar way. If all the rows in a group has “good” block—the result is good block, otherwise bad.
Example: n= 3
2 1 good
2 4 bad
2 6 good
3 2 good
4 1 good
4 4 good
4 6 good
6 2 good
6 3 good
In the above case, I left the 4th row and last 2 rows as I can’t make group of 3 rows with them. The first group result is “bad” and last group result is “good”.
Output: 1/ 2
For n= 4:
2 1 good
2 4 good
2 6 good
2 7 good
3 2 good
4 1 good
4 4 good
4 6 good
6 2 good
6 3 good
6 4 good
6 5 good
In this case, I make a group of 4 and finds the result. The 5th,6th,7th,8th row are left behind or ignored. I made 2 groups of 4 rows and both are “good” blocks.
Output: 2/2
So, After getting 3 output values from n=2 , n-3, and n=4 I will plot a graph of these values.
If you can help in any any language using array, if and do loop. it would be great.
I can change my code accordingly.
Update:
The answer for this doesn't have to be in sas. Since it is more algorithm-related than anything, I will accept suggestions in any language as long as they show how to accomplish this using arrays and do.
I am having trouble understanding your problem statement, but from what I can gather here is what I can suggest:
Place data into bins and the process the summary data.
Implementation 1
Assumption: You don't know what the range of the first column will be or distriution will be sparse
Create a hash table. The Key will be the item you are doing your grouping on. The value will be the count seen so far.
Proces each record. If the key already exists, increment the count (value for that key in the hash). Otherwise add the key and set the value to 1.
Continue until you have processed all records
Count the number of keys in the hash table and the number of values that are greater than your threshold.
Implementation 2
Assumption: You know the range of the first column and the distriution is reasonably dense
Create an array of integers with enough elements so the index can match the column value. Initialize all elements to zero. This array will hold your count for each item you are grouping on
Process each record. Examine value of first column. Increment corresponding index in array. (So if you have "2 1 good", do groupCount[2]++)
Continue until you have processed all records
Walk each element in the array. Count how many items are non zero (meaning they appeared at least once) and how many items meet your threshold.
You can use the same approach for gathering the good and bad counts.