how to create a line chart of percentage in AWS QuickSight - amazon-quicksight

I'm trying to create a line chart in quicksite, with "date" on the x-axis and cumulative percentage of treatments on the y-axis. I created a "running sum" fiels that sums the number of treatments by date. How do I calculate percentage out of all participants, including those without treatment? notice the no-treatment participants do not have a date.
id
date
treatment
running sum
abc
2/2/2000
1.
1
def
0
1
hij
4/4/2000
1
2
klm
5/5/2000
1
3
nop
0
3

Related

Trouble specifying datasummary() formula

I am getting weird outputs from my datasummary code. The idea is to create a table that shows the mean and SD for numeric variables and the number of observations for the full sample. I also want to display the shares for the two levels of a binary factor variable. Currently, i get the SD and mean from the only numeric variable (which makes sense), and the N shown is also only shown for the numeric variable. The N shown is also not the number of observations, but the first number in the numeric variable vector. This is my current code
age is the numeric variable
v2 - v4 are factor variables
obama is a factor variable which i want the table to show shares per each of the 2 levels.
datasummary(formula = age + (educated parent= education) + religion + sex ~ Heading("Entire sample") * 1 * (Mean + SD + N) + obama * Percent(), fmt = 3, data = data, title = 'Table 1: Votes for Obama in 2012 - Summary statistics', notes = c('1 = voted for Obama', 'educated parent: 1 = at least one parent has a degree', 'Source: General social survey'))
I am getting the warnings
Warning messages:
1: Summary statistic is length 1693
2: Summary statistic is length 1261
3: Summary statistic is length 432
4: Summary statistic is length 335
5: Summary statistic is length 379
6: Summary statistic is length 123
7: Summary statistic is length 856
8: Summary statistic is length 728
9: Summary statistic is length 965
Which are the values i want to be displayed under the "N" - column.
The table i get as an output looks like this
Table 1:
0 1
age 37.507 62.493
educated parent 0 27.998 46.486
1 9.510 16.007
religion None 3.662 16.125
Catholic 8.919 13.467
Other 1.713 5.552
Protestant 23.213 27.348
sex Male 18.252 24.749
Female 19.256 37.744
1 = voted for Obama
educated parent: 1 = at least one parent has a degree
Source: General social survey
The data is taken from gss_sm from the socviz package. I have created a new religion and a new education variable. Religion is a 4 level factor, and education is a 2 level factor.
I have tried making my own n fuction,
`n<-function() {
if(class(x)!="numeric"){
n<-length(x)
}
else{
n<-sum(!is.na(x))
}
formatC(n,digits=0)
}
`
and plugging that in in the place of "N".
It seems like as if it is the N function that isnt working.

Dax measure- sum of percent of total by group with condition

For simplicity sake, I have the following dummy data:
id val
1 5
1 30
1 50
1 15
2 120
2 60
2 10
2 10
My desired output is the following:
id SUM_GT_10%
1 95%
2 90%
SUM_GT_10% can be obtained by the following steps:
Calculate the sum of val for each id
Divide val by 1
sum of 2 if 2 > 10%
using the example data, the sum of val is 100 for id 1 and 200 for id 2, so we would obtain the following additional columns:
id val 1 2
1 5 100 5%
1 30 100 30%
1 50 100 50%
1 15 100 15%
2 120 200 60%
2 60 200 30%
2 10 200 5%
2 10 200 5%
And our final output (step 3) would be sum of 2 where 2> 10%:
id SUM_GT_10%
1 95%
2 90%
I don't care about the intermediate columns, just the final output, of course.
James, you might want to create a temporary table in your measure and then sum its results:
tbl_SumVAL =
var ThisId = MAX(tbl_VAL[id])
var temp =
FILTER(
SELECTCOLUMNS(tbl_VAL, "id", tbl_VAL[id], "GT%",
tbl_VAL[val] / SUMX(FILTER(tbl_VAL, tbl_VAL[id] = ThisId), tbl_VAL[val])),
[GT%] > 0.1
)
return
SUMX(temp, [GT%])
The temp table is basically recreating two steps that you have described (divide "original" value by the sum of all values for each ID), and then leaving only those values that are greater than 0.1. Note that if your id is not a number, then you'd need to replace MAX(tbl_VAL[id]) with SELECTEDVALUE(tbl_VAL[id]).
The final result looks like that -
Also, make sure to set your id field to "Not Summarize", in case id is a number -

repeating set row values for unique column values

the data format i have is as follows:
When i use s2<- fill_(s1,c("Time")), it would use the last seen value..
however i would like all values of time listed below to repeat for each value of Animal
Group Animal Sex Time
1 1001 M 0
4
8
24
48
1 1002 M
1 1003 M

TiBCO Spotfire - How to Calculate only the last 3 columns in a Data - see descr

Week Sales
1 100
2 250
3 350
4 145
5 987
6 26
7 32
8 156
I wanted to calculate the sales only for the last 3 weeks so the total will be 156+32+26.
If new weeks are added it should automatically calculate only the data from the last 3 rows.
Tried this formula but it is returning an incorrect sum
sum(sales) over (lastperiod(3(week))
https://i.stack.imgur.com/6Y7h7.jpg
If you want only the last 3 weeks sum in calculated column you can use a simple if calculation.
If([week]>(Max([week]) - 3),Sum([sales]),0)
If you need 3 weeks calculation throughout table use below one.
sum([sales]) OVER (LastPeriods(3,[week]))

Sum a measure of LastNonEmpty in GrandTotal in Microsoft SSAS

I am new to the OLAP, not sure how to sum the LastNonEmpty measure values in Grand Total, but keep LastNonEmpty values in other cells. like the following
Column 1 Column 2 Measure
A a 1
b 2
c
Total 2
B a 4
b
c
Total 4
Grand Total 6

Resources