SPSS: automatic counting of follow-up moments in a longitudinal long format database - data-structures

I would like to structure my long format SPSS file so I can clean it and get a better overview. However, I run into some problems.
How can i create a new veriable counting the complation moments/waves/follow-up moments. I only have a completion data avaible in my dataset. Please open my image for a more explanation.
Preferably a numbering that continues counting if a year is missing.

If I understand right, you want the new variable to be an index of the year of involvement for each patient, as opposed to an index of data row per patient. To do this we can calculate for each entry the difference in years between the entry and the first entry of that patient:
(this assumes your dates are in date format)
compute year=xdate.year(OpenInvulMomenten).
aggregate /outfile=* mode=addvariables /break=PatientIdPseudo /firstYear=min(year).
compute newvar=1+year-firstYear.
exe.

Related

How to create a DAX cross-sectional measure?

I don't know if I even worded the question correctly, but I'm trying to create a measure that depends on what is showing in the pivot table (using PowerPivot). In the image I posted, "DealMonth" is an expression in the PowerQuery table itself that simply takes the start date of the employee and subtracts it from the month a deal was closed in. That will show how long it took for that salesperson to close the deal. "TenureMonths" is also an expression in the PowerQuery table that calculates the tenure of the person. The values populating this screenshot are coming from a total headcount measure created. What I'm trying to do is create a separate measure that will show when the "TenureMonths" is less than the "DealMonth." So if the TenureMonths is 5, then after DealMonth of 5, the value would be 0. Is this possible?
Screenshot
I should add the following information.
"DealMonth" - Comes from the FactData table
"TenureMonths" - Comes from the DimSalesStart table
These two tables are joined by name. I feel like I'm so close because I can see what I want. The second image below is a copy/paste of the pivot table result but with my edits to show what I'd want to have shown. Basically, if(TenureMonths >= DealMonth,1,0). The trouble seems to be that since they're in two different tables, I can't make it work. The rows in the fact table are transactions, but the rows in the dim table are just the people with their start and end dates.
Desired Result
This is possible with some IF([measure1]<[measure2],blank(),[measure1]), however without seeing more of the data it will be hard to guide you specifically.
However you need to create two separate measures, one for TenureMonths and one for DealMonth, depending on the data this can be done with an aggregator forumla such as sum, min, max, etc (depends if there will be more than one value).
Then reference those two measures in the formula pattern I mentioned above, and that should give you want you want.
I figured out a solution. I added a dimension table for DealMonth itself and joined to my fact table. That allowed me to do the formulas that I needed.

Generate new format from a non-system generated report using Power Query

I have an excel file which is non-system generated report format.
I wish to calculate and generate another new output.
Given the Report format as below:-
1) Inside the query when load this excel file, how can I create a new column to copy and paste on the first found value (1#51) at column at the next record, if the next record is empty. Once, if detected a new value (1#261) then copy and paste to the subsequent null value of few next records till this end?
2) The final aim is to generate a new output to auto match/calculate the money to be assign to different reference. As shown below:-
The reference A ~ E is sharing the 3 bank Ref (28269,28542 & RMP) , was thinking to read the same data source a few times, first time to read the column A ~ O(QueryRef) and 2nd time to read the same source to read from A, Q ~ V(QueryBank).
After this I do not have idea how I can allocate the $$ from Query Bank to QueryRef based on the Sum of Total AR.
Eg,
Total Amt of BankRef 28269, $57,044.67 is sufficient to cover Ref#A $10,947.12
BankRef 28269 still sufficient to cover Ref#B $27,647.60
BankRef 28269 left only $18,449.95 , hence the balance of 28269 be allocate to Ref#C.
Remaining balance of Ref#C will need to use BankRef28542 to cover,i.e. $1,812.29
Ref#D will then be allocated of the remaining balance of BankRef28542, i.e. $4,595.32
Ref#D still left $13,350.03 unallocated, hence this will use BankRef#RMP
Ref#E only need $597.66, and BankRef#RMP is sufficient to cover this.
I am not sure if my above case study can be solved using power query or not, due to me still being a newbie # Power Query? Or this is too complicate to handle hence we need to write a program to auto matching this kinds of scenario?
Attached is the sample source file and output :
https://www.dropbox.com/sh/dyecwcdz2qg549y/AACzezsXBwAf8eHUNxxLD1eWa?dl=0
Any advice/opinion/guidance is very much appreciated.
Answering question one:
You have a feature in Powerquery called FILL, DOWN or UP.
For a selected column you can copy the first non empty value to all rows under until a new non empty row is found and so on.

Separate dated lines into beginning and end of month (LibreOffice)

Given a list of items which have a date as one field, how can I separate one set which have a date in the first few days of the month from those which have a date in the last few days?
The items are gas bills, generally one per month, in a bank statement which relate to each of two separate buildings and need to go into two separate accounts. They were imported from a CSV file.
In practice, the number of lines involved is small, so I've just done it by hand, but the question of how to do it by formula and sort occurred to me, and I neither have nor found an answer.
I hope it is a slightly interesting question.
The function is simply called DAY. You can find it by clicking on the Function Wizard toolbar icon and looking under the Date&Time category.
For example, in cell B1 enter a formula like =DAY(A1) and fill down. Then go to Data -> Sort.

Sorting time stamp values that get constantly updated via google forms

I have a google sheet that gets filled via a google form.
Time stamps are created every time a bar code (work order number) is scanned.
The work order number is in the first column.
The 4 unique time stamp fields below are populated in the 2nd column from the google form.
Setup start
Setup finish
Production start
Production finish
The time stamp is created in the 3rd column.
I am trying to do conditional formatting
where the total setup time and production time are calculated but they are tied to their respective work order number.
time stamp functionality
The difficulty is that the timestamp values all fall into one vertical column.
I don't want a mix up of timestamp values with different work order numbers.
The work order numbers along with the 4 unique time stamp values may be input at various times so the formula can't be order specific.
Is there a way to do this? Thanks!
Below is an example link of the spreadsheet I have:
https://drive.google.com/open?id=1YA86jGq_jMsx-wKe19TnZZyf9F4aW6_kUIbrz8hkLJI
Make a pivot table of the data from the form, then use simple formulas adjacent to the new pivot to get the results you are trying to get. Example Image

Calculate Value For Dates Between

A few references:
Microsoft's documentation on DATESBETWEEN.
Somewhat similar question, though the answer and derivatives of the formula don't return the correct results.
Microsoft's documentation on TODAY
Per the above Microsoft documentation, I'm trying to get a calculation for the last three months based on today's current date in SSAS Tabular model. First, I have no idea how to use SSAS and my company doesn't provide any learning material, so I've been reading through the MSDN documentation, which may not be the place to start, so if this is wrong, I'd appreciate being told so. For instance, with C# or Ruby, I can test code in a console to see if I get the result that I want, and I don't see how I can do that in SSAS Data Tools' DAX language - this is a GUI which gives users very little power over what they can do (it took me four hours to figure out how to access a dimension's properties). I am definitely a code monkey.
I tried using the below formula (and derivatives of it) because this is what it looks like Microsoft is doing in their example:
3MonthValue:=CALCULATE(SUM([MeasureOne])/SUM([MeasureTwo]),DATESBETWEEN(DateDimension[Date],DATEADD(DateDimension[Date],-3,MONTH),TODAY()))
The result, nothing. Of course, if I run similar SQL logic, I get the right results. I also used the provided SO example, though I suspect that's not exactly what I'm trying to achieve, and only obtained blanks as answers. Given that I need to calculate a formula between a certain time frame, which in SQL would be the WHERE clause, how do I translate this into DAX? In other words, what is DAX's WHERE and if CALCULATE isn't right, what's the correct approach?
When you say it isn't working, how do you mean? The formula you are using refers to your date dimension's key as the starting date for your DATESBETWEEN function - this means if you are expecting the measure to populate a value, you'll need to be using a particular date in your pivot to establish context.
If you are trying to view the measure at design time, in the editor, there is no context so the measure wont populate.
Moreover, if in a pivot you're looking at a time context that includes more than one date, that also will not work. So say you are looking at a month, or a quarter. Both of these encompass what amount to multiple DateDimension[Date]'s - so again context cannot be established.
so to recap - measures which look at date ranges like DATESBETWEEN using a starting time context that is set to your dimensions time key will only show up in a pivot when the pivots data is filtered to a single date.
You can test this using the same function, but hard set the starting date by replacing DateDimension[Date] with a static date (or possibly TODAY()). The measure should show up in design time because the formula has all the information it needs to complete the calculation.

Resources