Power BI Only Showing Average of Whole Column in Line Graph - time

So I am trying to create a report for (among other things) measurements made by a device.
I have normalised my schema such that these measurements are in a measurement dimension.
I have created a calendar and timetable to plot these measurements as a line graph.
However rather than having a line graph showing the changes, I get a constant line of the average (or whichever aggregate function I am using) of the whole column.
My line graph at a date level
My line graph at a date and time level
Since entries are unique for each datetime stamp, I don't understand why the line graph doesn't show variation here.
I've tried all sorts of changes to my schema, but it seems it only wants to work when its a flat file.
For a sense of my schema:
My schema
All the data types for columns seems to be correct.
Its happy filtering by date if I do it in transform data, so I don't think there is a problem with my relationships.
Am I missing something obvious?

Related

Interpret Google AutoML Online Prediction Results

We are using Google AutoML with Tables using input as CSV files. We have imported data , linked all schema with nullable columns and train model and then deployed and used the online prediction to predict value of one column.
Column we targeted has values min-max ( 44 - 263).
When we deployed and ran online-prediction it return values like this
Prediction result
0.49457597732543945
95% prediction interval
[-8.209495544433594, 0.9892584085464478]
Most of the resultset is in above format. How can we convert it to values in range of (44-263). Didn't find much documentation online on the same.
Looking for documentation reference and interpretation along with interpretation of 95% prediction.
Actually to clarify (I'm the PM of AutoML Tables)--
AutoML Tables does not do any normalization of the predicted values for your label data, so if you expect your label data to have a distribution of min/max 44-263, then the output predictions should also be in that range. Two possibilities would make it significantly different:
1) You selected the wrong label column
2) Your input features for this prediction are dramatically different than what was seen in the training data used.
Please feel free to reach out to cloud-automl-tables-discuss#googlegroups.com if you'd like us to help debug further

Converting year weeks to continuous without gaps in Power BI graph

I have weekly sales data from 201601-201835 and i have to create a graph to see the trend. Since the data is categorical, a scroll bar is appearing on the graph. I want to view the complete graph at once.
I tried converting the data to continuous, but there is a gap in the graph as there is no data from 201653 to 201699 and hence the graph is wrong.
Is there a way of viewing the complete data without a scroll bar?
I would add summary fields to the Axis above the Week, e.g. Year, Month, Quarter. You will probably need a reference table for your weeks or some fancy calculations.
With those fields in place, you can expand or drill into the levels before getting down to the Week level. As the higher levels are Categorical and have fewer categories, you are less likely to see scroll bars.
A side benefit is the user has more control e.g. they can drill down on just a section of the weeks, e.g. for one year.

Gap in time series not appearing

I am working with time series data that omit data for the weekend. When graphing these time series in D3 v4 the graph interpolates over the weekend. See the following URL for an illustration (including code, data, and graph output):
No records for weekend
Instead, I want a gap at the weekend; graph stopping on Friday and resuming on Monday.
I could fix the problem by creating dummy records for the weekend, with values 'NA', and using the D3 defined method, as shown in the following:
Data has NA records
However, generating dummy records feels to me like excessively heavy lifting. Is there a simple, natural way to get D3 to leave a gap when time series records are missing?
Is there a simple, natural way to get D3 to leave a gap when time series records are missing?
Unfortunately no, that's the normal behaviour of a time scale. According to Mike Bostock, D3 creator,
A d3 time scale should be used when you want to display time as a continuous, quantitative variable, such as when you want to take into account the fact that days can range from 23-25 hours due to daylight savings changes, and years can vary from 365-366 days due to leap years.
So, the time scale was created having in mind a continuous time.
Your current approach in the line generator...
.defined(function(d) { return !isNaN(d.value); })
... doesn't work because all the dates in your CSV have values, and d3 will connect the dots.
That having been said, if you want to keep the gap, just use dummy records (as null or any non numeric value) for the weekends and line.defined, as in your second link.

Tableau - Calculated fields / grouping / Custom Dim

Tableau:
This may seem simple, but I ran out of the usual tricks I've used in other systems.
I want a variance column. Essentially adding a member 'Variance' to the Act/Plan dimension which only contains the members 'Actual' and 'Plan'
I've come in where the data structure and reporting is set up like so:
Actual | Plan
Profit measure
measure 2
measure 3
etc
The goal is to have a Variance column (calculated and not part of the Actual/Plan dimension)
Actual | Plan | Variance
Profit measure
measure 2
measure 3
etc
There are solutions where it works for one measure only, and I've looked into that.
ie, create calculated field as such
Profit_Actual | Profit_Plan | Variance
You put this on the columns, and you get a grid that I want... except a grid with only 1 measure.
This does not work if I want to run several measures on rows. Essentially the solution above will only display the Profit measure, not Measure 1_Actual , Measure 2_Plan etc.
So I tried a trick where I grouped a the 3 calculated measures, ie Profit_Actual | Profit_Plan | Profit_Variance as 'Profit_Measure'
Created a parameter list - 'Actual', 'Plan', 'Variance'
Now I can half achieve my goal, by having the parameter on columns and the 'Profit Measure' on Rows (so I can have Measure 123_group etc down on rows too). Trouble is, I found that parameters are single select only. Only if it can display all options in the custom paramater at once, I would've solved my problem.
Any ideas on how I can achieve the Variance column I want?
Virtually adding a member to a dimension/Calculated fieds/tricks/workaround
Thank you
Any leads is appreciated
Gemmo
Okay. First thing, I had a really hard time trying to understand how your data is organized, try to be more clear (say how each entry in your database looks like, and not how a specific view in Tableau looks like).
But I think I got it. I guess you have a collection of entries, and each entry has a number of measure fields (profits and etc.) and an Act/Plan field, to identify whether that entry is an actual value or a planned value. Is that correct?
Well, if that's the case, I'm sorry to say you have to calculate a variance field for each dimension. Think about it, how your original dataset is structured. Do you think you can add a single field "Variance" to represent the variance of each measure? Well, you can, store the values in a string, and then collect it back using some string functions, but it's not very practical. The problem is that each entry have many measures, if it had only 1 measure, than 1 single variance field would suffice.
So, if you can re-organize your data, what would be an easier to work set (but with many more entries) is something with the fields: Measure, Value, Actual/Plan. The measure field would have a string to identify what you're measuring in that entry. Value would be a number to represent the actual measure. And the Actual/Plan is the same. For instance:
Measure Value Actual/Plan
Profit 100 Actual
So, each line in your current model would become n entries, where n is the number of measures you have right now. So a larger dataset in a way, but easier to work with. Think about, now you can have a calculated field, and use some table calculations to calculate the variance only for that measure and/or Actual/Plan. Just use WINDOW_VAR, and put Measure and/or Actual/Plan in the partition.
Table calculations are awesome, take a look at this to understand it better. http://onlinehelp.tableausoftware.com/current/pro/online/en-us/help.htm#calculations_tablecalculations_understanding_addressing.html
I generally like to have my data staged such that Actual is its own column and Plan is its own column in the data being fed to Tableau. It makes calculations so much easier.
If your data is such that there is a column called "Actual/Plan" and every row is populated with either "Actual" or "Plan" and there is another column called "Value" or "Measure" that is populated with the values, you can force Tableau to make them columns assuming you can't or won't rearrange your data.
Create a calculated field called "Actual" with the following calc:
IF [Actual/Plan] = 'Actual' THEN [Value] END
Similarly, create a calculated field called "Plan" with the following calc:
IF [Actual/Plan] = 'Plan' THEN [Value] END
Now, you can finally create your "Variance" and "Variance %" calculations (respectively):
SUM([Actual]) - SUM([Plan])
[Variance] / SUM([Plan])

MS Access - matching a small data set with a very large data set

I have a huge excel file with more than a million rows and a bunch of columns (300) which I've imported to an access database. I'm trying to run an inner join query on it which matches on a numeric field in a relatively small dataset. I would like to capture all the columns of data from the huge dataset if possible. I was able to get the query to run in about 1/2 hour when I selected just one column from the huge dataset. However, when I select all the columns from the larger dataset, and have the query writes to a table, it just never stops.
One consideration is that the smaller dataset's join field is a number, while the larger one's is in text. To get around this, I created a query on the larger dataset which converts the text field to a number using the "val" function. The text field in question is indexed, but I'm thinking I should convert on the table itself to a numeric field to match the smaller dataset's type. Maybe that would make the lookup more efficient.
Other than that, I could use and would greatly appreciate some suggestions of a good strategy to get this query to run in a reasonable amount of time.
Access is a relational database. It is designed to work efficiently if your structure respects the relational model. Volume is not the issue.
Step 1: normalize your data. If you don't have a clue about what that means, there is a wizard in Access that can help you for this (Database Tools, Analyze table) , or search for Database normalization
Step 2: index the join fields
Step 3: enjoy fast results
Your idea of having both sides of the join in the same type IS a must. If you don't do that, indexes and optimisation won't be able to operate.

Resources