Tableau tooltip incorrect when toggling through quick filter - filter

Link to workbook on public tableau
I created calculated value to determine grade for business, and this is colormap (in tab Grade per Location)
And when I hover over datapoints on map (tab Map), it displays correct Grade, i.e. D for Shish Boom Bah Car Wash
But as soon as I select any location from a drop-down, all grades are A
Tot_Avg is calculated like this:
{ EXCLUDE [Location (Loc)] : AVG([Rating]) }
Avg_Rating like this:
AVG([Rating])
And here are the conditions for receiving an A:
IF [Avg_Rating] > ATTR([Tot_Avg]) - (.10 * ATTR([Tot_Avg]))
THEN "A"
How to troubleshoot?

I think your confusion is in what that EXCLUDE is doing. It is NOT ignoring filters. It's just saying not to group by Location when aggregating AVG([Rating]). When you filter out all but one location, AVG([Rating]) and { EXCLUDE [Location (Loc)] : AVG([Rating]) } become equivalent, because with either calculation, you're averaging for all points in your filtered partition.
As a result, your condition for receiving an A will always be true if there's only one location. (Check the math: X > X - .1X → X > .9X)
Here's a different way to get what you're after. Make a calculated field (I'll call it Location Filter):
LOOKUP(ATTR([Location (Loc)]),0)
Then trash your Location filter and replace it with that field. We're doing something sneaky here - we're making the exact same filter as we had before, but we're disguising it as a table calculation (by using LOOKUP()). Tableau doesn't execute table calculations until after it's created the filtered partition, so we've tricked it into letting us use every location while still just examining one.

Related

Total less available = Remaining Formula in Power Query

I have a bunch of statistical data that I have chopped up and pieced together in power query. In looks like there is category missing from the Database. To fill in the gaps I am trying to take the grand totals which are correct (red), subtract from that what I know to be correct, with the remaining numbers giving me my answer (orange).
The correct data starts with I1, I2, I3, I4, so possibly a grand total of these, by state.
At the moment this is filled by the following formula in excel;
=E53-SUMIFS($E$5:$E$44,$B$5:$B$44,B45,$C$5:$C$44,C45,$D$5:$D$44,D45)
Any help with how the heck I can do this in power query. I realise I cant use the same formula but any ideas would be much appreciated. I can change the text in red to total if that helps in some way?
Thanks
Here's one potential way. If you start with your spreadsheet set up similar to this:
I only used a subset of your StateIDs from your example and generated my own Values for this example. And the figures in the Available column would be from your red section.
Then add the table from the spreadsheet to Power Query (in Excel, you would click on the table and then Data > From Table/Range > Select My table has headers and click OK).
In Power Query:
You'll probably have to change the TimeID type to date if you want to use the dates for anything, because it will probably come in as date-time type--I won't use the dates here though, so you could skip changing the type (otherwise, right-click the TimeID column > Change Type > Date)
Then use Group By to aggregate values and set the stage for the calculation you want (select the StateID > Group By > and setup the groupings like below and click OK)
You should see something like this:
Then add a new column with your calculation (Add Column > Custom Column > Set it up like below and click OK)
You should see something like this:

Power Query, avg value based on the values appearing within a specified date range

Context:
I have a data set for the weights of truck and trailer combinations coming into my site over the span of a few years. I have organized my data by seasons as I am trying to prove that the truck:trailers in winter are noticeably heavier due to ice, snow, and mud. The theory is, if the tare weight is higher in this season (the weight of the truck after it empties its load) than its Avg tare weight (which I need to calculate from the data) it can be deduced that the truck:trailer combinations are coming in with extra weight that we pay for in part as some snow/ice/mud falls off in the trailer emptying process.
What I've done so far:
I've defined a custom date range for my seasons
I've grouped Truck:Trailer by: count to get a duplicates column and, all rows to keep all my details
I've filtered out every combination I've seen less than 50 times, as i want good representation for each truck:trailer combo so that I can better emphasize repeated patterns
I've added an index column to better keep track of the individuals before expanding the details
What I need to do:
I only want to work with truck:trailer combinations which have weighed in for all four seasons at least once
I need to find the average tare weight of the truck:trailer combinations based over the extended range for both summer and autumn (the dry time of the year) while preserving the raw tare data for all seasons, as I need to eventually compare the winter tare values to this average.
example of my data
When I'm finished I'd like the data to look something like this
Pivot Chart
query data
For your first question (all seasons) you can add a column that holds the distinct count of the values in [Season] for each [Driver:Trailer]. Then filter your table on that column, keeping only the 4's. To achieve this, add the following m-code to your script in the Advanced Editor. Change the part after in to #"DistinctCount Season"
#"DistinctCount Season" = Table.Join(#"insert name previous step","Driver:Trailer",
Table.Group(#"insert name previous step", {"Driver:Trailer"},
{{"DistinctCountSeasons", each Table.RowCount(Table.Distinct(_,"Season")),
type number}}),"Driver:Trailer")
Insert the name of your previous step where indicated.
For second question:
You can use a matrix-visual for that in you report. First create a measure:
[AverageTare] = AVERAGE(table'[Tare])
Then put [Season] on Rows and the [AverageTare] on Values. You can create a group (right-click on [Season] in the FIELDS-pain) called [DrySeason], to combine the values for Spring and Summer.
If that doesn't work for you, explore the AVERAGEX function.
EDIT
In excel you can use a pivottable. Put [Season] on Rows and the [AverageTare] on Values. Right-click a value in the pivottable. Select Value Field Setting and choose Average. Then select the Seasons you want to group, right-click and select Group.
EDIT 2
To add a column in the Power Query Editor that holds the average [Tare] for the [Season] in each row, add the following steps to your script in the Avanced Editor:
#"GroupedSeasonAvg" = Table.Group(#"Insert name previous step", {"Season"}, {{"AVG", each List.Average([Tare]), type number}}),
#"JoinOnSeason" = Table.NestedJoin(#"Insert name previous step",{"Season"},GroupedSeasonAvg,{"Season"},"AVGGrouped"),
#"ExtractSeasonAVG" = Table.ExpandTableColumn(JoinOnSeason, "AVGGrouped", {"AVG"}, {"SeasonAVG"})
It works something like this:
"GroupedSeasonAvg" : Creates a table with the avereges for each [Season]
"JoinOnSeason": Creates a new column with tables joining the [Season] value for each row to [Season] in the grouped table.
#"ExtractSeasonAVG": Expand each table and keep only [AVG].

DAX COUNT/COUNTA functions

I've looked at many threads regarding COUNT and COUNTA, but I can't seem to figure out how to use it correctly.
I am new to DAX and am learning my way around. I have attempted to look this up and have gotten a little ways to where I need to be but not exactly. I think I am confused about how to apply a filter.
Here's the situation:
Four separate queries used to generate the data in the report; but only need to use two for the DAX function (Products and Display).
I have three columns I need to filter by, as follows:
Customer (Display or Products query; can do either)
Brand (Products query)
Location (Display query)
I want to count the columns based on if the data is unique.
Here's an example:
Customer: Big Box Buy;
Item: Lego Big Blocks;
Brand: Lego;
Location: Toys;
BREAK
Customer: Big Box Buy;
Item: Lego Star Wars;
Brand: Lego;
Location: Toys;
BREAK
Customer: Big Box Buy;
Item: Surface Pro;
Brand: Microsoft;
Location: Electronics;
BREAK
Customer: Little Shop on the Corner;
Item: Red Bicycle;
Brand: Trek;
Location: Racks;
In this example, no matter the fact that the items are different, we want to look at just the customer, the brand, and the location. We see in the first two records, the customer is "Big Box Buy" and the brand is "Lego" and the location is "Toys". This appears twice, but I want to count it distinct as "1". The next "Big Box Buy" store has the brand "Microsoft" and the location is "Electronics". It appears once and only once, and thus the distinct count is "1" anyway. This means that there are two separate entries for "Big Box Buy", both with a count of 1. And lastly there is "Little Shop on the Corner" which appears just once and is counted just once.
The "skeleton" of the code I have is basically just to see if I can get a count to work at all, which I can. It's the FILTER that I think is the problem (not used in the below example) judging by other threads I've read.
TotalDisplays = CALCULATE(COUNTA(products[Brand]))
Obviously I can't just count the amount of times a brand appears as that would give me duplicates. I need it unique based on if the following conditions are met:
Customer must be the same
Brand must be the same
Location must be the same
If so, we distinctly count it as one.
I know I ranted a bit and may seem to have gone in circles, but I was trying to figure out how to explain it. Please let me know if I need to edit this post or post clarification.
Many thanks in advance as I go through my journey with DAX!
I believe I have the answer. I used a NATURALINNERJOIN in DAX to create a new, merged table since I needed to reference all values in the same query (couldn't figure out how to do it otherwise). I also created an "unique identity" calculated column that combined data from multiple rows, but was hidden behind the scenes (not actually displayed on the report) so I could then take a measure of the unique values that way.
TotalDisplays = COUNTROWS(DISTINCT('GD-DP-Merge'[DisplayCountCalcCol]))
My calculated column is as follows:
DisplayCountCalcCol = 'GD-DP-Merge'[CustID] & 'GD-DP-Merge'[Brand] & 'GD-DP-Merge'[Location] & 'GD-DP-Merge'[Order#]
So the measure TotalDisplays now reports back the distinct count of rows based on the unique value of the customer ID, the brand, and the location of the item. I also threw in an order number just in case.
Thanks!
I am semi new to DAX and was struggling with Count and CountA formula, you post has helped me with answers. I would like to add the solution which i got for my query: Wanted count for Right Time start Achieved hence if anyone is looking for this kind of answer use below, filter will be selecting the table and adding string which you want to
RTSA:=calculate(COUNTA([RTS]),VEO_Daily_Services[RTS]="RTSA")

Tableau calculated-field filter on pie-chart doesn't work

Based on previous question, I had to create calculated value for Location, and use that as quick filter, i.e.
Location Filter:
LOOKUP(ATTR([Location (Loc)]),0)
Workbook is on Public Tableau
For hovering over points in a map, the calculated field works, but when I create pie chart, it doesn't work.
For instance, if I select All, this is the result
And if I select a business from Location Filter, this is what I get
How to troubleshoot?
Additional Info
However, if I use regular Location filter, then it works, i.e
There are two separate issues to address here:
LOOKUP(ATTR([Location (Loc)]),0) is a sneaky way of filtering the data in the view while still maintaining all of the locations in the partition (by disguising the field as a table calculation, the filtered partition is created before this table calculation is ever executed). Because you've used it here, you still have every location in the partition, even when you filter them out with the quick filter. Because they're still in the partition, when you calculate the percent of total, those other locations will be included in that total, even if they're not displayed in the view.
I don't see a reason for you to keep all of the locations in your partition in this case, so I'd just replace that filter with [Location].
It looks like you've dragged [Location] into your mark as a dimension. As a result, it's broken up the pie slices into smaller chunks, one per location. If you add a dimension to your data, then Tableau will have to group by that dimension when calculating the aggregations.
If you want the Location to appear in the tooltip of your pie chart, you'll have to either add it as an attribute (in which case you'll have to deal with the "*" when you have more than one location in the partition), or you'll just have to deal with the slices being broken up further.

Having to call .ToList() in entity framework 4.1 before .Skip() and .Take() on large table

I'm trying to do something a little clever with my app. I have a table, Adverts - which contains info on cars: model, mileage etc. The table is related to a few other tables via foreign keys e.g. model name is retrieved through a foreign key linking to a "VehicleModels" table etc.
Within the app's "Entities" dir (classes which map to tables in the database) I have one for the Adverts table, Advert.cs. This has a couple of properties which EF has been told to ignore (in fluent api) as they don't map to actual fields in the Adverts table.
The idea behind these fields is to store the calculated distance from a postcode (zip code) the user enters in a search form which filters through the Adverts table if they only want to see cars available within a certain radius. e.g.:
IQueryable<Advert> FilteredAdverts = repository.Adverts
.Where(am => mfr == "" || am.Manufacturer == mfr) &&
(am => model == etc etc...)
Later on, to calculate the distance the code resembles:
if (userPostcode != null) {
foreach (var ap in FilteredAdverts.ToList()) {
distmiles = //calculate distance in miles
distkm = //calculate distance in km
ap.DistanceMiles = Convert.ToInt32(distmiles);
ap.DistanceKm = Convert.ToInt32(distkm);
}
}
The problem I'm having is that in order to assign values to these two fields, I'm having to use .ToList() which is pulling all rows from the table. Works ok if there are only a few rows, but when there are ~1,000 it takes approx. 2.2 seconds, when I increased it to about 12,000 rows it took 32 seconds for the page to load when no filters were applied i.e. all active adverts returned.
The reason I'm pulling all adverts before calling .Skip and .Take to display them is that the filters available in the search form are based on possible options of all current adverts that are active i.e. have time remaining, rather than just selecting a list of manufacturers from the manufacturers table (where a user could choose a manufacturer for which there are no search results). e.g.
VehicleManufacturers = (from vm in FilteredAdverts.Select(x => x.VehicleManufacturer).Distinct().OrderBy(x => x)
select new SearchOptionsModel
{
Value = vm,
Text = vm,
Count = FilteredAdvertsVM.Where(x => x.VehicleManufacturer == vm).Count(),
})
.... filters for model, mileage etc
To get an idea of what I'm trying to achieve - take a look at the search form on the Autotrader website.
Once all the filters are applied, just before the model is passed to the view, .Skip and .Take are applied, but of course by this time all rows have been pulled.
My question is, how do I go about redoing this? Is there a better method to make use of these non-mapped properties in my Advert entity class? I'm working on my home PC - C2D # 3.4GHz, 2GB ram - would the slow queries run ok on a propert web host ?
You cannot use server-side paging on a client side function. That's the short answer. Assuming I understand your need correctly (to filter a list based on proximity to a given zip code), a solution I've used in the past is storing each 'Advert' record with a lat/long for that record's zip code. This data is persisted.
Then, when it comes time to query, calculate a bounding box (lat1, lng1, lat2, lng2) based on X distance from the center (user provided zip code) and filter the query results based on records whose lat/lng fits within this box. You can then apply client side calculations to further and more accurately filter the list, but using this method, you can establish a base filter to minimize the number of records pulled.
Edit: You can order the results of the query based on the absolute distance from the center point in terms of abs(latU-latR) and abs(lngU-lngR) where latU/lngU is the lat/lng of the user provided zip code and latR/lngR is the lat/lng of the record in the db.

Resources