I have a pivot table that looks like this:
My hope is to make the columns of the anomalies (A,B,C,D,M) that is the frequency of the anomaly. So that the column is basically
Anomaly/# of Inspections
How can I change the format of these cells to show this frequency so that they can be then plotted over time?
From your question, and a little help from the comments on it, it seems you want to display the volume of anomalies as a percentage of the number of inspections. For example in week 11 you had one Anomaly C, which would be 20% of the 5 inspections.
To display 20% instead of 1, the only way to do this is to change the column formula in the criteria to pretty much what you wrote in your question.
100*(Anomaly/# of Inspections)
You can't do this through formatting – you can't format a number into different number, you have to change the calculation to do that.
Related
I have a table in Excel, utilizing Power Pivot that I then display/filter using a Pivot Table. Within my dataset, calculating a ratio within Power Pivot that "sums" correctly in the pivot table based on slicers is fine - this utilizes a SUMX(Cost)/SUMX(Total) and everything works fine. By sums correctly, I mean if I further break down the data set based on Region/State/Product/Employee, all those Rows sum up correctly for the ratio percentage.
The dataset is filtered based on a single month or range of months. The result of this works fine for either the single month or range of Months. What I'm trying to do is within my Pivot Table, show a current month ratio AND a year to date ratio. I've tried messing around with equations I've found online, but nothing seems to work. This includes the following attempts:
=CALCULATE([Cost],[ProductID]="224594")/CALCULATE([Total],[ProductID]="224594")
=SUMX (FILTER(ALL('TableName'),PATHCONTAINS ('TableName'[ProductID], EARLIER('TableName'[ProductID]))),'TableName'[Cost]) / SUMX(FILTER(ALL('TableName'),PATHCONTAINS('TableName'[ProductID], EARLIER ('TableName'[ProductID]))),'TableName'[Total])
I need the "sumifs" to sum the cost for Product A for all months divided by the sum of total for Product A for all months. I do not want to hard code in the the Product ID into the equation, but simply sum all previous records for that product, but I can't seem to get this to work.
Any suggestions?
Sample Data Set
I used the calculate and filter functions in a column instead of trying to use them in a measure, which fixed the problem.
I have a incredibly large spreadsheet that lists details for the computers in my company's inventory. We need to know how many systems we have that are x years old. I was able to sort it by model but because the model names are wildly different it didn't help much. For example, one model name is
13-inch MacBook Pro (2011)
And another is
13-inch Retina MacBook Pro (Mid 2017)
The only constant value in the parentheses is the year at the end. I'm trying to write a formula that will spit out how many of each system there are. We need to know how many are 2011 computers, how many are 2017, etc. We are fine with grouping up "Early, Mid, Late" since we just need a year separation but those terms don't show up in every cell throwing my math off. The rows don't have to be sorted, I just need a count.
My plan of attack would be to first, convert the spreadsheet into a table using Insert > Table... this enables Excel to manage calculating columns for you.
The following assumes that the cell at the top of your list contains the word "Detail".
Second, I would make a new column at the far right with an equation like this:
=mid([#Detail], find(")",[#Detail])-4, 4)
...and I would tune the "Find" function and the "mid" function until it gives me just the year.
Third, sort the entire table by this new column. Tada!
Transfer the data to column A. Cells A1 to A1000 in my Example.
In Enter the years in column C. Cells C2 to C20 in my example.
In cell D2, enter the following Array Formula, and drag it down.
=SUM(IFERROR(IF(VALUE(LEFT(RIGHT($A$1:$A$1000,5),4))=C2,1,0),"-"))
Array Formulas are entered using Control + Shift + Enter, instead of Enter.
The Formula takes the last 5 characters of all entries in the column A. Then it takes the first 4 characters of this new text (to eliminate the closing bracket) and converts the text entries to numerical values. It matches each entry with the year in column C, and totals the matches.
I hope this solves your problem.
Regards,
Vijaykumar Shetye,
Spreadsheet Excellence,
Panaji, Goa India
I'm facing a following problem. In Kibana 4 I've created a line chart based on my input from elasticeasrch but I can only display average, min, max instead of an actual value of the field per time, e.g. sent bytes.
Most answears to that question on stackoverflow are about Kibana 3 (How to create value over time chart with Kibana 3?) and seem to include a Histogram on a X axis, yet I can't seem to find one which will enable me to apply them to Kibana 4. I was unable to find the histogram panel and once I click on the discover tab there is the constant Searching loading.
If I have the following fields in my _source:
{"timestamp":"2015-06-02T10:16:44.0855","time":587,"threadName":"Thread Group 1-957","byte":1372,"status":"false","latence":306,"registerCall":"404"}
and I would like to have the number of bytes on the Y-axis and on the X-axis my timestamp.
Any help in the right direction will be appreciated :)
To create a value over time line chart in Kibana, follow these steps:
Go to visualize tab and select line chart
In the X-axis, select X-axis, Aggregation as Date Histogram and then select your timestamp field as the date field.
Next for the Y-Axis, select Sum as the aggregation and then bytes as the field.
For the X axis, what Alcanzar said is good, but as you notice, the Y axis is problematic.
Sum (suggested by "Limit") works, but since it's aggregated, it shows the total used in each aggregated bucket, but that may be meaningless depending on what you are trying to show. Your question isn't clear on what you want, so I'm just guessing here. One hour of requests, each of which ran for one minute and sent 1 megabyte is indeed 60 megabytes-minutes, if you are trying to show total capacity used over than hour (maybe you are paying a bill based on usage per time). On the other hand, if you are trying to show peak usage in each time, it would be wrong.
You said you already looked and Max and Min and they don't meet your needs. I don't suppose Standard Deviation would be any better?
I have the same concern. The best I've been able to do so far is
display Min and Max simultaneously in the Y axis. When they diverge, I know I'm zoomed out too far, so I zoom in until they align.
This is how I know I'm seeing individual events.
In any case, I share your frustration. I too would like to be able to show time series as easily as I can in, say, Excel.
Tableau:
This may seem simple, but I ran out of the usual tricks I've used in other systems.
I want a variance column. Essentially adding a member 'Variance' to the Act/Plan dimension which only contains the members 'Actual' and 'Plan'
I've come in where the data structure and reporting is set up like so:
Actual | Plan
Profit measure
measure 2
measure 3
etc
The goal is to have a Variance column (calculated and not part of the Actual/Plan dimension)
Actual | Plan | Variance
Profit measure
measure 2
measure 3
etc
There are solutions where it works for one measure only, and I've looked into that.
ie, create calculated field as such
Profit_Actual | Profit_Plan | Variance
You put this on the columns, and you get a grid that I want... except a grid with only 1 measure.
This does not work if I want to run several measures on rows. Essentially the solution above will only display the Profit measure, not Measure 1_Actual , Measure 2_Plan etc.
So I tried a trick where I grouped a the 3 calculated measures, ie Profit_Actual | Profit_Plan | Profit_Variance as 'Profit_Measure'
Created a parameter list - 'Actual', 'Plan', 'Variance'
Now I can half achieve my goal, by having the parameter on columns and the 'Profit Measure' on Rows (so I can have Measure 123_group etc down on rows too). Trouble is, I found that parameters are single select only. Only if it can display all options in the custom paramater at once, I would've solved my problem.
Any ideas on how I can achieve the Variance column I want?
Virtually adding a member to a dimension/Calculated fieds/tricks/workaround
Thank you
Any leads is appreciated
Gemmo
Okay. First thing, I had a really hard time trying to understand how your data is organized, try to be more clear (say how each entry in your database looks like, and not how a specific view in Tableau looks like).
But I think I got it. I guess you have a collection of entries, and each entry has a number of measure fields (profits and etc.) and an Act/Plan field, to identify whether that entry is an actual value or a planned value. Is that correct?
Well, if that's the case, I'm sorry to say you have to calculate a variance field for each dimension. Think about it, how your original dataset is structured. Do you think you can add a single field "Variance" to represent the variance of each measure? Well, you can, store the values in a string, and then collect it back using some string functions, but it's not very practical. The problem is that each entry have many measures, if it had only 1 measure, than 1 single variance field would suffice.
So, if you can re-organize your data, what would be an easier to work set (but with many more entries) is something with the fields: Measure, Value, Actual/Plan. The measure field would have a string to identify what you're measuring in that entry. Value would be a number to represent the actual measure. And the Actual/Plan is the same. For instance:
Measure Value Actual/Plan
Profit 100 Actual
So, each line in your current model would become n entries, where n is the number of measures you have right now. So a larger dataset in a way, but easier to work with. Think about, now you can have a calculated field, and use some table calculations to calculate the variance only for that measure and/or Actual/Plan. Just use WINDOW_VAR, and put Measure and/or Actual/Plan in the partition.
Table calculations are awesome, take a look at this to understand it better. http://onlinehelp.tableausoftware.com/current/pro/online/en-us/help.htm#calculations_tablecalculations_understanding_addressing.html
I generally like to have my data staged such that Actual is its own column and Plan is its own column in the data being fed to Tableau. It makes calculations so much easier.
If your data is such that there is a column called "Actual/Plan" and every row is populated with either "Actual" or "Plan" and there is another column called "Value" or "Measure" that is populated with the values, you can force Tableau to make them columns assuming you can't or won't rearrange your data.
Create a calculated field called "Actual" with the following calc:
IF [Actual/Plan] = 'Actual' THEN [Value] END
Similarly, create a calculated field called "Plan" with the following calc:
IF [Actual/Plan] = 'Plan' THEN [Value] END
Now, you can finally create your "Variance" and "Variance %" calculations (respectively):
SUM([Actual]) - SUM([Plan])
[Variance] / SUM([Plan])
I am quite new to R, I am trying to do a Corresp analysis (MASS package) on summarized data. While the output shows row and column score, the resulting biplot shows the column scores as zero, making the plot unreadable (all values arranged by row scores in an expected manner, but flat along the column scores).
the code is
corresp(some_data)
biplot(corresp(some_data, nf = 2))
I would be grateful for any suggestions as to what I'm doing wrong and how to amend this, thanks in advance!
Martin
link to the image
the plot
corresp results
As suggested here:
http://www.statsoft.com/textbook/correspondence-analysis
the biplot actually depicts distributions of the row/column variables over 2 extracted dimensions where the variables' dependency is "the sharpest".
Looks like in your case a good deal of dependencies is concentrated along just one dimension, while the second dimension is already mush less significant.
It does not seem, however, that you relationships are weak. On the contrary, looking at your graph, one can observe the red (column) variable's interception with 2 distinct regions of the other variable values.
Makes sense?
Regards,
Igor