Reporting Multiple Values & Sorting - sorting

Having a bit of an issue and unsure if it's actually possible to do.
I'm working on a file that I will enter target progression vs actual target reporting the % outcome.
PAGE 1
¦NAME ¦TAR 1 %¦TAR 2 %¦TAR 3 %¦TAR 4 %¦OVERALL¦SUB 1¦SUB 2¦SUB 3¦
¦NAME1¦ 114%¦ 121%¦ 100%¦ 250%¦ 146%¦ 2¦ 0¦ 0%¦
¦NAME2¦ 88%¦ 100%¦ 90%¦ 50%¦ 82%¦ 0¦ 1¦ 0%¦
¦NAME3¦ 82%¦ 54%¦ 64%¦ 100%¦ 75%¦ 6¦ 6¦ 15%¦
¦NAME4¦ 103%¦ 64%¦ 56%¦ 43%¦ 67%¦ 4¦ 4¦ 24%¦
¦NAME5¦ 87%¦ 63%¦ 89%¦ 0%¦ 60%¦ 3¦ 2¦ 16%¦
Now I already have it sorting all rows by the Overall % column so I can quickly see at a glance but I am creating a second page that I need to reference points.
So on the second page I would like to somehow sort and reference different columns for example
PAGE 2
TOP TAR 1¦Name of top %¦Top %¦
TOP TAR 2¦Name of top %¦Top %¦
Is something like this possible to do?
Essentially I'm creating an Employee of the Month form that automatically works out who has topped what.
I'm willing to drop a paypal donation for whoever can figure this out for me as I've been doing it manually every month and would appreciate the time saved

I don't think a complicated array formula is necessary for this - I am suggesting a fairly standard Index/Match approach.
First set up the row titles - you can just copy and transpose them from Page 1, or use a formula in A2 of Page 2 like
=transpose('Page 1'!B1:E1)
The use them in an index/match to get the data in the corresponding column of the main sheet and find its maximum (in C2)
=max(index('Page 1'!A:E,0,match(A2,'Page 1'!A$1:E$1,0)))
Finally look up the maximum in the main sheet to find the corresponding name:
=index('Page 1'!A:A,match(C2,index('Page 1'!A:E,0,match(A2,'Page 1'!A$1:E$1,0)),0))
If you think there could be a tie for first place with two or more people getting the same score, you could use a filter to get the different names:
So if the max score is in B8 this time (same formula)
=max(index('Page 1'!A:E,0,match(A8,'Page 1'!A$1:E$1,0)))
the different names could be spread across the corresponding row using transpose (in C8)
=ArrayFormula(TRANSPOSE(filter('Page 1'!A:A,index('Page 1'!A:E,0,match(A8,'Page 1'!A$1:E$1,0))=B8)))
I have changed the test data slightly to show these different scenarios
Results

Related

Extracting data from text file in AMPL without adding indexes

I'm new to AMPL and I have data in a text file in matrix form from which I need to use certain values. However, I don't know how to use the matrices directly without having to manually add column and row indexes to them. Is there a way around this?
So the data I need to use looks something like this, with hundreds of rows and columns (and several more matrices like this), and I would like to use it as a parameter with index i for rows and j for columns.
t=1
0.0 40.95 40.36 38.14 44.87 29.7 26.85 28.61 29.73 39.15 41.49 32.37 33.13 59.63 38.72 42.34 40.59 33.77 44.69 38.14 33.45 47.27 38.93 56.43 44.74 35.38 58.27 31.57 55.76 35.83 51.01 59.29 39.11 30.91 58.24 52.83 42.65 32.25 41.13 41.88 46.94 30.72 46.69 55.5 45.15 42.28 47.86 54.6 42.25 48.57 32.83 37.52 58.18 46.27 43.98 33.43 39.41 34.0 57.23 32.98 33.4 47.8 40.36 53.84 51.66 47.76 30.95 50.34 ...
I'm not aware of an easy way to do this. The closest thing is probably the table format given in section 9.3 of the AMPL Book. This avoids needing to give indices for every term individually, but it still requires explicitly stating row and column indices.
AMPL doesn't seem to do a lot with position-based input formats, probably because it defaults to treating index sets as unordered so the concept of "first row" etc. isn't meaningful.
If you really wanted to do it within AMPL, you could probably put together a work-around along these lines:
declare a single-index param with length equal to the total size of your matrix (e.g. if your matrix is 10 x 100, this param has length 1000)
edit the beginning and end of your "matrix" data file to turn it into appropriate format for a single-index parameter indexed from 1 to n
then define your matrix something like this:
param m{i in 1..nrows,j in 1..ncols} := x[j+i*(ncols-1)];
(not tested, I won't promise that I have rows and columns the right way around there!)
But you're probably better off editing the input file into one of the standard AMPL matrix formats. AMPL isn't really designed for data wrangling - you can do it in a pinch but if you're doing this kind of thing repeatedly it may be less trouble to code it in a general-purpose language e.g. Python.

rowscap and filter applied in wrong order in DC.JS rowChart

Still using DC.JS to get some analysis tools written for our tool performance. Thanks so much for having this library available.
I am trying to show which recipe setup times are the worst for a given set of data. Everything works great as long as you show the whole group. When you only display the specified topN using .rowscap on the rowChart the following happens:
The chart will show the right number of bars and they are even sorted properly but the chart has picked the topN unfiltered bars first and then ordered them. I want it to pick the topN from the ordered list, not the other way around. See jsfiddle for demo. (http://jsfiddle.net/za8ksj45/24/)
in the fiddle, the longest setup time belongs to recipeD.
But if you have more than two recipes selected before recipeD
it is dropped of the right (top2) chart.
line 099-110: reductio definition
line 120-140: removal of empty bins (works okay)
(This is very similar to a problem Gordon helped resolved earlier (dc.js rowChart topN without zeros) and I reused the code from that solution. Something went 'wrong' when I combined it with the reductio.js library.)
I think I am not returning the value portion of the reductio group somewhere but have been unable to figure it out. Any help would be appreciated.
The issue is that at the time you .slice(0,n) the group in your function to remove empty bins, the group is not ordered, so you effectively get a random 2 groups, not the top 2 groups. This is actually clear from the unfiltered view, as the "top2" view shows the 2nd and 3rd group from the "all" view, not the actual top 2 (at least for me).
The previous example worked because Crossfilter's standard groups are ordered by default, but in the case of a complex group like the one you are generating with Reductio, what should it order by? There's no way it can know, so Reductio doesn't mess with the ordering at all, which I suppose means it is ordering by the value property, which is an object.
You need to add one line to order your FactsByRecipe group by average and I think it should fix your problem:
FactsByRecipe.order(function(d) { return d.avg; });
Note that there can only be one ordering on a Crossfilter group, so if you want to show "top X" for more than one property of that group you'll need to create another wrapper (like the remove empty bins wrapper) but have the "top" function re-sort the group by the ordering you want.
Good luck!

SSRS 2008: Using StDevP from multiple fields / Combining multiple fields in general

I'd like to calculate the standard deviation over two fields from the same dataset.
example:
MyFields1 = 10, 10
MyFields2 = 20
What I want now, is the standard deviation for (10,10,20), the expected result is 4.7
In SSRS I'd like to have something like this:
=StDevP(Fields!MyField1.Value + Fields!MyField2.Value)
Unfortunately this isn't possible, since (Fields!MyField1.Value + Fields!MyField2.Value) returns a single value and not a list of values. Is there no way to combine two fields from the same dataset into some kind of temporary dataset?
The only solutions I have are:
To create a new Dataset that contains all values from both fields. But this is very annoying because I need about twenty of those and I have six report parameters that need to filter every query. => It's probably getting very slow and annoying to maintain.
Write the formula by hand. But I don't really know how yet. StDevP is not that trivial to me. This is how I did it with Avg which is mathematically simpler:
=(SUM(Fields!MyField1.Value)+SUM(Fields!MyField2.Value))/2
found here: http://social.msdn.microsoft.com/Forums/is/sqlreportingservices/thread/7ff43716-2529-4240-a84d-42ada929020e
Btw. I know that it's odd to make such a calculation, but this is what my customer wants and I have to deliver somehow.
Thanks for any help.
CTDevP is standard deviation.
Such expression works fine for me
=StDevP(Fields!MyField1.Value + Fields!MyField2.Value) but it's deviation from one value (Fields!MyField1.Value + Fields!MyField2.Value) which is always 0.
you can look here for formula:
standard deviation (wiki)
I believe that you need to calculate this for some group (or full dataset), to do this you need set in the CTDevP your scope:
=StDevP(Fields!MyField1.Value + Fields!MyField2.Value, "MyDataSet1")

Count nodes based on two or more attribute values

I need to count how many times a particular node occurs in a document based on the values of two if its attributes. So, given the following small sample of XML:
<p:entry timestamp="2012-11-15T17:53:34.642-05:00" ticks="89709622449012" system="OSD" component="OSD5" marker=".\Launcher.cpp:1741" severity="Info" type="Driver" subtype="Start" tags="" sensitivity="false">
This can occur one or more times in the document with different attribute sets. I need to count how many show up with type="Driver" AND subtype="Start". I am able to count how many just have type="Driver" using:
count(//p:entry[#type="Driver"])
but haven't been able to combine them. This didn't work:
count(//p:entry[#type="Driver" and #subtype="Start"])
This works for the OP. Specify 2 predicates in succession instead of using operator and result in the same effect:
count(//p:entry[#type="Driver"][#subtype="Start"])
By right, the original code count(//p:entry[#type="Driver" and #subtype="Start"]) should work, as far as my knowledge goes.

R: Which heatmap/image to get row-sorted plot without any dendrogram?

Which package is best for a heatmap/image with sorting on rows only, but don't show any dendrogram or other visual clutter (just a 2D colored grid with automatic named labels on both axes). I don't need fancy clustering beyond basic numeric sorting. The data is a 39x10 table of numerics in the range (0,0.21) which I want to visualize.
I searched SO (see this) and the R sites, and tried a few out. Check out R Graphical Manual to see an excellent searchable list of screenshots and corresponding packages.
The range of packages is confusing - which one is the preferred heatmap (like ggplot2 is for most other plotting)? Here is what I found out so far:
base::image - bad, no name labels on axes, no sorting/clustering
base::heatmap - options are far less intelligible than the following:
pheatmap::pheatmap - fantastic but can't seem to turn off the
dendrograms? (any hacks?)
ggplot2 people use geom_tile, as Andrie points out
gplots::heatmap.2 , ref - seems
to be favored by biotech people, but way overkill for my purposes. (no
relation to ggplot* or Prof Wickham)
plotrix::color2D.matplot also exists
base::heatmap is annoying, even with args heatmap(..., Colv=NA, keep.dendro=FALSE) it still plots the unwanted dendrogram on rows.
For now I'm going with pheatmap(..., cluster_cols=FALSE, cluster_rows=FALSE) and manually presorting my table, like this guy: Order of rows in heatmap?
Addendum: to display the value inside each cell, see: display a matrix, including the values, as a heatmap . I didn't need that but it's nice-to-have.
With pheatmap you can use options treeheight_row and treeheight_col and set these to 0.
just another option you have not mentioned...package bipartite as it is as simple as you say
library(bipartite)
mat<-matrix(c(1,2,3,1,2,3,1,2,3),byrow=TRUE,nrow=3)
rownames(mat)<-c("a","b","c")
colnames(mat)<-c("a","b","c")
visweb(mat,type="nested")

Resources