BIRT: Adding multiple Category (X) Series - reporting

I have a single dataset containing 4 columns, each showing the number of rejections for a quarter-year. A 5th column shows the Team to which those values belong.
Is it possible to add 4 fixed points on the x-Axis, each belonging to one of these columns? Then I could add the Team as the Y-Series. I'd like to see the evolution of each team in time.

Take a look at this example:
http://www.birt-exchange.org/org/devshare/designing-birt-reports/1553-use-column-names-as-chart-xaxis/

I solved this by writing a (really short, really simple) loop which takes the values out of the four mentioned columns, one at a time, and creates four rows instead.
So the rows basically contain the Team from the original row (duplicated 4 times) and the Number of Rejections. So now instead of a Row with one team and four numbers, I have four rows, each with one team and one number.
I did this all in the report scripts (under "fetch"). Try it, it's really easy.

Related

Create a Dynamic Array formula (Excel) to combine multiple results columns into one column that is filtered & sorted using multiple criteria?

The sample data in the image below is collected from a round robin tournament.
There is a Round column,Home team & Away team columns listing who is playing who. A team could be either Home or Away.
For each match in a round (including any "Bye" match) the number of games won for the Home and Away team are recorded in separate columns respectively.
"Ff" = forfeit and has a value of 0. "Bye" result is left blank (at this stage).
Output columns are "Won, Lost, Round".
Required output (shown in the image) is, for any selected team, the top n most-games-won matches (from both Home & Away) sorted in descending order and then the corresponding games lost but sorted in ascending order where the games won are equal. Finally show the rounds where those scores occurred.
These are the challenges I've faced in going from data to output in one step using dynamic array formula:
Collating/Combining the the Win results into 1 column. Likewise the Losses.
Getting the array to ignore blanks or convert "Ff" to 0 without getting #NUM or #VALUE errors.
Ensuring that if I used separate single column arrays the corresponding Loss and Round matched the Win result
Although "Round, Won, Lost" would be acceptable. But I wasn't able to get the Dynamic Array capability to give the required output with this order.
SUMPRODUCT, INDEX(MATCH), SORT(FILTER) functions all hint at a possible one step formula solution.
The solutions are numerous for sorting & filtering where the existing values are already in one column. There was one solution that dealt with 2 columns of values which was somewhat useful How to get the highest values from 2 columns in excel - Stackoverflow 2013
Many other responses are around the use of concatenation, combining/merging array sets, aggregation etc.
My work around solution is to use a Helper Sheet to combine the Wins from the separate results columns and convert blanks & "Ff" to -1. Likewise for Losses. Using the formula for each line
=IF($C5=L$2,IF($F5="",-1,IF($F5="Ff",0,$F5)),IF($D5=L$2,IF($G5="",-1,IF($G5="Ff",0,$G5)),-1))
Example Helper Sheet
To get the final output the Dynamic Array formula was used on the Helper Sheet data
=SORT(FILTER(L$26:N$40,L$26:L$40>=LARGE(L$26:L$40,$J$3),""),{1,2},{-1,1},FALSE)
I'm trying to avoid using pivottable, VBA solutions. Powerquery possible but not preferred.
Apologies for the screenshots but I couldn't work out how to attach the sample spreadsheet file. (Unfortunately Stackoverflow Help didn't help me to/not to do this.)
Based on the comments I changed my answer with a different approach:
=LET(data,A5:F19,
round,INDEX(data,,1),
ha,CHOOSECOLS(data,3,4),
HAwonR,CHOOSECOLS(data,5,6,1),
w,BYROW(ha,LAMBDA(h,IFERROR(XMATCH(L2,h),0))),
clm,CHOOSE(w,{1,2},{2,1}),
srtwon,DROP(REDUCE(0,SEQUENCE(ROWS(data)),LAMBDA(y,z,VSTACK(y,INDEX(HAwonR,z,HSTACK(INDEX(clm,z,),3))))),1),
res,FILTER(srtwon,w),
TAKE(SORT(res,{1,2},{-1,1}),J3))
Old answer:
=LET(data,A5:F19,
round,INDEX(data,,1),
home,INDEX(data,,3),
away,INDEX(data,,4),
HAwonR,CHOOSECOLS(data,5,6,1),
w,MAP(home,away,LAMBDA(h,a,OR(h=L2,a=L2))),
won,FILTER(HAwonR,w),
TAKE(SORT(won,{1,2},{-1,1}),J3))
In your example you selected round 3 for the third result, but that wasn't won, so I guess that was by mistake.
As you can see making use of LET avoids helpers. Let allows you to create names (helpers) that are stored and because you can name them, you can make complex formulas be more readable.
Basically what it does is filter the columns Home, Away and Round (in that order) for either Home or Away equal the team in cell L2. That's sorted column 1 descending and column 2 ascending. Than the number of rows mentioned in cell J3 are displayed from that sorted array.
Here is my solution based on the excellent contribution by #P.b. Thank you much appreciated.
The wins (likewise losses) required mapping the presence, of the team in question, as hT (home team) to the games it won (hG) and adding to that a 2nd mapping of the games it won (aG) when it was the away team (aT). Essentially what was being done on the Helper Sheet. Result was a 1 column array for game wins and a 1 column array for game losses.
In the process I was able to convert the "Ff" text to 0. I attempted without the conversion and it threw an error.
Instead of CHOOSECOLS used HSTACK to create the new array (wins, losses & round) for the FILTER, SORT, TAKE to work on.
If it could be made conciser(?) that is the next challenge. Overall (not just my solution), this exercise has provided greater flexibility and solved the problems stated. I'm happy!
=LET(data,A5:G19,
round,INDEX(data,,1),
hT,INDEX(data,,3),
aT,INDEX(data,,4),
hG,INDEX(data,,6),
aG,INDEX(data,,7),
wins,MAP(hG,
MAP(hT,LAMBDA(h,h=L2)),
LAMBDA(w,t,IF(w="Ff",0,w)*IF(t=TRUE,1,0))) +
MAP(aG,
MAP(aT,LAMBDA(a,a=L2)),
LAMBDA(w,t,IF(w="Ff",0,w)*IF(t=TRUE,1,0))),
losses,MAP(aG,
MAP(hT,LAMBDA(h,h=L2)),
LAMBDA(w,t,IF(w="Ff",0,w)*IF(t=TRUE,1,0))) +
MAP(hG,
MAP(aT,LAMBDA(a,a=L2)),
LAMBDA(w,t,IF(w="Ff",0,w)*IF(t=TRUE,1,0))),
HAwonR,HSTACK(wins,losses,round),
w,MAP(home,away,LAMBDA(h,a,OR(h=L2,a=L2))),
won,FILTER(HAwonR,w),
TAKE(SORT(won,{1,2},{-1,1}),J3))

The column of the csv file in google automl tables is recognised as text or categorical instead of numeric as i would like

I tried to train a model using google automl tables but i have the following problem
The csv file is correctly imported, it has 2 columns and about 1870 rows, all numeric.
The system recognises only 1 column as numeric but not the other.
The column, where the problem is, has 5 digits in each row separated with space.
Is there anything i should do in order for the system to properly recognise the data as numeric?
Thanks in advance for your help
The issue is with the Data type Numeric definition, the number needs to be comparable (greater than, smaller than, equal).
Two different list of numbers are not comparable, for example 2 4 7 is not comparable to 1 5 7. To solve this, without using strings and therefore losing the "information" of those numbers, you have several options.
For example:
Create an array of numbers, by inserting [ ] in the limits of the second entrance. Take into consideration the Array Data type relative weighted approach in AutoMl tables as it may affect the "information" extracted from the sequence.
Create additional columns for every entry of the second column so each one is a single number and hence truly numeric.
I would personally go for the second option.
If you are afraid of losing "information" by splitting the numbers take into consideration that after training, the model should deduce by itself the importance of the position and other "information" those number sequences might contain (mean, norm/modulus,relative increase,...) provided the training data is representative.

Google Sheets calculate characters only once

Is there a formula in google sheets to calculate a character only once. For example, if a row has 5 columns (Monday-Friday) and there are 2 or 3 columns marked with X. How can I calculate how many rows have an X. I don't need to know how many Xs there are just how many have an X?
Reina, I have one answer, though there may be better ones.
This formula, pasted into B34, should do what you want. It merges all the cells in column B to F, in each row, into one value, substitutes out possible spaces, then checks if it has at least one "y" (as used in your example.
=COUNTIF(ARRAYFORMULA(
SUBSTITUTE(B4:B29&C4:C29&D4:D29&E4:E29&F4:F29," ","")),
"*y*")
It is coded to search all student rows, ie. between 4 and 29 - change these row numbers if necessary.
If the attendance might be marked with something other than a "y", you could change the "y" part of the formula to "?*". I just didn't know if other values might be used, eg. an "S' for sick day or something, and you wanted to ignore those.
Then, you can drag the new formula from B34, sideways on row 34, to G34 and beyond, and it should calculate the results for the subsequent weeks. It will shift the columns being checked by the formula automatically.
Let me know if this works for you, or if you need something else.
To possibly ease data entry, here is a sample sheet with the formula, but with check boxes replacing the cells where attendance is marked.
https://docs.google.com/spreadsheets/d/1ON5Rc55aLVq_LHtFOfpgmf876bYg2ITfwpbifklr3lU/edit?usp=sharing
Here the formula is slightly modified to look for "TRUE" values, instead of "y"s.
UPDATE: To look for ANY non-blank cell in that range, and count "1" for every student that week that attended at least one day, the formula is:
=COUNTIF(
ARRAYFORMULA( B4:B29&C4:C29&D4:D29&E4:E29&F4:F29), ">""")
or
=COUNTIF(
ARRAYFORMULA( B4:B29&C4:C29&D4:D29&E4:E29&F4:F29), "?*")
See sample here:
https://docs.google.com/spreadsheets/d/1ON5Rc55aLVq_LHtFOfpgmf876bYg2ITfwpbifklr3lU/edit#gid=461771088&range=B34:F34
Let me know if this answers your question, or do you need to do something specific with the "y,x, and o"s?

Advanced Excel Search and Sorting

I have a incredibly large spreadsheet that lists details for the computers in my company's inventory. We need to know how many systems we have that are x years old. I was able to sort it by model but because the model names are wildly different it didn't help much. For example, one model name is
13-inch MacBook Pro (2011)
And another is
13-inch Retina MacBook Pro (Mid 2017)
The only constant value in the parentheses is the year at the end. I'm trying to write a formula that will spit out how many of each system there are. We need to know how many are 2011 computers, how many are 2017, etc. We are fine with grouping up "Early, Mid, Late" since we just need a year separation but those terms don't show up in every cell throwing my math off. The rows don't have to be sorted, I just need a count.
My plan of attack would be to first, convert the spreadsheet into a table using Insert > Table... this enables Excel to manage calculating columns for you.
The following assumes that the cell at the top of your list contains the word "Detail".
Second, I would make a new column at the far right with an equation like this:
=mid([#Detail], find(")",[#Detail])-4, 4)
...and I would tune the "Find" function and the "mid" function until it gives me just the year.
Third, sort the entire table by this new column. Tada!
Transfer the data to column A. Cells A1 to A1000 in my Example.
In Enter the years in column C. Cells C2 to C20 in my example.
In cell D2, enter the following Array Formula, and drag it down.
=SUM(IFERROR(IF(VALUE(LEFT(RIGHT($A$1:$A$1000,5),4))=C2,1,0),"-"))
Array Formulas are entered using Control + Shift + Enter, instead of Enter.
The Formula takes the last 5 characters of all entries in the column A. Then it takes the first 4 characters of this new text (to eliminate the closing bracket) and converts the text entries to numerical values. It matches each entry with the year in column C, and totals the matches.
I hope this solves your problem.
Regards,
Vijaykumar Shetye,
Spreadsheet Excellence,
Panaji, Goa India

Resorting items with a single database update

Let's say you're reordering items in your queue on Netflix. For every example I've seen of this sort of thing, when you move the last item to the top, it updates every record in the database, one at a time.
1. One Fine Day ==> change sort order from 1 to 2
2. Two and a Half Men ==> change sort order from 2 to 3
3. Three Kings (move to top) ==> change sort order from 3 to 1
Is there a better way to do this? Maybe one that only requires one database update each time you reorder in item? Consider this:
1. One Fine Day ==> do nothing (sort order stays at 1)
2. Two and a Half Men ==> do nothing (sort order stays at 2)
3. Three Kings (move to top) ==> change sort order from 3 to 0
Moving an item between two other items would split the difference between the sort orders:
1. One Fine Day ==> do nothing (sort order stays at 1)
2. Two and a Half Men ==> do nothing (sort order stays at 2)
3. Three Kings (move to mid) ==> change sort order from 3 to 2.5
To go one step further, we can use a larger character set than just digits, maybe going to base64 and sorting alphabetically, which would give you near unlimited resorting before having to reorder all the items to keep working space between items.
Anyways, what is the smartest way to hit your DB when resorting?
Your scenario, as I understood it, is the following:
For the sake of simplicity, let's imagine that we have 3 tables Movies, Users and UserPreferenes (the last one, with the columns UserId, MovieId and Ordinal).
Each time an user changes his preference over the sorting of his favorite movies, we should update the UserPreferences table.
But that usually requires updating the Ordinal column for at least 2 records (with some exceptions, but I'm not narrowing the overall logic to those cases)
So, the question would be: how can we avoid multiple updates and update only one record instead?
If the above is correct, a workaround solution would be denormalization. There is no general solution and with the risk of caveats on each direction you would choose, there are a couple of alternatives that I suggest you should consider:
To have not three, but only two columns in the UserPreferences, by keeping the UserId column and storing a sequence of the user's favorite movies ids in another column, OrderedMovieIds.
To convert the Ordinal column into a finite number of columns which would indicate the preferences of the user: Ordinal1, Ordinal2, ... OrdinalN (of course this is limited by the table maximum number of columns).

Resources