First in first out inventory (FIFO) formula very slow calculation - performance

I have a sheet including product transactions from inventory to another and from supplier to another, I import formula for FIFO evaluation from excel sheet to google sheet, but when I populate that formula to all fields sheet get very slow.
Below the link for my sheet.
This is One of formulas
=ARRAY_CONSTRAIN(ARRAYFORMULA(SUM(--IF(MMULT(--(ROW(F$17:F18)>=TRANSPOSE(ROW(F$17:F18))),--IF(E$17:E18=N19,F$17:F18,0))<SUMIF(N$18:N19,N19,O$18:O19),1,0))), 1, 1)
and
=ARRAY_CONSTRAIN(ARRAYFORMULA(SUMPRODUCT(--IF(OFFSET(E$17,,,V19+1)=N19,1,0),OFFSET(F$17,,,V19+1),OFFSET(H$17,,,V19+1))-SUMIF(N$18:N18,N19,Z$18:Z18)), 1, 1)
and
=ARRAY_CONSTRAIN(ARRAYFORMULA((SUMIF(N$18:N19,N19,O$18:O19)-SUMPRODUCT(--IF(OFFSET(E$17,,,V19+1)=N19,1,0),OFFSET(F$17,,,V19+1)))*OFFSET(H$17,V19,,,)), 1, 1)
and finally
=IF(SUMIF(N$18:N19,N19,O$18:O19)>SUM(E$18:E19,N19,F$18:F19),MAX(SUMIF(E$18:E19,N19,G$18:G19)-SUMIF(N$18:N18,N19,P$18:P18),0),Y19)
https://docs.google.com/spreadsheets/d/1xJxCipSh-Q5ltSaGo-kpEPomrZdAI1T8PDH57rc-sOw/edit?usp=sharing
Update....
Formula in column H
=IF(F19=0,0,G19/F19)
Replaced With
=ARRAYFORMULA(IF(LEN(F19:F), IF(F19:F=0, 0, G19:G/F19:F), ))
Formula In Column P
=Z19
Replaced with
=ARRAYFORMULA(IF(LEN(O19:O), IF(O19:O=0, 0, Z19:Z), ))
Formula In column O
=P19/O19
Replaced with
=ARRAYFORMULA(IF(LEN(O19:O), IF(O19:O=0, 0, P19:P/O19:O), ))
But still those formula need help with
=ARRAY_CONSTRAIN(ARRAYFORMULA(SUM(--IF(MMULT(--(ROW(F$17:F18)>=TRANSPOSE(ROW(F$17:F18))),--IF(E$17:E18=N19,F$17:F18,0))
=ARRAY_CONSTRAIN(ARRAYFORMULA(SUMPRODUCT(--IF(OFFSET(E$17,,,V19+1)=N19,1,0),OFFSET(F$17,,,V19+1),OFFSET(H$17,,,V19+1))-SUMIF(N$18:N18,N19,Z$18:Z18)), 1, 1)
=ARRAY_CONSTRAIN(ARRAYFORMULA((SUMIF(N$18:N19,N19,O$18:O19)-SUMPRODUCT(--IF(OFFSET(E$17,,,V19+1)=N19,1,0),OFFSET(F$17,,,V19+1)))*OFFSET(H$17,V19,,,)), 1, 1)
=IF(SUMIF(N$18:N19,N19,O$18:O19)>SUM(E$18:E19,N19,F$18:F19),MAX(SUMIF(E$18:E19,N19,G$18:G19)-SUMIF(N$18:N18,N19,P$18:P18),0),Y19)
Regards

The sheet FIFO has a very long "formula chain" on column H (and may other columns too) from row 19 to row 5395. By formula chain I'm referring to a formula including relative references that was filled down/right so the differences in A1 Notation between a formula an the next are just the relative references but in R1C1 notation the formulas will look the same way.
To improve the performance of your spreadsheet you should reduce the number of formulas. If you don't need so many rows, try deleting the unnecessary rows. If that isn't enough, or you are looking for an optimal performance, replace formula chain by an array formula when this is possible or use Google Apps Script
NOTES
To make an optimal use of you web browser / device / network
resources
avoid the use of open references or enclose them inside ARRAY_CONSTRAIN function to return only the required values.
on Google Apps Script, avoid or keep the calls to the Spreadsheet Service at the execution time at minimum, specially avoid the to make calls to the Spreadsheet Service on loops, like using a for loop to edit one cell at a time.

Related

Create a Dynamic Array formula (Excel) to combine multiple results columns into one column that is filtered & sorted using multiple criteria?

The sample data in the image below is collected from a round robin tournament.
There is a Round column,Home team & Away team columns listing who is playing who. A team could be either Home or Away.
For each match in a round (including any "Bye" match) the number of games won for the Home and Away team are recorded in separate columns respectively.
"Ff" = forfeit and has a value of 0. "Bye" result is left blank (at this stage).
Output columns are "Won, Lost, Round".
Required output (shown in the image) is, for any selected team, the top n most-games-won matches (from both Home & Away) sorted in descending order and then the corresponding games lost but sorted in ascending order where the games won are equal. Finally show the rounds where those scores occurred.
These are the challenges I've faced in going from data to output in one step using dynamic array formula:
Collating/Combining the the Win results into 1 column. Likewise the Losses.
Getting the array to ignore blanks or convert "Ff" to 0 without getting #NUM or #VALUE errors.
Ensuring that if I used separate single column arrays the corresponding Loss and Round matched the Win result
Although "Round, Won, Lost" would be acceptable. But I wasn't able to get the Dynamic Array capability to give the required output with this order.
SUMPRODUCT, INDEX(MATCH), SORT(FILTER) functions all hint at a possible one step formula solution.
The solutions are numerous for sorting & filtering where the existing values are already in one column. There was one solution that dealt with 2 columns of values which was somewhat useful How to get the highest values from 2 columns in excel - Stackoverflow 2013
Many other responses are around the use of concatenation, combining/merging array sets, aggregation etc.
My work around solution is to use a Helper Sheet to combine the Wins from the separate results columns and convert blanks & "Ff" to -1. Likewise for Losses. Using the formula for each line
=IF($C5=L$2,IF($F5="",-1,IF($F5="Ff",0,$F5)),IF($D5=L$2,IF($G5="",-1,IF($G5="Ff",0,$G5)),-1))
Example Helper Sheet
To get the final output the Dynamic Array formula was used on the Helper Sheet data
=SORT(FILTER(L$26:N$40,L$26:L$40>=LARGE(L$26:L$40,$J$3),""),{1,2},{-1,1},FALSE)
I'm trying to avoid using pivottable, VBA solutions. Powerquery possible but not preferred.
Apologies for the screenshots but I couldn't work out how to attach the sample spreadsheet file. (Unfortunately Stackoverflow Help didn't help me to/not to do this.)
Based on the comments I changed my answer with a different approach:
=LET(data,A5:F19,
round,INDEX(data,,1),
ha,CHOOSECOLS(data,3,4),
HAwonR,CHOOSECOLS(data,5,6,1),
w,BYROW(ha,LAMBDA(h,IFERROR(XMATCH(L2,h),0))),
clm,CHOOSE(w,{1,2},{2,1}),
srtwon,DROP(REDUCE(0,SEQUENCE(ROWS(data)),LAMBDA(y,z,VSTACK(y,INDEX(HAwonR,z,HSTACK(INDEX(clm,z,),3))))),1),
res,FILTER(srtwon,w),
TAKE(SORT(res,{1,2},{-1,1}),J3))
Old answer:
=LET(data,A5:F19,
round,INDEX(data,,1),
home,INDEX(data,,3),
away,INDEX(data,,4),
HAwonR,CHOOSECOLS(data,5,6,1),
w,MAP(home,away,LAMBDA(h,a,OR(h=L2,a=L2))),
won,FILTER(HAwonR,w),
TAKE(SORT(won,{1,2},{-1,1}),J3))
In your example you selected round 3 for the third result, but that wasn't won, so I guess that was by mistake.
As you can see making use of LET avoids helpers. Let allows you to create names (helpers) that are stored and because you can name them, you can make complex formulas be more readable.
Basically what it does is filter the columns Home, Away and Round (in that order) for either Home or Away equal the team in cell L2. That's sorted column 1 descending and column 2 ascending. Than the number of rows mentioned in cell J3 are displayed from that sorted array.
Here is my solution based on the excellent contribution by #P.b. Thank you much appreciated.
The wins (likewise losses) required mapping the presence, of the team in question, as hT (home team) to the games it won (hG) and adding to that a 2nd mapping of the games it won (aG) when it was the away team (aT). Essentially what was being done on the Helper Sheet. Result was a 1 column array for game wins and a 1 column array for game losses.
In the process I was able to convert the "Ff" text to 0. I attempted without the conversion and it threw an error.
Instead of CHOOSECOLS used HSTACK to create the new array (wins, losses & round) for the FILTER, SORT, TAKE to work on.
If it could be made conciser(?) that is the next challenge. Overall (not just my solution), this exercise has provided greater flexibility and solved the problems stated. I'm happy!
=LET(data,A5:G19,
round,INDEX(data,,1),
hT,INDEX(data,,3),
aT,INDEX(data,,4),
hG,INDEX(data,,6),
aG,INDEX(data,,7),
wins,MAP(hG,
MAP(hT,LAMBDA(h,h=L2)),
LAMBDA(w,t,IF(w="Ff",0,w)*IF(t=TRUE,1,0))) +
MAP(aG,
MAP(aT,LAMBDA(a,a=L2)),
LAMBDA(w,t,IF(w="Ff",0,w)*IF(t=TRUE,1,0))),
losses,MAP(aG,
MAP(hT,LAMBDA(h,h=L2)),
LAMBDA(w,t,IF(w="Ff",0,w)*IF(t=TRUE,1,0))) +
MAP(hG,
MAP(aT,LAMBDA(a,a=L2)),
LAMBDA(w,t,IF(w="Ff",0,w)*IF(t=TRUE,1,0))),
HAwonR,HSTACK(wins,losses,round),
w,MAP(home,away,LAMBDA(h,a,OR(h=L2,a=L2))),
won,FILTER(HAwonR,w),
TAKE(SORT(won,{1,2},{-1,1}),J3))

Is there any option to do FOR loop in excel?

I have an excel that I'm calculating my Scrum Task's completed average. I have Story point item also in the excel. My calculation is:
Result= SP * percentage of completion --> This calculation is for each row and after that I sum up all result and taking the summary.
But sometimes I am adding new task and for each task I am adding the calculation to the average result.
Is there any way to use for loop in the excel?
for(int i=0;i<50;i++){ if(SP!=null && task!=null)(B+i)*(L+i)}
My calculation is like below:
AVERAGE((B4*L4+B5*L5+B6*L6+B7*L7+B8*L8+B9*L9+B10*L10)/SUM(B4:B10))
First of all, AVERAGE is not doing anything in your formula, since the argument you pass to it is just one single value. You already do an average calculation by dividing by the sum. That average is in fact a weighted average, and so you could not even achieve that with a plain AVERAGE function.
I see several ways to make this formula more generic, so it keeps working when you add rows:
1. Use SUMPRODUCT
=SUMPRODUCT(B4:B100,L4:L100)/SUM(B4:B100)
The row number 100 is chosen arbitrarily, but should evidently encompass all data rows. If you have no data occurring below your table, then it is safe to add a large margin. You'll want to avoid the situation where you think you add a line to the table, but actually get outside of the range of the formula. Using proper Excel tables can help to avoid this situation.
2. Use an array formula
This would be a second resort for when the formula becomes more complicated and cannot be executed with a "simple" SUMPRODUCT. But the above would translate to this array formula:
=SUM(B4:B100*L4:L100)/SUM(B4:B100)
Once you have typed this in the formula bar, make sure to press Ctrl+Shift+Enter to enter it. Only then will it act as an array formula.
Again, the same remark about row number 100.
3. Use an extra column
Things get easy when you use an extra column for storing the product of B & L values for each row. So you would put in cell N4 the following formula:
=B4*L4
...and then copy that relative formula to the other rows. You can hide that column if you want.
Then the overal formula can be:
=SUM(N4:N100)/SUM(B4:B100)
With this solution you must take care to always copy a row when inserting a new row, as you need the N column to have the intermediate product formula also for any new row.

Advanced Excel Search and Sorting

I have a incredibly large spreadsheet that lists details for the computers in my company's inventory. We need to know how many systems we have that are x years old. I was able to sort it by model but because the model names are wildly different it didn't help much. For example, one model name is
13-inch MacBook Pro (2011)
And another is
13-inch Retina MacBook Pro (Mid 2017)
The only constant value in the parentheses is the year at the end. I'm trying to write a formula that will spit out how many of each system there are. We need to know how many are 2011 computers, how many are 2017, etc. We are fine with grouping up "Early, Mid, Late" since we just need a year separation but those terms don't show up in every cell throwing my math off. The rows don't have to be sorted, I just need a count.
My plan of attack would be to first, convert the spreadsheet into a table using Insert > Table... this enables Excel to manage calculating columns for you.
The following assumes that the cell at the top of your list contains the word "Detail".
Second, I would make a new column at the far right with an equation like this:
=mid([#Detail], find(")",[#Detail])-4, 4)
...and I would tune the "Find" function and the "mid" function until it gives me just the year.
Third, sort the entire table by this new column. Tada!
Transfer the data to column A. Cells A1 to A1000 in my Example.
In Enter the years in column C. Cells C2 to C20 in my example.
In cell D2, enter the following Array Formula, and drag it down.
=SUM(IFERROR(IF(VALUE(LEFT(RIGHT($A$1:$A$1000,5),4))=C2,1,0),"-"))
Array Formulas are entered using Control + Shift + Enter, instead of Enter.
The Formula takes the last 5 characters of all entries in the column A. Then it takes the first 4 characters of this new text (to eliminate the closing bracket) and converts the text entries to numerical values. It matches each entry with the year in column C, and totals the matches.
I hope this solves your problem.
Regards,
Vijaykumar Shetye,
Spreadsheet Excellence,
Panaji, Goa India

Is it possible to create a table made up of multiple what-if scenario results?

I'm going to describe my goal in steps because I think that might be the easiest way to explain it. This is what I'm trying to do:
1) Create a template that has various calculations on it. On this template, 1 specific cell is left blank. The calculations will change depending on what's in this cell (I'll refer to this as the special cell).
2) There's one final figure behind these calculations that's important. What I want to do is create a list with every possible final figure and in an adjacent cell, list the value of the special cell that gives this final figure.
The problem is Excel for Mac 2008 doesn't use macros or VBA. In my Windows version of Excel, this is just a simple function. But on Excel for Mac 2008, I'm not sure at all how to tackle this. The only solution I can think of is to create one sheet for every possible value of the special cell, with all the calculations done specifically for that value of the special cell. Then I could just link each final figure/special cell to a main page so all the information is together. However, there are roughly 400 values the special cell can take, and I really don't want to create 400 different sheets. Does anybody know how I can do this?
Also, just as a note in case this is easier to visualize what I mean, I'm basically trying to run multiple what-if scenarios and collect one specific number from each of these scenarios.
Here's an example of the processes involved. I should mention here that there are actual 2 different special cells, I wrote 1 in the original description because I'm assuming the idea would be the same to do 2:
1) The main template sheet is located on Sheet A
2) There are 10 slots for store names
3) Each store has a rate, the rate is found by applying a vlookup which looks up the special cell 1 and where the array table is located on Sheet B
4) Each store also has an index number (referred to as index)
5) Each store has a calculation which is index * special cell 2 (referred to as calc1)
6) Each store has another calculation which is rate * num1 (referred to as calc2)
7) Each store has another index number (referred to as index2)
8) Some of the index2 values have to be multiplied by calc2, the rest will stay the same (referred to as calc3)
9) A summation has to be done, summing all the calc2 values to result in sum1
10) A summation has to be done, summing all the calc3 values to result in sum2
11) The final figure is sum1 + sum2
It sounds like you could create 400 rows where each row is a what if scenario. Then next to each row you could take an input and an output, and graph accordingly.
Update
Per your description so far I've created the attached workbook with some formulas to put you in the right direction:
https://dl.dropbox.com/u/19599049/120813_2c.xlsx
It calculates the sum1 and sum2 For 10 stores based on the 2 inputs.
Note that I colored which cells were ending up in which final output.
yellow = original sum1/sum2
blue = array formula version of sum1/sum2
green = data used in both.
I did this to point out that while this example workbook seems to follow all 11 of your rules. the input 2 doesnt appear to be included in the final outputs of my mock-up version for some reason.
Either way this should serve as a good basis to get you started. And I can modify it if you continue to include more details.

Algorithm to calculate a page importance based on its views / comments

I need an algorithm that allows me to determine an appropriate <priority> field for my website's sitemap based on the page's views and comments count.
For those of you unfamiliar with sitemaps, the priority field is used to signal the importance of a page relative to the others on the same website. It must be a decimal number between 0 and 1.
The algorithm will accept two parameters, viewCount and commentCount, and will return the priority value. For example:
GetPriority(100000, 100000); // Damn, a lot of views/comments! The returned value will be very close to 1, for example 0.995
GetPriority(3, 2); // Ok not many users are interested in this page, so for example it will return 0.082
You mentioned doing this in an SQL query, so I'll give samples in that.
If you have a table/view Pages, something like this
Pages
-----
page_id:int
views:int - indexed
comments:int - indexed
Then you can order them by writing
SELECT * FROM Pages
ORDER BY
(0.3+LOG10(10+views)/LOG10(10+(SELECT MAX(views) FROM Pages))) +
(0.7+LOG10(10+comments)/LOG10(10+(SELECT MAX(comments) FROM Pages)))
I've deliberately chosen unequal weighting between views and comments. A problem that can arise with keeping an equal weighting with views/comments is that the ranking becomes a self-fulfilling prophecy - a page is returned at the top of the list, so it's visited more often, and thus gets more points, so it's shown at the stop of the list, and it's visited more often, and it gets more points.... Putting more weight on on the comments reflects that these take real effort and show real interest.
The above formula will give you ranking based on all-time statistics. So an article that amassed the same number of views/comments in the last week as another article amassed in the last year will be given the same priority. It may make sense to repeat the formula, each time specifying a range of dates, and favoring pages with higher activity, e.g.
0.3*(score for views/comments today) - live data
0.3*(score for views/comments in the last week)
0.25*(score for views/comments in the last month)
0.15*(score for all views/comments, all time)
This will ensure that "hot" pages are given higher priority than similarly scored pages that haven't seen much action lately. All values apart from today's scores can be persisted in tables by scheduled stored procedures so that the database isn't having to aggregate many many comments/view stats. Only today's stats are computed "live". Taking it one step further, the ranking formula itself can be computed and stored for historical data by a stored procedure run daily.
EDIT: To get a strict range from 0.1 to 1.0, you would motify the formula like this. But I stress - this will only add overhead and is unecessary - the absolute values of priority are not important - only their relative values to other urls. The search engine uses these to answer the question, is URL A more important/relevant than URL B? It does this by comparing their priorities - which one is greatest - not their absolute values.
// unnormalized - x is some page id
un(x) = 0.3*log(views(x)+10)/log(10+maxViews()) +
0.7*log(comments(x)+10)/log(10+maxComments())
// the original formula (now in pseudo code)
The maximum will be 1.0, the minimum will start at 1.0 and move downwards as more views/comments are made.
we define un(0) as the minimum value, i.e. (where views(x) and comments(x) are both 0 in the above formula)
To get a normalized formula from 0.1 to 1.0, you then compute n(x), the normalized priority for page x
(1.0-un(x)) * (un(0)-0.1)
n(x) = un(x) - ------------------------- when un(0) != 1.0
1.0-un(0)
= 0.1 otherwise.
Priority = W1 * views / maxViewsOfAllArticles + W2 * comments / maxCommentsOfAllArticles
with W1+W2=1
Although IMHO, just use 0.5*log_10(10+views)/log_10(10+maxViews) + 0.5*log_10(10+comments)/log_10(10+maxComments)
What you're looking for here is not an algorithm, but a formula.
Unfortunately, you haven't really specified the details of what you want, so there's no way we can provide the formula to you.
Instead, let's try to walk through the problem together.
You've got two incoming parameters, the viewCount and the commentCount. You want to return a single number, Priority. So far, so good.
You say that Priority should range between 0 and 1, but this isn't really important. If we were to come up with a formula we liked, but resulted in values between 0 and N, we could just divide the results by N-- so this constraint isn't really relevant.
Now, the first thing we need to decide is the relative weight of Comments vs Views.
If page A has 100 comments and 10 views, and page B has 10 comments and 100 views, which should have a higher priority? Or, should it be the same priority? You need to decide what's right for your definition of Priority.
If you decide, for example, that comments are 5 times more valuable than views, then we can begin with a formula like
Priority = 5 * Comments + Views
Obviously, this can be generalized to
Priority = A * Comments + B * Views
Where A and B are relative weights.
But, sometimes we want our weights to be exponential instead of linear, like
Priority = Comment ^ A + Views ^ B
which will give a very different curve than the earlier formula.
Similarly,
Priority = Comment ^ A * Views ^ B
will give higher value to a page with 20 comments and 20 views than one with 1 comment and 40 views, if the weights are equal.
So, to summarize:
You really ought to make a spreadsheet with sample values for Views and Comments, and then play around with various formulas until you get one that has the distribution that you are hoping for.
We can't do it for you, because we don't know how you want to value things.
I know it has been a while since this was asked, but I encountered a similar problem and had a different solution.
When you want to have a way to rank something, and there are multiple factors that you're using to perform that ranking, you're doing something called multi-criteria decision analysis. (MCDA). See: http://en.wikipedia.org/wiki/Multi-criteria_decision_analysis
There are several ways to handle this. In your case, your criteria have different "units". One is in units of comments, the other is in units of views. Futhermore, you may want to give different weight to these criteria based on whatever business rules you come up with.
In that case, the best solution is something called a weighted product model. See: http://en.wikipedia.org/wiki/Weighted_product_model
The gist is that you take each of your criteria and turn it into a percentage (as was previously suggested), then you take that percentage and raise it to the power of X, where X is a number between 0 and 1. This number represents your weight. Your total weights should add up to one.
Lastly, you multiple each of the results together to come up with a rank. If the rank is greater than 1, than the numerator page has a higher rank than the denominator page.
Each page would be compared against every other page by doing something like:
p1C = page 1 comments
p1V = page 1 view
p2C = page 2 comments
p2V = page 2 views
wC = comment weight
wV = view weight
rank = (p1C/p2C)^(wC) * (p1V/p2V)^(wV)
The end result is a sorted list of pages according to their rank.
I've implemented this in C# by performing a sort on a collection of objects implementing IComparable.
What several posters have essentially advocated without conceptual clarification is that you use linear regression to determine a weighting function of webpage view and comment counts to establish priority.
This technique is pretty easy to implement for your problem, and the basic concept is described well in this Wikipedia article on linear regression models.
A quick summary of how to apply it to your problem is:
Determine the parameters of the line which best fits the view and comment count data for all your site's webpages, i.e., use linear regression.
Use the line parameters to derive your priority function for the view/count parameters.
Code examples for basic linear regression should not be hard to track down if you don't want to implement it from scratch from basic math formulas (use the web, Numerical Recipes, etc.). Also, any general math software package like Matlab, R, etc., comes with linear regression functions.
The most naive approach would be the following:
Let v[i] the views of page i, c[i] the number of comments for page i, then define the relative view weight for page i to be
r_v(i) = v[i]/(sum_j v[j])
where sum_j v[j] is the total of the v[.] over all pages. Similarly define the relative comment weight for page i to be
r_c(i) = c[i]/(sum_j c[j]).
Now you want some constant parameter p: 0 < p < 1 which indicates the importance of views over comments: p = 0 means only comments are significant, p = 1 means only views are significant, and p = 0.5 gives equal weight.
Then set the priority to be
p*r_v(i) + (1-p)*r_c(i)
This might be over-simplistic but its probably the best starting point.

Resources