Get Capped Maximum Value From List - powerquery

I have a list of values that range anywhere from 500-1000. I have a second list of values that denote relevant breakpoints in the 500-1000 range (500, 520, 540, 600, etc). I need to return the highest value in the second list that is less than the value in a given number from the first list. I noticed the "N" functions let you set a conditional on them, so for example if I do:
List.Max(List.FirstN(SomeTable[Breakpoints], each _ < 530))
It correctly returns 520 to me. However if I put this inside an AddColumn function and change the 530 to a local field reference:
Table.AddColumn(MyTable, "MinValue", each List.Max(List.FirstN(SomeTable[Breakpoints], each _ < [SomeNumbers])))
Then I get a "We cannot apply field access to the type Number" error. Is what I'm trying to do possible and I'm just formatting it wrong? I always get confused with scope and references in PQ, so it may just be that.

After each, [SomeNumbers] by itself is short for _[SomeNumbers] (which is what you see when filtering a column). In the List.FirstN call, _ refers to a number in the list instead of a row in a table: the value of _ is tied to the closest each, where closeness is measured by the number of layers of nesting between _ and the appearance of each . Therefore, in your code [SomeNumbers] is trying to find the column SomeNumbers on a number, which doesn't exist.
There are a couple ways to fix this:
You can use a let...in statement to store the current value of the SomeNumbers column to use it for later, like so:
each
let
currentNumber = [SomeNumbers],
result = List.Max(List.FirstN(SomeTable[Breakpoints], each _ < currentNumber))
in
result
You can explicitly define a function with the (x) => ... syntax instead of using each twice, like so:
each List.Max(List.FirstN(SomeTable[Breakpoints], (point) => point < [SomeNumbers]))

Related

Understanding the behaviour of Function.ScalarVector

I have a use case where I would like to pass two arguments to a function generated by Function.ScalarVector (see https://learn.microsoft.com/en-us/powerquery-m/function-scalarvector).
I would like the second argument to, optionally, capture two or more columns of a table, e.g. ScalarFun([Col1], {[Col2], [Col3]}). I would expect this to pass a list of lists to the function itself. Alas, it does not.
Consider this example, if I define a function to simply capture the generated input list (which we'd normally pass on to a function doing something useful with the list) we can see that passing a list of two values and a concatenation of two values generate very different behaviour:
let
ScalarFun =
Function.ScalarVector(
type function (col as any) as any,
(t) =>
let
buf = Table.Buffer(t)
in
List.Transform(buf[col], each List.Distinct(buf[col]))
),
TestTable = Table.FromColumns(
{{"a","b","c"}, {"x","y","z"}},
{"Col2", "Col3"}
),
#"List syntax" = Table.AddColumn(
TestTable,
"List1",
each Text.Combine(List.First(ScalarFun({[Col2],[Col3]})), ";")
),
#"Concactenation syntax" = Table.AddColumn(
#"List syntax",
"List2",
each Text.Combine(ScalarFun([Col2] & [Col3]), ";")
)
in
#"Concactenation syntax"
Given this output I can see that the syntax where we combine two values into a list ScalarFun([Col1], {[Col2], [Col3]}) actually operate row by row. Concatenating two text values however works just fine (the function gets passed a list containing the entire column).
A list is just a value like any other, so I don't understand why my preferred syntax doesn't work?
Update:
It's clear to me that the function does not behave in the way I anticipated due to lazy evaluation. I see the same issue if I try to use a record. Passing a list in this rather convoluted way 'works' however:
Text.Split(Text.Combine({[Col2],[Col3]}, ";"),";")
I'm not going to post this as an answer because I don't fully understand why this (silently, without an error) breaks the intended behaviour of Function.ScalarVector.
A list of lists is not the same as concatenated lists so I don't see any reason to expect them to behave the same way.
{TestTable[Col2], TestTable[Col3]} = {{a,b,c},{x,y,z}}
TestTable[Col2] & TestTable[Col3] = {a,b,c,x,y,z}

How to call Lua table value explicitly when using integer counter (i,j,k) in a for loop to make the table name/address?

I have to be honest that I don't quite understand Lua that well yet. I am trying to overwrite a local numeric value assigned to a set table address (is this the right term?).
The addresses are of the type:
project.models.stor1.inputs.T_in.default, project.models.stor2.inputs.T_in.default and so on with the stor number increasing.
I would like to do this in a for loop but cannot find the right expression to make the entire string be accepted by Lua as a table address (again, I hope this is the right term).
So far, I tried the following to concatenate the strings but without success in calling and then overwriting the value:
for k = 1,10,1 do
project.models.["stor"..k].inputs.T_in.default = 25
end
for k = 1,10,1 do
"project.models.stor"..j..".T_in.default" = 25
end
EDIT:
I think I found the solution as per https://www.lua.org/pil/2.5.html:
A common mistake for beginners is to confuse a.x with a[x]. The first form represents a["x"], that is, a table indexed by the string "x". The second form is a table indexed by the value of the variable x. See the difference:
for k = 1,10,1 do
project["models"]["stor"..k]["inputs"]["T_in"]["default"] = 25
end
You were almost close.
Lua supports this representation by providing a.name as syntactic sugar for a["name"].
Read more: https://www.lua.org/pil/2.5.html
You can use only one syntax in time.
Either tbl.key or tbl["key"].
The limitation of . is that you can only use constant strings in it (which are also valid variable names).
In square brackets [] you can evaluate runtime expressions.
Correct way to do it:
project.models["stor"..k].inputs.T_in.default = 25
The . in models.["stor"..k] is unnecessary and causes an error. The correct syntax is just models["stor"..k].

find endpoints for range given a value within the range

I am trying to solve a simple problem, but at the moment I cannot think of a better solution. I am testing an API that is not documented.
There is an ID used to fetch objects and it has a min and max value with random values missing in-between. I'm trying to test the responses I receive for random objects, but to find objects, I need to have valid IDs.
It would be very inefficient to test random numbers and hope that I get an object back. The best I can do is find a range, get a random number between that range and check if it exists before conducting tests.
A sample list of all of the IDs in the database might look like this:
[1005, 25984, 25986, 29587, 30000, ...]
Assuming the deviation from one value to another will never exceed C, e.g. from the first value to the next value, the difference will never be greater than a pre-defined constant, how would you calculate the min/max of the range given only one value in the range?
Starting from a given value and looping until the last value is found is horrible but that is how it was implemented by previous devs. Below is pseudocode that more or less covers what they do.
// this can be any valid object ID from the database
// assuming the ID's in the database are [1005, 25984, 25986, 29587, 30000]
// "i" could be any one of these values
var i = givenPredefinedObjectId;
var deviation = 100;
// objectWithIdExists() is going to lookup an object with the ID "i" in the database
// if there is no object with the ID "i" , it will return false
// otherwise the object will get tested and return true
while(objectWithIdExists(i)){
i++;
}
for(i; i < i+deviation; i++){
if(objectWithIdExists(i)){
goto while loop;
}
}
endPoint = i - deviation;
Assuming there is no knowledge about the possible values except you can check if they exist and you are given one valid value (there is no array with all possible IDs, that was just an example), how would you find the min/max values?
Unbounded binary search is feasible, with a factor of C slowdown. Given an algorithm for unbounded binary search that, given access to the oracle less_equal(n) for some natural number n, returns n in time O(log n), implement the oracle on input k by querying all of the IDs C*k, C*k+1, ..., C*k+C-1 and reporting that k is less than or equal to n if and only if one ID is found. The running time is O(C*log((max-min)/C)).

how can I get the location for the maximum value in fortran?

I have a 250*2001 matrix. I want to find the location for the maximum value for a(:,i) where i takes 5 different values: i = i + 256
a(:,256)
a(:,512)
a(:,768)
a(:,1024)
a(:,1280)
I tried using MAXLOC, but since I'm new to fortran, I couldn't get it right.
Try this
maxloc(a(:,256:1280:256))
but be warned, this call will return a value in the range 1..5 for the second dimension. The call will return the index of the maxloc in the 2001*5 array section that you pass to it. So to get the column index of the location in the original array you'll have to do some multiplication. And note that since the argument in the call to maxloc is a rank-2 array section the call will return a 2-element vector.
Your question is a little unclear: it could be either of two things you want.
One value for the maximum over the entire 250-by-5 subarray;
One value for the maximum in each of the 5 250-by-1 subarrays.
Your comments suggest you want the latter, and there is already an answer for the former.
So, in case it is the latter:
b(1:5) = MAXLOC(a(:,256:1280:256), DIM=1)

Sorting a list of objects by property in Matlab, and then deleting the smallest one

I'm trying to use Matlab to implement the MDO algorithm, which requires me to sort an array of objects of a custom-defined mdoVertex class by their degree, and then delete the one with the smallest degree value. My first attempt was this:
for i = 1:m
if graph(i).degree < minDegree
minDegree = graph(i).degree;
elimObject = graph(i);
end
end
Matlab is complaining that elimObject, or the object to be eliminated after the loop executes, is an undefined function or variable. How, then, can I keep track of not only the current smallest degree the loop has encountered, but also which object it corresponded to? 'graph' is the name of the array holding all of my vertex objects.
I suspect that you're somehow trying to call clear on the object returned from your function. Or is it just a few lines of code in a script? I'm guessing here. In any event, calling clear won't work. As you've noticed, clear expects to be given a variable name.
But in this case, you're not trying to delete a variable, you're trying to remove an element from an array. For that, you do arrayname(indextodelete) = [];
So I think that you want...
minDegree = inf; % See what I did there? I defined the variable, and I did it in such a way that I KNOW that the first vertex will satisfy the condition.
for i = 1:length(graph) % Properly loop over the entire graph
if graph(i).degree < minDegree % The first vertex will definitely satisfy this. Maybe another one (or more) will later!
minDegree = graph(i).degree;
minDegreeIndex = i; % Don't record the value, just remember WHERE it is in the array.
end
end
graph(minDegreeIndex) = []; % Now, remove the element that you identified from the array!
(By the way, you never showed us how you tried to eliminate elimObject. I assume that you called clear (the object that you identified)? You shouldn't make us guess; show us.)

Resources