Is there a hold value until condition function in numpy? - algorithm

For example something like:
pre_hold_list = [-2,0,0,-1,0,0,0,3,0,0]
hold_condition = lambda x:x != 0
output = np.hold(pre_hold_list, hold_condition)
[-2,-2,-2,-1,-1,-1,-1,3,3,3] #result of output
Here the condition is that the current value is not zero the function will hold the value that this condition is met until the next value that meets this condition (i.e. it will hold -2 then -1 then 3).
Searching for np.hold() or np.step() does not give me anything on google.

Nevermind I coded a function that does this using the cumulative nature of cumsum and diff. If there's a way to improve this please let me know.
def holdtil(x, condition):
condition_index = np.where(condition)[0]
condition_value = np.take(x, condition_index)
condition_value_diff = np.diff(condition_value)
holdtil_diff = np.zeros(len(x))
holdtil_diff[condition_index[0]] = condition_value[0]
holdtil_diff[condition_index[1:]] = condition_value_diff
return np.cumsum(holdtil_diff)
EDIT:
I did a performance check between my solution and #Willem Van Onsem and mine has a very slight edge in time.
def hold_func():
start = time.time()
for i in range(1000):
x = np.random.randint(-5, 5, 1000)
hold(x, x != 0)
print(time.time() - start)
def holdtil_func():
start = time.time()
for i in range(1000):
x = np.random.randint(-5, 5, 1000)
holdtil(x, x != 0)
print(time.time() - start)
hold_func()
holdtil_func()
#0.055173397064208984
#0.045740604400634766

You could use a trick here by using cumsum(..) [numpy-doc], and diff() [numpy-doc]:
import numpy as np
def hold(iterable, condition):
cond = np.array(condition)
vals = np.array(iterable)
a = vals * cond
a[cond] = np.diff(np.hstack(((0,), a[cond])))
return a.cumsum()
The first parameter is an iterable that contains the elements, the second parameter condition is an iterable of the same length with booleans.
For example:
>>> a
array([-2, 0, 0, -1, 0, 0, 0, 3, 0, 0])
>>> hold(a, a != 0)
array([-2, -2, -2, -1, -1, -1, -1, 3, 3, 3])
>>> hold(a, a != 0)
array([-2, -2, -2, -1, -1, -1, -1, 3, 3, 3])
The function works as follows. Furst we make a copy of the two iterables (and convert these to numpy arrays, if that is not yet the case). You can omit that if these are numpy arrays.
Next we perform an elementwise multiplication, to make the values where the condition does not hold zero.
Next we calculate the difference between each item where the condition holds and the next one, and we set that to a. Finally we can use the cummulative sum of a, since the .diff() ensured that this will result in the correct repetitions.

Related

Algorithm to find some rows from a matrix, whose sum is equal to a given row

For example, here is a matrix:
[1, 0, 0, 0],
[1, 1, 0, 0],
[1, 0, 1, 0],
[1, 1, 1, 0],
[1, 1, 1, 1],
I want to find some rows, whose sum is equal to [4, 3, 2, 1].
The expected answer is rows: {0,1,3,4}.
Because:
[1, 0, 0, 0] + [1, 1, 0, 0] + [1, 1, 1, 0] + [1, 1, 1, 1] = [4, 3, 2, 1]
Is there some famous or related algrithoms to resolve the problem?
Thank #sascha and #N. Wouda for the comments.
To clarify it, here I provide some more details.
In my problem, the matrix will have about 50 rows and 25 columns. But echo row will just have less than 4 elements (other is zero). And every solution has 8 rows.
If I try all combinations, c(8, 50) is about 0.55 billion times of attempt. Too complex. So I want to find a more effective algrithom.
If you want to make the jump to using a solver, I'd recommend it. This is a pretty straightforward Integer Program. Below solutions use python, python's pyomo math programming package to formulate the problem, and COIN OR's cbc solver for Integer Programs and Mixed Integer Programs, which needs to be installed separately (freeware) available: https://www.coin-or.org/downloading/
Here is the an example with your data followed by an example with 100,000 rows. The example above solves instantly, the 100,000 row example takes about 2 seconds on my machine.
# row selection Integer Program
import pyomo.environ as pyo
data1 = [ [1, 0, 0, 0],
[1, 1, 0, 0],
[1, 0, 1, 0],
[1, 1, 1, 0],
[1, 1, 1, 1],]
data_dict = {(i, j): data1[i][j] for i in range(len(data1)) for j in range(len(data1[0]))}
model = pyo.ConcreteModel()
# sets
model.I = pyo.Set(initialize=range(len(data1))) # a simple row index
model.J = pyo.Set(initialize=range(len(data1[0]))) # a simple column index
# parameters
model.matrix = pyo.Param(model.I , model.J, initialize=data_dict) # hold the sparse matrix of values
magic_sum = [4, 3, 2, 1 ]
# variables
model.row_select = pyo.Var(model.I, domain=pyo.Boolean) # row selection variable
# constraints
# ensure the columnar sum is at least the magic sum for all j
def min_sum(model, j):
return sum(model.row_select[i] * model.matrix[(i, j)] for i in model.I) >= magic_sum[j]
model.c1 = pyo.Constraint(model.J, rule=min_sum)
# objective function
# minimze the overage
def objective(model):
delta = 0
for j in model.J:
delta += sum(model.row_select[i] * model.matrix[i, j] for i in model.I) - magic_sum[j]
return delta
model.OBJ = pyo.Objective(rule=objective)
model.pprint() # verify everything
solver = pyo.SolverFactory('cbc') # need to have cbc solver installed
result = solver.solve(model)
result.write() # solver details
model.row_select.display() # output
Output:
# ----------------------------------------------------------
# Solver Information
# ----------------------------------------------------------
Solver:
- Status: ok
User time: -1.0
System time: 0.0
Wallclock time: 0.0
Termination condition: optimal
Termination message: Model was solved to optimality (subject to tolerances), and an optimal solution is available.
Statistics:
Branch and bound:
Number of bounded subproblems: 0
Number of created subproblems: 0
Black box:
Number of iterations: 0
Error rc: 0
Time: 0.01792597770690918
# ----------------------------------------------------------
# Solution Information
# ----------------------------------------------------------
Solution:
- number of solutions: 0
number of solutions displayed: 0
row_select : Size=5, Index=I
Key : Lower : Value : Upper : Fixed : Stale : Domain
0 : 0 : 1.0 : 1 : False : False : Boolean
1 : 0 : 1.0 : 1 : False : False : Boolean
2 : 0 : 0.0 : 1 : False : False : Boolean
3 : 0 : 1.0 : 1 : False : False : Boolean
4 : 0 : 1.0 : 1 : False : False : Boolean
A more stressful rendition with 100,000 rows:
# row selection Integer Program stress test
import pyomo.environ as pyo
import numpy as np
# make a large matrix 100,000 x 8
data1 = np.random.randint(0, 1000, size=(100_000, 8))
# inject "the right answer into 3 rows"
data1[42602] = [8, 0, 0, 0, 0, 0, 0, 0 ]
data1[3] = [0, 0, 0, 0, 4, 3, 2, 1 ]
data1[10986] = [0, 7, 6, 5, 0, 0, 0, 0 ]
data_dict = {(i, j): data1[i][j] for i in range(len(data1)) for j in range(len(data1[0]))}
model = pyo.ConcreteModel()
# sets
model.I = pyo.Set(initialize=range(len(data1))) # a simple row index
model.J = pyo.Set(initialize=range(len(data1[0]))) # a simple column index
# parameters
model.matrix = pyo.Param(model.I , model.J, initialize=data_dict) # hold the sparse matrix of values
magic_sum = [8, 7, 6, 5, 4, 3, 2, 1 ]
# variables
model.row_select = pyo.Var(model.I, domain=pyo.Boolean) # row selection variable
# constraints
# ensure the columnar sum is at least the magic sum for all j
def min_sum(model, j):
return sum(model.row_select[i] * model.matrix[(i, j)] for i in model.I) >= magic_sum[j]
model.c1 = pyo.Constraint(model.J, rule=min_sum)
# objective function
# minimze the overage
def objective(model):
delta = 0
for j in model.J:
delta += sum(model.row_select[i] * model.matrix[i, j] for i in model.I) - magic_sum[j]
return delta
model.OBJ = pyo.Objective(rule=objective)
solver = pyo.SolverFactory('cbc')
result = solver.solve(model)
result.write()
print('\n\n======== row selections =======')
for i in model.I:
if model.row_select[i].value > 0:
print (f'row {i} selected')
Output:
# ----------------------------------------------------------
# Solver Information
# ----------------------------------------------------------
Solver:
- Status: ok
User time: -1.0
System time: 2.18
Wallclock time: 2.61
Termination condition: optimal
Termination message: Model was solved to optimality (subject to tolerances), and an optimal solution is available.
Statistics:
Branch and bound:
Number of bounded subproblems: 0
Number of created subproblems: 0
Black box:
Number of iterations: 0
Error rc: 0
Time: 2.800779104232788
# ----------------------------------------------------------
# Solution Information
# ----------------------------------------------------------
Solution:
- number of solutions: 0
number of solutions displayed: 0
======== row selections =======
row 3 selected
row 10986 selected
row 42602 selected
This one picks and not picks an element (recursivly). As soon as the tree is impossible to solve (no elements left or any target value negative) it will return false. In case the sum of the target is 0 a solution is found and returned in form of the picked elements.
Feel free to add time and memory complexity in the comments. Worst case should be 2^(n+1)
Please let me know how it performs on your 8/50 data.
const elements = [
[1, 0, 0, 0],
[1, 1, 0, 0],
[1, 0, 1, 0],
[1, 1, 1, 0],
[1, 1, 1, 1]
];
const target = [4, 3, 2, 1];
let iterations = 0;
console.log(iter(elements, target, [], 0));
console.log(`Iterations: ${iterations}`);
function iter(elements, target, picked, index) {
iterations++;
const sum = target.reduce(function(element, sum) {
return sum + element;
});
if (sum === 0) return picked;
if (elements.length === 0) return false;
const result = iter(
removeElement(elements, 0),
target,
picked,
index + 1
);
if (result !== false) return result;
const newTarget = matrixSubtract(target, elements[0]);
const hasNegatives = newTarget.some(function(element) {
return element < 0;
});
if (hasNegatives) return false;
return iter(
removeElement(elements, 0),
newTarget,
picked.concat(index),
index + 1
);
}
function removeElement(target, i) {
return target.slice(0, i).concat(target.slice(i + 1));
}
function matrixSubtract(minuend, subtrahend) {
let i = 0;
return minuend.map(function(element) {
return minuend[i] - subtrahend[i++]
});
}

Is there a common name for a function that maps by an index?

Names such as map, filter or sum are generally understood by every resonably good programmer.
I wonder whether the following function f also has such a standard name:
def f(data, idx): return [data[i] for i in idx]
Example usages:
r = f(['world', '!', 'hello'], [2, 0, 1, 1, 1])
piecePrice = [100, 50, 20, 180]
pieceIdx = [0, 2, 3, 0, 0]
total Price = sum(f(piecePrice, pieceIdx))
I started with map, but map is generally understood as a function that applies a function on each element of a list.

Calculate the amount of water a tool described by an array can contain

There is a tool for collecting rainwater. The transect chart of the tool is described by an array in the length of n.
For example:
for this array {2,1,1,4,1,1,2,3} the transect chart is:
I am required to calculate the amount of water the tool can sustain, in time and place complexity of O(n).
.
For the array above it is 7 (the grey area).
My thought:
Since it's a graphical problem, my initial thought was to first calculate the maximum of the array and multiply it by n. This is the starting volume I need to subtract from.
For example in the array above I need to subtract the green area and the heights themselves:
This is where I'm stuck and need help in order to do so in the required complexity.
Note: Maybe I'm overthinking and there are better ways to handle this problem. But as I said, since it's a graphical problem, my first thought was to go for a geometric solution.
Any tips or hints would be appreciate.
The water level at position i is the smaller of:
The maximum container height at positions <= i; and
The maximum container height at positions >= i
Calculate these two maximum values for every position using two passes through the array, and then sum up the differences between the water levels and the container heights.
Here is a python implementation of an algorithm similar to the one described by #MattTimmermans. The code reads like pseudocode, so I don't think extra explanations are needed:
def _find_water_capacity(container):
"""returns the max water capacity as calculated from the left bank
of the given container
"""
water_levels = [0]
current_left_bank = 0
idx = 0
while idx < len(container) - 1:
current_left_bank = max(current_left_bank, container[idx])
current_location_height = container[idx + 1]
possible_water_level = current_left_bank - current_location_height
if possible_water_level <= 0:
water_levels.append(0)
else:
water_levels.append(possible_water_level)
idx += 1
return water_levels
def find_water_capacity(container):
"""returns the actual water capacity as the sum of the minimum between the
left and right capacity for each position """
to_left = _find_water_capacity(container[::-1])[::-1] #reverse the result from _find_water_capacity of the reversed container.
to_right = _find_water_capacity(container)
return sum(min(left, right) for left, right in zip(to_left, to_right))
def test_find_water_capacity():
container = []
expected = 0
assert find_water_capacity(container) == expected
assert find_water_capacity(container[::-1]) == expected
container = [1, 1, 1, 1, 1]
expected = 0
assert find_water_capacity(container) == expected
assert find_water_capacity(container[::-1]) == expected
container = [5, 4, 3, 2, 1]
expected = 0
assert find_water_capacity(container) == expected
assert find_water_capacity(container[::-1]) == expected
container = [2, 1, 1, 4, 1, 1, 2, 3] # <--- the sample provided
expected = 7
assert find_water_capacity(container) == expected
assert find_water_capacity(container[::-1]) == expected
container = [4, 1, 1, 2, 1, 1, 3, 3, 3, 1, 2]
expected = 10
assert find_water_capacity(container) == expected
assert find_water_capacity(container[::-1]) == expected
container = [4, 5, 6, 7, 8, -10, 12, 11, 10, 9, 9]
expected = 18
assert find_water_capacity(container) == expected
assert find_water_capacity(container[::-1]) == expected
container = [2, 1, 5, 4, 3, 2, 1, 5, 1, 2]
expected = 12
assert find_water_capacity(container) == expected
assert find_water_capacity(container[::-1]) == expected
print("***all tests find_water_capacity passed***")
test_find_water_capacity()

Python Coin Change: Incrementing list in return statement?

Edit: Still working on this, making progress though.
def recursion_change(available_coins, tender):
"""
Returns a tuple containing:
:an array counting which coins are used to make change, mirroring the input array
:the number of coins to make tender.
:coins: List[int]
:money: int
:rtype: (List[int], int)
"""
change_list = [0] * len(available_coins)
def _helper_recursion_change(change_index, remaining_balance, change_list):
if remaining_balance == 0:
return (change_list, sum(change_list))
elif change_index == -1 or remaining_balance < 0:
return float('inf')
else:
test_a = _helper_recursion_change(change_index-1, remaining_balance, change_list)
test_b = _helper_recursion_change(_helper_recursion_change(len(available_coins)-1, tender, change_list))
test_min = min(test_a or test_b)
if :
_helper_recursion_change()
else:
_helper_recursion_change()
return 1 + _helper_recursion_change(change_index, remaining_balance-available_coins[change_index], change_list))
print str(recursion_change([1, 5, 10, 25, 50, 100], 72)) # Current Output: 5
# Desired Output: ([2, 0, 2, 0, 1, 0], 5)
Quick overview: this coin-change algorithm is supposed to receive a list of possible change options and tender. It's supposed to recursively output a mirror array and the number of coins needed to make tender, and I think the best way to do that is with a tuple.
For example:
> recursion_change([1, 2, 5, 10, 25], 49)
>> ([0, 2, 0, 2, 1], 5)
Working code sample:
http://ideone.com/mmtuMr
def recursion_change(coins, money):
"""
Returns a tuple containing:
:an array counting which coins are used to make change, mirroring the input array
:the number of coins to make tender.
:coins: List[int]
:money: int
:rtype: (List[int], int)
"""
change_list = [0] * len(coins)
def _helper_recursion_change(i, k, change_list):
if k == 0: # Base case: money in this (sub)problem matches change precisely
return 0
elif i == -1 or k < 0: # Base case: change cannot be made for this subproblem
return float('inf')
else: # Otherwise, simplify by recursing:
# Take the minimum of:
# the number of coins to make i cents
# the number of coins to make k-i cents
return min(_helper_recursion_change(i-1, k, change_list), 1 + _helper_recursion_change(i, k-coins[i], change_list))
return (_helper_recursion_change(len(coins)-1, money, change_list))
print str(recursion_change([1, 5, 10, 25, 50, 100], 6)) # Current Output: 2
# Desired Output: ([1, 1, 0, 0, 0, 0], 2)
Particularly, this line:
1 + _helper_recursion_change(i, k-coins[i], change_list))
It's easy enough to catch the number of coins we need, as the program does now. Do I have to change the return value to include change_list, so I can increment it? What's the best way to do that without messing with the recursion, as it currently returns just a simple integer.
Replacing change_list in the list above with change_list[i] + 1 gives me a
TypeError: 'int' object is unsubscriptable or change_list[i] += 1 fails to run because it's 'invalid syntax'.

Give random integers and a transform function, after some times of transformation, it will run into a cycle

This is a derived question, you can refer to original question,
and my question is: Given 10 random integers(from 0 to 9, repeating allowed), and a transform funciton f, f is this(in python 3.3 code):
def f(a):
l = []
for i in range(10):
l.append(a.count(i))
return l
Supposing a is the ten random integers, execute f and assign the result back to a, repeat this process, after a few times, you wil run into a cycle.
It is to say: a, a1=f(a), a2=f(a1)..., there is a cycle in this sequence.
test code is as following(code from #user1125600):
import random
# [tortoise and hare algorithm][2] to detect cycle
a = []
for i in range(10):
a.append(random.randint(0,9))
print('random:', a)
fast = a
slow = a
i = 0
while True:
fast = f(f(fast))
slow = f(slow)
print('slow:', slow, 'fast:', fast)
i +=1
# in case of running into an infinite loop, we are limited to run no more than 10 times
if(i > 10):
print('more than 10 times, quit')
break
if fast == slow:
print('you are running in a cycle:', fast, 'loop times:', i)
break
how to prove why existing a cycle in it ? And another interesting thing is that: look at the results of test, you will find that fast and slow will meet only at three points:[7, 1, 0, 1, 0, 0, 1, 0, 0, 0] and [6, 3, 0, 0, 0, 0, 0, 1, 0, 0] and [6, 2, 1, 0, 0, 0, 1, 0, 0, 0]
There has to be a cycle because f is a function (it always produces the same output for a given input), and because the range of the function (the set of possible outputs) is finite. Since the range is finite, if you repeatedly map the range onto itself, you must eventually get some value you've already seen.

Resources