numpy - calculate value from matrix indexes - matrix

I need to create a new (n*m) x 4 matrix (e.g. named b) from a n x m matrix (e.g. named a), but I don't want to use nested loops for speed reasons. Here how I would do it with a nested loop:
for j in xrange(1,m+1):
for i in xrange(1,n+1):
index = (j-1)*n+i
b[index,1] = a[i,j]
b[index,2] = index
b[index,3] = s1*i+s2*j+s3
b[index,4] = s4*i+s5*j+s6
The question is, therefore, how to create a new matrix with values derived from original matrix indexes? Thanks

If you can use numpy, you could try
import numpy as np
# Create an empty array
b = np.empty((np.multiply(*a.shape), 4), dtype=a.dtype)
# Get two (nxm) of indices
(irows, icols) = np.indices(a)
# Fill the b array
b[...,0] = a.flat
b[...,1] = np.arange(a.size)
b[...,2] = (s1*irows + s2*icols + s3).flat
b[...,3] = (s4*irows + s5*icols + s6).flat

some minor corrections (could not post as comment :/ ) for those who will have a similar question:
import numpy as np
# Create an empty array
b = np.empty((a.size, 4), dtype=a.dtype)
# Get two (nxm) of indices (starting from 1)
(irows, icols) = np.indices(a.shape) + 1
# Fill the b array
b[...,0] = a.flat
b[...,1] = np.arange(a.size) + 1
b[...,2] = (s1*irows + s2*icols + s3).flat
b[...,3] = (s4*irows + s5*icols + s6).flat

Related

How to extract optimization problem matrices A,b,c using JuMP in Julia

I create an optimization model in Julia-JuMP using the symbolic variables and constraints e.g. below
using JuMP
using CPLEX
# model
Mod = Model(CPLEX.Optimizer)
# sets
I = 1:2;
# Variables
x = #variable( Mod , [I] , base_name = "x" )
y = #variable( Mod , [I] , base_name = "y" )
# constraints
Con1 = #constraint( Mod , [i in I] , 2 * x[i] + 3 * y[i] <= 100 )
# objective
ObjFun = #objective( Mod , Max , sum( x[i] + 2 * y[i] for i in I) ) ;
# solve
optimize!(Mod)
I guess JuMP creates the problem in the form minimize c'*x subj to Ax < b before it is passes to the solver CPLEX. I want to extract the matrices A,b,c. In the above example I would expect something like:
A
2×4 Array{Int64,2}:
2 0 3 0
0 2 0 3
b
2-element Array{Int64,1}:
100
100
c
4-element Array{Int64,1}:
1
1
2
2
In MATLAB the function prob2struct can do this https://www.mathworks.com/help/optim/ug/optim.problemdef.optimizationproblem.prob2struct.html
In there a JuMP function that can do this?
This is not easily possible as far as I am aware.
The problem is stored in the underlying MathOptInterface (MOI) specific data structures. For example, constraints are always stored as MOI.AbstractFunction - in - MOI.AbstractSet. The same is true for the MOI.ObjectiveFunction. (see MOI documentation: https://jump.dev/MathOptInterface.jl/dev/apimanual/#Functions-1)
You can however, try to recompute the objective function terms and the constraints in matrix-vector-form.
For example, assuming you still have your JuMP.Model Mod, you can examine the objective function closer by typing:
using MathOptInterface
const MOI = MathOptInterface
# this only works if you have a linear objective function (the model has a ScalarAffineFunction as its objective)
obj = MOI.get(Mod, MOI.ObjectiveFunction{MOI.ScalarAffineFunction{Float64}}())
# take a look at the terms
obj.terms
# from this you could extract your vector c
c = zeros(4)
for term in obj.terms
c[term.variable_index.value] = term.coefficient
end
#show(c)
This gives indeed: c = [1.;1.;2.;2.].
You can do something similar for the underlying MOI.constraints.
# list all the constraints present in the model
cons = MOI.get(Mod, MOI.ListOfConstraints())
#show(cons)
in this case we only have one type of constraint, i.e. (MOI.ScalarAffineFunction{Float64} in MOI.LessThan{Float64})
# get the constraint indices for this combination of F(unction) in S(et)
F = cons[1][1]
S = cons[1][2]
ci = MOI.get(Mod, MOI.ListOfConstraintIndices{F,S}())
You get two constraint indices (stored in the array ci), because there are two constraints for this combination F - in - S.
Let's examine the first one of them closer:
ci1 = ci[1]
# to get the function and set corresponding to this constraint (index):
moi_backend = backend(Mod)
f = MOI.get(moi_backend, MOI.ConstraintFunction(), ci1)
f is again of type MOI.ScalarAffineFunction which corresponds to one row a1 in your A = [a1; ...; am] matrix. The row is given by:
a1 = zeros(4)
for term in f.terms
a1[term.variable_index.value] = term.coefficient
end
#show(a1) # gives [2.0 0 3.0 0] (the first row of your A matrix)
To get the corresponding first entry b1 of your b = [b1; ...; bm] vector, you have to look at the constraint set of that same constraint index ci1:
s = MOI.get(moi_backend, MOI.ConstraintSet(), ci1)
#show(s) # MathOptInterface.LessThan{Float64}(100.0)
b1 = s.upper
I hope this gives you some intuition on how the data is stored in MathOptInterface format.
You would have to do this for all constraints and all constraint types and stack them as rows in your constraint matrix A and vector b.
Use the following lines:
Pkg.add("NLPModelsJuMP")
using NLPModelsJuMP
nlp = MathOptNLPModel(model) # the input "< model >" is the name of the model you created by JuMP before with variables and constraints (and optionally the objective function) attached to it.
x = zeros(nlp.meta.nvar)
b = NLPModelsJuMP.grad(nlp, x)
A = Matrix(NLPModelsJuMP.jac(nlp, x))
I didn't try it myself. But the MathProgBase package seems to be able to provide A, b, and c in matrix form.

Algorithm for finding all points within distance of another point

I had this problem for an entry test for a job. I did not pass the test. I am disguising the question in deference to the company.
Imagine you have N number of people in a park of A X B space. If a person has no other person within 50 feet, he enjoys his privacy. Otherwise, his personal space is violated. Given a set of (x, y), how many people will have their space violated?
For example, give this list in Python:
people = [(0,0), (1,1), (1000, 1000)]
We would find 2 people who are having their space violated: 1, 2.
We don't need to find all sets of people; just the total number of unique people.
You can't use a brute method to solve the problem. In other words, you can't use a simple array within an array.
I have been working on this problem off and on for a few weeks, and although I have gotten a solution faster than n^2, have not come up with a problem that scales.
I think the only correct way to solve this problem is by using Fortune's algorithm?
Here's what I have in Python (not using Fortune's algorithm):
import math
import random
random.seed(1) # Setting random number generator seed for repeatability
TEST = True
NUM_PEOPLE = 10000
PARK_SIZE = 128000 # Meters.
CONFLICT_RADIUS = 500 # Meters.
def _get_distance(x1, y1, x2, y2):
"""
require: x1, y1, x2, y2: all integers
return: a distance as a float
"""
distance = math.sqrt(math.pow((x1 - x2), 2) + math.pow((y1 - y2),2))
return distance
def check_real_distance(people1, people2, conflict_radius):
"""
determine if two people are too close
"""
if people2[1] - people1[1] > conflict_radius:
return False
d = _get_distance(people1[0], people1[1], people2[0], people2[1])
if d >= conflict_radius:
return False
return True
def check_for_conflicts(peoples, conflict_radius):
# sort people
def sort_func1(the_tuple):
return the_tuple[0]
_peoples = []
index = 0
for people in peoples:
_peoples.append((people[0], people[1], index))
index += 1
peoples = _peoples
peoples = sorted(peoples, key = sort_func1)
conflicts_dict = {}
i = 0
# use a type of sweep strategy
while i < len(peoples) - 1:
x_len = peoples[i + 1][0] - peoples[i][0]
conflict = False
conflicts_list =[peoples[i]]
j = i + 1
while x_len <= conflict_radius and j < len(peoples):
x_len = peoples[j][0] - peoples[i][0]
conflict = check_real_distance(peoples[i], peoples[j], conflict_radius)
if conflict:
people1 = peoples[i][2]
people2 = peoples[j][2]
conflicts_dict[people1] = True
conflicts_dict[people2] = True
j += 1
i += 1
return len(conflicts_dict.keys())
def gen_coord():
return int(random.random() * PARK_SIZE)
if __name__ == '__main__':
people_positions = [[gen_coord(), gen_coord()] for i in range(NUM_PEOPLE)]
conflicts = check_for_conflicts(people_positions, CONFLICT_RADIUS)
print("people in conflict: {}".format(conflicts))
As you can see from the comments, there's lots of approaches to this problem. In an interview situation you'd probably want to list as many as you can and say what the strengths and weaknesses of each one are.
For the problem as stated, where you have a fixed radius, the simplest approach is probably rounding and hashing. k-d trees and the like are powerful data structures, but they're also quite complex and if you don't need to repeatedly query them or add and remove objects they might be overkill for this. Hashing can achieve linear time, versus spatial trees which are n log n, although it might depend on the distribution of points.
To understand hashing and rounding, just think of it as partitioning your space up into a grid of squares with sides of length equal to the radius you want to check against. Each square is given it's own "zip code" which you can use as a hash key to store values in that square. You can compute the zip code of a point by dividing the x and y co-ordinates by the radius, and rounding down, like this:
def get_zip_code(x, y, radius):
return str(int(math.floor(x/radius))) + "_" + str(int(math.floor(y/radius)))
I'm using strings because it's simple, but you can use anything as long as you generate a unique zip code for each square.
Create a dictionary, where the keys are the zip codes, and the values are lists of all the people in that zip code. To check for conflicts, add the people one at a time, and before adding each one, test for conflicts with all the people in the same zip code, and the zip code's 8 neighbours. I've reused your method for keeping track of conflicts:
def check_for_conflicts(peoples, conflict_radius):
index = 0
d = {}
conflicts_dict = {}
for person in peoples:
# check for conflicts with people in this person's zip code
# and neighbouring zip codes:
for offset_x in range(-1, 2):
for offset_y in range(-1, 2):
offset_zip_code = get_zip_code(person[0] + (offset_x * conflict_radius), person[1] + (offset_y * conflict_radius), conflict_radius)
if offset_zip_code in d:
# get a list of people in this zip:
other_people = d[offset_zip_code]
# check for conflicts with each of them:
for other_person in other_people:
conflict = check_real_distance(person, other_person, conflict_radius)
if conflict:
people1 = index
people2 = other_person[2]
conflicts_dict[people1] = True
conflicts_dict[people2] = True
# add the new person to their zip code
zip_code = get_zip_code(person[0], person[1], conflict_radius)
if not zip_code in d:
d[zip_code] = []
d[zip_code].append([person[0], person[1], index])
index += 1
return len(conflicts_dict.keys())
The time complexity of this depends on a couple of things. If you increase the number of people, but don't increase the size of the space you are distributing them in, then it will be O(N2) because the number of conflicts is going to increase quadratically and you have to count them all. However if you increase the space along with the number of people, so that the density is the same, it will be closer to O(N).
If you're just counting unique people, you can keep a count if how many people in each zip code have at least 1 conflict. If its equal to everyone in the zip code, you can early out of the loop that checks for conflicts in a given zip after the first conflict with the new person, since no more uniques will be found. You could also loop through twice, adding all people on the first loop, and testing on the second, breaking out of the loop when you find the first conflict for each person.
You can see this topcoder link and section 'Closest pair'. You can modify the closest pair algorithm so that the distance h is always 50.
So , what you basically do is ,
Sort the people by X coordinate
Sweep from left to right.
Keep a balanced binary tree and keep all the points within 50 radii in the binary tree. The key of the binary tree would be the Y coordinates of the point
Select the points with Y-50 and Y+50 , this can be done with the binary tree in lg(n) time.
So the overall complexity becomes nlg(n)
Be sure to mark the points you find to skip those points in the future.
You can use set in C++ as the binary tree.But I couldn't find if python set supports range query or upper_bound and lower_bound.If someone knows , please point that out in the comments.
Here's my solution to this interesting problem:
from math import sqrt
import math
import random
class Person():
def __init__(self, x, y, conflict_radius=500):
self.position = [x, y]
self.valid = True
self.radius = conflict_radius**2
def validate_people(self, people):
P0 = self.position
for p in reversed(people):
P1 = p.position
dx = P1[0] - P0[0]
dy = P1[1] - P0[1]
dx2 = (dx * dx)
if dx2 > self.radius:
break
dy2 = (dy * dy)
d = dx2 + dy2
if d <= self.radius:
self.valid = False
p.valid = False
def __str__(self):
p = self.position
return "{0}:{1} - {2}".format(p[0], p[1], self.valid)
class Park():
def __init__(self, num_people=10000, park_size=128000):
random.seed(1)
self.num_people = num_people
self.park_size = park_size
def gen_coord(self):
return int(random.random() * self.park_size)
def generate(self):
return [[self.gen_coord(), self.gen_coord()] for i in range(self.num_people)]
def naive_solution(data):
sorted_data = sorted(data, key=lambda x: x[0])
len_sorted_data = len(sorted_data)
result = []
for index, pos in enumerate(sorted_data):
print "{0}/{1} - {2}".format(index, len_sorted_data, len(result))
p = Person(pos[0], pos[1])
p.validate_people(result)
result.append(p)
return result
if __name__ == '__main__':
people_positions = Park().generate()
with_conflicts = len(filter(lambda x: x.valid, naive_solution(people_positions)))
without_conflicts = len(filter(lambda x: not x.valid, naive_solution(people_positions)))
print("people with conflicts: {}".format(with_conflicts))
print("people without conflicts: {}".format(without_conflicts))
I'm sure the code can be still optimized further
I found a relatively solution to the problem. Sort the list of coordinates by the X value. Then look at each X value, one at a time. Sweep right, checking the position with the next position, until the end of the sweep area is reached (500 meters), or a conflict is found.
If no conflict is found, sweep left in the same manner. This method avoids unnecessary checks. For example, if there are 1,000,000 people in the park, then all of them will be in conflict. The algorithm will only check each person one time: once a conflict is found the search stops.
My time seems to be O(N).
Here is the code:
import math
import random
random.seed(1) # Setting random number generator seed for repeatability
NUM_PEOPLE = 10000
PARK_SIZE = 128000 # Meters.
CONFLICT_RADIUS = 500 # Meters.
check_real_distance = lambda conflict_radius, people1, people2: people2[1] - people1[1] <= conflict_radius \
and math.pow(people1[0] - people2[0], 2) + math.pow(people1[1] - people2[1], 2) <= math.pow(conflict_radius, 2)
def check_for_conflicts(peoples, conflict_radius):
peoples.sort(key = lambda x: x[0])
conflicts_dict = {}
i = 0
num_checks = 0
# use a type of sweep strategy
while i < len(peoples) :
conflict = False
j = i + 1
#sweep right
while j < len(peoples) and peoples[j][0] - peoples[i][0] <= conflict_radius \
and not conflict and not conflicts_dict.get(i):
num_checks += 1
conflict = check_real_distance(conflict_radius, peoples[i], peoples[j])
if conflict:
conflicts_dict[i] = True
conflicts_dict[j] = True
j += 1
j = i - 1
#sweep left
while j >= 0 and peoples[i][0] - peoples[j][0] <= conflict_radius \
and not conflict and not conflicts_dict.get(i):
num_checks += 1
conflict = check_real_distance(conflict_radius, peoples[j], peoples[i])
if conflict:
conflicts_dict[i] = True
conflicts_dict[j] = True
j -= 1
i += 1
print("num checks is {0}".format(num_checks))
print("num checks per size is is {0}".format(num_checks/ NUM_PEOPLE))
return len(conflicts_dict.keys())
def gen_coord():
return int(random.random() * PARK_SIZE)
if __name__ == '__main__':
people_positions = [[gen_coord(), gen_coord()] for i in range(NUM_PEOPLE)]
conflicts = check_for_conflicts(people_positions, CONFLICT_RADIUS)
print("people in conflict: {}".format(conflicts))
I cam up with an answer that seems to take O(N) time. The strategy is to sort the array by X values. For each X value, sweep left until a conflict is found, or the distance exceeds the conflict distance (500 M). If no conflict is found, sweep left in the same manner. With this technique, you limit the amount of searching.
Here is the code:
import math
import random
random.seed(1) # Setting random number generator seed for repeatability
NUM_PEOPLE = 10000
PARK_SIZE = 128000 # Meters.
CONFLICT_RADIUS = 500 # Meters.
check_real_distance = lambda conflict_radius, people1, people2: people2[1] - people1[1] <= conflict_radius \
and math.pow(people1[0] - people2[0], 2) + math.pow(people1[1] - people2[1], 2) <= math.pow(conflict_radius, 2)
def check_for_conflicts(peoples, conflict_radius):
peoples.sort(key = lambda x: x[0])
conflicts_dict = {}
i = 0
num_checks = 0
# use a type of sweep strategy
while i < len(peoples) :
conflict = False
j = i + 1
#sweep right
while j < len(peoples) and peoples[j][0] - peoples[i][0] <= conflict_radius \
and not conflict and not conflicts_dict.get(i):
num_checks += 1
conflict = check_real_distance(conflict_radius, peoples[i], peoples[j])
if conflict:
conflicts_dict[i] = True
conflicts_dict[j] = True
j += 1
j = i - 1
#sweep left
while j >= 0 and peoples[i][0] - peoples[j][0] <= conflict_radius \
and not conflict and not conflicts_dict.get(i):
num_checks += 1
conflict = check_real_distance(conflict_radius, peoples[j], peoples[i])
if conflict:
conflicts_dict[i] = True
conflicts_dict[j] = True
j -= 1
i += 1
print("num checks is {0}".format(num_checks))
print("num checks per size is is {0}".format(num_checks/ NUM_PEOPLE))
return len(conflicts_dict.keys())
def gen_coord():
return int(random.random() * PARK_SIZE)
if __name__ == '__main__':
people_positions = [[gen_coord(), gen_coord()] for i in range(NUM_PEOPLE)]
conflicts = check_for_conflicts(people_positions, CONFLICT_RADIUS)
print("people in conflict: {}".format(conflicts))

Quickly calculating centroid of binary numpy array

I have a numpy array of 0's and 1's (512 x 512). I'd like to calculate the centroid of the shape of 1's (they're all connected in one circular blob in the middle of the array).
for i in xrange(len(array[:,0])):
for j in xrange(len(array[0,:])):
if array[i,j] == 1:
x_center += i
y_center += j
count = (aorta == 1).sum()
x_center /= count
y_center /= count
Is there a way to speed up my calculation above? Could I use numpy.where() or something? Are there any python functions for doing this in parallel?
You can replace the two nested loops to get the coordinates of the center point, like so -
x_center, y_center = np.argwhere(array==1).sum(0)/count
You could also use scipy.ndimage.measurements.center_of_mass.
import numpy as np
x_c = 0
y_c = 0
area = array.sum()
it = np.nditer(array, flags=['multi_index'])
for i in it:
x_c = i * it.multi_index[1] + x_c
y_c = i * it.multi_index[0] + y_c
centroid = int(x_c/area), int(y_c/area)
Thats should be the centroid of a binary image using numpy.nditer()

Boolean expression in SOP

I'm new to boolean expressions.
I've been given the task to simplify
F(w,x,y,z) = xy’ + x’z’ + wxz + wx’y by using K map.
I've done it and the result is wx’+w’y’+xyz.
Now I have to "Write it in a standard SOP form. You need to provide the steps through which you get the standard SOP".
And i have no idea how to do it. I thought result after k map is sop.
Yes, you already have it in SOP form. But the second question is about Standard (aka canonical) SOP form. That's much simpler to find than having to use K-maps (but it's often long), it's just the sum of minterms.
I think your solution does not cover all ones. These Karnaugh maps show the original expression, the simplified version (minimal SOP) and the canonical SOP, where every product contains all literals (all given variables or their negation).
The original expression is
F(w,x,y,z) = x·¬y + ¬x·¬z + w·x·z + w·¬x·y
– there are two fours and two pairs circled in the corresponding (first one) K-map.
The original expression simplified using K-map (shown in the second one):
F(w,x,y,z) = x·¬y + ¬x·¬z + w·y·z
is different than yours, but you can check for example with wolframalpha online tool, that it is the simplified original expression.
It is also the minimal DNF, but not a sum of minterms (where the output is equal to 1), because there are not all variables in every product of the sum.
The third K-map shows ten minterms circled. They form the canonical DNF:
F(w,x,y,z) = m0 + m2 + m4 + m5 + m8 + m10 + m11 + m12 + m13 + m15 =
= ¬w·¬x·¬y·¬z + ¬w·¬x·y·¬z + ¬w·x·¬y·¬z + ¬w·x·¬y·z + w·¬x·¬y·¬z
+ w·¬x·y·¬z + w·¬x·y·z + w·x·¬y·¬z + w·x·¬y·z + w·x·y·z
I checked your simplified expression, but there are not all ones covered (even if there were some useful do not care states (marked X)). Maybe you made a typo. Or could there be a typo in the original expression?
We can implement the K-Map algorithm in python for 4 variables, as shown below. The function accepts the Boolean function in SOP (sum of products) form and the names of the variables and returns a simplified reduced representation. Basically you need to create rectangular groups containing total terms in power of two like 8, 4, 2 and try to cover as many elements as you can in one group (we need to cover all the ones).
For example, the function can be represented F(w,x,y,z) = xy’ + x’z’ + wxz + wx’y in SOP form as f(w,x,y,z)=∑(0,2,4,5,8,10,11,12,13,15), as can be seen from the below table:
As can be seen from the output of the next code snippet, the program outputs the simplified form x¬y + ¬x¬z + wyz, where negation of a boolean variable x is represented as ¬x in the code.
from collections import defaultdict
from itertools import permutations, product
def kv_map(sop, vars):
sop = set(sop)
not_covered = sop.copy()
sop_covered = set([])
mts = [] # minterms
# check for minterms with 1 variable
all_3 = [''.join(x) for x in product('01', repeat=3)]
for i in range(4):
for v_i in [0,1]:
if len(not_covered) == 0: continue
mt = ('' if v_i else '¬') + vars[i]
s = [x[:i]+str(v_i)+x[i:] for x in all_3]
sop1 = set(map(lambda x: int(x,2), s))
if len(sop1 & sop) == 8 and len(sop_covered & sop1) < 8: # if not already covered
mts.append(mt)
sop_covered |= sop1
not_covered = not_covered - sop1
if len(not_covered) == 0:
return mts
# check for minterms with 2 variables
all_2 = [''.join(x) for x in product('01', repeat=2)]
for i in range(4):
for j in range(i+1, 4):
for v_i in [0,1]:
for v_j in [0,1]:
if len(not_covered) == 0: continue
mt = ('' if v_i else '¬') + vars[i] + ('' if v_j else '¬') + vars[j]
s = [x[:i]+str(v_i)+x[i:] for x in all_2]
s = [x[:j]+str(v_j)+x[j:] for x in s]
sop1 = set(map(lambda x: int(x,2), s))
if len(sop1 & sop) == 4 and len(sop_covered & sop1) < 4: # if not already covered
mts.append(mt)
sop_covered |= sop1
not_covered = not_covered - sop1
if len(not_covered) == 0:
return mts
# check for minterms with 3 variables similarly (code omitted)
# ... ... ...
return mts
mts = kv_map([0,2,4,5,8,10,11,12,13,15], ['w', 'x', 'y', 'z'])
mts
# ['x¬y', '¬x¬z', 'wyz']
The following animation shows how the above code (greedily) simplifies the Boolean function given in SOP form (the basic goal is to cover all the 1s with minimum number of power-2 blocks). Since the algorithm is greedy it may get stuck to some local minimum, that we need to be careful about.

Code to calculate 1D Median Filter

I wonder if anyone knows some python or java code to calculate 1D median filter.
I have a file comma delimited with two fields: Date and Signal.
Something like that:
2014-06-01 11:22:12, 23.8
2014-06-01 11:23:12, 25.9
2014-06-01 11:24:12, 45.7
I would like to read this file and apply the 1D Median Filter with size 23
for the field Signal and save it in another file to remove the noise.
Thanks in advance.
Alexandre.
In case someone stumbled on this later.
To extract the data you can use regex, while for the custom median filter you can have a look here.
I will leave a copy down here in case it is removed:
def medfilt (x, k):
"""Apply a length-k median filter to a 1D array x.
Boundaries are extended by repeating endpoints.
"""
assert k % 2 == 1, "Median filter length must be odd."
assert x.ndim == 1, "Input must be one-dimensional."
k2 = (k - 1) // 2
y = np.zeros ((len (x), k), dtype=x.dtype)
y[:,k2] = x
for i in range (k2):
j = k2 - i
y[j:,i] = x[:-j]
y[:j,i] = x[0]
y[:-j,-(i+1)] = x[j:]
y[-j:,-(i+1)] = x[-1]
return np.median (y, axis=1)
scipy.signal.medfilt accepts 1D kernels:
import pandas as pd
import scipy.signal
def median_filter(file_name, new_file_name, kernel_size):
with open(file_name, 'r') as f:
df = pd.read_csv(f, header=None)
signal = df.iloc[:, 1].values
median = scipy.signal.medfilt(signal, kernel_size)
df = df.drop(df.columns[1], 1)
df[1] = median
df.to_csv(new_file_name, sep=',', index=None, header=None)
if __name__=='__main__':
median_filter('old_signal.csv', 'new_signal.csv', 23)

Resources