How to make argsort return 2D indexes - sorting

torch argsort returns indexes inside the row or a column depending on dim=...
How can I instead get a 2D indexes ..f.e..
[[ r1,r2,r3,...], [c1,c2,c3,.....]]
thanks... this is what I did
#https://github.com/pytorch/pytorch/issues/35674
def unravel_indices(indices, shape):
coord = []
for dim in reversed(shape):
coord.append(torch.fmod(indices, dim))
indices = torch.div(indices, dim, rounding_mode='floor')
coord = torch.stack(coord[::-1], dim=-1)
return coord
torch.unravel_indices = unravel_indices

Although numpy has unravel_index to perform this, there's no built-in for Torch but you can do it yourself. Easy enough for two dimensions:
yy, xx = indices // width, indices % width
fwiw pytorch has had a function request and PR(s) floating around for several years now.

Related

Function using multiprocessing returns None values

I have been stuck on this for quite a while now. I am using the multiprocessing function to speed up a function that previously looped over data points. Before using multiprocessing the function worked fine, but now it returns some none values (the first few, <10) before returning values. I have tried many things and different ways to use the multiprocessing pool.
The multiprocessing is used inside a function, which I am not sure if that might be the problem?
def SkyViewFactor(point, coords, max_radius):
betas_lin = np.linspace(0,2*np.pi,steps_beta)
"""this is the analytical dome area but should make same assumption as for d_area"""
dome_area = max_radius**2*2*np.pi
""" we throw away all point outside the dome
# dome is now a 5 column array of points:
# the 5 columns: x,y,z,radius,angle theta"""
dome_p = dome(point, coords, max_radius)
betas = np.zeros(steps_beta)
"""we loop over all points in the dome"""
d = 0
while (d < dome_p.shape[0]):
psi = np.arctan((dome_p[d,2]-point[2])/dome_p[d,3])
"""The angles of the min and max angle of the building"""
beta_min = - np.arcsin(np.sqrt(2*gridboxsize**2)/2/dome_p[d,3]) + dome_p[d,4]
beta_max = np.arcsin(np.sqrt(2*gridboxsize**2)/2/dome_p[d,3]) + dome_p[d,4]
"""Where the index of betas fall within the min and max beta, and there is not already a larger psi blocking"""
betas[np.nonzero(np.logical_and((betas < psi), np.logical_and((beta_min <= betas_lin), (betas_lin < beta_max))))] = psi
d +=1
areas = d_area(betas, steps_beta, max_radius)
"""The SVF is the fraction of area of the dome that is not blocked"""
SVF = np.around((dome_area - np.sum(areas))/dome_area, 3)
#print(SVF)
return SVF
def calc_SVF(coords, max_radius, blocklength):
"""
Function to calculate the sky view factor.
We create a dome around a point with a certain radius,
and for each point in the dome we evaluate of the height of this point blocks the view
:param coords: all coordinates of our dataset
:param max_radius: maximum radius we think influences the svf
:param blocklength: the first amount of points in our data set we want to evaluate
:return: SVF for all points
"""
def parallel_runs_SVF():
points = [coords[i,:] for i in range(blocklength)]
pool = Pool()
SVF_list = []
SVF_par = partial(SkyViewFactor, coords=coords,max_radius=max_radius) # prod_x has only one argument x (y is fixed to 10)
SVF = pool.map(SVF_par, points)
pool.close()
pool.join()
# if SVF != None:
# SVF_list.append(SVF)
print(SVF)
return SVF
if __name__ == '__SVF__':
return parallel_runs_SVF()
This function is later called in:
def reshape_SVF(data,coords,julianday,lat,long,LMT,reshape,save_CSV,save_Im):
[x_len, y_len] = [int(data.shape[0]/2),int(data.shape[1]/2)]
blocklength = int(x_len*y_len)
"Compute SVF and SF and Reshape the shadow factors and SVF back to nd array"
SVFs = calc_SVF(coords,max_radius,blocklength)
SFs = calc_SF(coords,julianday,lat,long,LMT,blocklength)
#SVFs = filter(None, SVFs)
"If reshape is true we reshape the arrays to the original data matrix"
if reshape == True:
SVF_matrix = np.ndarray([x_len,y_len])
SF_matrix = np.ndarray([x_len,y_len])
for i in range(blocklength):
SVF_matrix[coords[int(i-x_len/2),0],coords[int(i-y_len/2),1]] = SVFs[i]
SF_matrix[coords[int(i-x_len/2),0],coords[int(i-y_len/2),1]] = SFs[i]
if save_CSV == True:
np.savetxt("SVFmatrix.csv", SVF_matrix, delimiter=",")
np.savetxt("SFmatrix.csv", SF_matrix, delimiter=",")
if save_Im == True:
tf.imwrite('SVF_matrix.tif', SVF_matrix, photometric='minisblack')
tf.imwrite('SF_matrix.tif', SF_matrix, photometric='minisblack')
return SF_matrix,SF_matrix
elif reshape == False:
np.savetxt("SVFs.csv", SVFs, delimiter=",")
np.savetxt("SFs.csv", SFs, delimiter=",")
return SVFs, SFs
SFs is a similar function with the same structure (it also uses multiprocessing). The goal is to return a list with all Sky View factors, or if reshape is true a matrix with the same shape as the original input data (DSM data) with the sky view factor for each location.
I get the error:
SVF_matrix[coords[int(i-x_len/2),0],coords[int(i-y_len/2),1]] = SVFs[i]
TypeError: 'NoneType' object is not subscriptable
I tried to filter out nones with the filter() function using
SVFs = filter(None, SVFs)
This returns the error
TypeError: 'NoneType' object is not iterable. Also I do not know if the None values returned are instead of the actual values or extra (i.e. if I have 1000 datapoints I should have 1000 sky view factors, but do I get an array with 3 Nones and then numbers, do I get 3 Nones ánd 1000 sky view factors or do I get a list of 1000 values of which the first 3 are Nones?)
I also tried to Make an empty list and append the SVF only if this is not None, this however also does not work
SVF_list = []
if SVF != None:
SVF_list.append(SVF)
return SVF_list
To include all used functions: These are the functions used in SkyViewFactor to calculate distances and the area elements
def dist(point, coord):
"""
:param point: evaluation point (x,y,z)
:param coord: array of coordinates with heights
:return: the distance from each coordinate to the point and the angle
"""
# Columns is dx
dx = (coord[:,1]-point[1])*gridboxsize
# Rows is dy
dy = (coord[:,0]-point[0])*gridboxsize
dist = np.sqrt(abs(dx)**2 + abs(dy)**2)
"""angle is 0 north direction"""
angle = np.arctan2(dy,dx)+np.pi/2
return dist,angle
def dome(point, coords, maxR):
"""
:param point: point we are evaluating
:param coords: array of coordinates with heights
:param maxR: maximum radius in which we think the coordinates can influence the SVF
:return: a dome of points that we take into account to evaluate the SVF
"""
radii, angles = dist(point,coords)
coords = np.column_stack([coords, radii])
coords = np.column_stack([coords, angles])
"""the dome consist of points higher than the view height and within the radius we want"""
dome = coords[(np.logical_and(coords[:,3]<maxR,coords[:,3]>0.1)),:]
dome = dome[(dome[:,2]>point[2]),:]
return dome
def d_area(psi,steps_beta,maxR):
"""Radius at ground surface and at the height of the projection of the building"""
d_area = 2*np.pi/steps_beta*maxR**2*np.sin(psi)
return d_area
Help with this or some other suggestion to speed up my code / use multiprocessing the right way is very appreciated!
I am trying to speed up a for loop using multiprocessing, this works without multiprocessing but since I have to iterate over 1.250.000 datapoints it is way to slow. With multiprocessing it returns None for the first few values.

Subtracting a best fit line with numpy.polyfit()?

So I'm working on a project and I have a set of data that I loaded in as a csv. The data has a spot that that I need to flatten out. I used the numpy.polyfit() function to find a line of best fit, but what I can't seem to figure out is how to subtract off the best fit line. Any advice?
Here is the code I'm using so far:
μ = pd.read_csv("C:\\Users\\ander\\Documents\\Data\\plots and code\\dataframe2.csv")
yvalue = "average"
xvalue = "xvalue"
X = μ[xvalue][173:852]
Y = μ[yvalue][173:852]
fit = np.polyfit(X, Y, 1)
μ = μ.subtract(fit, μ)
The polyfit function finds the linear coefficient of the best fit. In order to subtract the line from your data, you first need to create the linear function itself. For example, you can use the numpy.poly1d function.
I'll show you an example. Since we don't have access to the .csv file I made up X and Y:
import matplotlib.pyplot as plt
import numpy as np
DATA_SIZE = 500
μ_X = np.sort(np.random.uniform(0,10,DATA_SIZE))
μ_Y = 3*np.exp(-(μ_X-7)**2) + np.random.normal(0,0.08,DATA_SIZE) + 0.5*μ_X
X = μ_X[50:200]
Y = μ_Y[50:200]
plt.scatter(μ_X, μ_Y, label='Full data')
plt.scatter(X, Y, label='Selected region')
plt.legend()
plt.show()
Now we can fit the baseline from the orange data and subtract the linear function from all the data (blue).
fit = np.polyfit(X, Y, 1)
linear_baseline = np.poly1d(fit) # create the linear baseline function
μ_Y = μ_Y - linear_baseline(μ_X) # subtract the baseline from μ_Y
plt.scatter(μ_X, μ_Y, label='Linear baseline removed')
plt.legend()
plt.show()

seaborn kdeplot x axis scaling?

I have a histogram of my data:
h is a 1-d array of counts
x is a 1-d array of bin values
Now if I do:
sns.kdeplot(h, shade=True);
I get a plot where x-axis goes from -20 to 100, which has nothing to do with
my original x data. How do I get the x-axis scaled to match my data?
I see I misunderstood the input to kde. It wants the original values. I had already created a histogram and wanted to feed that to kde.
In my histogram I have h.buckets, and h.results. I did
def hist_to_values (hist):
ret = []
for x,y in zip (hist.buckets, h.results):
ret.extend ([x] * y)
return np.array (ret)
Then feed this to kde, and I got the results I expect.

Sparse matrix plot matlab

I have a 5000 *5000 sparse matrix with 4 different values. I want to visualise the nonzero elements with 4 different colors such that I can recognise the ratio of this values and the relationships between them,I use imagesc but I can not recognise very well among different values, especially the values with smaller ratio.I think if I use some symboles for each value , it works but I don't know how is it in Matlab. Any suggestion? The result of Dan code is in figure below.
You could reform the matrix to be a set of [X, Y, F] coordinates (re-using my answer from Resampling Matrix and restoring in one single Matrix):
Assuming your matrix is M
[X, Y] = meshgrid(1:size(M,1), 1:size(M,2));
Mf = M(:); %used again later, hence stored
V = [X(:), Y(:), Mf];
get rid of the zero elements
V(Mf == 0, :) = [];
At this point, if you have access to the statistics toolbox you can just go gscatter(V(:,1), V(:,2), V(:,3)) to get the correct plot otherwise continue with the following if you don't have the toolbox:
Find a list of the unique values in M
Vu = unique(V(:,3));
For each such value, plot the points as an xy scatter plot, note hold all makes sure the colour changes each time a new plot is added i.e. each new iteration of the loop
hold all;
for g = 1:length(Vu)
Vg = V(V(:,3)==Vu(g),:)
plot(Vg(:,1), Vg(:,2), '*');
a{g}=num2str(Vu(g));
end
legend(a);
Example M:
M = zeros(1000);
M(200,30) = 7;
M(100, 900) = 10;
M(150, 901) = 13;
M(600, 600) = 13;
Result:
Now i can answer the first part of the question. I suppose you need to do something like
sum(histc(A, unique(A)),2)
to count the number of unique values in the matrix.
temp = histc(A, unique(A)) "is a matrix of column histogram counts." So you get the counts of all values of unique(A) as they appear in A columns.
I'm doing stat = sum(temp,2) to get counts of all values of unique(A) in the whole matrix.
Then you can use the code proposed from #Dan to visualize the result.
hold all;
u=unique(A);
for i = 1:length(stat)
plot(u(i), stat(i)/max(stat), '*');
end
Please clarify what kind of relationship between the values do you mean?

Male/Female Classification with Matlab- About Finding Mean Image

I am working on a project which is about pattern (male/female)classification with matlab.I have a problem, I need your help, please.
My program should find mean images of datasets. First dataset is women,second dataset is men. So first mean image has to look like a woman and second a man.I have different datasets which all have format of jpeg. I am trying different datasets for my program to check if it is working but when I use different datasets I can not see true mean images all the time, for ex:
They are mean images from a dataset:
But when I use another dataset my mean images are like this, they have no meanings, I mean they dont look like face:
What can be the reason for this? I should work with different datasets. Please help.
`
filenamesA = dir(fullfile(pathfora, '*.jpg'));
Train_NumberA = numel(filenamesA);
%%%%%%%%%%%%%%%%%%%% Finding Image Vectors for A
imagesA= [];
for k = 1 : Train_NumberA
str = int2str(k);
str= strcat(str);
str = strcat('\',str,'b','.jpg');
str = strcat(pathfora,str);
imgA = imread(str);
imgA = rgb2gray(imgA);
[irowA icolA] = size(imgA);
tempA = reshape(imgA',irowA*icolA,1); % Reshaping 2D images into 1D image vectors
imagesA = [imagesA tempA]; % 'imagesA' grows after each turn
imagesA=double(imagesA);
end`
`%%%%%%%%%%%%%%%%%%%%%%%% Calculate the MEAN IMAGE VECTOR for A
mean_vectorA= mean(imagesA,2); % Computing the average vector m = (1/P)*sum(Tj's) (j = 1 : P)
mean_imageA= reshape(mean_vectorA,irowA,icolA); % Average matrix of training set A
meanimgA=mat2gray(mean_imageA);
figure(1);
imshow(rot90(meanimgA,3));`
-------------------------------------And same for dataset B (male)
You could use a 3D matrix to store the images. I also cleaned up the code a bit. Not tested.
filenamesA = dir(fullfile(pathfora, '*.jpg'));
Train_NumberA = numel(filenamesA);
imagesA = [];
for k = 1:Train_NumberA
imgA = imread(strcat(pathfora, '\', int2str(k), 'b', '.jpg'));
imgA = rgb2gray(imgA);
imagesA = cat(3, imagesA, imgA);
end
double command moved out of loop.
imagesA = double(imagesA);
Calculate the mean over the 3rd dimension of the imagesA matrix to get the mean 2D image.
meanimage_A = mean(imagesA, 3);
Convert to grayscale image.
meanimgA = mat2gray(meanimage_A);
I think rot90 is not needed here...
figure(1);
imshow(meanimgA, 3);
Use a 3D array or cell array of images instead of reshaping 2D images into single rows of a matrix. The reshaping is unnecessary and can only add bugs.
If all your images are the same size, you can use a multidimensional array: Matlab documentation on multidimensional arrays
Otherwise, use a cell array: Matlab documentation on cell arrays

Resources