Access TensorFloat data - windows-machine-learning

I am using Windows-Machine-Learning to convert my VideoFrame to a TensorFloat _input (shape: 1,3,256,192; RGB channels + image), load that into my onnx model and receive as _output another TensorFloat object (shape: 1,17,64,48; 17 detected objects + image).
Now my question: If I want to access that TensorFloat _output, currently the only way I know is to use _output.data.GetAsVectorView, which gives me a long 1d Vector and try to reorder that and figure out how the dimensions are ordered in there? Is there a clear rule that I can follow to understand how the 4D tensor is encoded in the 1D Vector? Alternatively, can I somehow access the different dimensions directly from the _output TensorFloat object, since using "Shape" shows me that it is a multidimensional array?

Please refer to the layout of Windows ML tensors here:
https://learn.microsoft.com/en-us/uwp/api/windows.ai.machinelearning.tensorfloat?view=winrt-20348
A tensor is a multi-dimensional array of values. A float tensor is a tensor of 32-bit floating point values.
The layout of tensors is row-major, with tightly packed contiguous data representing each dimension. The total size of a tensor is the product of the size of each dimension.
Consider:
Shape: [D1][D2][D3]...[DN]
Strides: [S1][S2][S3]...[SN]
Location: [A1][A2][A3]...[AN],
and you wish to compute the index at Location.
Then, you can assume that:
Sn = Dn+1 * Dn+2 * ... * Dn, (for n = 1...N-1)
SN = 1
So:
index = A1*S1 + A2*S2 + A3*S3 + ... + AN*SN

Related

How would I extract a region of some NEMO ocean model output in Iris?

Is there a straightforward way to extract a region from an Iris cube which is described by 2D latitude and longitude variables, for example using NEMO ocean model data?
I found this workaround but was wondering if there was a way to do this in 'pure' Iris, without having to resort to defining a new function?
For example, if I have this cube...
In [30]: print(cube)
mole_concentration_of_dimethyl_sulfide_in_sea_water / (mol m-3) (time: 780; cell index along second dimension: 330; cell index along first dimension: 360)
Dimension coordinates:
time x - -
cell index along second dimension - x -
cell index along first dimension - - x
Auxiliary coordinates:
latitude - x x
longitude - x x
... and then try to extract a region using intersection, I get this...
>>> subset = cube.intersection(longitude=(-10, 10))
CoordinateMultiDimError: Multi-dimensional coordinate not supported: 'longitude'
Thanks!
As you can see from the error messsage, iris does not currently support subsetting by multi-dimensional coordinates, so you have to write a function similar to bbox_extract_2Dcoords() in that blog post. All it does is creates a boolean mask with values set to True within your region of interest and False outside. Then the boundaries of this region are used as indices to subset the cube.
An alternative would be to regrid the data to a regular grid defined by 1D longitude and latitude and then subset the data using the standard Constraint() method.

Resize an image with bilinear interpolation without imresize

I've found some methods to enlarge an image but there is no solution to shrink an image. I'm currently using the nearest neighbor method. How could I do this with bilinear interpolation without using the imresize function in MATLAB?
In your comments, you mentioned you wanted to resize an image using bilinear interpolation. Bear in mind that the bilinear interpolation algorithm is size independent. You can very well use the same algorithm for enlarging an image as well as shrinking an image. The right scale factors to sample the pixel locations are dependent on the output dimensions you specify. This doesn't change the core algorithm by the way.
Before I start with any code, I'm going to refer you to Richard Alan Peters' II digital image processing slides on interpolation, specifically slide #59. It has a great illustration as well as pseudocode on how to do bilinear interpolation that is MATLAB friendly. To be self-contained, I'm going to include his slide here so we can follow along and code it:
Please be advised that this only resamples the image. If you actually want to match MATLAB's output, you need to disable anti-aliasing.
MATLAB by default will perform anti-aliasing on the images to ensure the output looks visually pleasing. If you'd like to compare apples with apples, make sure you disable anti-aliasing when comparing between this implementation and MATLAB's imresize function.
Let's write a function that will do this for us. This function will take in an image (that is read in through imread) which can be either colour or grayscale, as well as an array of two elements - The image you want to resize and the output dimensions in a two-element array of the final resized image you want. The first element of this array will be the rows and the second element of this array will be the columns. We will simply go through this algorithm and calculate the output pixel colours / grayscale values using this pseudocode:
function [out] = bilinearInterpolation(im, out_dims)
%// Get some necessary variables first
in_rows = size(im,1);
in_cols = size(im,2);
out_rows = out_dims(1);
out_cols = out_dims(2);
%// Let S_R = R / R'
S_R = in_rows / out_rows;
%// Let S_C = C / C'
S_C = in_cols / out_cols;
%// Define grid of co-ordinates in our image
%// Generate (x,y) pairs for each point in our image
[cf, rf] = meshgrid(1 : out_cols, 1 : out_rows);
%// Let r_f = r'*S_R for r = 1,...,R'
%// Let c_f = c'*S_C for c = 1,...,C'
rf = rf * S_R;
cf = cf * S_C;
%// Let r = floor(rf) and c = floor(cf)
r = floor(rf);
c = floor(cf);
%// Any values out of range, cap
r(r < 1) = 1;
c(c < 1) = 1;
r(r > in_rows - 1) = in_rows - 1;
c(c > in_cols - 1) = in_cols - 1;
%// Let delta_R = rf - r and delta_C = cf - c
delta_R = rf - r;
delta_C = cf - c;
%// Final line of algorithm
%// Get column major indices for each point we wish
%// to access
in1_ind = sub2ind([in_rows, in_cols], r, c);
in2_ind = sub2ind([in_rows, in_cols], r+1,c);
in3_ind = sub2ind([in_rows, in_cols], r, c+1);
in4_ind = sub2ind([in_rows, in_cols], r+1, c+1);
%// Now interpolate
%// Go through each channel for the case of colour
%// Create output image that is the same class as input
out = zeros(out_rows, out_cols, size(im, 3));
out = cast(out, class(im));
for idx = 1 : size(im, 3)
chan = double(im(:,:,idx)); %// Get i'th channel
%// Interpolate the channel
tmp = chan(in1_ind).*(1 - delta_R).*(1 - delta_C) + ...
chan(in2_ind).*(delta_R).*(1 - delta_C) + ...
chan(in3_ind).*(1 - delta_R).*(delta_C) + ...
chan(in4_ind).*(delta_R).*(delta_C);
out(:,:,idx) = cast(tmp, class(im));
end
Take the above code, copy and paste it into a file called bilinearInterpolation.m and save it. Make sure you change your working directory where you've saved this file.
Except for sub2ind and perhaps meshgrid, everything seems to be in accordance with the algorithm. meshgrid is very easy to explain. All you're doing is specifying a 2D grid of (x,y) co-ordinates, where each location in your image has a pair of (x,y) or column and row co-ordinates. Creating a grid through meshgrid avoids any for loops as we will have generated all of the right pixel locations from the algorithm that we need before we continue.
How sub2ind works is that it takes in a row and column location in a 2D matrix (well... it can really be any amount of dimensions you want), and it outputs a single linear index. If you're not aware of how MATLAB indexes into matrices, there are two ways you can access an element in a matrix. You can use the row and column to get what you want, or you can use a column-major index. Take a look at this matrix example I have below:
A =
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
If we want to access the number 9, we can do A(2,4) which is what most people tend to default to. There is another way to access the number 9 using a single number, which is A(11)... now how is that the case? MATLAB lays out the memory of its matrices in column-major format. This means that if you were to take this matrix and stack all of its columns together in a single array, it would look like this:
A =
1
6
11
2
7
12
3
8
13
4
9
14
5
10
15
Now, if you want to access element number 9, you would need to access the 11th element of this array. Going back to the interpolation bit, sub2ind is crucial if you want to vectorize accessing the elements in your image to do the interpolation without doing any for loops. As such, if you look at the last line of the pseudocode, we want to access elements at r, c, r+1 and c+1. Note that all of these are 2D arrays, where each element in each of the matching locations in all of these arrays tell us the four pixels we need to sample from in order to produce the final output pixel. The output of sub2ind will also be 2D arrays of the same size as the output image. The key here is that each element of the 2D arrays of r, c, r+1, and c+1 will give us the column-major indices into the image that we want to access, and by throwing this as input into the image for indexing, we will exactly get the pixel locations that we want.
There are some important subtleties I'd like to add when implementing the algorithm:
You need to make sure that any indices to access the image when interpolating outside of the image are either set to 1 or the number of rows or columns to ensure you don't go out of bounds. Actually, if you extend to the right or below the image, you need to set this to one below the maximum as the interpolation requires that you are accessing pixels to one over to the right or below. This will make sure that you're still within bounds.
You also need to make sure that the output image is cast to the same class as the input image.
I ran through a for loop to interpolate each channel on its own. You could do this intelligently using bsxfun, but I decided to use a for loop for simplicity, and so that you are able to follow along with the algorithm.
As an example to show this works, let's use the onion.png image that is part of MATLAB's system path. The original dimensions of this image are 135 x 198. Let's interpolate this image by making it larger, going to 270 x 396 which is twice the size of the original image:
im = imread('onion.png');
out = bilinearInterpolation(im, [270 396]);
figure;
imshow(im);
figure;
imshow(out);
The above code will interpolate the image by increasing each dimension by twice as much, then show a figure with the original image and another figure with the scaled up image. This is what I get for both:
Similarly, let's shrink the image down by half as much:
im = imread('onion.png');
out = bilinearInterpolation(im, [68 99]);
figure;
imshow(im);
figure;
imshow(out);
Note that half of 135 is 67.5 for the rows, but I rounded up to 68. This is what I get:
One thing I've noticed in practice is that upsampling with bilinear has decent performance in comparison to other schemes like bicubic... or even Lanczos. However, when you're shrinking an image, because you're removing detail, nearest neighbour is very much sufficient. I find bilinear or bicubic to be overkill. I'm not sure about what your application is, but play around with the different interpolation algorithms and see what you like out of the results. Bicubic is another story, and I'll leave that to you as an exercise. Those slides I referred you to does have material on bicubic interpolation if you're interested.
Good luck!

Storing CWT of each row of image in a Cell

I want to compute the morlet wavelet of each row of 480X480 image. I have to save the output of the transform of each row which is a 2d array(matrix).
Then i will be taking the average all 480 2d matrices i have to get one final plot of the average.
clc;
close all;
clear all;
I=imread('lena.jpg');
J=rgb2gray(I);
%K=J(1:480)
%coefs = cwt(K,1:128,'morl','plot');
coefs = cell(480,1);
for i = 1:480
K=J(i,:);
coefs(i) = cwt(K,1:128,'morl');
end
Here i want to take the avg of the 480 coeff matrices. Here am getting the error
Conversion to cell from double is not possible.
Error in soilwave (line 12) coefs(i) = cwt(K,1:128,'morl');
Could anyone suggest a better method or tweaks to this.
Cell arrays are practical if you need to store elements that have inconsistent format or dimensions, but for what you are trying to do, a 3D array is easier to work with. Here is what I would do:
Preassign a 3D array:
coefs = zeros(128, size(J, 2), size(J,1));
then compute and populate the stack:
for ii = 1:size(J, 1)
K=J(ii,:);
coefs(:,:,ii) = cwt(K,1:128,'morl');
end
Finally, compute the mean along the third dimension:
MeanCoeff=mean(coefs, 3);

Matlab mode filter for dependent RGB channels

I've been performing a 2D mode filter on an RGB image by running medfilt2 independently on the R,G and B channels. However, splitting the RGB channels like this gives artifacts in the colouring. Is there a way to perform the 2D median filter while keeping RGB values 'together'?
Or, I could explain this more abstractly: Imagine I had a 2D matrix, where each value contained a pair of index coordinates (i.e. a cell matrix of 2X1 vectors). How would I go about performing a median filter on this?
Here's how I can do an independent mode filter (giving the artifacts):
r = colfilt(r0,[5 5],'sliding',#mode);
g = colfilt(g0,[5 5],'sliding',#mode);
b = colfilt(b0,[5 5],'sliding',#mode);
However colfilt won't work on a cell matrix.
Another approach could be to somehow combine my RGB channels into a single number and thus create a standard 2D matrix. Not sure how to implement this, though...
Any ideas?
Thanks for your help.
Cheers,
Hugh
EDIT:
OK, so problem solved. Here's how I did it.
I adapted my question so that I'm no longer dealing with (RGB) vectors, but (UV) vectors. Still essentially the same problem, except that my vectors are 2D not 3D.
So firstly I load the individual U and V channels, arrange them each into a 1D list, then combine them, so I essentially have a list of vectors. Then I reduce it to just those which are unique. Then, I assign each pixel in my matrix the value of the index of that unique vector. After this I can do the mode filter. Then I basically do the reverse, in that I go through the filtered image pixelwise, and read the value at each pixel (i.e. an index in my list), and find the unique vector associated with that index and insert it at that pixel.
% Create index list
img_u = img_iuv(:,:,2);
img_v = img_iuv(:,:,3);
coordlist = unique(cat(2,img_u(:),img_v(:)),'rows');
% Create a 2D matrix of indices
img_idx = zeros(size(img_iuv,1),size(img_iuv,2),2);
for y = 1:length(Y)
for x = 1:length(X)
coords = squeeze(img_iuv(x,y,2:3))';
[~,idx] = ismember(coords,coordlist,'rows');
img_idx(x,y) = idx;
end
end
% Apply the mode filter
img_idx = colfilt(img_idx,[n,n],'sliding',#mode);
% Re-construct the original image using the filtered data
for y = 1:length(Y)
for x = 1:length(X)
idx = img_idx(x,y);
try
coords = coordlist(idx,:);
end
img_iuv(x,y,2:3) = coords(:);
end
end
Not pretty but it gets the job done. I suppose this approach would also work for RGB images, or other similar situations.
Cheers,
Hugh
I don't see how you can define the median of a vector variable. You probably need to reduce the R,G,B components to a single value and then compunte the median on that value. Why not use the intensity level as that single value? You could do it easily with rgb2gray.

create 3D image from coordinates and intensity values

I am trying to create a 3D array of size 1000x1000x1000 with all the elements (corresponding to voxels) being zero and then assign a random value in the 2000 to 2001 range instead of 0 to some specific elements in the array and finally store it as a binary file.
The array named "coord" is the Nx3 matrix coordinates (x,y,z) of the points that I need them to be assigned the random value in the 3D array.))
I should mention that all the x,y,z values of the coordinate matrix are floating point numbers with: 0<=x<=1000 0<=y<=1000 0<=z<=1000
My aim is to export the 3D matrix in a binary format (other than MATLAB's default binary format) so that I can use it with other programs.
Here is what I've been up to so far:
load coord;
a=coord(:,1);
b=coord(:,2);
c=coord(:,3);
d=rand(1000,1)*2000;
dd = 0:2:1000;
[xq,yq,zq] = meshgrid(dd,dd,dd);
vq = griddata3(a,b,c,d,xq,yq,zq,'nearest');
h=figure;
plot3(a,b,c,'ro')
%=========================================%
fid=fopen('data.bin','w');
fwrite(fid,vq,'single');
fclose(fid);
In the above code a, b and c are the coordinates of each point and d is the corresponding intensity values for the desired range. While it is possible to create a 3D mesh (using meshgrid) and then interpolate the intensity values for mesh points (using griddata3), the final result (vq) would not be the actual points (ai,bi,ci) and corresponding intensities , but rather an interpolated set of points which is pretty useful for visualization purposes (for instance if you like to fit a 3D surface which fits through actual data).
I am simply trying to find a way to store the actual data-points and their intensities into a file and export it.
Any help is highly appreciated.
If you want to save to files that will allow importing into a visualization software, a series of Tiff files will most likely be convenient, i.e.
maxValue = 2000; % this is the maximum signal that can possibly occur
% according to your code
for z = 1:size(vq,3)
%# convert slice z to 16 bit
currentSlice = vq(:,:,z);
currentSlice = uint16(round(currentSlice/maxValue))
%# save to file
imwrite(currentSlice, sprintf('testImg_z%04i.tif',z),'tif');
end
Note that if you create a double array of dimensions 1000x1000x1000, you'll need 8GB of contiguous RAM.
How about something like:
%# 3D array
voxels = zeros([1000 1000 1000]);
%# round points coordinates, and clamp to valid range [1,1000]
load coords
coords = round(coords);
coords = min(max(coords,1),1000);
%# convert to linear indices
idx = sub2ind(size(voxels), coords(:,1), coords(:,2), coords(:,3));
%# random values in the 2000 to 2001 range
v = rand(size(idx)) + 2000;
%# assign those values to the chosen points
voxels(idx) = v;

Resources