How to fix the limits of plt.colorbar() for every separate spectrogram plot for comparison - colorbar

At every epoch of training a CNN, I produce a spectrogram to monitor intermediate results. The problem is that every time the colorbar of each spectrogram changes slightly, so that the plots can not be directly comparable. How can I fix the colorbar once, so that it is used every time a new spectrogram is plotted, but without actually manipulating the values of spectrograms?
Right now, I generate the spectrograms like this:
for i in range(len(output_stfts)):
S_db = librosa.amplitude_to_db(output_stfts[i], ref=np.max)
plt.figure()
librosa.display.specshow(S_db)
plt.colorbar()
plt.savefig(file_name[i])
plt.close('all')
And get an output (in separate files) like this
File 1
File 2
I would like the spectrograms to have the same colorbar without manipulating (normalizing) the spectrograms. I don't have the spectrograms beforehand, they are generated on-the-fly. Setting the limits to a fixed range e.g. [0, -50] would do. Everything that is below -50dB may be clipped to that value.

Related

Make images overlap, despite being translated

I will have two images.
They will be either the same or almost the same.
But sometimes either of the images may have been moved by a few pixels on either axis.
What would be the best way to detect if there is such a move going on?
Or better still, what would be the best way to manipulate the images so that they fix for this unwanted movement?
If the images are really nearly identical, and are simply translated (i.e. not skewed, rotated, scaled, etc), you could try using cross-correlation.
When you cross-correlate an image with itself (this is the auto-correlation), the maximum value will be at the center of the resulting matrix. If you shift the image vertically or horizontally and then cross-correlate with the original image the position of the maximum value will shift accordingly. By measuring the shift in the position of the maximum value, relative to the expected position, you can determine how far an image has been translated vertically and horizontally.
Here's a toy example in python. Start by importing some stuff, generating a test image, and examining the auto-correlation:
import numpy as np
from scipy.signal import correlate2d
# generate a test image
num_rows, num_cols = 40, 60
image = np.random.random((num_rows, num_cols))
# get the auto-correlation
correlated = correlate2d(image, image, mode='full')
# get the coordinates of the maximum value
max_coords = np.unravel_index(correlated.argmax(), correlated.shape)
This produces coordinates max_coords = (39, 59). Now to test the approach, shift the image to the right one column, add some random values on the left, and find the max value in the cross-correlation again:
image_translated = np.concatenate(
(np.random.random((image.shape[0], 1)), image[:, :-1]),
axis=1)
correlated = correlate2d(image_translated, image, mode='full')
new_max_coords = np.unravel_index(correlated.argmax(), correlated.shape)
This gives new_max_coords = (39, 60), correctly indicating the image is offset horizontally by 1 (because np.array(new_max_coords) - np.array(max_coords) is [0, 1]). Using this information you can shift images to compensate for translation.
Note that, should you decide to go this way, you may have a lot of kinks to work out. Off-by-one errors abound when determining, given the dimensions of an image, where the max coordinate 'should' be following correlation (i.e. to avoid computing the auto-correlation and determining these coordinates empirically), especially if the images have an even number of rows/columns. In the example above, the center is just [num_rows-1, num_cols-1] but I'm not sure if that's a safe assumption more generally.
But for many cases -- especially those with images that are almost exactly the same and only translated -- this approach should work quite well.

matlab find peak images

I have a binary image below:
it's an image of random abstract picture, and by using matlab, what I wanna do is to detect, how many peaks does it have so I'll know that there are roughly 5 objects in it.
As you can see, there are, 5 peaks in it, so it means there are 5 objects in it.
I've tried using imregionalmax(), but I don't find it usefull, since my image already in binary image. I also tried to use regionprops('Area'), but it shows wrong number since there is no exact whitespace between each object. Thanks in advance
An easy way to do this would be to simply sum across the rows for each column and find the peaks of the result using findpeaks. In the example below, I have opted to use the inverse of the image which will result in positive peaks where the columns are.
rowSum = sum(1 - image, 1);
If we plot this, it looks like the bottom plot
We can then use findpeaks to identify the peaks in this plot. We will apply a 5-point moving average to it to help eliminate false peaks.
[peaks, locations, widths, prominences] = findpeaks(smooth(rowSum));
You can then select the "true" peaks by thresholding based on any of these outputs. For this example we can use prominences and find the more prominent peaks.
isPeak = prominences > 50;
nPeaks = sum(isPeak)
5
Then we can plot the peaks locations to confirm
plot(locations(isPeak), peaks(isPeak), 'r*');
If you have some prior knowledge about the expected widths of the peaks, you could adjust the smooth span to match this expected width and obtain some cleaner peaks when using findpeaks.
Using an expected width of 40 for your image, findpeaks was able to perfectly detect all 5 peaks with no false positive.
findpeaks(smooth(rowSum, 40));
As your they are peaks, they are vertical structures. So in this particular case, you case use projection histograms (also know as histogram projection function): you make all the black pixels fall as if they were effected by gravity. Then you will find a curve of black pixels on the bottom of your image. Then you can count the number of peaks.
Here is the algorithm:
Invert the image (black is normally the absence of information)
Histogram projection
Closing and opening in order to clean the signal and get the final result.
You can add a maxima detection to get the top of the peaks.

Filtering out correlation values in seaborn corrplot

I'm using the corrplot function in seaborn and everything works flawlessly. However, I want to do a little filtering on the data. Is there a way to hide correlations below or above a certain value? I have a large data frame and I only want to see correlations greater than an arbitrary number, say .4.
I'd like all the 'squares' in the image that are not greater than .4 to be set to white, grey or some other color. I'm not sure how to do this because the corrplot takes a full data frame and calculates the correlations internally. I don't want to filter on the data frame values, just the resulting correlation values.
Maybe there's some way to get the resulting image from the underlying matshow call back to my own code and then replot it by filtering the image itself?
As per #mwaskom's comments, you can use sns.heatmap(). You'll have to compute the correlation matrix yourself, but it's otherwise more flexibly in presentation and allows you to pass, e.g. a mask to do exactly what you want.

Why is subplot much faster than figure?

I'm building a data analysis platform in MATLAB. One of the system's features need to create many plots. At any given time only one plot is available and the user can traverse to the next/previous upon request (the emphasis here is that there is no need for multiple windows to be open).
Initially I used the figure command each time a new plot was shown, but I noticed that, as the user traverse to the next plot, this command took a bit longer than I wanted. Degrading usability. So I tried using subplot instead and it worked much faster.
Seeing this behavior I ran a little experiment, timing both. The first time figure runs it takes about 0.3 seconds and subplot takes 0.1 seconds. The mean run time for figure is 0.06 seconds with standard deviation of 0.05, while subplot take only 0.002 with standard deviation of 0.001. It seems that subplot is an order of magnitude faster.
The question is: In situation when only one window will be available at any given time, is there any reason to use figure?
Is there any value lost in using `subplot' in general?
(similar consideration can be made even if you can either only once).
The call of subplot does nothing else than creating a new axes object with some convenient positioning options wrapped around.
Axes objects are always children of figure objects, so if there is no figure window open, subplot will open one. This action takes a little time. So instead of opening a new figure window for every new plot, it's faster to just create a new axes object by using subplot, as you determined correctly. To save some memory you can clear the previous plot by clf as suggested by Daniel.
As I understood, you don't want to Create axes in tiled positions, rather you just want to create one axes object. So it would be even faster to use the axes command directly. subplot is actually overkill.
If all your plots have the same axes limits and labels, even clf is not necessary. Use cla (clear axes) to delete the previous plot, but keep labels, limits and grid.
Example:
%// plot #1
plot( x1, y2 );
xlim( [0,100] ); ylim( [0,100] );
xlabel( 'x' );
ylabel( 'y' );
%// clear plot #1, keep all settings of axes
%// plot #2
plot( x2, y2 );
...
Use figure once to create a figure and clf to clear it's content before repainting.

How to average multiple images using Octave and matrix manipulation to reduce noise?

UPDATE
Here is my code that is meant to add up the two matrices and using element by element addition and then divide by two.
function [ finish ] = stackAndMeanImage (initFrame, finalFrame)
cd 'C:\Users\Disc-1119\Desktop\Internships\Tracking\Octave\highway\highway (6-13-2014 11-13-41 AM)';
pkg load image;
i = initFrame;
f = finalFrame;
astr = num2str(i);
tmp = imread(astr, 'jpg');
d = f - i
for a = 1:d
a
astr = num2str(i + 1);
read_tmp = imread(astr, 'jpg');
read_tmp = rgb2gray(read_tmp);
tmp = tmp :+ read_tmp;
tmp = tmp / 2;
end
imwrite(tmp, 'meanimage.JPG');
finish = 'done';
end
Here are two example input images
http://imgur.com/5DR1ccS,AWBEI0d#1
And here is one output image
http://imgur.com/aX6b0kj
I am really confused as to what is happening. I have not implemented what the other answers have said yet though.
OLD
I am working on an image processing project where I am now manually choosing images that are 'empty' or only have the background, so that my algorithm can compute the differences and then do some more analysis, I have a simple piece of code that computes the mean of the two images, which I have converted to grayscale matrices, but this only works for two images, because when I find the mean of two, then take this mean and find the mean of this versus the next image, and do this repeatedly, I end up with a washed out white image that is absolutely useless. You can't even see anything.
I found that there is a function in Matlab called imFuse that is able to average images. I was wondering if anyone knew the process that imFuse uses to combine images, I am happy to implement this into Octave, or if anyone knew of or has already written a piece of code that achieves something similiar to this. Again, I am not asking for anyone to write code for me, just wondering what the process for this is and if there are already pre-existing functions out there, which I have not found after my research.
Thanks,
AeroVTP
You should not end up with a washed-out image. Instead, you should end up with an image, which is technically speaking temporally low-pass filtered. What this means is that half of the information content is form the last image, one quarter from the second last image, one eight from the third last image, etc.
Actually, the effect in a moving image is similar to a display with slow response time.
If you are ending up with a white image, you are doing something wrong. nkjt's guess of type challenges is a good one. Another possibility is that you have forgotten to divide by two after summing the two images.
One more thing... If you are doing linear operations (such as averaging) on images, your image intensity scale should be linear. If you just use the RGB values or some grayscale values simply calculated from them, you may get bitten by the nonlinearity of the image. This property is called the gamma correction. (Admittedly, most image processing programs just ignore the problem, as it is not always a big challenge.)
As your project calculates differences of images, you should take this into account. I suggest using linearised floating point values. Unfortunately, the linearisation depends on the source of your image data.
On the other hand, averaging often the most efficient way of reducing noise. So, there you are in the right track assuming the images are similar enough.
However, after having a look at your images, it seems that you may actually want to do something else than to average the image. If I understand your intention correctly, you would like to get rid of the cars in your road cam to give you just the carless background which you could then subtract from the image to get the cars.
If that is what you want to do, you should consider using a median filter instead of averaging. What this means is that you take for example 11 consecutive frames. Then for each pixel you have 11 different values. Now you order (sort) these values and take the middle (6th) one as the background pixel value.
If your road is empty most of the time (at least 6 frames of 11), then the 6th sample will represent the road regardless of the colour of the cars passing your camera.
If you have an empty road, the result from the median filtering is close to averaging. (Averaging is better with Gaussian white noise, but the difference is not very big.) But your averaging will be affected by white or black cars, whereas median filtering is not.
The problem with median filtering is that it is computationally intensive. I am very sorry I speak very broken and ancient Octave, so I cannot give you any useful code. In MatLab or PyLab you would stack, say, 11 images to a M x N x 11 array, and then use a single median command along the depth axis. (When I say intensive, I do not mean it couldn't be done in real time with your data. It can, but it is much more complicated than averaging.)
If you have really a lot of traffic, the road is visible behind the cars less than half of the time. Then the median trick will fail. You will need to take more samples and then find the most typical value, because it is likely to be the road (unless all cars have similar colours). There it will help a lot to use the colour image, as cars look more different from each other in RGB or HSV than in grayscale.
Unfortunately, if you need to resort to this type of processing, the path is slightly slippery and rocky. Average is very easy and fast, median is easy (but not that fast), but then things tend to get rather complicated.
Another BTW came into my mind. If you want to have a rolling average, there is a very simple and effective way to calculate it with an arbitrary length (arbitrary number of frames to average):
# N is the number of images to average
# P[i] are the input frames
# S is a sum accumulator (sum of N frames)
# calculate the sum of the first N frames
S <- 0
I <- 0
while I < N
S <- S + P[I]
I <- I + 1
# save_img() saves an averaged image
while there are images to process
save_img(S / N)
S <- -P[I-N] + S + P[I]
I <- I + 1
Of course, you'll probably want to use for-loops, and += and -= operators, but still the idea is there. For each frame you only need one subtraction, one addition, and one division by a constant (which can be modified into a multiplication or even a bitwise shift in some cases if you are in a hurry).
I may have misunderstood your problem but I think what you're trying to do is the following. Basically, read all images into a matrix and then use mean(). This is providing that you are able to put them all in memory.
function [finish] = stackAndMeanImage (ini_frame, final_frame)
pkg load image;
dir_path = 'C:\Users\Disc-1119\Desktop\Internships\Tracking\Octave\highway\highway (6-13-2014 11-13-41 AM)';
imgs = cell (1, 1, d);
## read all images into a cell array
current_frame = ini_frame;
for n = 1:(final_frame - ini_frame)
fname = fullfile (dir_path, sprintf ("%i", current_frame++));
imgs{n} = rgb2gray (imread (fname, "jpg"));
endfor
## create 3D matrix out of all frames and calculate mean across 3rd dimension
imgs = cell2mat (imgs);
avg = mean (imgs, 3);
## mean returns double precision so we cast it back to uint8 after
## rescaling it to range [0 1]. This assumes that images were all
## originally uint8, but since they are jpgs, that's a safe assumption
avg = im2uint8 (avg ./255);
imwrite (avg, fullfile (dir_path, "meanimage.jpg"));
finish = "done";
endfunction

Resources