Slic can implement segmentation under binarized masks, as shown in the figure below
But if I need to divide the superpixels of different adjacent regions, what should I do?
Each color represents an area, each region requires independent superpixel segmentation

There is not currently any way to handle a mask with multiple regions in a single call. For your use case you will have to split each region into a separate mask and then call slic once per mask. You can combine the multiple segmentations into one by incrementing the labels appropriately.
Pasted below is a concrete example of this for two separate masked regions (adapted from the existing example you referenced):
import matplotlib.pyplot as plt
import numpy as np
from skimage import data
from skimage import color
from skimage import morphology
from skimage import segmentation
# Input data
img = data.immunohistochemistry()
# Compute a mask
lum = color.rgb2gray(img)
mask = morphology.remove_small_holes(
lum < 0.7, 500),
mask1 = morphology.opening(mask, morphology.disk(3))
# create a second mask as the inverse of the first
mask2 = ~mask1
segmented = np.zeros(img.shape[:-1], dtype=np.int64)
max_label = 0
# replace [mask2, mask1] with a list of any number of binary masks
for mask in [mask2, mask1]:
# maskSLIC result
m_slic = segmentation.slic(img, n_segments=100, mask=mask, start_label=1)
if max_label > 0:
# offset the labels by the current maximum label
m_slic += max_label
# add the label into the current combined segmentation
segmented += m_slic
# increment max label
max_label += m_slic.max()
# Display result
fig, ax_arr = plt.subplots(2, 2, sharex=True, sharey=True, figsize=(10, 10))
ax1, ax2, ax3, ax4 = ax_arr.ravel()
ax1.set_title('Original image')
ax2.imshow(mask, cmap='gray')
ax3.imshow(segmentation.mark_boundaries(img, m_slic))
ax3.contour(mask, colors='red', linewidths=1)
ax3.set_title('maskSLIC (mask1 only)')
ax4.imshow(segmentation.mark_boundaries(img, segmented))
ax4.contour(mask, colors='red', linewidths=1)
ax4.set_title('maskSLIC (both masks)')
for ax in ax_arr.ravel():
The basic approach I am suggesting is in the for loop above. Most of the other code is just generating the data and plots.


How to do data augmentation on an 8 by 8 greyscale image?

I want to to data augmentation on an 8*8-pixel greayscale image through the codes below on Keras (the pixel values are only 0 and 1):
from ctypes import sizeof
from re import X
from turtle import shape
from keras.preprocessing.image import ImageDataGenerator
from skimage import io
import numpy as np
from PIL import Image
datagen = ImageDataGenerator(
rotation_range=45, #Random rotation between 0 and 45
width_shift_range=0.2, #% shift
fill_mode='nearest') #Also try nearest, constant, reflect, wrap
# forming a binary 8*8 array
array = np.array([[0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0],[0,0,1,1,1,0,0,0],
# scale values to uint8 maximum 255, and convert it to greyscale image
array = ((array) * 255).astype(np.uint8)
x = Image.fromarray(array)
i = 0
for batch in datagen.flow(x, batch_size=16,
i += 1
if i > 20:
break # otherwise the generator would loop indefinitely
But I get this error in the output (when I have .flow function):
ValueError: ('Input data in `NumpyArrayIterator` should have rank 4. You passed an array with shape', (8, 8))
Could anyone give me some hands please?
ImageDataGenerator accepts input as 4-dimensional tensor, where first dimension is sample number and last dimension are color channels. In your code you should convert this (8,8) tensor to (1,8,8,1) tensor. This can be done by
array = np.expand_dims(array, (0, -1))
Also you should not convert array to image before passing it to generator as you did it here
x = Image.fromarray(array)
you should simply pass array to generator.

Measuring an object by reference on an image considering perspective or angle

I made an algorithm to measure an object using a reference, like this:
The reference is the frame and the other (AOL) is the desired object. My code obtained this result:
But the real AOL is 78.6. This is because of the perspective/angle of photograph. So I used in my code Deep Learning and I obtained the the reference and AOL mask, and I made a simple calculation based on the number of pixels for each mask to obtain AOL area (cm²), once I know the actual size of the reference. I tried to correct the angle/perpective based on the reference and I used the reference mask:
I tried to calculate quadrangle vertices based on the reference mask to correct the perspective. I created this code based on this reference Perspective correction in OpenCV using python:
# import the necessary packages
from scipy.spatial import distance as dist
from imutils import perspective
from imutils import contours
import numpy as np
import imutils
import cv2
import math
import matplotlib.pyplot as plt
# get the single external contours
# load the image, convert it to grayscale, and blur it slightly
image = cv2.imread("./ref/20210702_114527.png") ## Mask Image
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (7, 7), 0)
# perform edge detection, then perform a dilation + erosion to
# close gaps in between object edges
edged = cv2.Canny(gray, 50, 100)
edged = cv2.dilate(edged, None, iterations=1)
edged = cv2.erode(edged, None, iterations=1)
# find contours in the edge map
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
cnts = imutils.grab_contours(cnts)
# sort the contours from left-to-right and initialize the
# 'pixels per metric' calibration variable
(cnts, _) = contours.sort_contours(cnts)
pixelsPerMetric = None
orig = image.copy()
box = cv2.minAreaRect(min(cnts, key=cv2.contourArea))
box = if imutils.is_cv2() else cv2.boxPoints(box)
box = np.array(box, dtype="int")
# order the points in the contour such that they appear
# in top-left, top-right, bottom-right, and bottom-left
# order, then draw the outline of the rotated bounding
# box
box = perspective.order_points(box)
cv2.drawContours(orig, [box.astype("int")], -1, (0, 255, 0), 2)
# loop over the original points and draw them
for (x, y) in box:, (int(x), int(y)), 5, (0, 0, 255), -1)
print('Box: ',box)
cv2.imshow('Orig', orig)
img = cv2.imread("./meat/sair/20210702_114527.jpg") #original image
rows,cols,ch = img.shape
#pts1 = np.float32([[185,9],[304,80],[290, 134],[163,64]]) #ficou legal 6e.jpg
### Coletando os pontos
pts1 = np.float32(box)
### Draw the vertices on the original image
for (x, y) in pts1:, (int(x), int(y)), 5, (0, 0, 255), -1)
ratio= 1.6
pts2 = np.float32([[pts1[0][0],pts1[0][1]], [pts1[0][0]+moldW, pts1[0][1]], [pts1[0][0]+moldW, pts1[0][1]+moldH], [pts1[0][0], pts1[0][1]+moldH]])
#print('cardH: ',cardH,cardW)
M = cv2.getPerspectiveTransform(pts1,pts2)
print('M:', M)
print('pts1:', pts1)
print('pts2:', pts2)
offsetSize= 320
transformed = np.zeros((int(moldW+offsetSize), int(moldH+offsetSize)), dtype=np.uint8)
dst = cv2.warpPerspective(img, M, transformed.shape)
And I got this:
No perspective correction. I have a lot of information like vertices, the correct size of the reference. Is it possible to do a mathematical correction based on quadrangle vertices, like a regression? Not necessarily correcting the image directly, unless there is a good method to correct the perspective image. Or maybe a different approach based on math? Thanks for your patience.
For Christoph:
This is the result position too:
pts1: [[ 9. 51.]
[392. 56.]
[388. 336.]
[ 5. 331.]]

How a heatmap is overlaid by another one, by calling matplotlib imshow twice for the same ax?

An exciting animation was posted on twitter recently:
One of the authors explained in this Jupyter Notebook
how a frame is created.
Related to the simple code displayed by this notebook, my question is: when we call imshow twice for the same ax:
ax.imshow(np.flipud(sst.sst.values), cmap=cm.RdBu_r, vmin=12, vmax=24)
ax.imshow(np.flipud(u.u_surf.values), alpha=0.3, cmap=cm.gray, vmin=-.3, vmax=0.3)
what operations performs matplotlib, behind the scenes, to get a layered image?
I worked with alpha blending in Open CV - Python, but here it starts with two arrays of the same shape (1000, 1000), and via ax.imshow, called twice for the two arrays, it displays the resulting image. I'd like to know how is it possible. What arithmetic operations between images are involved?
I searched the matplotlib github repository to understand what's going on, but I couldn't find something relevant.
I succeeded to illustrate that the two imshow(s) hide the alpha-blending of the two images.
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt
import as cm
sst = xr.open_dataset('GF_FESOM2_testdata/')
u = xr.open_dataset('GF_FESOM2_testdata/')
v = xr.open_dataset('GF_FESOM2_testdata/')
#Define the heatmap from SST-data and extract the array representing it as an image:
fig1, ax1 = plt.subplots(1, 1,
figsize=(10, 10))
f1 = ax1.imshow(np.flipud(sst.sst.values), cmap=cm.RdBu_r, vmin=12, vmax=24)
arr1 = f1.make_image('notebook')[0] #array representing the above image
#Repeat the same procedure for u-data set:
fig2, ax2 = plt.subplots(1, 1,
figsize=(10, 10))
f2 = ax2.imshow(np.flipud(u.u_surf.values), cmap=cm.gray, vmin=-0.3, vmax=0.3)
arr2 = f2.make_image("notebook")[0]
#alpha blending of the two images amounts to a convex combination of the associated arrays
alpha1= 1 # background image alpha
alpha2 = 0.3 #foreground image alpha
arr = np.asarray((alpha2*arr2 + alpha1*(1-alpha2)*arr1)/(alpha2+alpha1*(1-alpha2)), dtype=np.uint8)
fig, ax = plt.subplots(1, 1,
figsize=(10, 10))

Is there a way to plot confusion matrix from H2O?

I know H2O can use
model_perf = model.model_performance(input)
to output the confusion matrix. But is there a way to get the confusion matrix table to create plot?
You have the function you need as indicated here. So you just need to convert the output of your H2OFrames to a Pandas Dataframe. Example is shown below:
import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.utils.multiclass import unique_labels
%matplotlib inline
# import the cars dataset:
# this dataset is used to classify whether or not a car is economical based on
# the car's displacement, power, weight, and acceleration, and the year it was made
cars = h2o.import_file("")
# print(cars["economy_20mpg"].isna().sum())
cars = cars[~cars["economy_20mpg"].isna()]
# convert response column to a factor
cars["economy_20mpg"] = cars["economy_20mpg"].asfactor()
# set the predictor names and the response column name
predictors = ["displacement","power","weight","acceleration","year"]
response = "economy_20mpg"
# split into train and validation sets
train, valid = cars.split_frame(ratios = [.8], seed = 1234)
# try using the `y` parameter:
# first initialize your estimator
cars_gbm = H2OGradientBoostingEstimator(seed = 1234, sample_rate=.5)
# then train your model, where you specify your 'x' predictors, your 'y' the response column
# training_frame and validation_frame
cars_gbm.train(x = predictors, y = response, training_frame = train, validation_frame = valid)
function from sklearn:
def plot_confusion_matrix(y_true, y_pred, classes,
This function prints and plots the confusion matrix.
Normalization can be applied by setting `normalize=True`.
if not title:
if normalize:
title = 'Normalized confusion matrix'
title = 'Confusion matrix, without normalization'
# Compute confusion matrix
cm = confusion_matrix(y_true, y_pred)
# Only use the labels that appear in the data
classes = classes[unique_labels(y_true, y_pred)]
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
print("Normalized confusion matrix")
print('Confusion matrix, without normalization')
fig, ax = plt.subplots()
im = ax.imshow(cm, interpolation='nearest', cmap=cmap)
ax.figure.colorbar(im, ax=ax)
# We want to show all ticks...
# ... and label them with the respective list entries
xticklabels=classes, yticklabels=classes,
ylabel='True label',
xlabel='Predicted label')
# Rotate the tick labels and set their alignment.
plt.setp(ax.get_xticklabels(), rotation=45, ha="right",
# Loop over data dimensions and create text annotations.
fmt = '.2f' if normalize else 'd'
thresh = cm.max() / 2.
for i in range(cm.shape[0]):
for j in range(cm.shape[1]):
ax.text(j, i, format(cm[i, j], fmt),
ha="center", va="center",
color="white" if cm[i, j] > thresh else "black")
return ax
extract values
# specify the threshold you want to use to create integer labels
maxf1_threshold = cars_gbm.find_threshold_by_max_metric('f1')
# specify your tru and prediciton labels
y_true = cars["economy_20mpg"].as_data_frame()
y_pred = cars_gbm.predict(cars)
# convert prediction labels (original uncalibrated probabilities into integer labels)
y_pred = (y_pred['p1'] >= maxf1_threshold).ifelse(1,0)
y_pred = y_pred.as_data_frame()
y_pred.columns = ['p1']
y_true1 = y_true.economy_20mpg
y_pred1 = y_pred.p1
class_names = np.array(cars["economy_20mpg"].levels()[0])
# Plot non-normalized confusion matrix
plot_confusion_matrix(y_true1, y_pred1, classes=class_names,
title='Confusion matrix')
image result:
Please note that there is a bug in the H2O-3 confusion matrix that has been noted here

Find median of list of images

If I have a list of images represented by 3D ndarray such as [[x,y,color],...], what operations can I use to output an image with values that are median of all values? I am using a for loop and find it too slow.
This is my vectorized implementation using NumPy:
For my test I used these five images:
The relevant parts:
import numpy as np
import scipy.ndimage
# Load five images:
ims = [scipy.ndimage.imread(str(i + 1) + '.png', flatten=True) for i in range(5)]
# Stack the reshaped images (rows) vertically:
ims = np.vstack([im.reshape(1,im.shape[0] * im.shape[1]) for im in ims])
# Compute the median column by column and reshape to the original shape:
median = np.median(ims, axis=0).reshape(100, 100)
The complete script:
import numpy as np
import scipy.ndimage
import matplotlib.pyplot as plt
ims = [scipy.ndimage.imread(str(i + 1) + '.png', flatten=True) for i in range(5)]
print ims[0].shape # (100, 100)
ims = np.vstack([im.reshape(1,im.shape[0] * im.shape[1]) for im in ims])
print ims.shape # (5, 10000)
median = np.median(ims, axis=0).reshape(100, 100)
fig = plt.figure(figsize=(100./109., 100./109.), dpi=109, frameon=False)
ax = fig.add_axes([0, 0, 1, 1])
plt.imshow(median, cmap='Greys_r')
The median (numpy.median) result of the five images looks like this:
Fun part: The mean (numpy.mean) result looks like this:
Okay, science meets art. :-)
You said your images in color, formatted as a list of 3d ndarrays. Let's say there are n images:
imgs = [img_1, ..., img_n]
Where imgs is a list and each img_i is a an ndarray with shape (nrows, ncols, 3).
Convert the list to a 4d ndarray, then take the median over the dimension that corresponds to images:
import numpy as np
# Convert images to 4d ndarray, size(n, nrows, ncols, 3)
imgs = np.asarray(imgs)
# Take the median over the first dim
med = np.median(imgs, axis=0)
This gives the pixel-wise median. The value of each color channel of each pixel is the median of the corresponding pixel/channel in all images.
The documentation for asarray() says "no copy is performed if the input is already an ndarray". This suggests that the operation would be faster if you stored the original list of images as a 4d ndarray instead of a list. In that case, it wouldn't be necessary to copy the information in memory (or to run asarray())
Can you put an example of your data ?
Else, I think that you could maybe use numpy with numpy.mean ?
You have the doc here ;)
