Trying the pretrained Faceboook DETR model for object detection using the HuggingFace implementation.
The sample code listed below from https://huggingface.co/facebook/detr-resnet-50 is straightforward.
from transformers import DetrFeatureExtractor, DetrForObjectDetection
from PIL import Image
import requests
import numpy as np
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)
feature_extractor = DetrFeatureExtractor.from_pretrained('facebook/detr-resnet-50')
model = DetrForObjectDetection.from_pretrained('facebook/detr-resnet-50')
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
# model predicts bounding boxes and corresponding COCO classes
logits = outputs.logits
bboxes = outputs.pred_boxes
I can use
threshod = 0.7
labels =['background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
'train', 'truck', 'boat', 'traffic light', 'fire hydrant',
'street sign', 'stop sign', 'parking meter', 'bench', 'bird',
'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra',
'giraffe', 'hat', 'backpack', 'umbrella', 'shoe', 'eye glasses',
'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard',
'sports ball', 'kite', 'baseball bat', 'baseball glove',
'skateboard', 'surfboard', 'tennis racket', 'bottle', 'plate',
'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana',
'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog',
'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed',
'mirror', 'dining table', 'window', 'desk', 'toilet', 'door', 'tv',
'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave',
'oven', 'toaster', 'sink', 'refrigerator', 'blender', 'book',
'clock', 'vase', 'scissors', 'teddy bear', 'hair drier',
'toothbrush']
np_softmax = (logits.softmax(-1)[0, :, :-1]).detach().numpy()
classes = []
probability = []
idx = []
for i, j in enumerate(np_softmax):
if np.max(j) > threshold:
classes.append(labels[np.argmax(j)])
probability.append(np.max(j))
idx.append(i)
to retrieve the detected classes. But I did not fully understand the coordinates in the bboxes.
This is a torch tensor with 100 bounded boxes coordinates of 4 dimensions. With idx I can get the index of the classes so I can get their corresponding boxes. Seems the coordinates are normalized because they are all between 0 and 1. I have a difficulty to remap the coordinates into pixels so I can draw the bounded boxes on the original images. Could not find documentation on this, any suggestions? Thanks
Okay, figured it out, the four coordinates are the normalized (X center, Y center, Width, Height)
if you want to draw a rectangle for each bbox you can use this code:
plt.figure(figsize=(16,10))
plt.imshow(pil_img)
ax = plt.gca()
(xmin, ymin, xmax, ymax) = bbox
ax.add_patch(plt.Rectangle(
(xmin, ymin),
xmax - xmin,
ymax - ymin,
fill = False,
color = c,
linewidth = 3
))
Related
I am creating a plotly figure, overlapping rectangles on an image and I want to change the xticks.
Example Code:
a = 255*np.random.random((28,28))
pil_img = Image.fromarray(a).convert('RGB')
fig2 = go.Figure(data = [go.Scatter(x=[0,10,10,0], y=[0,0,10,10], fill="toself"),go.Image(z=pil_img)])
fig2.show()
Instead of the ticks being the number of pixels (0-28) I want them to be let's say from 0.2 to 3 in increments of 0.1 [0.2,0.3,...3] so that the length is still 28 but the ticks aren't [0,1,2] but rather [0.2,0.3,..3]
Thanks!
Following this documentation page, one way to achieve it is:
import numpy as np
from PIL import Image
import plotly.graph_objects as go
N = 28
a = 255 * np.random.random((N, N))
pil_img = Image.fromarray(a).convert('RGB')
fig = go.Figure(data = [go.Scatter(x=[0,10,10,0], y=[0,0,10,10], fill="toself"),go.Image(z=pil_img)])
fig.update_layout(
xaxis = dict(
tickmode = 'array',
tickvals = np.arange(N),
ticktext = ["{:.1f}".format(t) for t in np.linspace(0.3, 3, N)]
),
yaxis = dict(
tickmode = 'array',
tickvals = np.arange(N),
ticktext = ["{:.1f}".format(t) for t in np.linspace(0.3, 3, N)]
)
)
fig
I am creating a 3D sphere with 2D grid by selected latitudes and longitudes, which show the cartesian coordinates. This grid represent the key points to draw 3D sphere. Than I am creating X,Y,Z 3D coordinates values from these cartesian coordinates with well known formula. Drawn sphere is shown in the attached picture. I am using a numpy named datatype as
np3d = np.dtype([('X', np.float), ('Y', np.float), ('Z', np.float)])
for 3D coordinates and
np2d = np.dtype([('L', np.float), ('B', np.float)])
for latitude/longitude grid. My python code is
import numpy as np, math
import pandas as pd
from matplotlib import pyplot as plt, ticker, patches, font_manager as fmng
from matplotlib.widgets import Cursor, MultiCursor
from pathlib import Path
from datetime import datetime
fig3d = plt.figure('3D Sphere', figsize=(9.5,9.5))
fig3d.subplots_adjust(left=0.04, bottom=0.07, top=0.97, right=0.97, wspace=0, hspace=0)
prmgraf = dict(axis="both", direction='in',top=True, right=True)
ax3 = fig3d.add_subplot(111, projection='3d')
ax3.grid(False)
ax3.minorticks_on()
ax3.yaxis.set_minor_locator(ticker.AutoMinorLocator(5))
ax3.xaxis.set_minor_locator(ticker.AutoMinorLocator(5))
ax3.zaxis.set_minor_locator(ticker.AutoMinorLocator(5))
ax3.tick_params(which='major', length=4, **prmgraf)
ax3.tick_params(which='minor', length=3, **prmgraf)
ax3.set_xlabel('Axis X')
ax3.set_ylabel('Axis Y')
ax3.set_zlabel('Axis Z')
ax3.set_xlim(-15.0, 15.0)
ax3.set_ylim(-15.0, 15.0)
ax3.set_zlim(-15.0, 15.0)
# azim elev
ax3.view_init(0., 180.)
# ---------------------------- settings ------------------------------
nLat = np.vstack(np.radians(np.arange(-65., 70., 5.)))
nLon = np.radians(np.arange(-180., 185., 5.))
np3d = np.dtype([('X', np.float), ('Y', np.float), ('Z', np.float)])
np2d = np.dtype([('L', np.float), ('B', np.float)])
LatN, LonN = (len(nLat), len(nLon))
SphrRadius = 14.5
#2D cartesian coordinates
pSphr = np.zeros(shape=(LatN, LonN), dtype=np2d)
pSphr['L'] = nLat
pSphr['B'] = nLon
#3D sphere coordinates
Spher = np.zeros(shape=(LatN, LonN), dtype=np3d)
Spher['X'] = SphrRadius*np.cos(pSphr['L'])*np.sin(pSphr['B'])
Spher['Y'] = SphrRadius*np.sin(pSphr['L'])
Spher['Z'] = SphrRadius*np.cos(pSphr['L'])*np.cos(pSphr['B'])
# draw sphere latitudes
for i in range(LatN):
kx = Spher[i,:]['X']
ky = Spher[i,:]['Y']
kz = Spher[i,:]['Z']
ax3.plot3D(kx, ky, kz, c='k',lw=0.5)
plt.show()
Z values of the sphere are changing between -15.0 and 15.0. I want to select ONLY POSITIVE Z VALUES in the "Spher variable". In other words, I want to draw half of the sphere in the Z direction. How can i do that in named datatype? Thanks for now to the friends who will answer.
I found solution as
Spher[(Spher['Z'] < 0.)] = np.nan
But this is not the solution I expected. I want to select all points into a new variable as
ZPositive = Spher[(Spher['Z'] < 0.)]
But in this way, 2D data structure change into 1D data.
I am trying to implement FRST on python to detect centroids of elliptical objects (e.g. cells in microscopy images), but my implementation does not find seed points (more or less center points) of elliptical objects. This effort comes from duplicating FRST from Segmentation of Overlapping Elliptical Objects in Silhouette Images (https://ieeexplore.ieee.org/document/7300433). I don't know why I have these artifacts. An interesting thing is that I see these patterns (crosses) all in the same direction per object. Any point in the right direction to generate the same result as in the paper (just to find the seed points) will be most welcome.
Original Paper: A Fast Radial Symmetry Transform for Detecting Points of Interest by Loy and Zelinsky (ECCV 2002)
I have also tried the pre-existing python package for FRST: https://pypi.org/project/frst/. This somehow results in the same artifacts. Weird.
First image: Original Image
Second image: Sobel-operated Image
Third image: Magnitude Projection Image
Fourth image: Magnitude Projection Image with positively affected pixels only
Fifth image: FRST'd image: end-product with original image overlaid (shadowed)
Sixth image: FRST'd image by the pre-existing python package with original image overlaid (shadowed).
from scipy.ndimage import gaussian_filter
import numpy as np
from scipy.signal import convolve
# Get orientation projection image
def get_proj_img(image, radius):
workingDims = tuple((e + 2*radius) for e in image.shape)
h,w = image.shape
ori_img = np.zeros(workingDims) # Orientation Projection Image
mag_img = np.zeros(workingDims) # Magnitutde Projection Image
# Kenels for the sobel operator
a1 = np.matrix([1, 2, 1])
a2 = np.matrix([-1, 0, 1])
Kx = a1.T * a2
Ky = a2.T * a1
# Apply the Sobel operator
sobel_x = convolve(image, Kx)
sobel_y = convolve(image, Ky)
sobel_norms = np.hypot(sobel_x, sobel_y)
# Distances to afpx, afpy (affected pixels)
dist_afpx = np.multiply(np.divide(sobel_x, sobel_norms, out = np.zeros(sobel_x.shape), where = sobel_norms!=0), radius)
dist_afpx = np.round(dist_afpx).astype(int)
dist_afpy = np.multiply(np.divide(sobel_y, sobel_norms, out = np.zeros(sobel_y.shape), where = sobel_norms!=0), radius)
dist_afpy = np.round(dist_afpy).astype(int)
for cords, sobel_norm in np.ndenumerate(sobel_norms):
i, j = cords
pos_aff_pix = (i+dist_afpx[i,j], j+dist_afpy[i,j])
neg_aff_pix = (i-dist_afpx[i,j], j-dist_afpy[i,j])
ori_img[pos_aff_pix] += 1
ori_img[neg_aff_pix] -= 1
mag_img[pos_aff_pix] += sobel_norm
mag_img[neg_aff_pix] -= sobel_norm
ori_img = ori_img[:h, :w]
mag_img = mag_img[:h, :w]
print ("Did it go back to the original image size? ")
print (ori_img.shape == image.shape)
# try normalizing ori and mag img
return ori_img, mag_img
def get_sn(ori_img, mag_img, radius, kn, alpha):
ori_img_limited = np.minimum(ori_img, kn)
fn = np.multiply(np.divide(mag_img,kn), np.power((np.absolute(ori_img_limited)/kn), alpha))
# convolute fn with gaussian filter.
sn = gaussian_filter(fn, 0.25*radius)
return sn
def do_frst(image, radius, kn, alpha, ksize = 3):
ori_img, mag_img = get_proj_img(image, radius)
sn = get_sn(ori_img, mag_img, radius, kn, alpha)
return sn
Parameters:
radius = 50
kn = 10
alpha = 2
beta = 0
stdfactor = 0.25
I have created an image which is an output of the imagesc function in MATLAB.
My problem: How do I display a larger image of a transposed matrix and a much smaller regular plot as a subplot?
I have tried the following code for myself but have not gone far. I'm looking for comments on what I have done and suggestions that I could use to arrive at my final goal.
figure(1);
hFig = figure(1);
set(hFig, 'Position', [100 100 500 500]);
subplot(2,1,1)
imagesc(normalized_matrix'),colormap(jet)
ax = gca;
ax.XLimMode = 'manual';
ax.XLim = [0e4 30e4];
ax.XTickLabelMode = 'manual';
ax.XTickLabel = {'0', '5', '10', '15', '20', '25', '30'};
xlabel('Time in Seconds')
subplot(2,1,2)
plot(force_resample)
ax1 = gca;
ax1.XLimMode = 'manual';
ax1.XLim = [0e4 30e4];
ax1.XTickLabelMode = 'manual';
ax1.XTickLabel = {''};
ax1.YLimMode = 'manual';
ax1.YLim = [0 2];
ylabel('Force in Newtons')
ax1.XLimMode = 'manual';
ax1.XLim = [0e4 30e4];
ax1.XTickLabelMode = 'manual';
ax1.XTickLabel = {'0', '5', '10', '15', '20', '25', '30'};
xlabel('Time in Seconds')
Assign more parts of the subplot grid to the image. e.g:
%Sample Data
normalized_matrix = [1:10; 11:20; 21:30; 31:40];
force_resample = rand(4,40);
subplot(3,1,1:2); %using 2/3 part of the grid for the image
imagesc(normalized_matrix);
subplot(3,1,3); %using remaining 1/3 part of the grid for plot
plot(force_resample);
which gives:
What I would like to do is plot an image of a graph (from say a pdf file or a scanned image). Next, I would like to overlay an axis on the graph in the image, and then plot data on that axis (over the image).
Using imtool, I know the coordinates of the graph in the image (x range = ~52-355 pixels, and y range = 23(top) - 262(bottom) pixels in this case).
This is what I have tried:
I = imread('C:\MATLAB\R2014a\help\images\ref\ftrans2_fig.png');
I = squeeze(uint8(mean(I,3)));
figure, imshow(I)
[rows, cols] = size(I);
x_data = (-1 : .01 : +1)';
y_data = 1 - x_data.^2;
h1 = axes('Position',([52, 23, 355-52, 262-23] ./ [cols, rows, cols, rows] ));
set(h1, 'Color', 'none')
hold on
plot(x_data, y_data, '-rx')
Question: Knowing the pixel coordinates of the graph in the image, how do I determine the proper position of the axis in the figure, (my code fails to account for the actual size of the figure box, the gray border around the image). I have to do this for several images and sets of data, so I would like an automated method, assuming I find the coordinates of the graphs in the image ahead of time.
Thanks for your reply! (1st time posting, please be kind)
You may be able to solve your problem by forcing the image onto the same axis as the plot. Try this:
I = imread('C:\MATLAB\R2014a\help\images\ref\ftrans2_fig.png');
I = squeeze(uint8(mean(I,3)));
[rows, cols] = size(I);
x_data = (-1 : .01 : +1)';
y_data = 1 - x_data.^2;
h1 = axes('Position',([52, 23, 355-52, 262-23] ./ [cols, rows, cols, rows] ));
set(h1, 'Color', 'none')
hold on
image(I, 'Parent', h1);
plot(h1, x_data, y_data, '-rx')
That should at ensure that the plot axis and the image axis have the same origin, as they will be one and the same. You may need to adjust your sizing code. Let me know if that doesn't do it for you.
Good Luck!
I think I have it figured out.
It would have been easier if I could use:
figure, h1=imshow(I)
get(h1,'Position')
but that results in "The name 'Position' is not an accessible property for an instance of class 'image'."
Instead, this appears to work:
I = imread('C:\MATLAB\R2014a\help\images\ref\ftrans2_fig.png');
I = squeeze(uint8(mean(I,3)));
in_mag = 300;
figure, imshow(I, 'Border', 'tight', 'InitialMagnification', in_mag)
[rows, cols] = size(I);
x_data = (-1 : .01 : +1)';
y_data = 1 - x_data.^2;
% Coord of graph in image pixels
x_0 = 50; x_max = 354; y_0 = 262; y_max = 23;
h1 = axes('Position',([x_0, rows-y_0, x_max-x_0, y_0-y_max] ...
./ [cols, rows, cols, rows] ));
set(h1,'Color','none')
hold on
plot(x_data, y_data, '-rx')
ylim([0,1.4])
set(gca,'YColor', [0 0 1], 'XColor', [0 0 1])
However, if anybody has a better idea, I would be very happy to explore it!
Thanks