i'm trying to train train_shape_predictor_ex for detecting following image in indian bill. i'm using 34 different images both clicked and scanned.
model is trained succesfully with
taining error = 0
testing error = 0.35468-6
i have tried changing oversampling parameter from 300 to 12000
but still same results.
what am i doing wrong?
drawing code - from image loading to drawing step:
image_window win;
frontal_face_detector detector = get_frontal_face_detector();
shape_predictor pose_model;
deserialize("sp.dat") >> pose_model;
while (!win.is_closed())
{
cv::Mat temp;
cap >> temp;
cv_image<bgr_pixel> cimg(temp);
std::vector<rectangle> faces = detector(cimg);
std::vector<full_object_detection> shapes;
for (unsigned long i = 0; i < faces.size(); ++i)
{
full_object_detection shape = pose_model(cimg, faces[i]);
std::vector<rectangle> dets = detector(cimg);
shapes.push_back(pose_model(cimg, faces[i]));
win.clear_overlay();
win.set_image(cimg);
win.add_overlay(dets, rgb_pixel(255, 0, 0));
win.add_overlay(render_face_detections(shapes));
}
}
As I see now - you are trying to train custom shape_predictor that will only have 14 points while using Dlib's render_face_detections function that require Dlib's face shape that has 68 points. render_face_detections will not draw your shape correctly and should throw an exception.
To make your shape prediction work, please ensure that you follow this conditions:
The meaning of each point should be the same on each image. Do not put point #0 on ear on first image and #0 on the tip of nose on the second image
Object bounding box should not be hand-drawn if you have small dataset. If you use face detector for testing shape predictor - ensure that your training/testing images bounding boxes are made by detecting face with the same face detector. Yes, you can hand-draw whis bounding boxes, but please make sure that they have the same size and position as if they will be detected. The way how you put box on the training set should be identical the way how you will get it in a future.
There is no requirement to fit all points inside face bounding box. Really it can have any size and position, even static 10x10 box for each face on the tip of nose (or full bill rect) will work - but you should have enough training samples. And follow previous condition
Use as many images as possible. 34 images is not enough for training shape predictor - if will simply remember them in its internal memory and will not work in a future. Dlib's shape predictor is trained with about 2-3k images.
You can generate new images by distorting original ones - scale, resize, add noise...
Do not use Dlib's testing and drawing functions if your shape predictor does not have 68 points with the same meaning as dlib's face shape predictor. You can use its source code and make your functions as you need
Related
I have two cameras setted horizontally (close to each other). I have left camera cam1 and right camera cam2.
First I calibrate cameras (I want to calibrate 50 pairs of images):
I calibrate both cameras separetely using cv::calibrateCamera()
I calibrate stereo using cv::stereoCalibrate()
My questions:
In stereoCalibrate - I assumed that the order of cameras data is important. If data from left camera should be the imagePoints1 and from right camera it should be imagePoints2 or vice versa or it doesn't matters as long as order of cameras is the same in every point of program?
In stereoCalibrate - I get RMS error around 15,9319 and average reprojection error around 8,4536. I get that values if I use all images from cameras. In other case: first I save images, I select pairs where whole chessboard is visible (all of chessborad's squares is in camera view and every square is visible in its entirety) I get RMS around 0,7. If that means that only offline calibration is good and if I want to calibrate camera I should select good images manually? Or there is some way to do calibration online? By online I mean that I start capture view from camera and on every view I found chessboard corners and after stop capture view from camera I calibrate camera.
I need only four values of distortion but I get five of them (with k3). In old api version cvStereoCalibrate2 I got only four values but in cv::stereoCalibrate I don't know how to do this? Is it even possible or the only way is to get 5 values and use only four of them later?
My code:
Mat cameraMatrix[2], distCoeffs[2];
distCoeffs[0] = Mat(4, 1, CV_64F);
distCoeffs[1] = Mat(4, 1, CV_64F);
vector<Mat> rvec1, rvec2, tvec1, tvec2;
double rms1 = cv::calibrateCamera(objectPoints, imagePoints[0], imageSize, cameraMatrix[0], distCoeffs[0],rvec1, tvec1, CALIB_FIX_K3, TermCriteria(
TermCriteria::COUNT+TermCriteria::EPS, 30, DBL_EPSILON));
double rms2 = cv::calibrateCamera(objectPoints, imagePoints[1], imageSize, cameraMatrix[1], distCoeffs[1],rvec2, tvec2, CALIB_FIX_K3, TermCriteria(
TermCriteria::COUNT+TermCriteria::EPS, 30, DBL_EPSILON));
qDebug()<<"Rms1: "<<rms1;
qDebug()<<"Rms2: "<<rms2;
Mat R, T, E, F;
double rms = cv::stereoCalibrate(objectPoints, imagePoints[0], imagePoints[1],
cameraMatrix[0], distCoeffs[0],
cameraMatrix[1], distCoeffs[1],
imageSize, R, T, E, F,
TermCriteria(CV_TERMCRIT_ITER+CV_TERMCRIT_EPS, 100, 1e-5),
CV_CALIB_FIX_INTRINSIC+
CV_CALIB_SAME_FOCAL_LENGTH);
I had a similar problem. My problem was that I was reading the left images and the right images by assuming that both were sorted. Here a part of the code in Python
I fixed by using "sorted" in the second line.
images = glob.glob(path_left)
for fname in sorted(images):
img = cv2.imread(fname)
gray1 = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Find the chess board corners
ret, corners1 = cv2.findChessboardCorners(gray1, (n, m), None)
# If found, add object points, image points (after refining them)
if ret == True:
i = i + 1
print("Cam1. Chess pattern was detected")
objpoints1.append(objp)
cv2.cornerSubPix(gray1, corners1, (5, 5), (-1, -1), criteria)
imgpoints1.append(corners1)
cv2.drawChessboardCorners(img, (n, m), corners1, ret)
cv2.imshow('img', img)
cv2.waitKey(100)
The only thing why is the order of cameras/image sets important is the rotation and translation you get from stereoCalibrate function. The image set you put into the function as first is taken as the base. So the rotation and translation you get is how is the second camera translated and rotated from the first camera. Of course you can just reverse the result, which is the same as switching image sets. This of course holds only if the images in both sets are corresponding to each other (their order).
This is a bit tricky, but there are few reasons why you are getting this big RMS error.
First, I'm not sure how you detect your chessboard corners, but if the whole chessboard is not visible and you provide valid chessboard model, findChessboardCorners should return false as it does not detect the chessboard. So you're able to automatically (=online) omit these "chessless" images. Of course you have to throw away also the image from second camera, even if that one is valid, to preserve correct order in both sets.
Second option is to back-project all corners for each image and calculate reprojection error for all images separately (not only for whole calibration). Then you can select, for example, best 3/4 images by this error and recalculate calibration without outliers.
Other reason could be the time sync between snapping images from 2 cameras. If the delay is big and you move with the chessboard continuously, you're actually trying to match projections of slightly translated chessboard.
If you want robust online version I'm afraid you will end up with the second option, as it helps you also get rid of blurred images, wrong detections due to light conditions and so. You just need to set the threshold (how many images you will cut of as outliers) carefully to not throw away valid data.
I'm not that sure in this field, but I would say you can calculate 5 of them and use only four coz it looks like you just cut off higher order of Taylor series. But I cannot guarantee it's true.
I am trying to figure out an algorithm for detecting the free surface from a PIV image (see attached). The major problem is that in the flow under consideration gas bubbles are injected into the fluid, these rise up due to buoyancy and tend to sit on top of the surface. I don't want these to be mistaken for the free surface (actually want the '2nd' edge underneath them) - I'm struggling to figure out how to include that in the algorithm.
Ideally, I want an array of x and y values representing coordinates of the free surface (like a continuous, smooth curve).
My initial approach was to scan the picture left to right, one column at a time, find an edge, move to the next column etc... That works somewhat ok, but fails as soon as the bubbles appear and my 'edge' splits in two. So I am wondering if there is some more sophisticated way of going about it.
If anybody have any expertise in the area of image processing/edge detection, any advice would be greatly appreciated.
Typical PIV image
Desired outcome
I think you can actually solve the problem by using morphologic methods.
A = imread('./MATLAB/ZBhAM.jpg');
figure;
subplot 131;
imshow(A)
subplot 132;
B = double(A(:,:,1));
B = B/255;
B = im2bw(B, 0.1);
imshow(B);
subplot 133;
st = strel('diamond', 5);
B = imerode(B, st);
B = imdilate(B, st);
B = imshow(B);
This gives the following result:
As you can see this approach is not perfect mostly because I picked a random value for the threshold in im2bw, if you use an adaptive threshold for the different column of your images you should have something better.
Try to work on your lighting otherwise.
I found that there are some paper said can analysis the gradient histogram
(blur image has gradient follows a heavy-tailed distribution)
or using fft (blur image has lower frequency)
Is there a way to detect if an image is blurry?
to detect blur in image.
But I am not quite sure how to implement it in matlab. How to define the threshold value and so on.
[Gx, Gy] = imgradientxy(a);
G = sqrt(Gx.^2+Gy.^2)
What should I do after running the command and find the G?
What should I do if I wanna plot a graph of number of pixel verse G
I am new to matlab and image processing. Could anyone kindly provide more details of how to implement it
Preparation: we read the cameraman image, which is often used for visualizing image processing algorithms, and add some motion blur.
origIm = imread('cameraman.tif');
littleBlurredIm = imfilter(origIm,fspecial('motion',5,45),'replicate');
muchBlurredIm = imfilter(origIm,fspecial('motion',20,45),'replicate');
which gives us the following images to start with:
To calculate the Laplacian, you can use the imgradient function, which returns magnitude and angle, so we'll simply discard the angle:
[lpOrigIm,~] = imgradient(origIm);
[lpLittleBlurredIm,~] = imgradient(littleBlurredIm);
[lpMuchBlurredIm,~] = imgradient(muchBlurredIm);
which gives:
You can visually see that the original image has very sharp and clear edges. The image with a little blur still has some features, and the image with much blur only contains a few non-zero values.
As proposed in the answer by nikie to this question, we can now create some measure for the blurriness. A (more or less) robust measure would for example be the median of the top 0.1% of the values:
% Number of pixels to look at: 0.1%
nPx = round(0.001*numel(origIm));
% Sort values to pick top values
sortedOrigIm = sort(lpOrigIm(:));
sortedLittleBlurredIm = sort(lpLittleBlurredIm(:));
sortedMuchBlurredIm = sort(lpMuchBlurredIm(:));
% Calculate measure
measureOrigIm = median(sortedOrigIm(end-nPx+1:end));
measureLittleBlurredIm = median(sortedLittleBlurredIm(end-nPx+1:end));
measureMuchBlurredIm = median(sortedMuchBlurredIm(end-nPx+1:end));
Which gives the following results:
Original image: 823.7
Little Blurred image: 593.1
Much Blurred image: 490.3
Here is a comparison of this blurriness measure for different motion blur angles and blur amplitudes.
Finally, I tried it on the test images from the answer linked above:
which gives
Interpretation: As you see it is possible to detect, if an image is blurred. It however appears difficult to detect how strongly blurred the image is, as this also depends on the angle of the blur with relation to the scene, and due to the imperfect gradient calculation. Further the absolute value is very much scene-dependent, so you might have to put some prior knowledge about the scene into the interpretation of this value.
This is a very interesting topic.
Although gradient magnitude can be used as good feature for blur detection but this feature will fail when dealing with uniform regions in images. In other words, this feature will not be able to distinguish between blur and flat regions. There are many other solutions. Some of them detect flat regions to avoid classifying flat regions as blur. if you want more information you can check these links:
You can find many good recent papers in cvpr conference.
Many of them they have websites where they discuss the details and provide the code.
This one http://www.cse.cuhk.edu.hk/leojia/projects/dblurdetect/
is one of the papers that I worked on
you can find the code available.
You can check also other papers in cvpr. most of them they have the code
this is another one
http://shijianping.me/jnb/index.html
I'm working on a PIV-Workflow and I'm currently pre-processing the images. I need to get rid of the perspective distortion in the images. I do have the "image processing toolbox" and the "camera calibrator". I already got rid of the lens distortion with "undistortImage();" and the cameraParams object, which is inferred through a chessboard pattern.
First Question: Is it possible to use the cameraParams object to distort the image perspectively, so that my chessboard is rectified in the image?
Second Question: Since I were not able to use the cameraParams object, I tried to use the transformation functions manually. I tried to use pairs of control-points (with cpselection tool, the original image and a generated chessboard-image) and the fitgeotrans(movingPoints, fixedPoints, 'projective'); function to get my tform-object. However I always get the error message:
Error using fitgeotrans>findProjectiveTransform (line 189)
At least 4 non-collinear points needed to infer projective transform.
Error in fitgeotrans (line 102)
tform = findProjectiveTransform(movingPoints,fixedPoints);
I tried a lot of different pairs of control-points (4 pairs or more). But I'm still getting this error. I believe I must overlook something here.
Any help is appreciated, thank you.
Stephan
If you are using one of the calibration images, then all the information you need is in the cameraParams object.
Let's say you are using calibration image 1, and let's call it I.
First, undistort the image:
I = undistortImage(I, cameraParams);
Get the extrinsics (rotation and translation) for your image:
R = cameraParams.RotationMatrices(:,:,1);
t = cameraParams.TranslationVectors(1, :);
Then combine rotation and translation into one matrix:
R(3, :) = t;
Now compute the homography between the checkerboard and the image plane:
H = R * cameraParams.IntrinsicMatrix;
Transform the image using the inverse of the homography:
J = imwarp(I, projective2d(inv(H)));
imshow(J);
You should see a "bird's eye" view of the checkerboard. If you are not using one of the calibration images, then you can compute R and t using the extrinsics function.
Another way to do this is to use detectCheckerboardPoints and generateCheckerboardPoints, and then compute the homography using fitgeotform.
I want to process an image in matlab
The image consists out of a solid back ground and two specimens (top and bottom side). I already have a code that separate the top and bottom and make it two images. But the part what I don't get working is to crop the image to the glued area only (red box in the image, I've only marked the top one). However, the cropped image should be a rectangle just like the red box (the yellow background, can be discarded afterwards).
I know this can be done with imcrop, but this requires manual input from the user. The code needs to be automated such that it is possible to process more images without user input. All image will have the same colors (red for glue, black for material).
Can someone help me with this?
edit: Thanks for the help. I used the following code to solve the problem. However, I couldn't get rid of the black part right of the red box. This can be fix by taping that part off before making pictures. The code which I used looks a bit weird, but it succeeds in counting the black region in the picture and getting a percentage.
a=imread('testim0.png');
level = graythresh(a);
bw2=im2bw(a, level);
rgb2=bw2rgb(bw2);
IM2 = imclearborder(rgb2,4);
pic_negative = ait_imgneg(IM2);
%% figures
% figure()
% image(rgb2)
%
% figure()
% imshow(pic_negative)
%% Counting percentage
g=0;
for j=1:size(rgb2,2)
for i=1:size(rgb2,1)
if rgb2(i,j,1) <= 0 ...
& rgb2(i,j,2) <= 0 ...
& rgb2(i,j,3) <= 0
g=g+1;
end
end
end
h=0;
for j=1:size(pic_negative,2)
for i=1:size(pic_negative,1)
if pic_negative(i,j)== 0
h=h+1;
end
end
end
per=g/(g+h)
If anyone has some suggestions to improve the code, I'm happy to hear it.
For a straight-forward image segmentation into 2 regions (background, foreground) based on color (yellow, black are prominent in your case), an option can be clustering image color values using kmeans algorithm. For additional robustness you can transform the image from RGB to Lab* colorspace.
The example for your case follows the MATLAB Imape Processing example here.
% read and transform to L*a*b space
im_rgb = double(imread('testim0.png'))./256;
im_lab = applycform(im_rgb, makecform('srgb2lab'));
% keep only a,b-channels and form feature vector
ab = double(lab_I(:,:,2:3));
[nRows, nCols, ~] = size(ab);
ab = reshape(ab,nRows * nCols,2);
% apply k-means for 2 regions, repeat c times, e.g. c = 5
nRegions = 2;
[cluster_idx cluster_center] = kmeans(ab,nRegions, 'Replicates', 5);
% get foreground-background mask
im_regions = reshape(cluster_idx, nRows, nCols);
You can use the resulting binary image to index the regions of interest (or find the boundary) in the original reference image.
Images are saved as matrices. If you know the bounds in pixels of the crop box want to crop you can execute the crop using indexing.
M = rand(100); % create a 100x100 matrix or load it from an image
left = 50;
right = 75;
top = 80;
bottom = 10;
croppedM = M(bottom:top, left:right);
%save croppedm
You can easily get the unknown bounded crop by
1) Contour plotting the image,
2) find() on the result for the max/min X/ys,
3) use #slayton's method to perform the actual crop.
EDIT: Just looked at your actual image - it won't be so easy. But color enhance/threshold your image first, and the contours should work with reasonable accuracy. Needless to say, this requires tweaking to your specific situation.
since you've already been able to seperate the top and bottom, and also able to segment the region you want (including the small part of the right side that you don't want), I propose you just add a fix at the end of the code via the following.
after segmentation, sum each column Blue intensity value, so that you are compressing the image from 2d to 1d. so if original region is width=683 height=59, the new matrix/image will be simply width=683 height=1.
now, you can apply a small threshold to determine where the edge should lie, and apply a crop to the image at that location. Now you get your stats.