finding length of object in image - image

As a part of our project we have to find the dimensions of a given object in a particular image ex- dimensions of a given sunken ship which is underwater. This is totally new to me so my friend told me that in matlab its possible. Kindly help me out

I think you should look into image blobs and Edge detection. That's where I would start
Are you already set on MatLab? If you can use C#, I would look into: AForge.NET Image processing library for C#:
http://www.aforgenet.com/projects/iplab/
I have used AForge before to identify "blobs" in images and perform other image processing operations.

If you have still not finalized on Matlab, then AForge.NET or Magick.NET from ImageMagick can be tried.
To identify the dimensions of the image, we have to think thru the manual process of identifying the same. How are we able to identify ship in water from an image? How is the object different from sorrounding area in the image?
From that, you may try to identify ship as a blob and work on the blob. Sometimes, you may not be able to identify ship as blob, probably due to noise of the sorrounding. Find means to remove that noise or differentiate the object further from sorrounding by errosion or dillation or combination.

Related

Caffe - training autoencoder with image data/image label pairs

I am very unfamiliar with Caffe. My task is to train an autoencoder net on image pairs, given in .tif format, where one is a grayscale image of nerves, and the other is the corresponding binary mask which shows if a certain structure is present on the image or not. I have these in the same "train" folder. What I would like to accomplish, is a meaningful experiment with these images (segmentation, classification, it is not specified). My first problem is that I do not know how to feed the images into the net without an existing train.txt. Can I use the images directly, or another format like lmdb, hdf5 needed? Any suggestion is appreciated.
you can accomplish it with simple classification (existing like alexnet, googlenet, lenet). You can use only the binary mask or gray scale image and the class name to do this. Nvidia Digits is a good graphical tool to make the pair dataset and learning....
Please see this link:
https://developer.nvidia.com/digits

Image matching algorithm: Get the sizes of the same object from two images using SIFT or SURF?

I am working on a image matching project. First we use SURF to find out a matching pair of pictures. There will be at least one object that appears on both images. Now I need to find out, for this same object, what are the sizes of it on the two pictures? Relative size is enough.
Both SIFT and SURF are only local feature point descriptors. I will get a bunch of descriptors and associated feature point locations, but how to use this information to determine an object size? I am thinking of using contours: if I can associate a contour in the two images correctly, then I can find out the object sizes easily by calculating contour point locations. But how to associate contours?
I assume there must be some way to apply SIFT or SURF to get object information, since people can do object tracking using SIFT... but after searched for a long time I still couldn't get any useful information...
Any help would be appreciated! Thanks in advance!
The SIFT/SURF detector gives each feature a canonical scale. By simply comparing the ratios of the scales of the matching features in each image, you should be able to determine their relative size.
You should be already comparing the scales of the potential matches in order to discard spurious matches when there is transformation inconsistency.

What is the fastest way to rotate a jpg image file?

I am working on some batch routines to manage large libraries of jpg files. I have a nice routine that will quickly downsize 4mb+ files down to 40kb+. Using CCR.Exif, I can determine if an image needs to be rotated. My problem is that I can't find any code to rotate the image before I save it. I really need to be able to do this without incurring the overhead of bringing the image to screen.
I'm using the built-in jpeg.pas; I found another library by Gabriel Corneanu at CodeCentral, but it hasn't been updated for DXE2. All I need to do is a 90° rotation.
Any help will be greatly appreciated!
JPGs are compressed and must be rendered before you can work with the image data. Even if it is a non-visible canvas, they still need to be loaded into a component that renders them. Then you can use Windows API calls to rotate the image by directly accessing the canvas. I haven't rotated the image before, but I have manipulated it in other ways by accessing the canvas.
GR32 and EFG are both good sites with several components and algorithms. Here is one example on EFG's site that rotates an image. The code is Delphi 3, but it should still work fine for image manipulation.
EFG Example with Source
TImage32 has a method to rotate the image 90 degrees as well. See TImage32.Bitmap.Rotate90. TImage32 is part of the GR32 library and has been updated for Delphi-XE2.
svn co https://graphics32.svn.sourceforge.net/svnroot/graphics32/trunk graphics32
Also see: GR32 Homepage
If you need to rotate JPEG in steps by 90 degree, then look for lossless transformations.
For example irfanview.com has a special plugin DLL for it, though it does not have public API, but maybe you can ask Irfan Author for it or reverse-engineer it with debugger and cff explorer.
a lot of discussion might by just googled, including discussion how it is implemented.
https://www.google.ru/search?client=opera&q=lossless+jpeg+rotation
Component catalogues have that like
http://www.torry.net/quicksearchd.php?String=jpeg+lossless&Title=No
That will not work with rotation finer than 90 degree steps, but for orthogonal turns keep searchign for lossless jpeg transformations.
The fastest way to rotate a JPEG image would be to write a new / alternate pixel pump for the JPEG decoder that reads and decodes the JPEG pixels left to right (x,y), and writes them to bitmap memory as (y,x) - that is, writing one pixel per scanline at the same offset, instead of the normal mode of writing one pixel per column on the same scanline.
Anything else will be making multiple passes over the bitmap data.

I want to learn how images are composed

I really want to learn how an image is composed (i.e. array of bits, or however, how is the color composed for each pixel, etc). Can you point me in the right direction? I'm not really sure what to search for.
Thanks a lot in advance.
So what I want to do is to be able to modify the picture pragmatically, i.e. change to black and white, scale it, crop it, etc, and for this I would really like to learn how the image is composed instead of just finding these algorithms online.
You don't always need to know low level mathematical details(matrixes,quantisation,fourier transform etc.) of graphic formats to manipulate images.
For all the things you want to do you may use proper libraries.
For example in PHP libraries used freuqently to manipulate images are:
GD - http://php.net/manual/en/book.image.php
ImageMagick - http://php.net/manual/en/book.imagick.php
It depends on the image format that you're interested in manipulating. Each format (more or less) is composed in a different manner, and based on that has a different set of capabilities for manipulating the image.
Different sets of actions on an image favor different image formats, as does the type of image you want to manipulate.
Provide more details about what you want to do with the image and I'm sure someone else will come along and tell you which formats are best and how they are handled.

Detecting if two images are visually identical

Sometimes two image files may be different on a file level, but a human would consider them perceptively identical. Given that, now suppose you have a huge database of images, and you wish to know if a human would think some image X is present in the database or not. If all images had a perceptive hash / fingerprint, then one could hash image X and it would be a simple matter to see if it is in the database or not.
I know there is research around this issue, and some algorithms exist, but is there any tool, like a UNIX command line tool or a library I could use to compute such a hash without implementing some algorithm from scratch?
edit: relevant code from findimagedupes, using ImageMagick
try $image->Sample("160x160!");
try $image->Modulate(saturation=>-100);
try $image->Blur(radius=>3,sigma=>99);
try $image->Normalize();
try $image->Equalize();
try $image->Sample("16x16");
try $image->Threshold();
try $image->Set(magick=>'mono');
($blob) = $image->ImageToBlob();
edit: Warning! ImageMagick $image object seems to contain information about the creation time of an image file that was read in. This means that the blob you get will be different even for the same image, if it was retrieved at a different time. To make sure the fingerprint stays the same, use $image->getImageSignature() as the last step.
findimagedupes is pretty good. You can run "findimagedupes -v fingerprint images" to let it print "perceptive hash", for example.
Cross-correlation or phase correlation will tell you if the images are the same, even with noise, degradation, and horizontal or vertical offsets. Using the FFT-based methods will make it much faster than the algorithm described in the question.
The usual algorithm doesn't work for images that are not the same scale or rotation, though. You could pre-rotate or pre-scale them, but that's really processor intensive. Apparently you can also do the correlation in a log-polar space and it will be invariant to rotation, translation, and scale, but I don't know the details well enough to explain that.
MATLAB example: Registering an Image Using Normalized Cross-Correlation
Wikipedia calls this "phase correlation" and also describes making it scale- and rotation-invariant:
The method can be extended to determine rotation and scaling differences between two images by first converting the images to log-polar coordinates. Due to properties of the Fourier transform, the rotation and scaling parameters can be determined in a manner invariant to translation.
Colour histogram is good for the same image that has been resized, resampled etc.
If you want to match different people's photos of the same landmark it's trickier - look at haar classifiers. Opencv is a great free library for image processing.
I don't know the algorithm behind it, but Microsoft Live Image Search just added this capability. Picasa also has the ability to identify faces in images, and groups faces that look similar. Most of the time, it's the same person.
Some machine learning technology like a support vector machine, neural network, naive Bayes classifier or Bayesian network would be best at this type of problem. I've written one each of the first three to classify handwritten digits, which is essentially image pattern recognition.
resize the image to a 1x1 pixle... if they are exact, there is a small probability they are the same picture...
now resize it to a 2x2 pixle image, if all 4 pixles are exact, there is a larger probability they are exact...
then 3x3, if all 9 pixles are exact... good chance etc.
then 4x4, if all 16 pixles are exact,... better chance.
etc...
doing it this way, you can make efficiency improvments... if the 1x1 pixel grid is off by a lot, why bother checking 2x2 grid? etc.
If you have lots of images, a color histogram could be used to get rough closeness of images before doing a full image comparison of each image against each other one (i.e. O(n^2)).
There is DPEG, "The" Duplicate Media Manager, but its code is not open. It's a very old tool - I remember using it in 2003.
You could use diff to see if they are REALLY different.. I guess it will remove lots of useless comparison. Then, for the algorithm, I would use a probabilistic approach.. what are the chances that they look the same.. I'd based that on the amount of rgb in each pixel. You could also find some other metrics such as luminosity and stuff like that.

Resources