Looking for an Image Comparison/Pattern Recognition Library - image

The end goal would be to see if
contains
.
the compare needs to support minor distortion, scaling, color differences, rotation, and brightness differences.
it can be in any language really. i will be running this algorithm as a webservice so its no problem if i have to write this portion in c, c++, python, etc.

You should probably take a look at OpenCV and VLfeat.

Object detection can be performed for example using
Rapidminer IMMI (image mining extension for one of the leading open-source data-mining platform)
BoofCV (using SURF feature detection)

How about ImageMagick? Its not a library per se however if you can provide shell access to your environment its pretty easy to use.
You would most probably be interested in the compare command.
EDIT: ImageMagick does contain tools for sub-image search like subimage-search.

Look at this - http://gallery.azureml.net/MachineLearningAPI/02ce55bbc0ab4fea9422fe019995c02f - it supports OCR. This also supports multiple languages and distortion - http://www.projectoxford.ai/doc/vision/OCR

Related

how to extract an object from an image

I want to extract an object such as a man ,a car or something like that from an image.The image is just an ordinary iamge, not medical image or other types for specific purpose.
I have searched for a long time and found that the automatic image segmentation algorithms just segment the image into a set of regions or gives out the contour in the image,not a semantic object. so I turned to the interactive image segmentation algorithms and I found some popular algorithms like interactive graph cuts and SIOX and so on. I think these algorithms just meet my demand.
Further more, I also downloaded two interactive image segmentation tool,the first one is the interactive segmentation tool, the second one is the interactive segmentation tool-box.
So my quesions are
1.if the interactive image segmentation algorithm is the right solution for my task since the performance is the most important.
2.and if I want to use the automatic image segmentation algorithm, what should I do next?
Any suggestion will be approciated.
If you want to pick out a object from a single static image just by a few scribbles. I recommend you have a read of
'Closed-form solution to image matting'
or 'Spectral matting',
or 'lazy snapping'
but as in my tests, the last doesn't perform as well as the first two methods when dealing with subtle objects like hairs.
However you can find their source matlab codes very easily from google.
But the first two method are't so pleasant to use actually, I think you'll need to do lots of modification to make them easy to use. It's main problem IMHO, is it requires very decent scribbles on the image, that's if you draw some extra scribbles or at wrong positions, you'll ruin your object cutting .
Apart from these, you may try 'bayesian matting, possion matting, etc.' which all request some helping image called trimap, and it's hard to draw really.
Extracting objects from an image, specially a picture is not that easy as you think, you may want to take a look at the OpenCV project.
OpenCV
Other than OpenCV, I would suggest looking at ITK. It is very popular in medical image analysis projects, because there it is known that semi-automatic segmentation tools provide the best results. I think that the methods apply to natural images as well.
Try looking at tools like livewire segmentation, and level-set based image segmentation. ITK has several demos that allow you to play with these tools on your own images. The demo application such as this is part of the open source distribution, but it can be downloaded directly from the itk servers (look around for instructions)
If this is a business case, you'd better look for companies specialized in "video content analysis". I mean it: reliable people and vehicle detection aren't a single man's project.
Genreral purpose segmentation tools won't do the trick because they have no notion of what a man or a car look like. All they are deemed to do is to find uniform regions in an image.
It is quite late but there is an algorithm called connected component labeling, which you may find useful.
here is wiki link of the algorithm

Any large set of functions that i can find for filters their description as well?

Well what i want is place i can find the list of all the possible filters used in image processing and how they can be used. Image processing toolbox in MATLAB is one alternative but not very fond of it
Please suggest some links
thanks
Here are a few resources for image noise removal methods. These are mostly research based. The second link has working C++ code that can be downloaded.
Recent trends in denoising - 2007
NL Means denoising
A comprehensive list of the more standard methods (that can be easily implemented using OpenCV's inbuilt functions if you're not fond of Matlab) are here. Right off the cuff (without knowing the type of noise you are dealing with) I'd say go for the Bilateral filter. It's edge-preserving in most cases and comes without any implementation headache in most computer vision libraries.

Plotting GPS info with Ruby

I'm looking for ways to programmatically convert my GPS logs to images and would like to do this in Ruby... if that's an acceptable tool. I have no GIS background whatsoever but as a programmer i think it's an interesting problem to look at.
Here is what I have come up with so far. First you'll need some kind of graphing library. I went for gnuplot as I found a Ruby binding for that one but R seems hot these days. I created a small script that converts a GPX file and feeds the data to gnuplot resulting in something like this: alt text http://dl.dropbox.com/u/45672/gpslog.png
This looks fine but gnuplot seems really a tool to create graphs, not spatial data. Is this the way to do it or are there much better solutions available?
Here is another example, any idea how you build stuff like this?
Answer to First Question
Since you stated that you "would like to do this in Ruby...if that's an acceptable tool", I'll go out on a limb and assume that you might be open to a non-Ruby solution if it meets all of your other requirements.
I would recommend Python primarily because in the first chapter of Beginning Python Visualization, Shai Vaingast—the author—goes through an example of reading in GPS data from a GPS receiver and then plots the results. If you're open to a Python-based solution, this book would be a great resource.
Here are the Python packages that are used to read and plot the GPS data:
pySerial to read the GPS data in from the serial port
matplotlib to plot the data. "matplotlib is a library for making 2D plots of arrays in Python. Although it has its origins in emulating the MATLAB® graphics commands, it is independent of MATLAB, and can be used in a Pythonic, object oriented way."
Here's an example figure created by Shai Vaingast showing off a few of the different capabilities of matplotlib for plotting GPS data.
If you are not open to a Python solution, and would prefer Ruby—for whatever reason—I understand. I tried to search for an equivalent of matplotlib in Ruby, but I didn't find an equivalent package.
Answer to Last Question
Here is another example, any idea how you build stuff like this?
Looking at the lower, right-hand corner, it appears that DISLIN was used to create that image. While DISLIN is available for quite a few programming languages, the DISLIN software requirements page does not show that Ruby is supported.
According to the DISLIN website,
DISLIN is a high-level plotting library for displaying data as curves, polar plots, bar graphs, pie charts, 3D-color plots, surfaces, contours and maps.
The software is available for several C, Fortran 77 and Fortran 90/95 compilers on the operating systems UNIX, Linux, FreeBSD, OpenVMS, Windows, Mac OSX and MS-DOS. DISLIN programs are very system-independent, they can be ported from one operating system to another without any changes.
For some operating systems, the programming languages Perl, Python, Java and the C/C++ interpreter Ch are also supported by DISLIN. The DISLIN interpreter DISGCL is availble for all supported operating systems. See a complete list of the supported operating systems and compilers.
Do you really want images, or just a way to visualize the data? How about using the google maps api?
Check out this link:
http://google-dox.net/O.Reilly-Google.Maps.Hacks/0596101619/googlemapshks-CHP-4-SECT-10.html
I think that using gnuplot from any programming language is a good starting approach.
However, I strongly suggest adding the set size ratio -1 gnuplot command somewhere in your code, as this will make the x and y axis scales equal in the plot, which is extremely important.
You could also augment the line with very small point markers equally spaced in time (assuming you have time information in your data, or at least you know that rows are sampled at regular time intervals), so you get a feel of the speed of the movement, which is otherwise lost (i.e., large-spaced point markers on the line mean faster movement). Obviously you should pick a time interval between point markers that makes them appropriately spaced, or compute such time interval automatically: i.e. by computing the length of your curve, converting it in pixel units, and dividing by anything between 10 and 100, to get the total number of points you want to place. The time interval is then given by the total time of the track divided by such number of points. This should work robustly for reasonably regular movements.
Another option is to use a different charting system than gnuplot, which is powerful but a bit old. Options known to me include:
gruff, which is for ruby but seems to miss 2D plotting abilities (which is an obvious requirement).
XML/SWF Charts, which is powerful and flexible, but commercial. In this case, you would use ruby or any other programming language to generate an XML file which then gets interpreted to an interactive graph.
Google Chart API, which can return chart images over the web, forging an appropriate GET or POST request in your code. In this case, you are intersted in the lxy chart type, possibly compounded with a scatter chart for the point markers.
The third option seems the most fun.
GDAL is very popular Open Source GIS kit, there are GDAL Ruby bindings. If you want map data, open street map is very useful. Combined plotting of OSM and the GPS will give pretty nice results. GDAL/OGR Api tutorial is here.
If you want to look more into R, there are Ruby bindings for that too, but there has been no activity on the project for over a year:
http://github.com/alexgutteridge/rsruby
Maybe you have heard of Processing already but have you heard of Ruby-Processing ?
From the Ruby-Processing readme:
Ruby-Processing is a Ruby wrapper for the Processing code art framework.
…
If some quality time with Ruby is your
idea of a pleasant afternoon, or you
harbor ambitions of entering the
fast-paced and not altogether
cutthroat world of Code Art, then
Ruby-Processing is probably something
you should try on for size.
…
Processing is an MIT-developed
framework for making little code
artifacts, animations,
visualizations, and the like,
developed originally by Ben Fry and
Casey Reas, supported by a small army
of open-source contributors.
Processing has become a sort of
standard for visually-oriented
programming, strongly influencing
the designs of Nodebox, Shoes,
Arduino, and other kindred projects

Java or C for image processing

I am looking in to learning a programming language (take a course) for use in image analysis and processing. Possibly Bioinformatics too. Which language should I go for? C or Java? Other languages are not an option for me. Also please explain why either of the languages is a better option for my application.
You have to balance raw processing power and developer time. Java is getting pretty fast too and if you are finished a couple of days early, you have more time to process the data.
It all depends on volume.
More importantly, I suggest you look for the libraries and frameworks which already exist, see which fits closest to what needs to be done, and choose whatever language the library was written be it C, Java or Fortran.
For Java I found BioJava.org as a starting point.
Java isn't TOOO bad for image processing. If you manage your source objects appropriately, you ll have a chance at getting reasonable performance out of it. Some of the things I like with Java that relates to imaging:
Java Advanced Imaging
2D Graphics utilities (take a look at BufferedImages)
ImageJ, etc
Get it to work with JAMA
Ask someone in the field you're working in (ie, bioinformatics)
For solar images, the majority of the work is done in IDL, Fortran, Matlab, Python, C or Perl (PDL). (Roughly in that order ... IDL is definitely first, as the majority of the instrument calibration software is written in IDL)
Because of this, there's a lot of toolkits already written in those languages for our field. Frequently, with large reference data sets, the PI releases some software package as an example of how to interpret / interact with the data format. I can only assume that Bioinformatics would be similar.
If you end up going a different route than the rest of the field, you're going to have a much harder time working with other scientists as you can't share code as easily.
Note -- There are a number of the visualization tools that have been released in our field that were written in Java, but they assume that the images have already been prepped by some other process.
The most popular computer vision (image processing, image analysis) library is OpenCV which is written in C++, but can also be used with Python, and Java (official OpenCV4Android and non-official JavaCV).
There are Bioinformatic applications that are basically image processing, so OpenCV will take care of that. But there are also some which are not, they are, for example, based on Machine Learning, so if you need something other than image/video related you will need another Bioinformatic oriented library. Opencv also has a machine learning module but it is more focused for computer vision.
About the languages C vs Java, most has been said in the other answers. I should add that these libraries are now C++ based and not plain C. If your applications have real-time processing needs, C++ will probably be better for that, if not, Java will be more than enough as it is more friendly.
Ideally, you would use something like Java or (even better) Python for "high-level" stuff, and compile in C the routines that require a lot of processing power (for instance using Cython, etc).
Some scientific libraries exist for Python (SciPy and NumPy), and they are a good start, although it isn't yet straightforward to combine Python and C (you need to tweak things a bit).
just my two pence worth: java doesn't allow the use of pointers as opposed to C/C++ or C#. So if you are going to manipulate pixels directly, i.e. write your own image processing functions then they will be much slower than the equivalent in C++. On the otherhand C++ is a total nightmare of a language compared to java. it will take you at least twice as long to write the equivalent bit of code in c++. so with all the productivity gain you can probably afford to buy a computer that makes up for the difference in runtime ;-)
i know other languages aren't an option for you, but personally i can highly recommend c# for image processing or computer vision: it allows pointers and hence IP functions in c# are only half as slow as in C++ (an acceptable trade-off i think) and it has excellent integration with native C++ and a good wrapper library for opencv.
Disclaimer: I work for TunaCode.
If you have to make a choice between different languages to get started on Image Processing, I would recommend to start with C++. You can raw pointer access which is a must if you want to operate on individual pixels.
Next, what kind of Imaging are you interested in? Just for fun image filters or some heavy stuff like motion estimation, tracking and detection etc? For that I would recommend you take a look at CUVILib since sooner than later, you will need performance on Imaging functionality and that's what CUVI provides. You can use it as standalone if it serves your purposes or you can plug it with other libraries like Intel IPP, ITK, OpenCV etc.

How a marker-based augmented reality algorithm (like ARToolkit's one) works?

For my job i've been using a Java version of ARToolkit (NyARTookit). So far it proven good enough for our needs, but my boss is starting to want the framework ported in other platforms such as web (Flash, etc) and mobiles. While i suppose i could use other ports, i'm increasingly annoyed by not knowing how the kit works and beyond that, from some limitations. Later i'll also need to extend the kit's abilities to add stuff like interaction (virtual buttons on cards, etc), which as far as i've seen in NyARToolkit aren't supported.
So basically, i need to replace ARToolkit with a custom mark detector (and in case of NyARToolkit, try to get rid of JMF and use a better solution via JNI). However i don't know how these detectors work. I know about 3D graphics and i've built a nice framework around it, but i need to know how to build the underlying tech :-).
Does anyone know any sources about how to implement a marker-based augmented reality application from scratch? When searching in google i only find "applications" of AR, not the underlying algorithms :-/.
'From scratch' is a relative term. Truly doing it from scratch, without using any pre-existing vision code, would be very painful and you wouldn't do a better job of it than the entire computer vision community.
However, if you want to do AR with existing vision code, this is more reasonable. The essential sub-tasks are:
Find the markers in your image or video.
Make sure they are the ones you want.
Figure out how they are oriented relative to the camera.
The first task is keypoint localization. Techniques for this include SIFT keypoint detection, the Harris corner detector, and others. Some of these have open source implementations - i think OpenCV has the Harris corner detector in the function GoodFeaturesToTrack.
The second task is making region descriptors. Techniques for this include SIFT descriptors, HOG descriptors, and many many others. There should be an open-source implementation of one of these somewhere.
The third task is also done by keypoint localizers. Ideally you want an affine transformation, since this will tell you how the marker is sitting in 3-space. The Harris affine detector should work for this. For more details go here: http://en.wikipedia.org/wiki/Harris_affine_region_detector

Resources