i'm just curious if someone knows how to analyse an image. for example:
i have a heatmap picture, know i want to extract the color value and the x,y coordiantes and redraw the image with javascript & canvas.
Another example would be to recognize pattern in the image (lines, arrows) and extract the direction and length.
A popular image/video analysis library I would recommend is OpenCV. I have used this github fork of the ruby-opencv gem with success. If you scroll down on the readme file, you'll see an example on face detection. The unit tests demonstrate how to do other things like drawing shapes and such. At a glance, I don't see any tests on extracting pixel data, but it most definitely is possible.
If you need something more simple, you can try out devil. It's more user-friendly and is focused on image manipulation, but you can probably extract pixel data with it.
It sounds like you'll be leaning towards OpenCV. It might be useful to look at this previous question, specifically the mention of the Hough transform
Related
For a project I am using YOLO to detect phallusia (microbial organisms) that swim into focus in a video. The issue is that I have to train YOLO on my own data. The data needs to be segmented so I can isolate the phallusia. I am not sure how to properly segment/cut-out the phallusia to fit the format that YOLO needs. For example in the picture below I want YOLO to detect when a phallusia is in focus similar to the one I have boxed in red. Do I just cut-out that segment of the image and save it as its own image and feed to that to YOLO? Do all segmented images need to have the same dimensions? Not sure what I am doing and could use some guidance.
It looks like you need to start from basics, ok, no fear. I will try to suggest a simple route to start efficiently to use YOLO techniques. Luckly the web has a lot of examples.
Understand WHAT is a YOLO method.
Andrew NG's YOLO explanation is a good start, but only if you alread know what are classification and detection.
Understand the YOLO Loss function, the heart of the algorithm.
Check the paper YOLO itself, don't be scared. At page #2, in Unified Detection section, you will find the information about the bounding box detection used, but be aware that you can use whatever notation you want (even invent a new one), in order to be compatible with the Loss function, real meaning of this algorithm.
Start to implement an example
As I wrote above, there are plenty of examples. You can check this one if you are familiar with python and tensorflow.
Inside it you will find a way to prepare the dataset, that is your target for this question, I think. In this case a tool named labelImg is used.
I hope it will be useful. Please share your code when it will be ready, I'm curious :). Good luck!
Do I just cut-out that segment of the image and save it as its own image and feed to that to YOLO?
You need as much images as you can get of your microbial organism, in different sizes, positions, etc. It doesn't need to be the only thing on the image, but you need to know the <x> <y> <width> <height> position of it.
Do all segmented images need to have the same dimensions?
No, they can be of any size and Yolo adapts them. See the VOC dataset for examples of images Yolo is normally trained on. A couple examples;
kitchen, dogs
Not sure what I am doing and could use some guidance.
My advice would be to follow the instructions for "Training YOLO on VOC" from the original Yolo website; https://pjreddie.com/darknet/yolo/
Once you have that working, you will have a better idea of the steeps you need to take.
I had similar problems when I wanted to train YOLOv2 for some game cards.
In order to solve the problem I took a picture from every game card with my cellphone and I cut out them. Because I didn't have enough training data I wrote a dataset generator program what generated the training data by using the photos from the cards. This program is able to multiply, rotate, scale the image then to place it on a background.
It can happen that you will have problems if you don't have enough learning data. In this case don't panic, because from several raw images by rotating and scaling you can generate a large dataset.
Here you can find my dataset generator, which is able to generate Pascal VOC style and darknet style training data: https://github.com/szaza/dataset-generator. Feel free to reuse it, if you need something similar.
I am thinking of using OpenCV library for image analysis. Basically I want to automate in my project the extraction of image label from wine bottle.
This is the sample input image:
This is the sample output:
I am thinking what should be my general strategy to extract the image. I am not asking for direct code. Just want to know the general approach to solve the problem.
Thanks!
Sorry for vage answer but in applied computer vision is no such thing like general approach.
some will disagree of course but in reality
all CV applications are custom made for some specific purpose/task
in your case is the idea to find cylindric and probably standing object (bottle)
and then finding of irregular parts in it
I would do it like this:
1.remove noise as much as possible (smooth/sharpen filters)
2.(optionaly) reduce image data (via (i)FT or (i)DCT for example)
3.segmentate objects (usually by homogenity of color or by edge detection or by booth)
4.identify bottle object (by color,shape,or illumination (glass is transparent))
5.identify objects inside bottle (homogenity,not transparent,usually sharp edges,color is not good some labels are black on dark glass)
6.(optional) project label back from cylindric space to flat texture
[notes]
create app with many scrollbars and checkboxes
to be able to change all tresholds and enable disable filters or their order on the run
all parts will take a lot of tweaking of tresholds and weights
you have to do a lot of trial and error runs to find the best filters and their config for your task
With my new assignment I am looking for a method to detect the presence of text on image. The image is a map - can be for example google map. The task is to detect where the street/city label is placed.
I know that opencv library has algorithm that can detect features (for example human faces) - haar classifier or hog (histogram of oriented gradients), but I heard that learning process of such algorithms is quite difficult.
Do you know of any algorithm, method or a library that could do that (detect presence of text on image)?
Thanks,
John
There is a standard problem in vision called text detection in images. it is quite different to OCR. OCR concerms itself with what it says, while text detection is about determining if there is text in the image. Adi Shavit's third link is a method to address this problem. You can look on google scholar well cited articles on text detection.
There are several possible approaches you can take.
Use OCR. A search for OCR on Stackoverflow will show many options. These include Tesseract and Ocropus.
If your text uses very specific fixed font, you may get away with simple template matching.
In the more general case you might want to take a look at "Detecting Text in Natural Scenes with Stroke Width Transform"
UPDATE Jan. 2017
The OpenCV 3.2 contrib module now has a text detection module.
It also includes a sample (C++, Python) of how to use it.
You need to tune this to a specific type of map images, or the problem is going to be very difficult (see the previous post about links to articles).
OCR is the way to go, and you should use an existing library. However, OCR is mainly done on text on white backgrounds. To reduce your problem to a regular OCR problem, you should attempt to work on the color space of the map. Likely the map text has a very specific color and this may be enough to find these pixels. You can then filter the detected pixels based on the size of connected regions.
If you literally only want to find the locations of text labels, you can do the above, and pretty much just skip the OCR step. If the labels are not too close, simple clustering algorithms can be used to find their respective positions.
I'm messing around with image manipulation, mostly using Python. I'm not too worried about performance right now, as I'm just doing this for fun. Thus far, I can load bitmaps, merge them (according to some function), and do some REALLY crude analysis (find the brightest/darkest points, that kind of thing).
I'd like to be able to take an image, generate a set of control points (which I can more or less do now), and then smudge the image, starting at a control point and moving in a particular direction. What I'm not sure of is the process of smudging itself. What's a good algorithm for this?
This question is pretty old but I've recently gotten interested in this very subject so maybe this might be helpful to someone. I implemented a 'smudge' brush using Imagick for PHP which is roughly based on the smudging technique described in this paper. If you want to inspect the code feel free to have a look at the project: Magickpaint
Try PythonMagick (ImageMagick library bindings for Python). If you can't find it on your distribution's repositories, get it here: http://www.imagemagick.org/download/python/
It has more effect functions than you can shake a stick at.
One method would be to apply a Gaussian blur (or some other type of blur) to each point in the region defined by your control points.
One method would be to create a grid that your control points moves and then use texture mapping techniques to map the image back onto the distorted grid.
I can vouch for a Gaussian Blur mentioned above, it is quite simple to implement and provides a fairly decent blur result.
James
Basically, suppose that I have a fingerprint. I know the dimension of my image, and I know that the fingerprint is black on a white background or that it is green on a black background or something like that.
Is there a way to process only the parts that delimit the image, in this case, the fingerprint? What I'm trying to do is basically this:
1) Delimit fingerprint
2) Extract the important points to compare to other fingerprints
3) Find best match on a database of other fingerprints that had their points previously extracted
I already have methods for 2 and 3, so now I just would have to delimit the image.
Programming language would have to be Ruby, Java or C++. Ruby preferred, then Java, and God help me if I have to use C++. I don't have any experience with image processing, but I'd like to do this with multiple common formats such as jpg, gif, png, if possible.
I think that the best way to do it is applying a edge detection filter to your image.
There are may approaches as suggested by wikipedia (article), but noone of them is trivial because they work on gradients or kernels. You should check Canny Edge Detection that should be enough straight-forward to implement: tutorial.
In any case if you want to avoid going deep into implementation details you should use OpenCV that is a computer vision library able to do these things in a simple way. You can use it for sure in C++ and Java but I think that a wrapper for Ruby is offered too. This is a simple example using that library with Canny algorithm.
EDIT: actually my answer covers point 2-3, so I'm wondering what you mean by delimiting the image? Think about the fact that scaling or rotating must be considered too if you want to compare different fingerprints: you need a fuzzy comparator.. maybe you should work on the Fast Fouried Transform version of the image that can handle such things in a better way.
An easy approach could be using threshold, like:
Convert your image to grayscale - so you have fingerprint in white on black.
Find a threshold value that gets most of the fingerprint.
Use open operation (http://en.wikipedia.org/wiki/Mathematical_morphology) to remove noise.
(experiment with dilate a few times)
Find the center of gravity (x,y) of the image and the standard deviation (vx, vy).
In the box:
[x-2vx,y-2vy],
[x-2vx,y+2vy],
[x+2vx,y+2vy],
[x+2vx,y-2vy]
You will find 95.4% of the pixels
You could narrow the box down to find the actual max and min pixels in it, if you have many outliers.
Use the box to clip from the original image.
It is simple method that might work well for your situation :)