I'm using dlib's API in python. The functions get_frontal_face_detector() and simple_object_detector() both appear to use the function scan_fhog_pyramid() which apparently creates several scales of the image, and then applies HOG detectors in a sliding window fashion.
Is it possible to use the Dlib library but instead of using a sliding window, scan for ROI in specific (prederemined) locations? Does anyone have any experience in this and can point where in the file scan_fhog_pyramid() this can be modified?
Alternatively, I could write my own function but I do not have access to Dlib's HOG detector. Is it possible to run just the detector and linear classifier on my own, without using the scan_fhog_pyramid() function to find ROIs?
I hope my question is clear enough...
Thanks.
Related
I'm working on an educational project where I need to detect objects, mainly doors and windows. I have tried to find specific algorithms to do this.
This is the first step in a project to detect all objects and let a user choose the object he wants. Then in the next step the system will define edges of the object accurately.
I want to detect objects by their color variety with background or with overlapping objects. I need an algorithm to start with. I started learning color spaces and I chose hvs color space. I read many papers and I know how they work, but I'm still confused and don't know what algorithm will really help.
You can use any segmentation algorithm.
You will need to find features from images to use in segmentation, a good approach for feature selection is by using any deep learning technique, i would recommend try CNN, you will find a builtin library "matconvnet" "http://www.vlfeat.org/matconvnet/" for implementing CNN in MATLAB.
you can also find few already build models for segmentation using CNN here http://www.vlfeat.org/matconvnet/pretrained/
In the Hebrew University in Jerusalem there are a few MATLAB applications, consisting of both calculations and UI. Since the UI is becoming increasingly complex, it's getting very hard to maintain it.
What I'd like to do is keep the calculations and the rendering of 2D and 3D graphs in MATLAB, but control the entire UI from elsewhere. I know MATLAB exports a COM interface, which is OK for using MATLAB calculations, but I couldn't find a way to pass rendered data (MATLAB plots, basically) back through it.
Is there a way to do that?
The simplest thing for you to do would be to issue an instruction to MATLAB to create the plot (perhaps creating it offscreen, to avoid an unwelcome popup window), adjust its appearance and size, then save it to an image file. Pass the filename back, then load it in from your UI code and display it.
However, that will not of course get you a plot that is "live", so you won't be able to edit it, or click on it/interact with it, or even resize it nicely.
If you need that, I'm afraid there's no documented or supported way to do it. But if you're willing to go undocumented, then MATLAB also has a Java interface (jmi.jar) that you can call from Java, and you can embed a live MATLAB plot within a Java GUI, attaching MATLAB or Java callbacks to plot elements.
Note that that capability is completely undocumented, and may well change from release to release without warning. If you'd like to learn how to approach that, I'd recommend reading through the blog Undocumented MATLAB, and probably buying a copy of the book by that blog's author.
I'm working on a small program for optical mark recognition.
The processing of the scanned form consists of two steps:
1) Find the form in the scanned image, descew and crop borders.
2) With this "normalized" form, I can simply search the marks by using coordinates from the original document and so on.
For the first step, I'm currently using the Homography functions from OpenCV and a perspecive transform to map the points. I also tried the SurfDetector.
However, both algorithms are quite slow and do not really meet the speed requierements when scanning forms from a document scanner.
Can anyone point me to an alternative algorithm/solution for this specific problem?
Thanks in advance!
Try with ORB or FAST detector: they should be faster than SURF (documentation here).
If those don't match your speed requirement you should probably use a different approach. Do you need scale and rotation invariance? If not, you could try with the cross correlation.
Viola-Jones cascade classifier is pretty quick. It is used in OpenCV for Face detection, but you can train it for different purpose. Depending on the appearance of what you call your "form", you can use simpler algorithms such as cross correlation as said by Muffo.
I am working on a Visual C++/CLI "Windows Forms" project that touches image processing. The intensity (grayscale) image values that I have to deal with are short integers aquired at a framerate of ~400fps.
Question 1: Is there an image processing library comparable to CImg that runs with managed c++ that I can use to process the images? Great thing about CImg is: it offers a constructor, that accepts a pointer to the first image value in the memory, the number of image pixels and the byte size of the pixel values. This is exactly what I am looking for, but I did not manage to get CImg.h running using managed C++: I got it to compile, but I seem to be unable to instantiate a CImg object.
Question 2: What would be the best approach to draw images in real time on a Form? My first approach was to generate Bitmaps using the SetPixel() method and draw the Bitmaps using a Graphics object. However, this approach proved to be far from real time capability.
Any help on this matter is greatly appreciated!
[edit: I just succeeded in integrating CImg into my C++/CLI project. I can now display the camera output using the CImageDisplay class. However, this can only be a workaround. The Application I'm developing consists of an MDIParent and the camera live view should be run in an MDIChild. I do not see any possibility to realize this using CImg (would be glad to be proven wrong!). Therefor, both questions are still of great importance to me!]
OpenGL would be a good choice. You can load the image into a texture, then draw a textured quad (or two triangles) filling your MDI child window.
I'm messing around with image manipulation, mostly using Python. I'm not too worried about performance right now, as I'm just doing this for fun. Thus far, I can load bitmaps, merge them (according to some function), and do some REALLY crude analysis (find the brightest/darkest points, that kind of thing).
I'd like to be able to take an image, generate a set of control points (which I can more or less do now), and then smudge the image, starting at a control point and moving in a particular direction. What I'm not sure of is the process of smudging itself. What's a good algorithm for this?
This question is pretty old but I've recently gotten interested in this very subject so maybe this might be helpful to someone. I implemented a 'smudge' brush using Imagick for PHP which is roughly based on the smudging technique described in this paper. If you want to inspect the code feel free to have a look at the project: Magickpaint
Try PythonMagick (ImageMagick library bindings for Python). If you can't find it on your distribution's repositories, get it here: http://www.imagemagick.org/download/python/
It has more effect functions than you can shake a stick at.
One method would be to apply a Gaussian blur (or some other type of blur) to each point in the region defined by your control points.
One method would be to create a grid that your control points moves and then use texture mapping techniques to map the image back onto the distorted grid.
I can vouch for a Gaussian Blur mentioned above, it is quite simple to implement and provides a fairly decent blur result.
James