Visualization of river- animation via code - animation

I am trying to visualize a river flow- basically, should be able to visualize river current direction and speed based on an user-defined external parameter. This is required to demonstrate vectors in two dimensions- given education needs, animation quality can be minimal- 'tolerable enough'.
I tried a simplistic approach by a blue background with lines indicating currents- comes out very weak and below my low standards!!
Can someone point out a good example/ approach for achieving the same? Thanks.

You can create an image filter that looks like water. Look at Jerry's image filters. Specifically look at the the caustic filter. You could animate it moving from one end of the river to the other end. You can also experiment with varying the time parameter. Since it's open source, you can translate it to other languages.
Here are some links to 3d visualizations.

Related

Performing OCR on gas meter

I want to perform OCR on a gas meter so it can read the value. An example of a meter I want to perform OCR on:
The OCR should return 25539144 in this case.
As you can see there is a bit of a problem: There is a lot of text around the meter. So a normal OCR library wouldn't work here since it would return the text around it aswell.
I already tried object detection to detect the meter but the only one that seems to work well (because I have only 50 pictures) is the azure cognitive services. The problem is that later on it should be able to detect it in a live stream so a web service is not possible.
Can anyone help me in the right direction to tackle this problem?
If the comment about using the colour doesn't help you then you could try this approach:
One possible approach could be to train a model (a NN perhaps) to draw a bounding box around the usage numbers.
You're going to have to draw a few boxes by hand to provide training examples.
Once you've run this "bounding-box creation model" you can crop out all the irrelevant stuff and you'll have a new training set consisting of examples that are easier to learn from.
You can then try retraining your ocr model on this new dataset.

Remove background and get deer as a fore ground?

I want to remove background and get deer as a foreground image.
This is my source image captured by trail camera:
This is what I want to get. This output image can be a binary image or RGB.
I worked on it and try many methods to get solution but every time it failed at specific point. So please first understand what is my exact problem.
Image are captured by a trail camera and camera is motion detector. when deer come in front of camera it capture image.
Scene mode change with respect to weather changing or day and night etc. So I can't use frame difference or some thing like this.
Segmentation may be not work correctly because Foreground (deer) and Background have same color in many cases.
If anyone still have any ambiguity in my question then please first ask me to clear and then answer, it will be appreciated.
Thanks in advance.
Here's what I would do:
As was commented to your question, you can detect the dear and then perform grabcut to segment it from the picture.
To detect the dear, I would couple a classifier with a sliding window approach. That would mean that you'll have a classifier that given a patch (can be a large patch) in the image, output's a score of how much that patch is similar to a dear. The sliding window approach means that you loop on the window size and then loop on the window location. For each position of the window in the image, you should apply the classifier on that window and get a score of how much that window "looks like" a dear. Once you've done that, threshold all the scores to get the "best windows", i.e. the windows that are most similar to a dear. The rational behind this is that if we a dear is present at some location in the image, the classifier will output a high score at all windows that are close/overlap with the actual dear location. We would like to merge all that locations to a single location. That can be done by applying the functions groupRectangles from OpenCV:
http://docs.opencv.org/modules/objdetect/doc/cascade_classification.html#grouprectangles
Take a look at some face detection example from OpenCV, it basically does the same (sliding window + classifier) where the classifier is a Haar cascade.
Now, I didn't mention what that "dear classifier" can be. You can use HOG+SVM (which are both included in OpenCV) or use a much powerful approach of running a deep convulutional neural network (deep CNN). Luckily, you don't need to train a deep CNN. You can use the following packages with their "off the shelf" ImageNet networks (which are very powerful and might even be able to identify a dear without further training):
Decaf- which can be used only for research purposes:
https://github.com/UCB-ICSI-Vision-Group/decaf-release/
Or Caffe - which is BSD licensed:
http://caffe.berkeleyvision.org/
There are other packages of which you can read about here:
http://deeplearning.net/software_links/
The most common ones are Theano, Cuda ConvNet's and OverFeat (but that's really opinion based, you should chose the best package from the list that I linked to).
The "off the shelf" ImageNet network were trained on roughly 10M images from 1000 categories. If those categories contain "dear", that you can just use them as is. If not, you can use them to extract features (as a 4096 dimensional vector in the case of Decaf) and train a classifier on positive and negative images to build a "dear classifier".
Now, once you detected the dear, meaning you have a bounding box around it, you can apply grabcut:
http://docs.opencv.org/trunk/doc/py_tutorials/py_imgproc/py_grabcut/py_grabcut.html
You'll need an initial scribble on the dear to perform grabcu. You can just take a horizontal line in the middle of the bounding box and hope that it will be on the dear's torso. More elaborate approaches would be to find the symmetry axis of the dear and use that as a scribble, but you would have to google, research an implement some method to extract symmetry axis from the image.
That's about it. Not straightforward, but so is the problem.
Please let me know if you have any questions.
Try OpenCV Background Substraction with Mixture of Gaussians models. They should be adaptable enough for your scenes. Of course, the final performance will depend on the scenario, but it is worth trying.
Since you just want to separate the background from the foreground I think you do not need to recognize the deer. You need to recognize an object in motion in the scene. You just need to separate what is static in a significant interval of time (background) from what is not static: the deer.
There are algorithms that combine multiple frames from the same scene in order to determine the background, like THIS ONE.
You mentioned that the scene mode changes with respect to weather changing or day and night considering photos of different deers.
You could implement a solution when motion is detected, instead of taking a single photo, it could take a few ones with some interval of time.
This interval has to be long as to get the deer in different positions or out of the scene and at the same time short enough to not be much affected by scene variations. Perhaps you need to deal with some brightness variation, but I think it is feasible to determine the background using these frames and finally segment the deer in the "motion frame".

Image analysis - how can I extract image label out of bottle

I am thinking of using OpenCV library for image analysis. Basically I want to automate in my project the extraction of image label from wine bottle.
This is the sample input image:
This is the sample output:
I am thinking what should be my general strategy to extract the image. I am not asking for direct code. Just want to know the general approach to solve the problem.
Thanks!
Sorry for vage answer but in applied computer vision is no such thing like general approach.
some will disagree of course but in reality
all CV applications are custom made for some specific purpose/task
in your case is the idea to find cylindric and probably standing object (bottle)
and then finding of irregular parts in it
I would do it like this:
1.remove noise as much as possible (smooth/sharpen filters)
2.(optionaly) reduce image data (via (i)FT or (i)DCT for example)
3.segmentate objects (usually by homogenity of color or by edge detection or by booth)
4.identify bottle object (by color,shape,or illumination (glass is transparent))
5.identify objects inside bottle (homogenity,not transparent,usually sharp edges,color is not good some labels are black on dark glass)
6.(optional) project label back from cylindric space to flat texture
[notes]
create app with many scrollbars and checkboxes
to be able to change all tresholds and enable disable filters or their order on the run
all parts will take a lot of tweaking of tresholds and weights
you have to do a lot of trial and error runs to find the best filters and their config for your task

Making 3D representation of an object with a webcam

Is it possible to make a 3D representation of an object by capturing many different angles using a webcam? If it is, how is it possible and how is the image-processing done?
My plan is to make a 3D representation of a person using a webcam, then from the 3D representation, i will be able to tell the person's vital statistics.
As Bart said (but did not post as an actual answer) this is entirely possible.
The research topic you are interested in is often called multi view stereo or something similar.
The basic idea resolves around using point correspondences between two (or more) images and then try to find the best matching camera positions. When the positions are found you can use stereo algorithms to back project the image points into a 3D coordinate system and form a point cloud.
From that point cloud you can then further process it to get the measurements you are looking for.
If you are completely new to the subject you have some fascinating reading to look forward to!
Bart proposed Multiple view geometry by Hartley and Zisserman, which is a very nice book indeed.
As Bart and Kigurai pointed out, this process has been studied under the title of "stereo" or "multi-view stereo" techniques. To be able to get a 3D model from a set of pictures, you need to do the following:
a) You need to know the "internal" parameters of a camera. This includes the focal length of the camera, the principal point of the image and account for radial distortion in the image.
b) You also need to know the position and orientation of each camera with respect to each other or a "world" co-ordinate system. This is called the "pose" of the camera.
There are algorithms to perform (a) and (b) which are described in Hartley and Zisserman's "Multiple View Geometry" book. Alternatively, you can use Noah Snavely's "Bundler" http://phototour.cs.washington.edu/bundler/ software to also do the same thing in a very robust manner.
Once you have the camera parameters, you essentially know how a 3D point (X,Y,Z) in the world maps to an image co-ordinate (u,v) on the photo. You also know how to map an image co-ordinate to the world. You can create a dense point cloud by searching for a match for each pixel on one photo in a photo taken from a different view-point. This requires a two-dimensional search. You can simplify this procedure by making the search 1-dimensional. This is called "rectification". You essentially take two photos and transform then so that their rows correspond to the same line in the world (simplified statement). Now you only have to search along image rows.
An algorithm for this can be also found in Hartley and Zisserman.
Finally, you need to do the matching based on some measure. There is a lot of literature out there on "stereo matching". Another word used is "disparity estimation". This is basically searching for the match of pixel (u,v) on one photo to its match (u, v') on the other photo. Once you have the match, the difference between them can be used to map back to a 3D point.
You can use Yasutaka Furukawa's "CMVS" or "PMVS2" software to do this. Or if you want to experiment by yourself, openCV is a open-source computer vision toolbox to do many of the sub-tasks required for this.
This can be done with two webcams in the same ways your eyes work. It is called stereoscopic vision.
Have a look at this:
http://opencv.willowgarage.com/documentation/camera_calibration_and_3d_reconstruction.html
An affordable alternative to get 3D data would be the Kinect camera system.
Maybe not the answer you are hoping for but Microsoft's Kinect is doing that exact thing, there are some open source drivers out there that allow you to connect it to your windows/linux box.

Morphing 2 faces images

I would like some help from the aficionados of openCV here.
I would like to know the direction to take (and some advices or piece of code) on how to morph 2 faces together with a kind of ratio saying 10% of the first and 90% of the second.
I have seen functions like cvWarpAffine and cvMakeScanlines but I am not sure how to use them.
So if somebody could help me here, I'll be very grateful.
Thanks in advance.
Unless the images compared are the exact same images, you would not go very far with this.
This is an artificial intelligence problem and needs to be solved as such. Typical solution involves:
Normalising the data (removing noise, skew, ...) from the images
Feature extraction (turn the image into a smaller set of data)
Use a machine learning (typically classifiers) to train the data with your matches
Test the result
Refine previous processes according to the results until you get good recognition
The choice of OpenCV functions used depends on your feature extraction method. Have a look at Eigenface.

Resources