how do I apply the stereo effect to a video mapped sphere?`

how do I apply the stereo effect to a video mapped sphere?` - three.js

Assuming all my assets are linked properly. Using the oculusRiftEffect.js or the StereoEffect.js - how can I make my html file cardboard compatible? My Gist link is below
https://gist.github.com/4d9e4c81a6b13874ed52.git
Please advise

Exactly what are you trying to do? "Stereo effect" isn't very descriptive.
However assuming that you have stereo video files (one for left eye, one for right eye) you'd just play them back in a left sphere (seen by the left eye), and a right sphere for the right eye. The spheres are offset by the interpupillary distance (IPD) - usually about 55mm. (it'd actually be whatever the stereo videos are offset by).
So, you might ask - what happens when I turn around? The IPD goes negative. When I look up or down? It goes to zero. Welcome to stereo video.
Note that there are ways around this, but you're not going to get them with a GoPro without a lot of special processing. At best you can sync the IDP direction with the averaged lens separation for the video streams, but you'll always get stitching errors. The equirectangular format (i.e square video) isn't the best, you'll always get the wrong answer at the poles. Not using a sphere is probably how it will evolve (but it's the simplest solution till we get a better one)
But, given all that, give it a shot, the brain is very forgiving - and it'll look 3D-ish most of the time. This stuff is being refined all the time.

Related

Remove background and get deer as a fore ground?

I want to remove background and get deer as a foreground image.
This is my source image captured by trail camera:
This is what I want to get. This output image can be a binary image or RGB.
I worked on it and try many methods to get solution but every time it failed at specific point. So please first understand what is my exact problem.
Image are captured by a trail camera and camera is motion detector. when deer come in front of camera it capture image.
Scene mode change with respect to weather changing or day and night etc. So I can't use frame difference or some thing like this.
Segmentation may be not work correctly because Foreground (deer) and Background have same color in many cases.
If anyone still have any ambiguity in my question then please first ask me to clear and then answer, it will be appreciated.
Thanks in advance.

Here's what I would do:
As was commented to your question, you can detect the dear and then perform grabcut to segment it from the picture.
To detect the dear, I would couple a classifier with a sliding window approach. That would mean that you'll have a classifier that given a patch (can be a large patch) in the image, output's a score of how much that patch is similar to a dear. The sliding window approach means that you loop on the window size and then loop on the window location. For each position of the window in the image, you should apply the classifier on that window and get a score of how much that window "looks like" a dear. Once you've done that, threshold all the scores to get the "best windows", i.e. the windows that are most similar to a dear. The rational behind this is that if we a dear is present at some location in the image, the classifier will output a high score at all windows that are close/overlap with the actual dear location. We would like to merge all that locations to a single location. That can be done by applying the functions groupRectangles from OpenCV:
http://docs.opencv.org/modules/objdetect/doc/cascade_classification.html#grouprectangles
Take a look at some face detection example from OpenCV, it basically does the same (sliding window + classifier) where the classifier is a Haar cascade.
Now, I didn't mention what that "dear classifier" can be. You can use HOG+SVM (which are both included in OpenCV) or use a much powerful approach of running a deep convulutional neural network (deep CNN). Luckily, you don't need to train a deep CNN. You can use the following packages with their "off the shelf" ImageNet networks (which are very powerful and might even be able to identify a dear without further training):
Decaf- which can be used only for research purposes:
https://github.com/UCB-ICSI-Vision-Group/decaf-release/
Or Caffe - which is BSD licensed:
http://caffe.berkeleyvision.org/
There are other packages of which you can read about here:
http://deeplearning.net/software_links/
The most common ones are Theano, Cuda ConvNet's and OverFeat (but that's really opinion based, you should chose the best package from the list that I linked to).
The "off the shelf" ImageNet network were trained on roughly 10M images from 1000 categories. If those categories contain "dear", that you can just use them as is. If not, you can use them to extract features (as a 4096 dimensional vector in the case of Decaf) and train a classifier on positive and negative images to build a "dear classifier".
Now, once you detected the dear, meaning you have a bounding box around it, you can apply grabcut:
http://docs.opencv.org/trunk/doc/py_tutorials/py_imgproc/py_grabcut/py_grabcut.html
You'll need an initial scribble on the dear to perform grabcu. You can just take a horizontal line in the middle of the bounding box and hope that it will be on the dear's torso. More elaborate approaches would be to find the symmetry axis of the dear and use that as a scribble, but you would have to google, research an implement some method to extract symmetry axis from the image.
That's about it. Not straightforward, but so is the problem.
Please let me know if you have any questions.

Try OpenCV Background Substraction with Mixture of Gaussians models. They should be adaptable enough for your scenes. Of course, the final performance will depend on the scenario, but it is worth trying.

Since you just want to separate the background from the foreground I think you do not need to recognize the deer. You need to recognize an object in motion in the scene. You just need to separate what is static in a significant interval of time (background) from what is not static: the deer.
There are algorithms that combine multiple frames from the same scene in order to determine the background, like THIS ONE.
You mentioned that the scene mode changes with respect to weather changing or day and night considering photos of different deers.
You could implement a solution when motion is detected, instead of taking a single photo, it could take a few ones with some interval of time.
This interval has to be long as to get the deer in different positions or out of the scene and at the same time short enough to not be much affected by scene variations. Perhaps you need to deal with some brightness variation, but I think it is feasible to determine the background using these frames and finally segment the deer in the "motion frame".

OpenCV: how to detect if video has fast moving object in it?

What would be the best way to detect a fast moving object using OpenCV?
Say, I have 5 random video files:
1) Video of a crowd, people walking, static camera.
2) Video of a cat playing with a ball, shaky iPhone camera.
3) Video of a person being interviewed. Static camera.
4) Animation (3D) of a fast moving car, background is blurred etc. etc.
5) A blurred out video shot with iPhone camera (just camera waved around, nothing is visible).
So I would like to isolate video5 and detect that there is a lot of movement in video4 and video2.
What would be the best approach to do that? I think of using OpenCV2, but if there is a better solution for that, I'd be happy to learn about that.
Any input greatly appreciated. Pseudo-code or just recommendations of specific algorithms.
Thank you

Optical Flow This will be one of many ways of detecting motion.

I don't know if you are still on it but I found it interesting to answer.
Approach 1-
As suggested by user349026, one of the most intuitive way is to work with Optical flow, it will give you dominating motion but optical flow always comes with noises. You will have to use some filter before using the optical flow.
Approach- 2
This one difficult but gives good results.
This is from CVPR-2013 paper link- http://www.irisa.fr/texmex/people/jain/w-Flow/motion_cvpr13.pdf
I think the just introduction of this paper will solve your problem.

How can I get the rectangular areas of difference between two images?

I feel like I have a very typical problem with image comparison, and my googles are not revealing answers.
I want to transmit still images of a desktop every X amount of seconds. Currently, we send a new image if the old and new differ by even one pixel. Very often only something very minor changes, like the clock or an icon, and it would be great if I could just send the changed portion to the server and update the image (way less bandwidth).
The plan I envision is to get a rectangle of an area that has changed. For instance, if the clock changed, screen capture the smallest rectangle that encompasses the changes, and send it to the server along with its (x, y) coordinate. The server will then update the old image by overlaying the rectangle at the coordinate specified.
Is there any algorithm or library that accomplishes this? I don't want it to be perfect, let's say I'll always send a single rectangle that encompasses all the changes (even if many smaller rectangles would be more efficient).
My other idea was to get a diff between the new and old images that's saved as a series of transformations. Then, I would just send the series of transformations to the server, which would then apply this to the old image to get the new image. Not sure if this is even possible, just a thought.
Any ideas? Libraries I can use?

Compare every pixel of the previous frame with every pixel of the next frame, and keep track of which pixels have changed?
Since you are only looking for a single box to encompass all the changes, you actually only need to keep track of the min-x, min-y (not necessarily from the same pixel), max-x, and max-y. Those four values will give you the edges of your rectangle.
Note that this job (comparing the two frames) should really be off-loaded to the GPU, which could do this significantly faster than the CPU.
Note also that what you are trying to do is essentially a home-grown lossless streaming video compression algorithm. Using one from an existing library would not only be much easier, but also probably much more performant.

This is from algorithms point of view. Not sure if this is easier to implement.
Basically XOR the two images and compress using any information theory algorithm (huffman coding?)

I know am very late responding but I found this question today.
I have done some analysis on Image Differencing but the code was written for java. Kindly look into the below link that may come to help
How to find rectangle of difference between two images
The code finds differences and keeps the rectangles in a Linkedlist. You can use the linkedlist that contains the Rectangles to patch the differences on to the Base Image.
Cheers !

Image Cropping - Region of Interest Query

I have a set of videos of someone talking, I'm building a lip recognition system therefore I need to perform some image processing on specific region of the image (the lower chin and lips).
I have over 200 videos, each containing a sentence. It is natural conversation therefore the head constantly moves so the lips aren't in a fixed place. I'm having difficulty specifying my region of interest in the image as it is very tiresome having to watch through every video and mark out how big my box to ensure the lips are cropped within the ROI.
I was wondering if there would be an easier way to check this, perhaps using MATLAB? I was thinking I could crop the video frame by frame and output an image for each frame. And then physically go through the images to see if the lips go out of frame?

I had to solve a similar problem dealing with tracking the heads and limbs of students participating in class discussions on video. We experimented with using state of the art optical flow tracking, from Thomas Brox ( link, see the part about large-displacement optical flow.) In our case, we had nearly 20 terabytes of videos to work through, so we had no choice but to use a C++ and GPU implementation of optical flow code; I think you will discover too that Matlab is impossibly slow for doing video analysis.
Optical flow returns to you detailed motion vectors. Then, if you can just mark the original bounding box for the mouth and chin in the first frame of the video, you can follow the tracks given by the optical flow of those pixels and this will usually give you a good sequence of bounding boxes. You will probably have errors that you have to clean up, though. You could write a Python script that plays back the sequence of bounding boxes for you to quickly check for errors though.
The code I wrote for this is in Python, and it's probably not easy to adapt to your data set-up or your problem, but you can find my affine-transformation based optical flow tracking code linked here, in the part called 'Object tracker using dense optical flow.'
The short answer is that this is a very difficult and annoying problem for vision researchers. Most people "solve" it by placing their videos, frame by frame, onto Mechanical Turk, and paying human workers about 2 cents per frame that they analyze. This gives you pretty good results (you'll still have to clean them after collecting it from the Mechanical Turkers), but it's not very helpful when you have tons o' videos and you cannot wait for enough of them to randomly get analyzed on Mechanical Turk.
There definitely isn't any 'out of the box' solution to region-of-interest annotation, though. You'd probably have to pay quite a lot for third-party software that did this automatically. My best guess for that is to check out what face.com would charge you and how well it would perform. Be careful that you don't violate any researcher confidentiality agreements with your data set though, for this or Mechanical Turk.

Image processing - one frame is washed out, other isn't. How to 'fix'?

I have the following 2 consequative, unaltered frames from a video:
For some reason the camera made the 2nd much 'washed out' than the first. I want to make the 2nd look more like the 1st.
In the video I'm trying to process, there are lots of cases like this, where the 'exposure' changes suddenly from one frame to the next. I am able to find this parts of the video by looking at the image histogram for each frame, and when 2 adjacent frames have histograms that are too far apart, it's where this has happened. I can find this sections of different exposure, but I'm stumped with how to fix it.
As a programmer I'm familar with ImageMagick and that's about it. Since I have lots of frames, some automated hand off approach to fix this is by far the best solution. I am also totally unskilled with graphics editing programmes.
I've tried changing the exposure in imagemagick (with -level 0,50% etc.), but that doesn't help.
What ImageMagick commands (or other FLOSS image editing tools) will make the 2nd image look more like the 1st?

As some people have pointed out in the comments, the problem is the colour balance. Changing the colour balance makes the 2 images more similar.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio