Measuring distances from a time series of images in R, for snow-depth calculation - image

I am looking to calculate snow-depth for a timeseries of timelapse images in R. In each image there is a snow stake that represents a certain length from which we can get snow depth from. I wish to instead of reading the depth form each image, automate this process.
My idea is to first draw a reference line over the length of the snow stake, so we know how long the snow stake is in pixels. Then to, as it snows, reduce or grow that line based on the snow line, or automatically redraw the line to where the snow starts. I am wondering, a)whether this can be done, b)if any one has tried something similar to this and c)whether someone can point me in the right direction on where to start the process.
Currently I have tried drawing the line and then using edge detection to automatically redraw the line for every picture and then measure that line, but there is too much noise and our snow stakes don't stand out enough. I have also been using the magick package but haven't found a viable solution yet. Let me know if you have any ideas! Thanks!
Here is an example of a snow stake site in summer.

Related

OpenCV and OsX HEAD detection

Im looking to create a program that detects HEADS with opencv, not just faces. There must be a way to do this. Besides heads, I need to identify the most high pixel of the head (the top part of the hair) and the low center point of the chin... I'm not finding any OS X OpenCV examples.. Here is a picture of what I'd like to achieve...https://pasteboard.co/GF19Fao.jpg
Looks simple enough right?
I would recommend looking into haarcascades for this.
This consists of a couple cascades not included in OpenCV.
Method 1:
The profile and frontal face ones are a good starting point.
Personally, I have not tested them yet so it is hard for me to tell you what the results look like.
If the final bounding box fits perfectly over the entire head alone, then you can make the following assumptions as a starting point
if box = (topLeft, topRight, bottomLeft, bottomRight)
then hairTop = distance(topLeft, topRight)/2
and chin = distance(bottomLeft, bottomRight)/2
If not you could do two things:
1. Make some measurement readjustments to see how far up or down to move the resulting rectangle in order to find the chin and top hair.
2. You could use a combination of other classifiers as well.
Combine the results of the front/profile face with the mouth classifier to find the chin
Combine the front/profile face with upperbody classifier to find the top hair.
Method 2
You could also use the front/profile face classifiers to find the top hair and just use mouth to find the chin.
Either methods require that you run multiple tests to find the optimal values/estimates that fulfill your task.

Detecting the release of a ball in real time

I'm working on a project where I'm capturing people making free throw shots via a video camera. I need a way to detect, as fast as possible, the instant the ball is released from a player's hand. I tried researching a lot of detection/tracking algorithms, but everything I've found seemed more suited to tracking the ball itself. While I may eventually want to do that, right now all I need to know is the release timing.
I'm also open to other solutions that don't use the camera (I have a decent budget), but of course I'd like to use the camera if possible/fast enough. I'm also able to mess with the camera positioning/setup, and what I even want in the FOV.
Does anyone have any ideas? I'm pretty stuck right now, and haven't been able to find anything online that can help.
A solution is to use visual markers (motion trackers) on the throwing hands and on the ball. The precision is based on the FPS of the camera.
The assumption is that you know the ball dimension and the hand grip on the ball that may vary. By using visual markers/trackers you can know the position of the ball relative to the hand. When the distance between the initial grip of the ball and the hand is bigger than the distance between the center of the ball and it's extremity then is when you have your release. Schema of the method
A better solution is to use a graded meter bar (alternate between black and white bars like the ones used on the mythbusters show to track the speed of objects). At the moment there is a color gap between the hand and the ball you have your release. The downside of this approach is that you have to capture the image at a side angle or top-down angle and use panels to hold the grading.
Your problem is similar to the billiard ball collision detection. I hope you find this paper helpful.
Edit:
There is a powerful tool, that is not that expensive named Microsoft Kinect used for motion capture. The downside of this tool is that it's camera works with 30 fps and you cannot use it accurately on a very sunny scene. However I have found a scientific paper about using kinect to record athletes, including free-throws in basketball. Paper here
It's my first answer on so. Any feedback on how to improve my future answers is appreciated.

Remove background and get deer as a fore ground?

I want to remove background and get deer as a foreground image.
This is my source image captured by trail camera:
This is what I want to get. This output image can be a binary image or RGB.
I worked on it and try many methods to get solution but every time it failed at specific point. So please first understand what is my exact problem.
Image are captured by a trail camera and camera is motion detector. when deer come in front of camera it capture image.
Scene mode change with respect to weather changing or day and night etc. So I can't use frame difference or some thing like this.
Segmentation may be not work correctly because Foreground (deer) and Background have same color in many cases.
If anyone still have any ambiguity in my question then please first ask me to clear and then answer, it will be appreciated.
Thanks in advance.
Here's what I would do:
As was commented to your question, you can detect the dear and then perform grabcut to segment it from the picture.
To detect the dear, I would couple a classifier with a sliding window approach. That would mean that you'll have a classifier that given a patch (can be a large patch) in the image, output's a score of how much that patch is similar to a dear. The sliding window approach means that you loop on the window size and then loop on the window location. For each position of the window in the image, you should apply the classifier on that window and get a score of how much that window "looks like" a dear. Once you've done that, threshold all the scores to get the "best windows", i.e. the windows that are most similar to a dear. The rational behind this is that if we a dear is present at some location in the image, the classifier will output a high score at all windows that are close/overlap with the actual dear location. We would like to merge all that locations to a single location. That can be done by applying the functions groupRectangles from OpenCV:
http://docs.opencv.org/modules/objdetect/doc/cascade_classification.html#grouprectangles
Take a look at some face detection example from OpenCV, it basically does the same (sliding window + classifier) where the classifier is a Haar cascade.
Now, I didn't mention what that "dear classifier" can be. You can use HOG+SVM (which are both included in OpenCV) or use a much powerful approach of running a deep convulutional neural network (deep CNN). Luckily, you don't need to train a deep CNN. You can use the following packages with their "off the shelf" ImageNet networks (which are very powerful and might even be able to identify a dear without further training):
Decaf- which can be used only for research purposes:
https://github.com/UCB-ICSI-Vision-Group/decaf-release/
Or Caffe - which is BSD licensed:
http://caffe.berkeleyvision.org/
There are other packages of which you can read about here:
http://deeplearning.net/software_links/
The most common ones are Theano, Cuda ConvNet's and OverFeat (but that's really opinion based, you should chose the best package from the list that I linked to).
The "off the shelf" ImageNet network were trained on roughly 10M images from 1000 categories. If those categories contain "dear", that you can just use them as is. If not, you can use them to extract features (as a 4096 dimensional vector in the case of Decaf) and train a classifier on positive and negative images to build a "dear classifier".
Now, once you detected the dear, meaning you have a bounding box around it, you can apply grabcut:
http://docs.opencv.org/trunk/doc/py_tutorials/py_imgproc/py_grabcut/py_grabcut.html
You'll need an initial scribble on the dear to perform grabcu. You can just take a horizontal line in the middle of the bounding box and hope that it will be on the dear's torso. More elaborate approaches would be to find the symmetry axis of the dear and use that as a scribble, but you would have to google, research an implement some method to extract symmetry axis from the image.
That's about it. Not straightforward, but so is the problem.
Please let me know if you have any questions.
Try OpenCV Background Substraction with Mixture of Gaussians models. They should be adaptable enough for your scenes. Of course, the final performance will depend on the scenario, but it is worth trying.
Since you just want to separate the background from the foreground I think you do not need to recognize the deer. You need to recognize an object in motion in the scene. You just need to separate what is static in a significant interval of time (background) from what is not static: the deer.
There are algorithms that combine multiple frames from the same scene in order to determine the background, like THIS ONE.
You mentioned that the scene mode changes with respect to weather changing or day and night considering photos of different deers.
You could implement a solution when motion is detected, instead of taking a single photo, it could take a few ones with some interval of time.
This interval has to be long as to get the deer in different positions or out of the scene and at the same time short enough to not be much affected by scene variations. Perhaps you need to deal with some brightness variation, but I think it is feasible to determine the background using these frames and finally segment the deer in the "motion frame".

How can I get the rectangular areas of difference between two images?

I feel like I have a very typical problem with image comparison, and my googles are not revealing answers.
I want to transmit still images of a desktop every X amount of seconds. Currently, we send a new image if the old and new differ by even one pixel. Very often only something very minor changes, like the clock or an icon, and it would be great if I could just send the changed portion to the server and update the image (way less bandwidth).
The plan I envision is to get a rectangle of an area that has changed. For instance, if the clock changed, screen capture the smallest rectangle that encompasses the changes, and send it to the server along with its (x, y) coordinate. The server will then update the old image by overlaying the rectangle at the coordinate specified.
Is there any algorithm or library that accomplishes this? I don't want it to be perfect, let's say I'll always send a single rectangle that encompasses all the changes (even if many smaller rectangles would be more efficient).
My other idea was to get a diff between the new and old images that's saved as a series of transformations. Then, I would just send the series of transformations to the server, which would then apply this to the old image to get the new image. Not sure if this is even possible, just a thought.
Any ideas? Libraries I can use?
Compare every pixel of the previous frame with every pixel of the next frame, and keep track of which pixels have changed?
Since you are only looking for a single box to encompass all the changes, you actually only need to keep track of the min-x, min-y (not necessarily from the same pixel), max-x, and max-y. Those four values will give you the edges of your rectangle.
Note that this job (comparing the two frames) should really be off-loaded to the GPU, which could do this significantly faster than the CPU.
Note also that what you are trying to do is essentially a home-grown lossless streaming video compression algorithm. Using one from an existing library would not only be much easier, but also probably much more performant.
This is from algorithms point of view. Not sure if this is easier to implement.
Basically XOR the two images and compress using any information theory algorithm (huffman coding?)
I know am very late responding but I found this question today.
I have done some analysis on Image Differencing but the code was written for java. Kindly look into the below link that may come to help
How to find rectangle of difference between two images
The code finds differences and keeps the rectangles in a Linkedlist. You can use the linkedlist that contains the Rectangles to patch the differences on to the Base Image.
Cheers !

Looking for ways for a robot to locate itself in the house

I am hacking a vacuum cleaner robot to control it with a microcontroller (Arduino). I want to make it more efficient when cleaning a room. For now, it just go straight and turn when it hits something.
But I have trouble finding the best algorithm or method to use to know its position in the room. I am looking for an idea that stays cheap (less than $100) and not to complex (one that don't require a PhD thesis in computer vision). I can add some discrete markers in the room if necessary.
Right now, my robot has:
One webcam
Three proximity sensors (around 1 meter range)
Compass (no used for now)
Wi-Fi
Its speed can vary if the battery is full or nearly empty
A netbook Eee PC is embedded on the robot
Do you have any idea for doing this? Does any standard method exist for these kind of problems?
Note: if this question belongs on another website, please move it, I couldn't find a better place than Stack Overflow.
The problem of figuring out a robot's position in its environment is called localization. Computer science researchers have been trying to solve this problem for many years, with limited success. One problem is that you need reasonably good sensory input to figure out where you are, and sensory input from webcams (i.e. computer vision) is far from a solved problem.
If that didn't scare you off: one of the approaches to localization that I find easiest to understand is particle filtering. The idea goes something like this:
You keep track of a bunch of particles, each of which represents one possible location in the environment.
Each particle also has an associated probability that tells you how confident you are that the particle really represents your true location in the environment.
When you start off, all of these particles might be distributed uniformly throughout your environment and be given equal probabilities. Here the robot is gray and the particles are green.
When your robot moves, you move each particle. You might also degrade each particle's probability to represent the uncertainty in how the motors actually move the robot.
When your robot observes something (e.g. a landmark seen with the webcam, a wifi signal, etc.) you can increase the probability of particles that agree with that observation.
You might also want to periodically replace the lowest-probability particles with new particles based on observations.
To decide where the robot actually is, you can either use the particle with the highest probability, the highest-probability cluster, the weighted average of all particles, etc.
If you search around a bit, you'll find plenty of examples: e.g. a video of a robot using particle filtering to determine its location in a small room.
Particle filtering is nice because it's pretty easy to understand. That makes implementing and tweaking it a little less difficult. There are other similar techniques (like Kalman filters) that are arguably more theoretically sound but can be harder to get your head around.
A QR Code poster in each room would not only make an interesting Modern art piece, but would be relatively easy to spot with the camera!
If you can place some markers in the room, using the camera could be an option. If 2 known markers have an angular displacement (left to right) then the camera and the markers lie on a circle whose radius is related to the measured angle between the markers. I don't recall the formula right off, but the arc segment (on that circle) between the markers will be twice the angle you see. If you have the markers at known height and the camera is at a fixed angle of inclination, you can compute the distance to the markers. Either of these methods alone can nail down your position given enough markers. Using both will help do it with fewer markers.
Unfortunately, those methods are imperfect due to measurement errors. You get around this by using a Kalman estimator to incorporate multiple noisy measurements to arrive at a good position estimate - you can then feed in some dead reckoning information (which is also imperfect) to refine it further. This part is goes pretty deep into math, but I'd say it's a requirement to do a great job at what you're attempting. You can do OK without it, but if you want an optimal solution (in terms of best position estimate for given input) there is no better way. If you actually want a career in autonomous robotics, this will play large in your future. (
Once you can determine your position you can cover the room in any pattern you'd like. Keep using the bump sensor to help construct a map of obstacles and then you'll need to devise a way to scan incorporating the obstacles.
Not sure if you've got the math background yet, but here is the book:
http://books.google.com/books/about/Applied_optimal_estimation.html?id=KlFrn8lpPP0C
This doesn't replace the accepted answer (which is great, thanks!) but I might recommend getting a Kinect and use that instead of your webcam, either through Microsoft's recently released official drivers or using the hacked drivers if your EeePC doesn't have Windows 7 (presumably it does not).
That way the positioning will be improved by the 3D vision. Observing landmarks will now tell you how far away the landmark is, and not just where in the visual field that landmark is located.
Regardless, the accepted answer doesn't really address how to pick out landmarks in the visual field, and simply assumes that you can. While the Kinect drivers may already have feature detection included (I'm not sure) you can also use OpenCV for detecting features in the image.
One solution would be to use a strategy similar to "flood fill" (wikipedia). To get the controller to accurately perform sweeps, it needs a sense of distance. You can calibrate your bot using the proximity sensors: e.g. run motor for 1 sec = xx change in proximity. With that info, you can move your bot for an exact distance, and continue sweeping the room using flood fill.
Assuming you are not looking for a generalised solution, you may actually know the room's shape, size, potential obstacle locations, etc. When the bot exists the factory there is no info about its future operating environment, which kind of forces it to be inefficient from the outset.
If that's you case, you can hardcode that info, and then use basic measurements (ie. rotary encoders on wheels + compass) to precisely figure out its location in the room/house. No need for wifi triangulation or crazy sensor setups in my opinion. At least for a start.
Ever considered GPS? Every position on earth has a unique GPS coordinates - with resolution of 1 to 3 metres, and doing differential GPS you can go down to sub-10 cm range - more info here:
http://en.wikipedia.org/wiki/Global_Positioning_System
And Arduino does have lots of options of GPS-modules:
http://www.arduino.cc/playground/Tutorials/GPS
After you have collected all the key coordinates points of the house, you can then write the routine for the arduino to move the robot from point to point (as collected above) - assuming it will do all those obstacles avoidance stuff.
More information can be found here:
http://www.google.com/search?q=GPS+localization+robots&num=100
And inside the list I found this - specifically for your case: Arduino + GPS + localization:
http://www.youtube.com/watch?v=u7evnfTAVyM
I was thinking about this problem too. But I don't understand why you can't just triangulate? Have two or three beacons (e.g. IR LEDs of different frequencies) and a IR rotating sensor 'eye' on a servo. You could then get an almost constant fix on your position. I expect the accuracy would be in low cm range and it would be cheap. You can then map anything you bump into easily.
Maybe you could also use any interruption in the beacon beams to plot objects that are quite far from the robot too.
You have a camera you said ? Did you consider looking at the ceiling ? There is little chance that two rooms have identical dimensions, so you can identify in which room you are, position in the room can be computed from angular distance to the borders of the ceiling and direction can probably be extracted by the position of doors.
This will require some image processing but the vacuum cleaner moving slowly to be efficiently cleaning will have enough time to compute.
Good luck !
Use Ultra Sonic Sensor HC-SR04 or similar.
As above told sense the walls distance from robot with sensors and room part with QR code.
When your are near to a wall turn 90 degree and move as width of your robot and again turn 90deg( i.e. 90 deg left turn) and again move your robot I think it will help :)

Resources