I am trying to distinguish car lights from the light reflection from street at night images. For example, in an image like this:
I tried changing to some other color spaces, but it didn't work. For example, cvtColor(image, gray, CV_HSV2BGR_FULL); made it like:
However, in this post it works fine.
Is there any way I can do something similar for this image? I am using OpenCV3.1 with C++ on Windows (Python would also be great).
In general, a reflection is a light source. As far as light propagation is concerned, the light source is the direction the photons are coming from. It might be a source or a reflection.
Thus, a better view on the problem would be to think more about what you are trying to detect. In the post you referenced, the problem is much simpler, since the reflections are not blown out (reaching maximum brightness), while the light sources are (or they are close enough). In your image, due to the low quality of the capture, the reflections and the light sources have similar brightness.
To emphasize why this is a problem, your approach is a local approach. Think about a very small patch (e.g. 3x3 pixels) from your image. Could you tell if it's from a reflection or not?
Therefore, a better view would be to think about shapes. You want to detect car lights, which seem to be round, of about the same size and white.
I would suggest something like a blob detector finely tuned to your problem, or to use your technique to threshold the image, and then run a CCA and measure the circularity/size of the components.
You can also consider doing some cropping/filling in the city lights with black since the camera seems fixed.
Related
I would like to create a plastic material using three.js, something like the lighter fuel container here:
http://wiki.blender.org/index.php/Doc:2.4/Tutorials/Render/Import/SolidWorks
I would be glad if I could get a reasonably simple example to start working from.
I am actually not rendering an image but visualizing a mathematical problem (cellular automata). I need a set of interlocking surfaces (something like sheets of plastic foil) with as much visual information as possible, so I can distinguish between them. Therefore I was looking for: translucency, reflections, rotating an object with a fixed light source, visible edges. Later I will add some animated color coding, but for now I need a good material.
Here is the current status of my code:
https://github.com/jeras/three.js/tree/master/pyca
Here is how this networks look for 1D CA, but I would like to handle a 2D problem:
http://rattus.info/al/files/conference.pdf
Thanks,
Iztok Jeras
Well if you are looking for some examples to start working from , you should go to this three.js tutorials site : http://stemkoski.github.io/Three.js .
There is a lot of examples and the ones you might be interested of are :
the tranlucence
the reflection
the refraction
some bubble effect
Hope this helps
I am in the process of learning how to create a lens flare application. I've got most of the basic components figured out and now I'm moving on to the more complicated ones such as the glimmers / glints / spikeball as seen here: http://wiki.nuaj.net/images/e/e1/OpticalFlaresLensObjects.png
Or these: http://ak3.picdn.net/shutterstock/videos/1996229/preview/stock-footage-blue-flare-rotate.jpg
Some have suggested creating particles that emanate outwards from the center while fading out and either increasing or decreasing in size but I've tried this and there are just too many nested loops which makes performance awful.
Someone else suggested drawing a circular gradient from center white to radius black and using some algorithms to lighten and darken areas thus producing rays.
Does anyone have any ideas? I'm really stuck on this one.
I am using a limited compiler that is similar to C but I don't have any access to antialiasing, predefined shapes, etc. Everything has to be hand-coded.
Any help would be greatly appreciated!
I would create large circle selections, then use a radial gradient. Each side of the gradient is white, but one side has 100% alpha and the other 0%. Once you have used the gradient tool to draw that gradient inside the circle. Deselect it and use the transform tool to Skew or in a sense smash it. Then duplicate it several times and turn each one creating a spiral or circle holding Ctrl to constrain when needed. Then once those several layers are in the rotation or design that you want. Group them in a folder and then you can further effect them all at once with another transform or skew. WHen you use these real smal, they are like little stars. But you can do many different things when creating each one to make them different. Like making each one lower in opacity than the last etc...
I found a few examples of how to do lens-flare 'via code'. Ideally you'd want to do this as a post-process - meaning after you're done with your regular render, you process the image further.
Fragment shaders are apt for this step. The easiest version I found is this one. The basic idea is to
Identify really bright spots in your image and potentially down sample it.
Shoot rays from the fragment to the center of the image and sample some pixels along the way.
Accumalate the samples and apply further processing - chromatic distortion etc - on it.
And you get a whole range of options to play with.
Another more common alternative seems to be
Have a set of basic images (circles, hexes) and render them as a bunch of bright objects, along the path from the camera to the light(s).
Composite this image on top of the regular render of you scene.
The problem is in determining when to turn on lens flare, since it is dependant on whether a light is visible/occluded from a camera. GPU Gems comes to rescue, with better options.
A more serious, physically based implementation is listed in this paper. This is a real-time version of making lens-flares, but you need a hardware that can support both vertex and geometry shaders.
IES (Illuminating Engineering Society) is a file format (.ies) that enhances lights in animation tools. It adds accurate falloff, dispersion, color temperature, spatial emission, brightness and stuff like that. It's an industrial standard to show how lighting products really look like. Many animation tools (Maya, Cinema4D, Blender etc.) are able to utilize this format.
Yet, I'm still searching for a way to import/use IES in WebGL frameworks. Using an animation tool (in my case Blender) to import and process .ies-files and finally export the project to a webgl-format seemed to be the most promising method to me. I tried out a couple of WebGL-Frameworks (three.js, x3dom, coppercube) and many more export formats/import-/converter scripts but none of it produced satisfying results. The processed results showed no lights, default lights or at best the same number of lights with no further attributes.
Does anybody by any chance know of a working combination of animation tool -> export function -> WebGL-framework or WebGL-ies-format-import that would do the trick?
Are there people with the same problem and longing for a solution?
You could maybe transfer some attributes from .ies files (color, intensity and such), but the Three.js renderer simply doesn't support the complex light properties .ies (like the shape of light) is meant to describe. So, even if you were able to import/export those properties, the default Three.js light system would not be able to render them properly.
Even if you implemented your own shaders and lights (you can do that in Three.js), it probably would still be prohibitively slow and/or inaccurate, as you probably need some raytracing/pathtracing -based approach for good enough results. WebGL approach to rendering in general, no matter what library or framework you use, is not very good fit for complex, accurate simulation of lights and shadows.
That being said, I would be highly interested in any solution, even a crude approximate support would be useful.
Three.js now has an example of area lights. It is a very crude approximation (analytic evaluation of the area integral after some serious simplification). There is an article by Insomniac Games in GPU Pro 5 that offers are more accurate solution.
This only helps you with the area and shape of the light; you are on your own with the other attributes.
P.S. It's hard to tell from your question, but if your light format contains spectrum there have recently been some nice articles on real-time spectral lighting:
http://www.numb3r23.net/2013/03/26/spectral-rendering-on-the-gpu-now-with-bumps/
I have a web cam that takes a picture every N seconds. This gives me a collection of images of the same scene over time. I want to process that collection of images as they are created to identify events like someone entering into the frame, or something else large happening. I will be comparing images that are adjacent in time and fixed in space - the same scene at different moments of time.
I want a reasonably sophisticated approach. For example, naive approaches fail for outdoor applications. If you count the number of pixels that change, for example, or the percentage of the picture that has a different color or grayscale value, that will give false positive reports every time the sun goes behind a cloud or the wind shakes a tree.
I want to be able to positively detect a truck parking in the scene, for example, while ignoring lighting changes from sun/cloud transitions, etc.
I've done a number of searches, and found a few survey papers (Radke et al, for example) but nothing that actually gives algorithms that I can put into a program I can write.
Use color spectroanalisys, without luminance: when the Sun goes down for a while, you will get similar result, colors does not change (too much).
Don't go for big changes, but quick changes. If the luminance of the image changes -10% during 10 min, it means the usual evening effect. But when the change is -5%, 0, +5% within seconds, its a quick change.
Don't forget to adjust the reference values.
Split the image to smaller regions. Then, when all the regions change same way, you know, it's a global change, like an eclypse or what, but if only one region's parameters are changing, then something happens there.
Use masks to create smart regions. If you're watching a street, filter out the sky, the trees (blown by wind), etc. You may set up different trigger values for different regions. The regions should overlap.
A special case of the region is the line. A line (a narrow region) contains less and more homogeneous pixels than a flat area. Mark, say, a green fence, it's easy to detect wheter someone crosses it, it makes bigger change in the line than in a flat area.
If you can, change the IRL world. Repaint the fence to a strange color to create a color spectrum, which can be identified easier. Paint tags to the floor and wall, which can be OCRed by the program, so you can detect wheter something hides it.
I believe you are looking for Template Matching
Also i would suggest you to look on to Open CV
We had to contend with many of these issues in our interactive installations. It's tough to not get false positives without being able to control some of your environment (sounds like you will have some degree of control). In the end we looked at combining some techniques and we created an open piece of software named OpenTSPS (Open Toolkit for Sensing People in Spaces - http://www.opentsps.com). You can look at the C++ source in github (https://github.com/labatrockwell/openTSPS/).
We use ‘progressive background relearn’ to adjust to the changing background over time. Progressive relearning is particularly useful in variable lighting conditions – e.g. if lighting in a space changes from day to night. This in combination with blob detection works pretty well and the only way we have found to improve is to use 3D cameras like the kinect which cast out IR and measure it.
There are other algorithms that might be relevant, like SURF (http://achuwilson.wordpress.com/2011/08/05/object-detection-using-surf-in-opencv-part-1/ and http://en.wikipedia.org/wiki/SURF) but I don't think it will help in your situation unless you know exactly the type of thing you are looking for in the image.
Sounds like a fun project. Best of luck.
The problem you are trying to solve is very interesting indeed!
I think that you would need to attack it in parts:
As you already pointed out, a sudden change in illumination can be problematic. This is an indicator that you probably need to achieve some sort of illumination-invariant representation of the images you are trying to analyze.
There are plenty of techniques lying around, one I have found very useful for illumination invariance (applied to face recognition) is DoG filtering (Difference of Gaussians)
The idea is that you first convert the image to gray-scale. Then you generate two blurred versions of this image by applying a gaussian filter, one a little bit more blurry than the first one. (you could use a 1.0 sigma and a 2.0 sigma in a gaussian filter respectively) Then you subtract from the less-blury image, the pixel intensities of the more-blurry image. This operation enhances edges and produces a similar image regardless of strong illumination intensity variations. These steps can be very easily performed using OpenCV (as others have stated). This technique has been applied and documented here.
This paper adds an extra step involving contrast equalization, In my experience this is only needed if you want to obtain "visible" images from the DoG operation (pixel values tend to be very low after the DoG filter and are veiwed as black rectangles onscreen), and performing a histogram equalization is an acceptable substitution if you want to be able to see the effect of the DoG filter.
Once you have illumination-invariant images you could focus on the detection part. If your problem can afford having a static camera that can be trained for a certain amount of time, then you could use a strategy similar to alarm motion detectors. Most of them work with an average thermal image - basically they record the average temperature of the "pixels" of a room view, and trigger an alarm when the heat signature varies greatly from one "frame" to the next. Here you wouldn't be working with temperatures, but with average, light-normalized pixel values. This would allow you to build up with time which areas of the image tend to have movement (e.g. the leaves of a tree in a windy environment), and which areas are fairly stable in the image. Then you could trigger an alarm when a large number of pixles already flagged as stable have a strong variation from one frame to the next one.
If you can't afford training your camera view, then I would suggest you take a look at the TLD tracker of Zdenek Kalal. His research is focused on object tracking with a single frame as training. You could probably use the semistatic view of the camera (with no foreign objects present) as a starting point for the tracker and flag a detection when the TLD tracker (a grid of points where local motion flow is estimated using the Lucas-Kanade algorithm) fails to track a large amount of gridpoints from one frame to the next. This scenario would probably allow even a panning camera to work as the algorithm is very resilient to motion disturbances.
Hope this pointers are of some help. Good Luck and enjoy the journey! =D
Use one of the standard measures like Mean Squared Error, for eg. to find out the difference between two consecutive images. If the MSE is beyond a certain threshold, you know that there is some motion.
Also read about Motion Estimation.
if you know that the image will remain reletivly static I would reccomend:
1) look into neural networks. you can use them to learn what defines someone within the image or what is a non-something in the image.
2) look into motion detection algorithms, they are used all over the place.
3) is you camera capable of thermal imaging? if so it may be worthwile to look for hotspots in the images. There may be existing algorithms to turn your webcam into a thermal imager.
I am building a web application using .NET 3.5 (ASP.NET, SQL Server, C#, WCF, WF, etc) and I have run into a major design dilemma. This is a uni project btw, but it is 100% up to me what I develop.
I need to design a system whereby I can take an image and automatically crop a certain object within it, without user input. So for example, cut out the car in a picture of a road. I've given this a lot of thought, and I can't see any feasible method. I guess this thread is to discuss the issues and feasibility of achieving this goal. Eventually, I would get the dimensions of a car (or whatever it may be), and then pass this into a 3d modelling app (custom) as parameters, to render a 3d model. This last step is a lot more feasible. It's the cropping issue which is an issue. I have thought of all sorts of ideas, like getting the colour of the car and then the outline around that colour. So if the car (example) is yellow, when there is a yellow pixel in the image, trace around it. But this would fail if there are two yellow cars in a photo.
Ideally, I would like the system to be completely automated. But I guess I can't have everything my way. Also, my skills are in what I mentioned above (.NET 3.5, SQL Server, AJAX, web design) as opposed to C++ but I would be open to any solution just to see the feasibility.
I also found this patent: US Patent 7034848 - System and method for automatically cropping graphical images
Thanks
This is one of the problems that needed to be solved to finish the DARPA Grand Challenge. Google video has a great presentation by the project lead from the winning team, where he talks about how they went about their solution, and how some of the other teams approached it. The relevant portion starts around 19:30 of the video, but it's a great talk, and the whole thing is worth a watch. Hopefully it gives you a good starting point for solving your problem.
What you are talking about is an open research problem, or even several research problems. One way to tackle this, is by image segmentation. If you can safely assume that there is one object of interest in the image, you can try a figure-ground segmentation algorithm. There are many such algorithms, and none of them are perfect. They usually output a segmentation mask: a binary image where the figure is white and the background is black. You would then find the bounding box of the figure, and use it to crop. The thing to remember is that none of the existing segmentation algorithm will give you what you want 100% of the time.
Alternatively, if you know ahead of time what specific type of object you need to crop (car, person, motorcycle), then you can try an object detection algorithm. Once again, there are many, and none of them are perfect either. On the other hand, some of them may work better than segmentation if your object of interest is on very cluttered background.
To summarize, if you wish to pursue this, you would have to read a fair number of computer vision papers, and try a fair number of different algorithms. You will also increase your chances of success if you constrain your problem domain as much as possible: for example restrict yourself to a small number of object categories, assume there is only one object of interest in an image, or restrict yourself to a certain type of scenes (nature, sea, etc.). Also keep in mind, that even the accuracy of state-of-the-art approaches to solving this type of problems has a lot of room for improvement.
And by the way, the choice of language or platform for this project is by far the least difficult part.
A method often used for face detection in images is through the use of a Haar classifier cascade. A classifier cascade can be trained to detect any objects, not just faces, but the ability of the classifier is highly dependent on the quality of the training data.
This paper by Viola and Jones explains how it works and how it can be optimised.
Although it is C++ you might want to take a look at the image processing libraries provided by the OpenCV project which include code to both train and use Haar cascades. You will need a set of car and non-car images to train a system!
Some of the best attempts I've see of this is using a large database of images to help understand the image you have. These days you have flickr, which is not only a giant corpus of images, but it's also tagged with meta-information about what the image is.
Some projects that do this are documented here:
http://blogs.zdnet.com/emergingtech/?p=629
Start with analyzing the images yourself. That way you can formulate the criteria on which to match the car. And you get to define what you cannot match.
If all cars have the same background, for example, it need not be that complex. But your example states a car on a street. There may be parked cars. Should they be recognized?
If you have access to MatLab, you could test your pattern recognition filters with specialized software like PRTools.
Wwhen I was studying (a long time ago:) I used Khoros Cantata and found that an edge filter can simplify the image greatly.
But again, first define the conditions on the input. If you don't do that you will not succeed because pattern recognition is really hard (think about how long it took to crack captcha's)
I did say photo, so this could be a black car with a black background. I did think of specifying the colour of the object, and then when that colour is found, trace around it (high level explanation). But, with a black object in a black background (no constrast in other words), it would be a very difficult task.
Better still, I've come across several sites with 3d models of cars. I could always use this, stick it into a 3d model, and render it.
A 3D model would be easier to work with, a real world photo much harder. It does suck :(
If I'm reading this right... This is where AI shines.
I think the "simplest" solution would be to use a neural-network based image recognition algorithm. Unless you know that the car will look the exact same in each picture, then that's pretty much the only way.
If it IS the exact same, then you can just search for the pixel pattern, and get the bounding rectangle, and just set the image border to the inner boundary of the rectangle.
I think that you will never get good results without a real user telling the program what to do. Think of it this way: how should your program decide when there is more than 1 interesting object present (for example: 2 cars)? what if the object you want is actually the mountain in the background? what if nothing of interest is inside the picture, thus nothing to select as the object to crop out? etc, etc...
With that said, if you can make assumptions like: only 1 object will be present, then you can have a go with using image recognition algorithms.
Now that I think of it. I recently got a lecture about artificial intelligence in robots and in robotic research techniques. Their research went on about language interaction, evolution, and language recognition. But in order to do that they also needed some simple image recognition algorithms to process the perceived environment. One of the tricks they used was to make a 3D plot of the image where x and y where the normal x and y axis and the z axis was the brightness of that particular point, then they used the same technique for red-green values, and blue-yellow. And lo and behold they had something (relatively) easy they could use to pick out the objects from the perceived environment.
(I'm terribly sorry, but I can't find a link to the nice charts they had that showed how it all worked).
Anyway, the point is that they were not interested (that much) in image recognition so they created something that worked good enough and used something less advanced and thus less time consuming, so it is possible to create something simple for this complex task.
Also any good image editing program has some kind of magic wand that will select, with the right amount of tweaking, the object of interest you point it on, maybe it's worth your time to look into that as well.
So, it basically will mean that you:
have to make some assumptions, otherwise it will fail terribly
will probably best be served with techniques from AI, and more specifically image recognition
can take a look at paint.NET and their algorithm for their magic wand
try to use the fact that a good photo will have the object of interest somewhere in the middle of the image
.. but i'm not saying that this is the solution for your problem, maybe something simpler can be used.
Oh, and I will continue to look for those links, they hold some really valuable information about this topic, but I can't promise anything.