I'm writing an image processing application which recognizes objects based on their shapes. The issue that I'm facing is that since one object can be composed of one or more subobjects eg. human face is an object which is composed of eyes, nose and mouth.
Applying image segmentation creates separate objects but does not tells whether one object is inside another object.
How can I check whether an object is contained inside another object effeciently.
For now my algoirthm is wat I would call 8 point test in which u chose 8 points at 8 corners and check whther all of them are inside the object.If they are in then u can more quite certain that entire object is inside another object... But it has got certain limitation or certain areas of failure...
Also just because inner object is inside another object means I should treat them to part of outer object????
One way to test whether one object is fully inside another is to convert both into binary masks using poly2mask (in case they aren't binary masks already), and to test that all pixels of one object are part of the other object.
%# convert object 1 defined by points [x1,y1] into mask
msk1 = poly2mask(x1,y1,imageSizeX,imageSizeY);
%# do the same for object 2
msk2 = poly2mask(x2,y2,imageSizeX,imageSizeY);
%# check whether object 1 is fully inside object 2
oneInsideTwo = all(msk2(msk1));
However, is this really necessary? The eyes should always be close to the center of the face, and thus, the 8-point-method should be fairly robust at identifying whether you found an eye that is part of the face or whether it is a segmentation artifact.
Also, if an eye is on a face, then yes, you would consider it as part of that face - unless you're analyzing pictures of people that are eating eyes, in which case you'd have to test whether the eye is in roughly the right position on the face.
In sum, the answer to your questions is a big "depends on the details of your application".
Related
I have a problem where I want to rotate an Actor about an arbitrary vector, I wonder if there's a standard way of using Blueprints to achieve that, in case I have the vector's coordinates. I didn't find anything useful online.
One more smaller issue I encountered, regarding the extraction of that vector:
Is there a way to extract world coordinates of some key-points of an Actor using Blueprints or the UE4 interface?
For example, given a door frame which is rotated 5 degrees around the X axis, can I extract the world coordinates of one of its corners using simple tools such as Blueprints or the interface?
Assuming you are rotating an actor (in this case, by your example that actor should be the door) you can take multiple approaches, but I'll list only three:
Option 1
First, define a Socket in the door mesh, in the position you want to obtain. Then, get its current position with a GetSocketLocation node.
Option 2
If your door is intended to be a blueprint and you need to get a specific point, you can define a Scene Component in that Blueprint in the specific position you want it and then create a function that returns the World Location of that component. This is particularly useful if that position can change in time.
Option 3
Simply have a Vector parameter defining the offset of your given point in the actor's local space. You'll still need a function to translate that offset from Local to World or World to Local, highly depending on your approach.
With your given context, this is a way to interpret your situation.
I have the following questions:
What is the algorithm that bwareafilt uses?
Weird behaviour: When the input matrix is totally black, I get following error
Error using bwpropfilt (line 73)
Internal error: p must be positive.
Error in bwareafilt (line 33)
bw2 = bwpropfilt(bw, 'area', p, direction, conn);
Error in colour_reception (line 95)
Iz=bwareafilt(b,1);
Actually, I am using this function to take snapshots from a webcam, but when I block my webcam totally, then I get above following error.
So I believe it is an error due to some internal implementation mistake. Is this the case? How do I overcome this?
Let's answer your questions one at a time:
What algorithm does bwareafilt use?
bwareafilt is a function from the image processing toolbox that accepts a binary image and determines unique objects in this image. To find unique objects, a connected components analysis is performed where each object is assigned a unique ID. You can think of this as performing a flood fill on each object individually. A flood fill can be performed using a variety of algorithms - among them is depth-first search where you can consider an image as a graph where edges are connected to each pixel. Flood fill in this case visits all of the pixels that are connected to each other until you don't have any more pixels to visit and that are localized within this object. You then proceed to the next object and repeat the same algorithm until you run out of objects.
After, it determines the "area" for each object by counting how many pixels belong to that object. Once we determine the area for each object, we can either output an image that retains the top n objects or filter the image so that only those objects that are within a certain range of areas get retained.
Given your code above, you are trying to output an image that is the largest object in the binary image. Therefore, you are using the former, not the latter where n=1.
Weird behaviour with bwareafilt
Given the above description of bwareafilt and your intended application:
Actually, I am using this function to take snapshots from a webcam, but when I block my webcam totally, then I get above following error.
... the error is self-explanatory. When you cover the webcam, the entire frame is black and there are no objects that are found in the image. Because there are no objects in the image, returning the object with the largest area makes no sense because there are no objects to return to begin with. That's why you are getting the error because you are trying to make bwareafilt return an image with the largest object but there aren't any objects in your image to begin with.
As such, if you want to use bwareafilt, what I suggest is you check to see if the entire image is black first. If it isn't black, then go ahead and use bwareafilt. If it is, then skip it.
Do something like this, assuming that b is the image you're trying to process:
if any(b(:))
Iz = bwareafilt(b, 1);
else
Iz = b;
end
The above code uses any to check to see if there are any white pixels in your image b that are non-zero. If there are, then bwareafilt should be appropriately called. If there aren't any white pixels in the image, then simply set the output to be what b originally was (which is a dark image anyway).
You can add Conditions to make your function robust to any inputs , for exemple by ading a simple condition to first treat the input image if it is all black or not, based on the condition yo use your function to filter objects.
Hy!
I am working with huge vertice objects, I am able to show lots of modells, because I have split them into smaller parts(Under 65K vertices). Also I am using three js cameras. I want to increase the performance by using a priority queue, and when the user moving the camera show only the top 10, then when the moving stop show the rest. This part is not that hard, but I dont want to put modells to render, when they are behind another object, maybe send out some Rays from the view of the camera(checking the bounding box hit) and according hit list i can build the prior queue.
What do you think?
Also how can I detect if I can load the next modell or not.(on the fly)
Option A: Occlusion culling, you will need to find a library for this.
Option B: Use a AABB Plane test with camera Frustum planes and object bounding box, this will tell you if an object is in cameras field of view. (not necessarily visible behind object, as such a operation is impossible, this mostly likely already done to a degree with webgl)
Implementation:
Google it, three js probably supports this
Option C: Use a max object render Limit, prioritized based on distance from camera and size of object. Eg Calculate which objects are visible(Option B), then prioritize the closest and biggest ones and disable the rest.
pseudo-code:
if(object is in frustum ){
var priority = (bounding.max - bounding.min) / distanceToCamera
}
Make sure your shaders are only doing one pass. As that will double the calculation time(roughly depending on situation)
Option D: raycast to eight corners of bounding box if they all fail don't render
the object. This is pretty accurate but by no means perfect.
Option A will be the best for sure, Using Option C is great if you don't care that small objects far away don't get rendered. Option D works well with objects that have a lot of verts, you may want to raycast more points of the object depending on the situation. Option B probably won't be useful for your scenario, but its a part of c, and other optimization methods. Over all there has never been an extremely reliable and optimal way to tell if something is behind something else.
i want to identify a ball in the picture. I am thiking of using sobel edge detection algorithm,with this i can detect the round objects in the image.
But how do i differentiate between different objects. For example, a foot ball is there in one picture and in another picture i have a picture of moon.. how to differentiate what object has been detected.
When i use my algorithm i get ball in both the cases. Any ideas?
Well if all the objects you would like to differentiate are round, you could even use a hough transformation for round objects. This is a very good way of distinguishing round objects.
But your basic problem seems to be classification - sorting the objects on your image into different classes.
For this you don't really need a Neural Network, you could simply try with a Nearest Neighbor match. It's functionalities are a bit like neural networks since you can give it several reference pictures where you tell the system what can be seen there and it will optimize itself to the best average values for each attribute you detected. By this you get a dictionary of clusters for the different types of objects.
But for this you'll of course first need something that distinguishes a ball from a moon.
Since they are all real round objects (which appear as circles) it will be useless to compare for circularity, circumference, diameter or area (only if your camera is steady and if you know a moon will always have the same size on your images, other than a ball).
So basically you need to look inside the objects itself and you can try to compare their mean color value or grayscale value or the contrast inside the object (the moon will mostly have mid-gray values whereas a soccer ball consists of black and white parts)
You could also run edge filters on the segmented objects just to determine which is more "edgy" in its texture. But for this there are better methods I guess...
So basically what you need to do first:
Find several attributes that help you distinguish the different round objects (assuming they are already separated)
Implement something to get these values out of a picture of a round object (which is already segmented of course, so it has a background of 0)
Build a system that you feed several images and their class to have a supervised learning system and feed it several images of each type (there are many implementations of that online)
Now you have your system running and can give other objects to it to classify.
For this you need to segment the objects in the image, by i.e Edge filters or a Hough Transformation
For each of the segmented objects in an image, let it run through your classification system and it should tell you which class (type of object) it belongs to...
Hope that helps... if not, please keep asking...
When you apply an edge detection algorithm you lose information.
Thus the moon and the ball are the same.
The moon has a diiferent color, a different texture, ... you can use these informations to differnentiate what object has been detected.
That's a question in AI.
If you think about it, the reason you know it's a ball and not a moon, is because you've seen a lot of balls and moons in your life.
So, you need to teach the program what a ball is, and what a moon is. Give it some kind of dictionary or something.
The problem with a dictionary of course would be that to match the object with all the objects in the dictionary would take time.
So the best solution would probably using Neural networks. I don't know what programming language you're using, but there are Neural network implementations to most languages i've encountered.
You'll have to read a bit about it, decide what kind of neural network, and its architecture.
After you have it implemented it gets easy. You just give it a lot of pictures to learn (neural networks get a vector as input, so you can give it the whole picture).
For each picture you give it, you tell it what it is. So you give it like 20 different moon pictures, 20 different ball pictures. After that you tell it to learn (built in function usually).
The neural network will go over the data you gave it, and learn how to differentiate the 2 objects.
Later you can use that network you taught, give it a picture, and it a mark of what it thinks it is, like 30% ball, 85% moon.
This has been discussed before. Have a look at this question. More info here and here.
I am still shiny new to XNA, so please forgive any stupid question and statements in this post (The added issue is that I am using Visual Studio 2010 with .Net 4.0 which also means very few examples exist out on the web - well, none that I could find easily):
I have two 2D objects in a "game" that I am using to learn more about XNA. I need to figure out when these two objects intersect.
I noticed that the Texture2D objects has a property named "Bounds" which in turn has a method named "Intersects" which takes a Rectangle (the other Texture2D.Bounds) as an argument.
However when you run the code, the objects always intersect even if they are on separate sides of the screen. When I step into the code, I noticed that for the Texture2D Bounds I get 4 parameters back when you mouse over the Bounds and the X, and Y coordinates always read "X = 0, Y = 0" for both objects (hence they always intersect).
The thing that confuses me is the fact that the Bounds property is on the Texture rather than on the Position (or Vector2) of the objects. I eventually created a little helper method that takes in the objects and there positions and then calculate whether they intersect, but I'm sure there must be a better way.
any suggestions, pointers would be much appreciated.
Gineer
The Bounds property was added to the Texture2D class to simplify working with Viewports. More here.
You shouldn't think of the texture as being the object itself, it's merely what holds the data that gets drawn to the screen, whether it's used for a Sprite or RenderTarget. The position of objects or sprites and how position/moving is handled is entirely up to you, so you have to track and handle this yourself. That includes the position of any bounds.
The 2D Rectangle Collision tutorial is a good start, as you've already found :)
I found the XNA Creator Club tutorials based on another post to stackoverflow by Ben S. The Collision Series 1: 2D Rectangle Collision tutorial explains it all.
It seems you have to create new rectangles, based on the original rectangles moving around in the game every time you try to run the intersection method, which then will contain the updated X and Y coordinates.
I am still not quite sure why the original object rectangles position can not just be kept up to date, but if this is the way it should work, that's good enough for me... for now. ;-)