I would like to apply some recursive mathematical operation on an arbitrary photo or image, which yield an interesting animation. I mean the first frame is the original picture and then the pixels are transformed in a way which leads to an interesting animation. For example a fractal animation or diffusion as particles or something similar. What approaches should I follow, what softwares to apply?
This is a very broad question, and it's hard to answer general "how do I do this" type questions. Stack Overflow is designed more for specific "I tried X, expected Y, but got Z instead" type questions. That being said, I'll try to answer in a general sense:
Step 1: Download Processing. You've tagged this with the processing tag, which is for questions about the Processing language. I can't tell whether that was on purpose, but it doesn't matter since Processing happens to be perfect for this task.
Step 2: Create a simple sketch that displays a single image, without any effects or animation. You need to break your problem down into smaller steps.
Step 3: Create a function that takes a PImage or a PGraphics and does the image manipulation for a single iteration of your algorithm. There are plenty of useful functions in the reference for manipulating the pixels in an image.
Step 4: Start with your initial image, and display that for X frames. Then pass that image into the function you created in step 3, and display the result of that for X frames. Then pass the update image back into your function, and repeat that process as many times as you want.
You could also generate every single frame ahead of time. Completely up to you.
There are a bunch of ways to do this, but those steps are the general outline. Please try something out, and break your problem down into smaller steps. Don't try to tackle your whole goal at one time. Instead, ask yourself: what's the smallest, simplest, easiest thing I know I have to do next?
If you get stuck, post an MCVE (note: not your whole sketch, just a small example that shows the problem you're stuck on) and a specific question, and we'll go from there. Good luck.
Related
I'm thinking of making an online version of the game Sprouts, possibly using the JavaScript web browser graphics library p5.js
You can read more about it but basically there are 2 players that draw lines with their mouse between points. The lines can be straight or curved in any way. One of the rules is that no 2 lines can cross.
I haven't started making the game yet, but planning it ahead, everything seems relatively simple to accomplish except for one problem:
When a user draws a line, I need to figure out if this line intersects with another line. However, since these lines aren't linear or exactly mathematical in any way I'm used to, there doesn't seem to be a simple mathematical way I can check for intersection.
How would I check if two such lines, given that I know the location of every pixel on the lines, cross?
I apologize for no code, I haven't yet started it. If you wish to include code in your answer, you can use psuedocode or any gui programming that you might be familiar with. However, I would prefer a purely hypothetical answer, since everything is at this stage.
Here are some ideas I have:
For each pixel on the line, I could check if any of the other lines has a pixel of the same position, in which case they overlap. This seems inefficient, so the below point is something else I came up with that is more efficient but less rigid and reliable.
Before drawing the line, If you make sure all the lines are one color, for every pixel on the line, you could check if this pixel is already colored in the same color as the color of the lines, using something like getPixel() If so, abort drawing the line. This solutions seems prone to many problems and a bit fragile.
These two solutions either trade efficiency for reliability or vice-versa. Are there any other solutions you know? Keep in mind this will run in a browser, so efficiency is important.
Keep in mind this will run in a browser, so efficiency is important.
You need to give yourself a better idea of what kind of "efficiency" is important to you and your users. Both approaches you outlined seem reasonable to me. I wouldn't assume that a solution is inefficient before you try it and measure its performance.
Taking a step back, in general you're going to need to store the lines in some kind of data structure. You said the lines are not mathematical, but you can break the lines down into individual line segments or points, which are mathematical. That could be an array of line segments, or a 2D array of boolean values, or a map of points, or a quadtree. There are many options. Then you need to check for collisions between those lines or points and the new lines or points added by the other player.
Another option to consider is decreasing the resolution of your input space. For example, maybe your game window is 500x500 pixels, but you really only need the game board to be 100x100 possible point positions. You could scale that 100x100 game board up so it's displayed at 500x500. This would improve the "efficiency" of whatever solution you come up with.
But again, I wouldn't worry about "efficiency" at this point. Either solution you mentioned seems fine. Get that working and then iterate on it if you notice a problem. Good luck.
May be this article from Mike Bostock about the Sutherland-Hodgman algorythm can interest you. It is more related to the intersection of 2 polygons rather than 2 lines but may be it can be adapted to your problem.
By collision avoidance, i mean stopping the player from walking through something. Like in mario, he can't just walk through the blocks. I technically succeeded in making this, but it's very very bad. The player often gets stuck on the block once the player hits it, and i cant figure out how to fix it. I put all the code together on an online p5.js editor, Here
In the code i linked, I'm trying to get the player to not go through any of the terrain constructions i make, the one i have currently set up is a red square named 'block1'
Stack Overflow isn't really designed for general "how do I do this" type questions. It's for specific "I tried X, expected Y, but got Z instead" type questions. But I'll try to help in a general sense.
You need to take a step back and break your problem down into smaller steps and then take those steps on one at a time. Can you get a simple program working that just shows two hard-coded rectangles that change color if they're colliding? Get that working perfectly before moving on.
Shameless self-promotion: I wrote a tutorial on collision detection available here. That's for regular Processing, but everything should be basically the same for P5.js. Generally you probably want to use grid-based collision detection to figure out which cell your player is in, and then use rectangle-rectangle collision detection to actually check whether the player has hit a block.
If you're having trouble, please debug your code and try to narrow your problem down to a MCVE and ask a specific technical question. Good luck.
I'm trying to change an int variable using a hand gesture. I'm trying to turn on the webcam in processing and if you swing your hand to the left the value changes and when you swing your hand to the right the value changes again. Any ideas on how to go about this?
This question is pretty vague for StackOverflow. It's hard to answer general "how do I do this" type questions. It's much easier to answer specific "I tried X, expected Y, but got Z instead" type questions. That being said, I'll try to answer in a general sense.
Break your problem down into smaller steps.
Step 1: Can you write a simple sketch that simply displays the stream from a camera? The video library might be able to help you with that.
Step 2: Once you have that working, can you write code that tracks your hand? Try just drawing a dot on top of your hand. You might use OpenCV for this step.
Step 3: If you have code that can track your hand, then you can keep track of its position. You'll know when the position moves to the left or right, and you can take whatever action you want.
You might have to break these steps down into even smaller sub-steps. If you get stuck, then you can post an MCVE and a specific technical question, and we can go from there. Good luck.
I want to remove background and get deer as a foreground image.
This is my source image captured by trail camera:
This is what I want to get. This output image can be a binary image or RGB.
I worked on it and try many methods to get solution but every time it failed at specific point. So please first understand what is my exact problem.
Image are captured by a trail camera and camera is motion detector. when deer come in front of camera it capture image.
Scene mode change with respect to weather changing or day and night etc. So I can't use frame difference or some thing like this.
Segmentation may be not work correctly because Foreground (deer) and Background have same color in many cases.
If anyone still have any ambiguity in my question then please first ask me to clear and then answer, it will be appreciated.
Thanks in advance.
Here's what I would do:
As was commented to your question, you can detect the dear and then perform grabcut to segment it from the picture.
To detect the dear, I would couple a classifier with a sliding window approach. That would mean that you'll have a classifier that given a patch (can be a large patch) in the image, output's a score of how much that patch is similar to a dear. The sliding window approach means that you loop on the window size and then loop on the window location. For each position of the window in the image, you should apply the classifier on that window and get a score of how much that window "looks like" a dear. Once you've done that, threshold all the scores to get the "best windows", i.e. the windows that are most similar to a dear. The rational behind this is that if we a dear is present at some location in the image, the classifier will output a high score at all windows that are close/overlap with the actual dear location. We would like to merge all that locations to a single location. That can be done by applying the functions groupRectangles from OpenCV:
http://docs.opencv.org/modules/objdetect/doc/cascade_classification.html#grouprectangles
Take a look at some face detection example from OpenCV, it basically does the same (sliding window + classifier) where the classifier is a Haar cascade.
Now, I didn't mention what that "dear classifier" can be. You can use HOG+SVM (which are both included in OpenCV) or use a much powerful approach of running a deep convulutional neural network (deep CNN). Luckily, you don't need to train a deep CNN. You can use the following packages with their "off the shelf" ImageNet networks (which are very powerful and might even be able to identify a dear without further training):
Decaf- which can be used only for research purposes:
https://github.com/UCB-ICSI-Vision-Group/decaf-release/
Or Caffe - which is BSD licensed:
http://caffe.berkeleyvision.org/
There are other packages of which you can read about here:
http://deeplearning.net/software_links/
The most common ones are Theano, Cuda ConvNet's and OverFeat (but that's really opinion based, you should chose the best package from the list that I linked to).
The "off the shelf" ImageNet network were trained on roughly 10M images from 1000 categories. If those categories contain "dear", that you can just use them as is. If not, you can use them to extract features (as a 4096 dimensional vector in the case of Decaf) and train a classifier on positive and negative images to build a "dear classifier".
Now, once you detected the dear, meaning you have a bounding box around it, you can apply grabcut:
http://docs.opencv.org/trunk/doc/py_tutorials/py_imgproc/py_grabcut/py_grabcut.html
You'll need an initial scribble on the dear to perform grabcu. You can just take a horizontal line in the middle of the bounding box and hope that it will be on the dear's torso. More elaborate approaches would be to find the symmetry axis of the dear and use that as a scribble, but you would have to google, research an implement some method to extract symmetry axis from the image.
That's about it. Not straightforward, but so is the problem.
Please let me know if you have any questions.
Try OpenCV Background Substraction with Mixture of Gaussians models. They should be adaptable enough for your scenes. Of course, the final performance will depend on the scenario, but it is worth trying.
Since you just want to separate the background from the foreground I think you do not need to recognize the deer. You need to recognize an object in motion in the scene. You just need to separate what is static in a significant interval of time (background) from what is not static: the deer.
There are algorithms that combine multiple frames from the same scene in order to determine the background, like THIS ONE.
You mentioned that the scene mode changes with respect to weather changing or day and night considering photos of different deers.
You could implement a solution when motion is detected, instead of taking a single photo, it could take a few ones with some interval of time.
This interval has to be long as to get the deer in different positions or out of the scene and at the same time short enough to not be much affected by scene variations. Perhaps you need to deal with some brightness variation, but I think it is feasible to determine the background using these frames and finally segment the deer in the "motion frame".
I 'm trying to find an efficient way of acceptable complexity to
detect an object in an image so I can isolate it from its surroundings
segment that object to its sub-parts and label them so I can then fetch them at will
It's been 3 weeks since I entered the image processing world and I've read about so many algorithms (sift, snakes, more snakes, fourier-related, etc.), and heuristics that I don't know where to start and which one is "best" for what I'm trying to achieve. Having in mind that the image dataset in interest is a pretty large one, I don't even know if I should use some algorithm implemented in OpenCV or if I should implement one my own.
Summarize:
Which methodology should I focus on? Why?
Should I use OpenCV for that kind of stuff or is there some other 'better' alternative?
Thank you in advance.
EDIT -- More info regarding the datasets
Each dataset consists of 80K images of products sharing the same
concept e.g. t-shirts, watches, shoes
size
orientation (90% of them)
background (95% of them)
All pictures in each datasets look almost identical apart from the product itself, apparently. To make things a little more clear, let's consider only the 'watch dataset':
All the pictures in the set look almost exactly like this:
(again, apart form the watch itself). I want to extract the strap and the dial. The thing is that there are lots of different watch styles and therefore shapes. From what I've read so far, I think I need a template algorithm that allows bending and stretching so as to be able to match straps and dials of different styles.
Instead of creating three distinct templates (upper part of strap, lower part of strap, dial), it would be reasonable to create only one and segment it into 3 parts. That way, I would be confident enough that each part was detected with respect to each other as intended to e.g. the dial would not be detected below the lower part of the strap.
From all the algorithms/methodologies I've encountered, active shape|appearance model seem to be the most promising ones. Unfortunately, I haven't managed to find a descent implementation and I'm not confident enough that that's the best approach so as to go ahead and write one myself.
If anyone could point out what I should be really looking for (algorithm/heuristic/library/etc.), I would be more than grateful. If again you think my description was a bit vague, feel free to ask for a more detailed one.
From what you've said, here are a few things that pop up at first glance:
Simplest thing to do it binarize the image and do Connected Components using OpenCV or CvBlob library. For simple images with non-complex background this usually yeilds objects
HOwever, looking at your sample image, texture-based segmentation techniques may work better - the watch dial, the straps and the background are wisely variant in texture/roughness, and this could be an ideal way to separate them.
The roughness of a portion can be easily found by the Eigen transform (explained a bit on SO, check the link to the research paper provided there), then the Mean Shift filter can be applied on the output of the Eigen transform. This will give regions clearly separated according to texture. Both the pyramidal Mean Shift and finding eigenvalues by SVD are implemented in OpenCV, so unless you can optimize your own code its better (and easier) to use inbuilt functions (if present) as far as speed and efficiency is concerned.
I think I would turn the problem around. Instead of hunting for the dial, I would use a set of robust features from the watch to 'stitch' the target image onto a template. The first watch has a set of squares in the dial that are white, the second watch has a number of white circles. I would per type of watch:
Segment out the squares or circles in the dial. Segmentation steps can be tricky as they are usually both scale and light dependent
Estimate the centers or corners of the above found feature areas. These are the new feature points.
Use the Hungarian algorithm to match features between the template watch and the target watch. Alternatively, one can take the surroundings of each feature point in the original image and match these using cross correlation
Use matching features between the template and the target to estimate scaling, rotation and translation
Stitch the image
As the image is now in a known form, one can extract the regions simply via pre set coordinates