I have implemented global illumination using the Monte Carlo method, using the scratch a pixel tutorial as a guide. My final image renders very noisy! The example below is at 64 samples, I have previously used as high as 512 and its still very noisy.
Any ideas what the problem could be?
Edit:
Here is the output with 128 samples and 16x Super sampling,resulting in 2048 samples. Still lots of noise!
Path tracing is pretty noisy; it's the nature of the algorithm. Consider this example from Wikipedia:
The top left image is at 1 sample per pixel, and from there (left to right, top to bottom), each following square doubles that. So the bottom right square is at 32768 spp.
There are other, related algorithms that can reduce noise for the same amount of computation:
Bidirectional path tracing
Photon mapping
Related
Without implementing openCV or calling QR code's recognition API, is there any quick and reliable algorithm to determine the existence of a QR code in an image?
The intention of this question is to improve the user experience of scanning QR code. When QR code's recognition fails, the program needs to know whether there really exists a QR code for it to scan and recognize QR code again or there is not any QR code so that the program can call other procedures.
To echo some response, the detection program doesn't need to be 100% accurate but returns an accurate result with reasonable probability. If we can use openCV here, Fourier Transformation will be easily implemented to detect whether there is an obvious high frequency in an image, which is a good sign of the existence of QR. But the integration of openCV will largely increase the size of my program, which I want to avoid.
It's great that you want to provide feedback to a user. Providing graphics that indicate the user is "getting warmer" in finding the QR code can make the process of finding and reading a code quicker and smoother.
It looks like you already have your answer, but to provide a more robust solution and/or have options, you might try one or more of the following:
Use N iterations to morph dark pixels closed, and the resulting squarish checkboard pattern should more closely resemble a filled square. This was part of a detection method I used to determine if a DataMatrix (a similar 2D code) was present whether it was readable or not. Whether this works will depend greatly on your background.
Before applying FFT, considering finding the affine transform to reduce perspective distortion. Analyzing FFT data can be a pain if the frequencies have a bit of spread because of foreshortening.
You could get some decent results using texture measures such as Local Binary Patterns (LBPs) or older techniques such as Law's Texture methods. You might even get lucky and be able to detect slight differences in the histogram of texture measures between a 2D code and a checkerboard pattern.
In regions of checkerboard-like patterns, look for the 3 guide features at the corners of the QR code. You could try SIFT/SURF-like methods, or perhaps implement a simpler match method by using a limited number of correlation templates that are tested in scale space.
Speaking of scale space: generate an image pyramid to save yourself the trouble of searching for squares in full-resolution images. You could try edge-preserving or non-edge-preserving methods to generate the smaller images in the pyramid, or perhaps a combination of both.
If you have code for fast kernel processing, you might try a corner detection method to reduce the amount of data you process to detect checkerboard-like patterns.
Look for clear bimodal distributions of grayscale values in squarish regions. 2D codes on paper labels tend to have stark contrast even though 2D codes on paper are quite readable at low contrast.
Rather than look for bimodal distribution of grayscale values, you could look for regions where gradient magnitudes are very consistent, nearly unimodal.
If you know the min/max area limits of a readable QR code, you could probabilistically sample the image for patches that match one or more of the above criteria: one mode of gradient magnitudes, nearly evenly space corner points, etc. If a patch does look promising, then jump to another random position with the caveat that the new patch was not previously found unpromising.
If you have the memory for an image pyramid, then working with reduced resolution images could be advantageous since you could try a number of tests fairly quickly.
As far as user interaction is concerned, you might also update the "this might be a QR code" graphic multiple times during pre-processing, and indicate degrees of confidence with progressively stronger/greener graphics (or whatever color is appropriate for the local culture). For example, if a patch of texture has a roughly 60% chance of being a QR code, you might display a thin yellowish-green rectangle with a dashed border. For an 80% - 90% likelihood you might display a solid rectangle of a more saturated green color. If you can update the graphics about every 100 - 200 milliseconds then a user will have some idea that some action such as moving the smart phone is helping or hurting.
1) convert the image into grayscale
2) divide the image into cells of n x m, say 3 x 3. This procedure intends to guarantee that at the least one cell will be fully covered by possible QR code if any
3) implement 2D Fourier Transformation for all the cells. If in any cell there is an significantly large value in high-frequency area in both X and Y axis, there is a high likelihood that there exists a QR code
I am addressing a probability issue rather than 100% accurate detection. In this algorithm, chessboard will be detected as QR code as well.
I want to write my own focus stacking software but haven't been able to find a suitable explanation of any algorithm for extracting the in-focus portions of each image in the stack.
For those who are not familiar with focus stacking, this Wikipedia article does a nice job of explaining the idea.
Can anyone point me in the right direction for finding an algorithm? Even some key words to search would be helpful.
I realise this is over a year old but for anyone who is interested...
I have had a fair bit of experience in machine vision and this is how I would do it:
Load every image in memory
Perform a Gaussian blur on each image on one of the channels (maybe Green):
The simplest Gaussian kernel is:
1 2 1
2 4 2
1 2 1
The idea is to loop through every pixel and look at the pixels immediately adjacent. The pixel that you are looping through is multiplied by 4, and the neighboring pixels are multiplied by whatever value corresponds to the kernel above.
You can make a larger Gaussian kernel by using the equation:
exp(-(((x*x)/2/c/c+(y*y)/2/c/c)))
where c is the strength of the blur
Perform a Laplacian Edge Detection kernel on each Gaussian Blurred image but do not apply a threshold
The simplest Laplacian operator is:
-1 -1 -1
-1 8 -1
-1 -1 -1
same deal as the Gaussian, slide the kernel over the entire image and generate a result.
An equation to work out larger kernels is here:
(-1/pi/c/c/c/c)*(1-(x*x+y*y)/2/c/c)*exp(-(x*x+y*y)/2/c/c)
Take the absolute value of the Laplacian of Gaussian result. this will quantify the strength of edges with respect to the size and strength of your kernel.
Now create a blank image, loop through each pixel and find the strongest edge in the LoG (i.e. the highest value in the image stack) and take the RGB value for that pixel from the corresponding image.
Here is an example in MATLAB that I have created:
http://www.magiclantern.fm/forum/index.php?topic=11886.0
You are free to use it for whatever you like. It will create a file called Outsharp.bmp which is what you are after.
To better your output image you could:
- Compensate for differences in lightness levels between images (i.e. histogram matching or simple level adjustment)
- Create a custom algorithm to reject image noise
- Manually adjust the stack after you have generated it
- Apply a Gaussian blur (be sure to divide the result by 16) on the focus map so that the individual images are better merged
Good luck!
Found this very interesting code on total variation filter tvmfilter
The additional functions this code uses are very confusing but the denoising is far better than all the filters i have tried so far
i have figured out the code on my own :)
His additional function "tv" denoises with the ROF model which has been a major research topic for two decades now. See http://www.ipol.im/pub/algo/g_tv_denoising/ for a summary of current methods.
Briefly, the idea behind ROF is to approximate the given noisy image with a piecewise constant image by solving an optimization which penalizes the total variation (ie l1-norm of the gradient) of the image.
The reason this performs well is that the other denoising methods you are probably working with denoise by smoothing the image via convolution with a Gaussian (ie penalizing the l2-norm of the gradient (ie solving the heat equation on the image) ). While fast to compute, denoising by smoothing blurs edges and thus results in poor image quality. l1-norm optimization preserves edges.
It's not clear how Guy solves the tv problem in that code you linked. He references the original ROF paper so it's possible that he's just using the original method (gradient descent) which is quite slow to converge. I suggest you give this code/paper a try: http://www.stanford.edu/~tagoldst/Tom_Goldstein/Split_Bregman.html as it's probably faster than the .m file you are using.
Also, as was mentioned in the comments, you will get better denoising (ie higher SNR) using nonlocal means. However, it will take much longer for the nonlocal means algorithm to work as it requires that you search the entire image for similar patches and compute weights based on them.
Fractals have always been a bit of a mystery for me.
What practical uses (beyond rendering to beautiful images) are there for fractals in the various programming problem domains? And please, don't just list areas that use them. I'm interested in specific algorithms and how fractals are used with those algorithms to solve something in practice. Please at least give a short description of the algorithm.
Absolutely computer graphics. It's not about generating beautiful abstract images, but realistic and not repeating landscapes. Read about Fractal Landscapes.
Perlin Noise, which might be considered a simple fractal is used in computer graphics everywhere. The author joked around that if he would patent it, he'd be a millionare now. Fractals are also used in animation and lossy image compression.
A Peano curve is a space-filling fractal, which allows you to cover a 2-D area (or higher-dimensional region) uniformly with a 1-D path. If you are doing local operations on a multidimensional array, storing and/or accessing the array data in space-filling curve order can increase your cache coherence, for all levels of cache.
Fractal image compression. There are some more applications thought not all in programming here.
Error diffusion along a Hilbert curve.
It's a simple idea - suppose that you convert an image to a 0-1 black & white bitmap. Converting a 55% brightness pixel to white yields a +45% error. Instead of just forgetting it, you keep the 45% to take into account when processing the next pixel. Suppose its value is 80%. Normally it would be converted to white, but a neighboring pixel is too bright, so taking the +45% error into account, you convert it to black (80%-45%=35%), keeping a -35% error to be spread into next pixels.
This way a 75% gray area will have white/black pixel ratio close to 75/25, which is good. But if you process the pixels left-to-right, the error only spreads in one direction, which yields worse looking images. Enter space-filling curves. Processing the pixels along a Hilbert curve gets good locality of the error spread. More here, with pictures.
Fractals are used in finance for analyzing the prices of stock. The are also used in the study of complex systems (complexity theory) and in art.
One can use computer science algorithms to compute the fractal dimension, or Haussdorff dimension of black-and-white images.
It is not that difficult to implement.
It turns out that this is used in biology and medicine to analyze cell samples, for example, analyze how aggressive a cancer cell is, or how far a disease have gone. A cell is in general more healthy the higher the dimension is, meaning you wish for low fractal dimension for cancer samples.
Another uses of fractal theory is fractal image interpolation. For example, Perfect Resize 7 is using fractals to resize images with very good quality. They are, most likely, using partition iterated function systems (PIFS), that assume that different parts of an image are self-similar to each other. The algorithm is based on searching of self-similar parts of an image and describing transformation between them.
used in image compression, any mobile phone, the antenna chip design is a fractal for maximum surface area, texture generation, mountain generation, understanding trees, cliffs, jellyfish, emulating any natural phenomena where there is a degree of recursion and self similarity at different scales. a lot of scientific applications.
How do I segment a 2D image into blobs of similar values efficiently? The given input is a n array of integer, which includes hue for non-gray pixels and brightness of gray pixels.
I am writing a virtual mobile robot using Java, and I am using segmentation to analyze the map and also the image from the camera. This is a well-known problem in Computer Vision, but when it's on a robot performance does matter so I wanted some inputs. Algorithm is what matters, so you can post code in any language.
Wikipedia article: Segmentation (image processing)
[PPT] Stanford CS-223-B Lecture 11 Segmentation and Grouping (which says Mean Shift is perhaps the best technique to date)
Mean Shift Pictures (paper is also available from Dorin Comaniciu)
I would downsample,in colourspace and in number of pixels, use a vision method(probably meanshift) and upscale the result.
This is good because downsampling also increases the robustness to noise, and makes it more likely that you get meaningful segments.
You could use floodfill to smooth edges afterwards if you need smoothness.
Some more thoughts (in response to your comment).
1) Did you blend as you downsampled? y[i]=(x[2i]+x[2i+1])/2 This should eliminate noise.
2)How fast do you want it to be?
3)Have you tried dynamic meanshift?(also google for dynamic x for all algorithms x)
Not sure if it is too efficient, but you could try using a Kohonen neural network (or, self-organizing map; SOM) to group the similar values, where each pixel contains the original color and position and only the color is used for the Kohohen grouping.
You should read up before you implement this though, as my knowledge of the Kohonen network goes as far as that it is used for grouping data - so I don't know what the performance/viability options are for your scenario.
There are also Hopfield Networks. They can be mangled into grouping from what I read.
What I have now:
Make a buffer of the same size as the input image, initialized to UNSEGMENTED.
For each pixel in the image where the corresponding buffer value is not UNSEGMENTED, flood the buffer using the pixel value.
a. The border checking of the flooding is done by checking if pixel is within EPSILON (currently set to 10) of the originating pixel's value.
b. Flood filling algorithm.
Possible issue:
The 2.a.'s border checking is called many times in the flood filling algorithm. I could turn it into a lookup if I could precalculate the border using edge detection, but that may add more time than current check.
private boolean isValuesCloseEnough(int a_lhs, int a_rhs) {
return Math.abs(a_lhs - a_rhs) <= EPSILON;
}
Possible Enhancement:
Instead of checking every single pixel for UNSEGMENTED, I could randomly pick a few points. If you are expecting around 10 blobs, picking random points in that order may suffice. Drawback is that you might miss a useful but small blob.
Check out Eyepatch (eyepatch.stanford.edu). It should help you during the investigation phase by providing a variety of possible filters for segmentation.
An alternative to flood-fill is the connnected-components algorithm. So,
Cheaply classify your pixels. e.g. divide pixels in colour space.
Run the cc to find the blobs
Retain the blobs of significant size
This approach is widely used in early vision approaches. For example in the seminal paper "Blobworld: A System for Region-Based Image Indexing and Retrieval".