I want to store the intensity of each pixel of an image in a n*n matrix. I am currently storing it in a vector. But for extremely large dimensions the program crashes as it runs out of memory. How do i solve this problem?
If your RAM is too small to hold all the information, you will need to use other means of storage. Maybe swapping to your harddisk. What kind of information is the intensity? A floating number? How many Pixels do your large images have? I think that your storage class simply creates too much overhead. Which language are you using? Can you supply some code snippets?
Related
This is more like something that I would like to discuss with the community rather than something that I am seeking for an absolute answer.
I am trying to implement the GMM based background subtraction algorithm from scratch. Apparently, OpenCV already had it well-implemented (the MOG2). I am still trying to implement it from scratch as I would like to test some of the parameters that the OpenCV does not provide access to. However, my implementation was super slow when running on 4k images and took a huge amount of memory while OpenCV can achieve about 5-10 images per second or even faster and does not take much memory. I am NOT surprised that the OpenCV was much faster than mine but still curious about how it was achieved.
So here are my thoughts:
The GMM approach is to build a mixture of Gaussians to describe the background/foreground for each pixel. That been said, each pixel will have 3-5 associated 3-dimensional Gaussian components. We can simplify the computation by using a shared variance for different channels instead of the covariance. Then we should have at least 3 means, 1 variance, and 1 weight parameters for each Gaussian component. If we assume each pixel would maintain 3 components. This would be roughly 4000*2000*3*(3+1+1) parameters when reading an image.
The computation for updating the GMM, although it is not very complex for a single pixel, the total amount of time for computing the whole 4000*2000 pixels should still be very expensive.
I don't think the OpenCV MOG2 was accelerated by CUDA as I tested on my mac without a graphic card. The speed was still fast.
So my question is:
Does the OpenCV compress the image before feeding it into the model and decompress the results at return?
Is it possible to achieve near real-time processing for 4k images (without image compression) with parallelization on CPU?
My implementation used 4000*2000 double linked lists for maintaining the Gaussian Components for the 4k images. I was expecting that it should save me some memory, but the memory still exploded when I tested it on the 4k image.
Plus:
I did test the OpenCV MOG2 on the resized image ((3840, 2160) down to (384, 216)) and the detection seems acceptable.
This might be a weird question... But I would appreciate any opinions on it.
I'm looking for a way to store a huge image in a way that allows random access to small regions without the need to load the whole image into memory. The focus is on speed and on a low memory footprint.
Let's say the image is 50k*50k 32-bit-pixels, amounting to ~9 GB (Currently it is stored as a single tiff file). One thing I found looks quite promising: HDF5 allows me to store and randomly access (integer) arrays of arbitrary size and dimension. I could put my image's pixels into such a database.
What (other) (simpler?) methods for efficient random pixel access are out there?
I would really like to store the image as a whole and avoid creating tiles for several reasons: The requested image region dimensions are not constant/predictable. I want to only load the pixels into memory that I really need. I would like to avoid the processing effort involved in tiling the image.
Does anyone has recommendation of data structures for relative large maps with high resolution, something like 400mile x 400mile with 10-15ft resolution. Using 2D array, that would be roughly 2Mx2M cells.
The map only needs to store the elevation and terrain (earth, water, rock, etc.), and I don't think storing tiles is a good strategy.
Thank you!
It depends on what you need to do with it: view it, store it, analyze it, etc...
One thing I can say, however, is that that file will be HUGE at your stated resolution, and you should consider splitting it up into at least a few tiles, even better at 1x1 mile tiles.
The list of raster formats supported by GDAL could serve as a good starting point for exploring various formats, keeping in mind that many software packages (GRASS, ArcGIS, etc. use GDAL to read and write most raster formats). Note also that some file formats have maximum sizes which may prevent you from using them with your very large file.
For analysis and non-viewable storage, HDF5 format might be of interest.
If you want people to see the data as a map over the web, then creating small image tile overlays will be the fastest approach to sharing such a large dataset.
Sometimes two image files may be different on a file level, but a human would consider them perceptively identical. Given that, now suppose you have a huge database of images, and you wish to know if a human would think some image X is present in the database or not. If all images had a perceptive hash / fingerprint, then one could hash image X and it would be a simple matter to see if it is in the database or not.
I know there is research around this issue, and some algorithms exist, but is there any tool, like a UNIX command line tool or a library I could use to compute such a hash without implementing some algorithm from scratch?
edit: relevant code from findimagedupes, using ImageMagick
try $image->Sample("160x160!");
try $image->Modulate(saturation=>-100);
try $image->Blur(radius=>3,sigma=>99);
try $image->Normalize();
try $image->Equalize();
try $image->Sample("16x16");
try $image->Threshold();
try $image->Set(magick=>'mono');
($blob) = $image->ImageToBlob();
edit: Warning! ImageMagick $image object seems to contain information about the creation time of an image file that was read in. This means that the blob you get will be different even for the same image, if it was retrieved at a different time. To make sure the fingerprint stays the same, use $image->getImageSignature() as the last step.
findimagedupes is pretty good. You can run "findimagedupes -v fingerprint images" to let it print "perceptive hash", for example.
Cross-correlation or phase correlation will tell you if the images are the same, even with noise, degradation, and horizontal or vertical offsets. Using the FFT-based methods will make it much faster than the algorithm described in the question.
The usual algorithm doesn't work for images that are not the same scale or rotation, though. You could pre-rotate or pre-scale them, but that's really processor intensive. Apparently you can also do the correlation in a log-polar space and it will be invariant to rotation, translation, and scale, but I don't know the details well enough to explain that.
MATLAB example: Registering an Image Using Normalized Cross-Correlation
Wikipedia calls this "phase correlation" and also describes making it scale- and rotation-invariant:
The method can be extended to determine rotation and scaling differences between two images by first converting the images to log-polar coordinates. Due to properties of the Fourier transform, the rotation and scaling parameters can be determined in a manner invariant to translation.
Colour histogram is good for the same image that has been resized, resampled etc.
If you want to match different people's photos of the same landmark it's trickier - look at haar classifiers. Opencv is a great free library for image processing.
I don't know the algorithm behind it, but Microsoft Live Image Search just added this capability. Picasa also has the ability to identify faces in images, and groups faces that look similar. Most of the time, it's the same person.
Some machine learning technology like a support vector machine, neural network, naive Bayes classifier or Bayesian network would be best at this type of problem. I've written one each of the first three to classify handwritten digits, which is essentially image pattern recognition.
resize the image to a 1x1 pixle... if they are exact, there is a small probability they are the same picture...
now resize it to a 2x2 pixle image, if all 4 pixles are exact, there is a larger probability they are exact...
then 3x3, if all 9 pixles are exact... good chance etc.
then 4x4, if all 16 pixles are exact,... better chance.
etc...
doing it this way, you can make efficiency improvments... if the 1x1 pixel grid is off by a lot, why bother checking 2x2 grid? etc.
If you have lots of images, a color histogram could be used to get rough closeness of images before doing a full image comparison of each image against each other one (i.e. O(n^2)).
There is DPEG, "The" Duplicate Media Manager, but its code is not open. It's a very old tool - I remember using it in 2003.
You could use diff to see if they are REALLY different.. I guess it will remove lots of useless comparison. Then, for the algorithm, I would use a probabilistic approach.. what are the chances that they look the same.. I'd based that on the amount of rgb in each pixel. You could also find some other metrics such as luminosity and stuff like that.
How do I segment a 2D image into blobs of similar values efficiently? The given input is a n array of integer, which includes hue for non-gray pixels and brightness of gray pixels.
I am writing a virtual mobile robot using Java, and I am using segmentation to analyze the map and also the image from the camera. This is a well-known problem in Computer Vision, but when it's on a robot performance does matter so I wanted some inputs. Algorithm is what matters, so you can post code in any language.
Wikipedia article: Segmentation (image processing)
[PPT] Stanford CS-223-B Lecture 11 Segmentation and Grouping (which says Mean Shift is perhaps the best technique to date)
Mean Shift Pictures (paper is also available from Dorin Comaniciu)
I would downsample,in colourspace and in number of pixels, use a vision method(probably meanshift) and upscale the result.
This is good because downsampling also increases the robustness to noise, and makes it more likely that you get meaningful segments.
You could use floodfill to smooth edges afterwards if you need smoothness.
Some more thoughts (in response to your comment).
1) Did you blend as you downsampled? y[i]=(x[2i]+x[2i+1])/2 This should eliminate noise.
2)How fast do you want it to be?
3)Have you tried dynamic meanshift?(also google for dynamic x for all algorithms x)
Not sure if it is too efficient, but you could try using a Kohonen neural network (or, self-organizing map; SOM) to group the similar values, where each pixel contains the original color and position and only the color is used for the Kohohen grouping.
You should read up before you implement this though, as my knowledge of the Kohonen network goes as far as that it is used for grouping data - so I don't know what the performance/viability options are for your scenario.
There are also Hopfield Networks. They can be mangled into grouping from what I read.
What I have now:
Make a buffer of the same size as the input image, initialized to UNSEGMENTED.
For each pixel in the image where the corresponding buffer value is not UNSEGMENTED, flood the buffer using the pixel value.
a. The border checking of the flooding is done by checking if pixel is within EPSILON (currently set to 10) of the originating pixel's value.
b. Flood filling algorithm.
Possible issue:
The 2.a.'s border checking is called many times in the flood filling algorithm. I could turn it into a lookup if I could precalculate the border using edge detection, but that may add more time than current check.
private boolean isValuesCloseEnough(int a_lhs, int a_rhs) {
return Math.abs(a_lhs - a_rhs) <= EPSILON;
}
Possible Enhancement:
Instead of checking every single pixel for UNSEGMENTED, I could randomly pick a few points. If you are expecting around 10 blobs, picking random points in that order may suffice. Drawback is that you might miss a useful but small blob.
Check out Eyepatch (eyepatch.stanford.edu). It should help you during the investigation phase by providing a variety of possible filters for segmentation.
An alternative to flood-fill is the connnected-components algorithm. So,
Cheaply classify your pixels. e.g. divide pixels in colour space.
Run the cc to find the blobs
Retain the blobs of significant size
This approach is widely used in early vision approaches. For example in the seminal paper "Blobworld: A System for Region-Based Image Indexing and Retrieval".