Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Are there any algorithms or tools that can increase the resolution of an image - besides just a simple zoom that makes each individual pixel in the image a little larger?
I realize that such an algorithm would have to invent pixels that don't really exist in the original image, but I figured there might be some algorithm that could intelligently figure out what pixels to add to the image to increase its resolution.
Interpolation: Image Scaling
For actual algorithms check out image interpolation.
The simple answer to your question is, "Yes there are algorithms, but none of them are very good." As you mentioned in the question, the limiting factor is the need to invent pixels in order to increase resolution beyond a small amount. (That's why you can't really read a license plate number from the reflection in someone's glasses off of a photo taken from a CCTV security camera, like they do in CSI: Miami.)
If all you want to do is create a larger image (for a wall hanging, or such like) then you can use a plugin for Photoshop that will smooth transitions between pixels using existing information. It can't create new pixels, but it can get rid of that boxy, pixelated look.
Addendum to the previous answers: Please note that the answer to your question depends heavily on what exactly you mean by resolution - of the display device, of the capture device, or of the viewing device (i.e., the human eye.) I assume you're talking about raster images (the problem wouldn't exist for vector images.)
You must accept that a picture taken at a higher resolution will contain more image information (i.e. details) than a picture of the same scene taken at a lower resolution. There is no way to add this information out of thin air. Scaling algorithms synthesize some information based on the assumption of continuity between the discrete raster image elements. That "new" information is not actually new but derived from the pre-existing picture information, hence it cannot be considered to have a 100% probability of matching the original scene. Better algorithms might yield better probabilities, but their results will always have a match probability of less than 1.
Enlarging images is risky. Beyond a certain point, enlarging images is a fool's errand; you can't magically synthesize an infinite number of new pixels out of thin air. And interpolated pixels are never as good as real pixels. That's why it's more than a little artificial to upsize the 512x512 Lena image by 500%. It'd be smarter to find a higher resolution scan or picture of whatever you need* than it would be to upsize it in software.
From Jeff Atwood
One way to increase resolution is to take multiple exposures, upsize them to 4x areal (2x linear both ways) and use stacking software to merge the images. The final image will be better than any of the originals.
You can try vectorizing the image with tools like autotrace or potrace and use it in whatever resolution you like. But it is computationally very costly so you end up with an image with few colors/features and even fewer if you need to work on its whole quickly.
Super-resolution algorithms might help in some cases.
I don’t know all what’s involved (soft/hardware & initial images necessary), but if you’re interested, here’s some links:
http://almalence.com/doc/superresolution-comparison/
(Seems like Almalence’s PhotoAcute fares the best of the ones tested in this article - $30 or $150). They are at: www. photoacute dot com
Markov Random Fields for SR – a free software package (MIT & Microsoft project)
http://people.csail.mit.edu/billf/project%20pages/sresCode/Markov%20Random%20Fields%20for%20Super-Resolution.html
Most decent image editors have smoothing/interpolating filters to do this kind of resizing/resampling, e.g. IrfanView which gives you several options for interpolation filters. See Lanczos resampling. ImageMagick's convert program allows you to do this also, after specifying a filter
If you need to do this algorithmically, check out the Image Scaling link suggested by Draemon. What platform will you be doing these interpolations on? Most graphics libraries will have a variety of approaches implemented, allowing you to balance speed against quality.
If you just need to resize some images, I recommend GIMP. It can resize images in a variety of ways, at least one of which should produce excellent results in any situation.
As others are pointing out, you can't expect a scaling method to invent information that isn't in the original image. So you can't expect it to be like the moments in CSI where they "zoom and enhance" to see the number on a license plate that was hopelessly blurred in the original image.
Related
I'm a novice at ML but I'm trying to create a model to detect a few objects in my custom photos. Before training my model, I'd like to know if and how I should modify my images to improve its accuracy.
I don't have access to the photos at the moment, however, I can provide an example of the characteristics of the images I'll be working with:
There's a white piece of paper (so white background), and on it are a bunch of insects.
There are a few different kinds of insects, and they look unique from eachother (different colors, shapes, sizes etc.).
The camera is pretty zoomed out, so each insect is probably ~ 40x40 pixels (so it's not really high definition).
I don't know much about machine learning, but I'd assume that because the insects will be captured in low quality, the model will mainly end up relying on the general shape and color to distinguish/identify the insects (e.g. long or circular spot on photo, etc.).
Therefore, I was wondering if I should do anything to to the photos to achieve higher accuracy (before I train it). For example, if I increase the contrast in my photos, would the insect's borders be more defined and thus make it easier for the model to detect/identify them? Or, should I convert the images to grayscale or stick with RGB? Are there any other factors that should be considered? Any help will be greatly appreciated!
Edit: I'm not sure why someone voted to close this as opinion-based, however, I'm not asking for an opinion. I'm trying to understand more about image-detection process by learning what constitutes a "good" photo versus a "bad" one. Even though this sounds like it's opinion-based, it's not. For example, I'm sure having extremely low-light photos would be terrible for training models. This wouldn't be an opinion, but a evidence-based fact.
Similarly, I'd like to learn what kinds of general characteristics make "better" photos, such as if I should use high contrast, brightness, etc. I think this is an answerable question that is not opinion-based.
You an employ standard preprocessing strategy like
Normalization of the RGB values
Horizontal/Vertical flipping
Affine transformation
P.s. it is more of comment than answer (I can't put comments)
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I'm really curious about that because nowadays every channel could modify or compress images some way which could be considered as a attack on steganography.
We can divide steganography to two basic types, first operates on spatial domain of image and second operates on some kind of transform domain.
The following types of attacks are of my interest because it is everywhere around us (if you want to save image on facebook or if you want to create a thumbnail of image or if you are going to save the image on mobile platform, etc.):
Compression or recompression of image - mainly for JPEG images or
PNG images with alpha premultiplication.
Resizing or scaling images and geometric manipulation - I mean the
transformation of image other than compressing it, e.g. rotation of
image, changing the scale etc.
I would like to ask:
What is the best way according you people to protect embedded
message in image from compression like in JPEG? What about "infinte"
recompression of image after embedding message with steganographic
mechanism? Would it be still the message readable?
Where is the threshold for embedded messages in connection with the
resizing of images if there is any? In my opinion steganography is
much more sensitive to resizing of image then compression or
rotation or adding a noise to images. What is the best way for
steganography resistant to resizing of image by you? I mean there is
always an edge where we can't go without losing the message but
there should be some threshold.
What about the combination of image manipulation through first and
second point?
I was reading many papers regarding compression resistant image steganography and basically they are always using error-correcting codes and Hamming distance to get the threshold of what we are able to hide without lossing information (or how to get the information in lossy channel). Then the first step is to hide redundantly our message to spatial domain using Hamming distance. With RGB image we are going to choose for example one triple as a one bit carrier and modifies our triple of colours is a way that the Hamming distance would be "in a center" of edges. We could do this as a repetition error-correcting code or any other (best practice is the Hamming codes like in the F5).
The idea behind this is that our error-correcting code with computed Hamming distances on JPEG compressed images would ensure that embedded information would be still there after many applications of JPEG compression. Of course all of this is at the expense of capacity of image while we are using redundancy through error correcting codes.
Example link on that method is here:
http://www.cs.unibo.it/babaoglu/courses/security/resources/documents/Steganography.pdf
I don't know much about watermarking techniques on digital image but probably we could find there a guidance on that topic because the aim of watermarking is almost the same as of steganography. We are trying to retain copyright information in digital images or we are trying to protect our hidden message in image in various situations like above.
I would like to discuss and ask you about today mechanisms of protect information in digital images through steganography. We can share our ideas or sample codes to make world better.
Your 1st question pertains to lossy methods removing the 'noise' (which are of course, the hidden bits) in your image. You may have to scatter it with redundancy. The LSB may not work as well as the position of the bits has to be distributed. Which means, the bits may have to be at various parts of the bits repetitively, so that, you can recover the message even when the other copies are corrupted. You may like to add a hash to ensure that the message is not corrupted (though the probability of the hash itself may). But redundancy and wider distribution may give you a good chance to survive the bits.
An idea may be to use proven cryptographic methods like AES or ECC (key management would be another topic). This will make your data bits "noise like". The position indices may also be determined via similar way. The principle is to create uniform distributions to deter predictability or pastern correlation for both data and location of the bits.
I hope this may give some guides to your steganographic design considerations.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have a web site that allows users to upload images of cars and I would like to put a privacy filter in place to detect registration plates on the vehicle and blur them.
The blurring is not a problem but is there a library or component (open source preferred) that will help with finding a licence within a photo?
Caveats;
I know nothing is perfect and image recognition of this type will provide false positive and negatives.
I appreciate that we could ask the user to select the area to blur and we will do this as well, but the question is specifically about finding that data programmatically; so answers such as 'get a person to check every image' is not helpful.
This software method is called 'Automatic Number Plate Recognition' in the UK but I cannot see any implementations of it as libraries.
Any language is great although .Net is preferred.
EDIT: I wrote a Python script for this.
As your objective is blurring (for privacy protection), you basically need a high recall detector as a first step. Here's how to go about doing this. The included code hints use OpenCV with Python.
Convert to Grayscale.
Apply Gaussian Blur.
img = cv2.imread('input.jpg',1)
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_gray = cv2.GaussianBlur(img_gray, (5,5), 0)
Let the input image be the following.
Apply Sobel Filter to detect vertical edges.
Threshold the resultant image using strict threshold or OTSU's binarization.
cv2.Sobel(image, -1, 1, 0)
cv2.threshold()
Apply a Morphological Closing operation using suitable structuring element. (I used 16x4 as structuring element)
se = cv2.getStructuringElement(cv2.MORPH_RECT,(16,4))
cv2.morphologyEx(image, cv2.MORPH_CLOSE, se)
Resultant Image after Step 5.
Find external contours of this image.
cv2.findContours(image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
For each contour, find the minAreaRect() bounding it.
Select rectangles based on aspect ratio, minimum and maximum area, and angle with the horizontal. (I used 2.2 <= Aspect Ratio <= 8, 500 <= Area <=15000, and angle <= 45 degrees)
All minAreaRect()s are shown in orange and the one which satisfies our criteria is in green.
There may be false positives after this step, to filter it, use edge density. Edge Density is defined as the number of white pixels/total number of pixels in a rectangle. Set a threshold for edge density. (I used 0.5)
Blur the detected regions.
You can apply other filters you deem suitable to increase recall and precision. The detection can also be trained using HOG+SVM to increase precision.
I coded a C# version based on JAVA ANPR, but I changed the awt library functions with OpenCV.
You can check it at http://anprmx.codeplex.com
There is a new, open source library on GitHub that does ANPR for US and European plates. It looks pretty accurate and it should do exactly what you need (recognize the plate regions). Here is the GitHub project:
https://github.com/openalpr/openalpr
I came across this one that is written in java javaANPR, I am looking for a c# library as well.
I would like a system where I can point a video camera at some sailing boats, all of which have large, identifiable numbers on them, and have it identify the boats and send a tweet when they sail past a video camera.
I have done some googling about this a couple of months ago. There are quite a few papers about this topic, but I never found any concrete open-source implementation. There are a lot of commercial implementations though, but none of them with a price quote, so they're probably pretty expensive.
try this Simple Automatic Number Plate Recognition System
http://opos.codeplex.com/
Open source and written with C#
Have a look at Java ANPR. Free license plate recognition...
Yes I use gocr at http://jocr.sourceforge.net/ its a commandline application which you could execute from your application. I use it in a couple of my applications.
High performance ANPR Library - http://www.dtksoft.com/dtkanpr.php. This is commercial, but they provide trial key.
http://licenseplate.sourceforge.net Python (I have not tested it)
It maybe work looking at Character recoqnition software as there are many libraries out there that perform the same thing. I reading an image and storing it. Micrsoft office is able to read tiff files and return alphanumerics
Sometimes two image files may be different on a file level, but a human would consider them perceptively identical. Given that, now suppose you have a huge database of images, and you wish to know if a human would think some image X is present in the database or not. If all images had a perceptive hash / fingerprint, then one could hash image X and it would be a simple matter to see if it is in the database or not.
I know there is research around this issue, and some algorithms exist, but is there any tool, like a UNIX command line tool or a library I could use to compute such a hash without implementing some algorithm from scratch?
edit: relevant code from findimagedupes, using ImageMagick
try $image->Sample("160x160!");
try $image->Modulate(saturation=>-100);
try $image->Blur(radius=>3,sigma=>99);
try $image->Normalize();
try $image->Equalize();
try $image->Sample("16x16");
try $image->Threshold();
try $image->Set(magick=>'mono');
($blob) = $image->ImageToBlob();
edit: Warning! ImageMagick $image object seems to contain information about the creation time of an image file that was read in. This means that the blob you get will be different even for the same image, if it was retrieved at a different time. To make sure the fingerprint stays the same, use $image->getImageSignature() as the last step.
findimagedupes is pretty good. You can run "findimagedupes -v fingerprint images" to let it print "perceptive hash", for example.
Cross-correlation or phase correlation will tell you if the images are the same, even with noise, degradation, and horizontal or vertical offsets. Using the FFT-based methods will make it much faster than the algorithm described in the question.
The usual algorithm doesn't work for images that are not the same scale or rotation, though. You could pre-rotate or pre-scale them, but that's really processor intensive. Apparently you can also do the correlation in a log-polar space and it will be invariant to rotation, translation, and scale, but I don't know the details well enough to explain that.
MATLAB example: Registering an Image Using Normalized Cross-Correlation
Wikipedia calls this "phase correlation" and also describes making it scale- and rotation-invariant:
The method can be extended to determine rotation and scaling differences between two images by first converting the images to log-polar coordinates. Due to properties of the Fourier transform, the rotation and scaling parameters can be determined in a manner invariant to translation.
Colour histogram is good for the same image that has been resized, resampled etc.
If you want to match different people's photos of the same landmark it's trickier - look at haar classifiers. Opencv is a great free library for image processing.
I don't know the algorithm behind it, but Microsoft Live Image Search just added this capability. Picasa also has the ability to identify faces in images, and groups faces that look similar. Most of the time, it's the same person.
Some machine learning technology like a support vector machine, neural network, naive Bayes classifier or Bayesian network would be best at this type of problem. I've written one each of the first three to classify handwritten digits, which is essentially image pattern recognition.
resize the image to a 1x1 pixle... if they are exact, there is a small probability they are the same picture...
now resize it to a 2x2 pixle image, if all 4 pixles are exact, there is a larger probability they are exact...
then 3x3, if all 9 pixles are exact... good chance etc.
then 4x4, if all 16 pixles are exact,... better chance.
etc...
doing it this way, you can make efficiency improvments... if the 1x1 pixel grid is off by a lot, why bother checking 2x2 grid? etc.
If you have lots of images, a color histogram could be used to get rough closeness of images before doing a full image comparison of each image against each other one (i.e. O(n^2)).
There is DPEG, "The" Duplicate Media Manager, but its code is not open. It's a very old tool - I remember using it in 2003.
You could use diff to see if they are REALLY different.. I guess it will remove lots of useless comparison. Then, for the algorithm, I would use a probabilistic approach.. what are the chances that they look the same.. I'd based that on the amount of rgb in each pixel. You could also find some other metrics such as luminosity and stuff like that.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I would like to compare a screenshot of one application (could be a Web page) with a previously taken screenshot to determine whether the application is displaying itself correctly. I don't want an exact match comparison, because the aspect could be slightly different (in the case of a Web app, depending on the browser, some element could be at a slightly different location). It should give a measure of how similar are the screenshots.
Is there a library / tool that already does that? How would you implement it?
This depends entirely on how smart you want the algorithm to be.
For instance, here are some issues:
cropped images vs. an uncropped image
images with a text added vs. another without
mirrored images
The easiest and simplest algorithm I've seen for this is just to do the following steps to each image:
scale to something small, like 64x64 or 32x32, disregard aspect ratio, use a combining scaling algorithm instead of nearest pixel
scale the color ranges so that the darkest is black and lightest is white
rotate and flip the image so that the lighest color is top left, and then top-right is next darker, bottom-left is next darker (as far as possible of course)
Edit A combining scaling algorithm is one that when scaling 10 pixels down to one will do it using a function that takes the color of all those 10 pixels and combines them into one. Can be done with algorithms like averaging, mean-value, or more complex ones like bicubic splines.
Then calculate the mean distance pixel-by-pixel between the two images.
To look up a possible match in a database, store the pixel colors as individual columns in the database, index a bunch of them (but not all, unless you use a very small image), and do a query that uses a range for each pixel value, ie. every image where the pixel in the small image is between -5 and +5 of the image you want to look up.
This is easy to implement, and fairly fast to run, but of course won't handle most advanced differences. For that you need much more advanced algorithms.
The 'classic' way of measuring this is to break the image up into some canonical number of sections (say a 10x10 grid) and then computing a histogram of RGB values inside of each cell and compare corresponding histograms. This type of algorithm is preferred because of both its simplicity and it's invariance to scaling and (small!) translation.
Use a normalised colour histogram. (Read the section on applications here), they are commonly used in image retrieval/matching systems and are a standard way of matching images that is very reliable, relatively fast and very easy to implement.
Essentially a colour histogram will capture the colour distribution of the image. This can then be compared with another image to see if the colour distributions match.
This type of matching is pretty resiliant to scaling (once the histogram is normalised), and rotation/shifting/movement etc.
Avoid pixel-by-pixel comparisons as if the image is rotated/shifted slightly it may lead to a large difference being reported.
Histograms would be straightforward to generate yourself (assuming you can get access to pixel values), but if you don't feel like it, the OpenCV library is a great resource for doing this kind of stuff. Here is a powerpoint presentation that shows you how to create a histogram using OpenCV.
Don't video encoding algorithms like MPEG compute the difference between each frame of a video so they can just encode the delta? You might look into how video encoding algorithms compute those frame differences.
Look at this open source image search application http://www.semanticmetadata.net/lire/. It describes several image similarity algorighms, three of which are from the MPEG-7 standard: ScalableColor, ColorLayout, EdgeHistogram and Auto Color Correlogram.
You could use a pure mathematical approach of O(n^2), but it will be useful only if you are certain that there's no offset or something like that. (Although that if you have a few objects with homogeneous coloring it will still work pretty well.)
Anyway, the idea is the compute the normalized dot-product of the two matrices.
C = sum(Pij*Qij)^2/(sum(Pij^2)*sum(Qij^2)).
This formula is actually the "cosine" of the angle between the matrices (wierd).
The bigger the similarity (lets say Pij=Qij), C will be 1, and if they're completely different, lets say for every i,j Qij = 1 (avoiding zero-division), Pij = 255, then for size nxn, the bigger n will be, the closer to zero we'll get. (By rough calculation: C=1/n^2).
You'll need pattern recognition for that. To determine small differences between two images, Hopfield nets work fairly well and are quite easy to implement. I don't know any available implementations, though.
A ruby solution can be found here
From the readme:
Phashion is a Ruby wrapper around the pHash library, "perceptual hash", which detects duplicate and near duplicate multimedia files
How to measure similarity between two images entirely depends on what you would like to measure, for example: contrast, brightness, modality, noise... and then choose the best suitable similarity measure there is for you. You can choose from MAD (mean absolute difference), MSD (mean squared difference) which are good for measuring brightness...there is also available CR (correlation coefficient) which is good in representing correlation between two images. You could also choose from histogram based similarity measures like SDH (standard deviation of difference image histogram) or multimodality similarity measures like MI (mutual information) or NMI (normalized mutual information).
Because this similarity measures cost much in time, it is advised to scale images down before applying these measures on them.
I wonder (and I'm really just throwing the idea out there to be shot down) if something could be derived by subtracting one image from the other, and then compressing the resulting image as a jpeg of gif, and taking the file size as a measure of similarity.
If you had two identical images, you'd get a white box, which would compress really well. The more the images differed, the more complex it would be to represent, and hence the less compressible.
Probably not an ideal test, and probably much slower than necessary, but it might work as a quick and dirty implementation.
You might look at the code for the open source tool findimagedupes, though it appears to have been written in perl, so I can't say how easy it will be to parse...
Reading the findimagedupes page that I liked, I see that there is a C++ implementation of the same algorithm. Presumably this will be easier to understand.
And it appears you can also use gqview.
Well, not to answer your question directly, but I have seen this happen. Microsoft recently launched a tool called PhotoSynth which does something very similar to determine overlapping areas in a large number of pictures (which could be of different aspect ratios).
I wonder if they have any available libraries or code snippets on their blog.
to expand on Vaibhav's note, hugin is an open-source 'autostitcher' which should have some insight on the problem.
There's software for content-based image retrieval, which does (partially) what you need. All references and explanations are linked from the project site and there's also a short text book (Kindle): LIRE
You can use Siamese Network to see if the two images are similar or dissimilar following this tutorial. This tutorial cluster the similar images whereas you can use L2 distance to measure the similarity of two images.
Beyond Compare has pixel-by-pixel comparison for images, e.g.,
If this is something you will be doing on an occasional basis and doesn't need automating, you can do it in an image editor that supports layers, such as Photoshop or Paint Shop Pro (probably GIMP or Paint.Net too, but I'm not sure about those). Open both screen shots, and put one as a layer on top of the other. Change the layer blending mode to Difference, and everything that's the same between the two will become black. You can move the top layer around to minimize any alignment differences.
Well a really base-level method to use could go through every pixel colour and compare it with the corresponding pixel colour on the second image - but that's a probably a very very slow solution.