Image classification and image resizing - image

I have a set of images that I am using for a typical classification problem using Tensorflow. The images come in different sizes so I wrote a small piece of code to resize them all. But the question is what is the best strategy of resizing for training purposes? For example, is it better to resize them, no matter how they scale up or down, or it is better to keep the aspect ratio and add some artificial zero padding around the resized images? I believe this is a typical question with some existing studies or solutions. Appreciate your advice.
Regards,
Hamid

Related

Image Dimension Size in Denoising Image using CNN

I have a set of synthetically noisy images. Example is shown below:
I have also their corresponding clean text images as my ground truth data. Example below:
The dimension size of the two images is 4918 x 5856. Is it an appropriate size for training my Convolutional Neural Network that will perform image denoising. If no, what shall I do? Resize or crop? Thanks.
This resolution really is overkill. You can start off with 1/64 of the size ~(600,750), which is already pretty big.
I was facing this problem recently as well. I learned that you need to crop the image into patches, each of about 500x500. Then you need to denoise each patch and put it all together. This usually gets the most accurate results. Let me know if you need anything else!

Best CNN architectures for small images (80x80)?

I'm new in computer vision area and I hope you can help me with some fundamental questions regarding CNN architectures.
I know some of the most well-known ones are:
VGG Net
ResNet
Dense Net
Inception Net
Xception Net
They usually need an input of images around 224x224x3 and I also saw 32x32x3.
Regarding my specific problem, my goal is to train biomedical images with size (80x80) for a 4-class classification - at the end I'll have a dense layer of 4. Also my dataset is quite small (1000images) and I wanted to use transfer learning.
Could you please help me with the following questions? It seems to me that there is no single correct answer to them, but I need to understand what should be the correct way of thinking about them. I will appreciate if you can give me some pointers as well.
Should I scale my images? How about the opposite and shrink to 32x32 inputs?
Should I change the input of the CNNs to 80x80? What parameters should I change mainly? Any specific ratio for the kernel and the parameters?
Also I have another problem, the input requires 3 channels (RGB) but I'm working with grayscale images. Will it change the results a lot?
Instead of scaling should I just fill the surroundings (between the 80x80 and 224x224) as background? Should the images be centered in this case?
Do you have any recommendations regarding what architecture to choose?
I've seen some adaptations of these architectures to 3D/volumes inputs instead of 2D/images. I have a similar problem to the one I described here but with 3D inputs. Is there any common reasoning when choosing a 3D CNN architecture instead of a 2D?
In advances I leave my thanks!
I am assuming you basic know-how in using CNN for classification
Answering question 1~3
You scale your image for several purposes. Smaller the image, the faster the training and inference time. However you will lose important information in the process of shrinking the image. There is no one right answer and it all depends on your application. Is real-time process important? If your answer is no, always stick to the original size.
You will also need to resize your image to fit the input size of predefined models if you plan to retrain them. However, since your image is in grayscale, you will need to find models trained in gray or create a 3 channel image and copy the same value to all R,G and B channel. This is not efficient but it will help you reuse the high quality model trained by others.
The best way i see for you to handle this problem is to train everything from start. 1000 can seem to be a small number of data, but since your domain is specific and only require 4 classes, training from scratch doesnt seem that bad.
Question 4
When the size is different, always scale. filling with the surrounding will cause the model to learn the empty spaces and that is not what we want.
Also make sure the input size and format during inference is the same as the input size and format during training.
Question 5
If processing time is not a problem RESNET. If processing time is important, then MobileNet.
Question 6
6) Depends on your input data. If you have 3D data then you can use it. More input data usually helps in better classification. But 2D will be enough to solve certain problem. If you can classify the images by looking at the 2D images, most probabily 2D images will be enough to complete the task.
I hope this will clear some of your problems and direct you to a proper solution.

Using a TreeMap with images

For representing most popular artists from EchoNest API, I've been trying to set-up Silverlight Toolkit's TreeMap using images, their TreeItemDefinition.ValueBinding being defined as the area of the image.
While it mostly fills up the space when the image stretch is set to 'Fill' :
When setting image stretch to 'Uniform' a lot of blank spaces remain :
On this post, image carving is suggested : Treemapping with a given aspect ratio
How can I know which images should be carved and at what dimensions they should be carved if possible at all ?
Is this problem solvable without human intervention for a good result ?
I don't think there is a way to know which images should be carved and at what dimensions they should be carved. An ok-ish euristic might be to check if the mean energy of an image is > a certain threshold (this can be refined to check only blocks of every image, and combining the result later: if the image has blocks without details/energy, it can be carved, at least in that section).
What i think would be better is to apply seam carving to the already composed image: that will try to carve out the white outlines (adding "artificial" energy to the patches of images might lead to even better results, preserving more the shapes of each image). This paper might be of use to check out other image resizing methods too.

How to detect subjective image quality

For an image-upload tool I want to detect the (subjective) quality of an image automatically, resulting in a rating of the quality.
I have the following idea to realize this heuristically:
Obviously incorporate the resolution into the rating.
Compress it to JPG (75%), decompress it and compare jpg-size vs. decompressed size to gain a ratio. The blurrier the image is, the higher the ratio.
Obviously my approach would use up a lot of cycles and memory if large images are rated, although this would do in my scenario (fat server, not many uploads), and I could always build in a "short circuit" around the more expensive steps if the image exceeds a certain resolution.
Is there something else I can try, or is there a way to do this more efficiently?
Assesing the image (the same goes for sound or video) quality is not an easy task, and there are numerous publications tackling the problem.
Much depends on the nature of the image - different set of criteria is appropriate for artificially created images (i.e. diagrams) or natural images (i.e. photographs). There are subtle effects that have to be taken into consideration - like color masking, luminance masking, contrast perception. For some images a given compression ratio is perfectly adequate, while for other it will result in significant loss of quality.
Here is a free-access publication giving a brief introduction to the subject of image quality evaluation.
The method you mentioned - compressing the image and comparing the result with the original is far from perfect. What will be the metric that you plan to use? MSE? MSE per block? For sure it is not too difficult to implement, but the results will be difficult to interpret (consider images with high-frequency components and without them).
And if you want to delve more into the are of image quality assessment there is also a lot of research done by the machine learning community.
You could try looking in the EXIF tags of the image (using something like exiftool), what you get will vary a lot. On my SLR, for example, you even get which of the focus points were active when the image was taken. There may also be something about compression quality.
The other thing to check is the image histogram - watch out for images biased to the left, which suggests under-exposure or lots of saturated pixels.
For image blur you could look at the high frequency components of the Fourier transform, this is probably accessing parameters relating to the JPG compression anyway.
This is a bit of a tricky area because most "rules" you might be able to implement could arguably be broken for artistic effect.
I'd like to shoot down the "obviously incorporate resolution" idea. Resolution tells you nothing. I can scale an image by a factor of 2 , quadrupling the number of pixels. This adds no information whatsoever, nor does it improve quality.
I am not sure about the "compress to JPG" idea. JPG is a photo-oriented algorithm. Not all images are photos. Besides, a blue sky compresses quite well. Uniformly grey even better. Do you think exact cloud types determine the image quality?
Sharpness is a bad idea, for similar reasons. Depth of Field is not trivially related to image quality. Items photographed against a black background will have a lot of pixels with quite low intensity, intentionally. Again, this does not signal underexposure, so the histogram isn't a good quality indicator by itself either.
But what if the photos are "commercial?" Does the value of the existing technology work if the photos are of every-day objects and purposefully non-artistic?
If I hire hundreds of people to take pictures of park benches I want to quickly know which pictures are of better quality (in-focus, well-lit) and which aren't. I don't want pictures of kittens, people, sunsets, etc.
Or what if the pictures are supposed to be of items for a catalog? No models, just garments. Would image-quality processing help there?
I'm also really interested working out how blurry a photograph is.
What about this:
measure the byte size of the image when compressed as JPEG
downscale the image to 1/4th
upscale it 4x, using some kind of basic interpolation
compress that version using JPEG
compare the sizes of the two compressed images.
If the size did not go down a lot (past some percentage threshold), then downscaling and upscaling did not lose much information, therefore the original image is the same as something that has been zoomed.

Are there algorithms for increasing resolution of an image? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Are there any algorithms or tools that can increase the resolution of an image - besides just a simple zoom that makes each individual pixel in the image a little larger?
I realize that such an algorithm would have to invent pixels that don't really exist in the original image, but I figured there might be some algorithm that could intelligently figure out what pixels to add to the image to increase its resolution.
Interpolation: Image Scaling
For actual algorithms check out image interpolation.
The simple answer to your question is, "Yes there are algorithms, but none of them are very good." As you mentioned in the question, the limiting factor is the need to invent pixels in order to increase resolution beyond a small amount. (That's why you can't really read a license plate number from the reflection in someone's glasses off of a photo taken from a CCTV security camera, like they do in CSI: Miami.)
If all you want to do is create a larger image (for a wall hanging, or such like) then you can use a plugin for Photoshop that will smooth transitions between pixels using existing information. It can't create new pixels, but it can get rid of that boxy, pixelated look.
Addendum to the previous answers: Please note that the answer to your question depends heavily on what exactly you mean by resolution - of the display device, of the capture device, or of the viewing device (i.e., the human eye.) I assume you're talking about raster images (the problem wouldn't exist for vector images.)
You must accept that a picture taken at a higher resolution will contain more image information (i.e. details) than a picture of the same scene taken at a lower resolution. There is no way to add this information out of thin air. Scaling algorithms synthesize some information based on the assumption of continuity between the discrete raster image elements. That "new" information is not actually new but derived from the pre-existing picture information, hence it cannot be considered to have a 100% probability of matching the original scene. Better algorithms might yield better probabilities, but their results will always have a match probability of less than 1.
Enlarging images is risky. Beyond a certain point, enlarging images is a fool's errand; you can't magically synthesize an infinite number of new pixels out of thin air. And interpolated pixels are never as good as real pixels. That's why it's more than a little artificial to upsize the 512x512 Lena image by 500%. It'd be smarter to find a higher resolution scan or picture of whatever you need* than it would be to upsize it in software.
From Jeff Atwood
One way to increase resolution is to take multiple exposures, upsize them to 4x areal (2x linear both ways) and use stacking software to merge the images. The final image will be better than any of the originals.
You can try vectorizing the image with tools like autotrace or potrace and use it in whatever resolution you like. But it is computationally very costly so you end up with an image with few colors/features and even fewer if you need to work on its whole quickly.
Super-resolution algorithms might help in some cases.
I don’t know all what’s involved (soft/hardware & initial images necessary), but if you’re interested, here’s some links:
http://almalence.com/doc/superresolution-comparison/
(Seems like Almalence’s PhotoAcute fares the best of the ones tested in this article - $30 or $150). They are at: www. photoacute dot com
Markov Random Fields for SR – a free software package (MIT & Microsoft project)
http://people.csail.mit.edu/billf/project%20pages/sresCode/Markov%20Random%20Fields%20for%20Super-Resolution.html
Most decent image editors have smoothing/interpolating filters to do this kind of resizing/resampling, e.g. IrfanView which gives you several options for interpolation filters. See Lanczos resampling. ImageMagick's convert program allows you to do this also, after specifying a filter
If you need to do this algorithmically, check out the Image Scaling link suggested by Draemon. What platform will you be doing these interpolations on? Most graphics libraries will have a variety of approaches implemented, allowing you to balance speed against quality.
If you just need to resize some images, I recommend GIMP. It can resize images in a variety of ways, at least one of which should produce excellent results in any situation.
As others are pointing out, you can't expect a scaling method to invent information that isn't in the original image. So you can't expect it to be like the moments in CSI where they "zoom and enhance" to see the number on a license plate that was hopelessly blurred in the original image.

Resources