Does image file type matter in terms of accuracy or speed when training/evaluating in machine learning? - image

I would like to know if the image file type matters at all in image classification using Keras, Tensorflow, or any other machine learning library. For example:
If I were to train using only JPG files, will the accuracy be significantly affected if I were to evaluate the model using only PNG files?
If so, will it be better to train using both JPG and PNG files so I can evaluate using both types?
Or does the image file type not matter at all?

The file type does not matter.
During training (and inference for that matter) images are converted into a tensors (you can think of this just as a multi dimensional array) where each pixel is represented by a small group of numbers (or a single number for black and white images).
Machine learning is performed on these tensors rather than the image itself so the original file format really doesn't matter.

Related

Why everybody convert image to gray-scale before performing operations on openCV

been trying to find the answer to why everybody converts an image to grayscale before processing?
For example, this website with instructions teaching people how to build a simple scanning program converts photo to greyscale first before passing commands to manipulate the image itself.
In the second example, this thread on stackoverflow shows a person also converts the image to grayscale before extracting text from his image.
Does this process make the image easier to manipulate? Or does it give better results when extracting text? If so, shouldn't a binary image give the best result in the case of extracting text?
More often than not, grayscale has all the relevant information to complete a particular task. So reducing the image to grayscale greatly simplifies calculations and removes redundancies.
Binary image is great too but it sacrifices too many information for it to be useful in many cases. And most library supports a minimum of 8 bit image processing anyway for a true binary data structure to be useful.
Imagine having to create a program to recognize text on paper. Having a color image doesn't help you to better read the text. The text can be in various color but you can read the text even if its in black and white. You can argue that binary image should also give the same performance and that is true IF there are no noise such as shadow on the paper.
Once there are noise elements exist on the image, you will need more information to separate text from noise and that is when grayscale is useful.
Moreover the most used and reliable information for advanced image processing is the edges and its textures. Both which can be obtained from a grayscale image.

Caffe - training autoencoder with image data/image label pairs

I am very unfamiliar with Caffe. My task is to train an autoencoder net on image pairs, given in .tif format, where one is a grayscale image of nerves, and the other is the corresponding binary mask which shows if a certain structure is present on the image or not. I have these in the same "train" folder. What I would like to accomplish, is a meaningful experiment with these images (segmentation, classification, it is not specified). My first problem is that I do not know how to feed the images into the net without an existing train.txt. Can I use the images directly, or another format like lmdb, hdf5 needed? Any suggestion is appreciated.
you can accomplish it with simple classification (existing like alexnet, googlenet, lenet). You can use only the binary mask or gray scale image and the class name to do this. Nvidia Digits is a good graphical tool to make the pair dataset and learning....
Please see this link:
https://developer.nvidia.com/digits

how do i convert photographs to tensors

I am a neophyte neural network user trying to get to grips with TensorFlow. I have used the MNIST dataset as a test, and would now like to use real world data.
Can anyone point me to a "Howto" or paper or source which tells me how to go about converting digital photographs in files, (jpeg, png, gif, wmf), into a tensors ready for import into TensorFlow please?
Cheers!
You can use the TensorFlow image functions to load images and convert them into tensors. After loading the images, you will likely want to look at tf.image.resize_bilinear to resize the images to standard sizes.
The standard way to load data into Tensorflow is to use a TFRecords file.
Another approach is to convert whatever data you have into a supported format. This approach makes it easier to mix and match data sets and network architectures. The recommended format for TensorFlow is a TFRecords file containing tf.train.Example protocol buffers.
-Tensorflow Documentation
Basically TFRecord is a binary representation of your data or images along with its labels, file names, and other information. Its main advantages are to allow you to stream data into the model efficiently by using Tensorflow's threading and to increase flexibility between different models.
You can use this script to generate your own TFRecord files.
Additionally, you can read on how to use the script here.

Can pdfbox extract vector images?

As per my understanding,
1. .eps format images are vector images.
2. When we draw something in word (like a flowchart) that is stored
as a vector image.
I am almost sure about the first, not sure about the second. Please correct me if I am wrong.
Assuming this two things, when a latex file (where .eps images are inserted) or a word file (that contains vector images) is converted into pdf, do the images get converted into raster images?
Also, I think PDFBox/xpdf can only extract raster images from the pdf (as they are embedded as XObjects), not vector images. Is that understanding correct? This question in stackoverflow is related, but have not been answered yet.
Your point 1 is incorrect, eps files are PostScript programs, they may contain vector information, or text or image data, or all of the above.
point 2 In PDF there isn't a 'vector image', an image means a bitmap and therefore cannot be vector.
If you convert a PostScript program to a PDF file, then the result depends entirely on the conversion program you use. In general vectors will be retained as vectors, and text as text. However it is entirely possible that an application might render the entire PostScript program and insert the result as an image in the PDF.
So the answer to your first question ("do the images get converted into raster images") is 'maybe, but probably not'.
I'm afraid I have no idea about the capabilities of PDFBox/xpdf, but since collections of vectors may not be arranged as 'images' (they could be held as Form XObjects, or Patterns) in any atomic fashion, there isn't any obvious way to know when to stop extracting. And what format would you store the result in anyway ?

Match two images in different formats

I'm working on a software project in which I have to compare a set of 'input' images against another 'source' set of images and find out if there is a match between any of them. The source images cannot be edited/modified in any way; the input images can be scaled/cropped in order to find a match. The images can be in BMP,JPEG,GIF,PNG,TIFF of any dimensions.
A constraint: I'm not allowed to use any external libraries. ImageMagick is an exception and can be used.
I intend to use Java/Python. The software is purely command-line based.
I was reading on SO and some common image comparing algorithms. I'm planning to take 2 approaches.
1. I could use Histograms/buckets to find out the RGB values of the 2 images being compared.
2. Use SIFT/SURF to fin keypoint descriptors and find the euclidean distance between them and output the result based on the resultant distance.
The 2 images in comparison can be in different formats. An intuitive thought is that before analysis/comparison, the 2 images must be converted to a common format.I reasoned that the image should be converted to the one with lesser quality e.g. if the 2 input images are BMP and JPEG, convert the BMP to JPEG. This can be thought of as a pre-processing step.
My question:
Is image conversion to a common format required? Can 2 images of different formats be compared? IF they have to be converted before comparison, is my assumption of comparing from higher quality(BMP) to lower(JPEG) correct? It'd also be helpful if someone can suggest some algorithms for image conversion.
EDIT
A match is said to be found if the pattern image is found in the source image.
Say for example the source image consists of a football field with one player. If the pattern image contains the player EXACTLY as he is in the source image, then its a match.
No, conversion to a common format on disk is not required, and likely not helpful. If you extract feature descriptors from an image (SIFT/SURF, for example), it matters much less how the original images were stored on disk. The feature descriptors should be invariant to small compression artifacts.
A bit more...
Suppose you have a BMP that is an image of object X in your source dataset.
Then, in your input/query dataset, you have another image of object X, but it has been saved as a JPEG.
You have no idea how what noise was introduced in the encoding process that produced either of these images. There is lighting differences, atmospheric effects, lens effects, sensor noise, tone-mapping, gammut-mapping. Some of these vary from image to image, others vary from camera to camera. All this is done before the image even gets saved to storage in the camera. Yes, there are also JPEG compression artifacts, but to assume the BMP is "higher" quality and then degrade it through JPEG compression will not help. Perhaps the BMP has even gone through JPEG compression before being saved as a BMP.

Resources