Theory : is it possible to save circular image? - image

I'm wondering if it is possible to save circular image or if it always have to be a squared ou rectangular image even if you select a circular ROI?
No specific language, this is more like a theorical question. In order to know which maximale size I can process.
In my case, I apply a circular mask with matlab on a BMP image and it returns squared images. I don't need the information around the circle and therefore I want to reduce the size of my image to save computational cost for the following processes.

File formats like PNG or JPG simply do not make sense for a circular image, because the file formats already imply that there is a rectangular collection of pixels to be stored.
Of course, one could define an own file format. There is nothing preventing you from defining a file format, maybe similar to PNG, that additionally stores a radius (in pixels), and beyond that only stores the pixels that are part of the circular region.
Another option is simply a different representation in memory. You could, for example, define a circular region of pixels in memory:
PPPP
PPPPPP
PPPPPPPP
PPPPPP
PPPP
Then you could arrange these pixels (in memory!) to fill a rectangular region:
PPPP PPPPPP PPPPPPPP
PPPPPP PPPP ........
and save this as an image. (Note that the image format has to be lossless. Storing such an image as JPG would not make sense for various reasons). When decoding this image (i.e. when reading the image file), you would additionally need information about the original radius of the circle. Then you could construct a new circular region of pixels, filled with the pixels that have been read from the (rectangular) image.
Something like this might even make sense when you have a very large circular region, and want to avoid saving the (useless) "corner pixels" in a file. But I doubt that there are realistic application cases for something like this.

Related

Image resizing method during preprocessing for neural network

I am new to machine learning. I am trying to create an input matrix (X) from a set of images (Stanford dog set of 120 breeds) to train a convolutional neural network. I aim to resize images and turn each image into one row by making each pixel a separate column.
If I directly resize images to a fixed size, the images lose their originality due to squishing or stretching, which is not good (first solution).
I can resize by fixing either width or height and then crop it (all resultant images will be of the same size as 100x100), but critical parts of the image can be cropped (second solution).
I am thinking of another way of doing it, but I am sure. Assume I want 10000 columns per image. Instead of resizing images to 100x100, I will resize the image so that the total pixel count will be around 10000 pixels. So, images of size 50x200, 100x100 and 250x40 will all converted into 10000 columns. For other sizes like 52x198, the first 10000 pixels out of 10296 will be considered (third solution).
The third solution I mentioned above seems to preserve the original shape of the image. However, it may be losing all of this originality while converting into a row since not all images are of the same size. I wonder about your comments on this issue. It will also be great if you can direct me to sources I can learn about the topic.
Solution 1 (simply resizing the input image) is a common approach. Unless you have a very different aspect ratio from the expected input shape (or your target classes have tight geometric constraints), you can usually still get good performance.
As you mentioned, Solution 2 (cropping your image) has the drawback of potentially excluding a critical part of your image. You can get around that by running the classification on multiple subwindows of the original image (i.e., classify multiple 100 x 100 sub-images by stepping over the input image horizontally and/or vertically at an appropriate stride). Then, you need to decide how to combine your multiple classification results.
Solution 3 will not work because the convolutional network needs to know the image dimensions (otherwise, it wouldn't know which pixels are horizontally and vertically adjacent). So you need to pass an image with explicit dimensions (e.g., 100 x 100) unless the network expects an array that was flattened from assumed dimensions. But if you simply pass an array of 10000 pixel values and the network doesn't know (or can't assume) whether the image was 100 x 100, 50 x 200, or 250 x 40, then the network can't apply the convolutional filters properly.
Solution 1 is clearly the easiest to implement but you need to balance the likely effect of changing the image aspect ratios with the level of effort required for running and combining multiple classifications for each image.

fast rasterisation and colorization of 2D polygons of known shape to an image file

The shape and positions of all the polygons are known beforehand. The polygons are not overlapping and will be of different colors and shapes, and there could be quite many of them. The polygons are defined in floating point based coordinates and will be painted on top of a JPEG photo as annotation.
How could I create the resulting image file as fast as possible after I get to know which color I should give each polygon?
If it would save time I would like to perform as much as possible of the computations beforehand. All information regarding geometry and positions of the polygons are known in advance. The JPEG photo is also known in advance. The only information not known beforehand is the color of each polygon.
The JPEG photo has a size of 250x250 pixels, so that would also be the image size of the resulting rasterised image.
The computations will be done on a Linux computer with a standard graphics card, so OpenGL might be a viable option. I know there are also rasterisation libraries like Cairo that could be used to paint polygons. What I wonder is if I could take advantage of the fact that I know so much of the input in advance and use that to speed up the computation. The only thing missing is the color of each polygon.
Preferably I would like to find a solution that would only precompute things in the form of data files. In other words as soon as the polygon colors are known, the algorithm would load the other information from datafiles (JPEG file, polygon geometry file and/or possibly precomputed datafiles). Of course it would be faster to start the computation out with a "warm" state ready in the GPU/CPU/RAM but I'd like to avoid that. The choice of programming language is not so import, but could for instance be C++.
To give some more background information: The JavaScript library OpenSeadragon that is running in a web browser requests image tiles from a web server. The idea is that measurement points (i.e. the polygons) could be plotted on-the-fly on to pregenerated Zooming Images (DZI format) by the web server. So for one image tile the algorithm would only need to be run one time. The aim is low latency.

Matlab - Registration and Cropping of aligned images from two different sources

Good day,
In MATLAB, I have multiple image-pairs of various samples. The images in a pair are taken by different cameras. The images are in differing orientations, though I have created transforms (for each image-pair) that can be applied to correct that. Their bounds contain the same physical area, but one image has smaller dimensions (ie. 50x50 against 250x250). Additionally, the smaller image is not in a consistent location within the larger image. However, the smaller image is within the borders of the larger image.
What I'd like to do is as follows: after applying my pre-determined transform to the larger image, I want to crop the part of the larger image that is of the same as the smaller image.
I know I can specify XData and YData when applying my transforms to output a subset of the transformed image, but I don't know how to relate that to the location of the smaller image. (Note: Transforms were created from control-point structures)
Please let me know if anything is unclear.
Any help is much appreciated.
Seeing how you are specifying control points to get the transformation from one image to another, I'm assuming this is a registration problem. As such, I'm also assuming you are using imtransform to warp one image to another.
imtransform allows you to specify two additional output parameters:
[out, xdata, ydata] = imtransform(in, tform);
Here, in would be the smaller image and tform would be the transformation you created to register the smaller image to warp into the larger image. You don't need to specify the XData and YData inputs here. The inputs of XData and YData will bound where you want to do the transformation. Usually people specify the dimensions of the image to ensure that the output image is always contained within the borders of the image. However in your case, I don't believe this is necessary.
The output variable out is the warped and transformed image that is dictated by your tform object. The other two output variables xdata and ydata are the minimum and maximum x and y values within your co-ordinate system that will encompass the transformed image fully. As such, you can use these variables to help you locate where exactly in the larger image the transformed smaller image appears. If you want to do a comparison, you can use these to crop out the larger image and see how well the transformation worked.
NB: Sometimes the limits of xdata and ydata will go beyond the dimensions of your image. However, because you said that the smaller image will always be contained within the larger image (I'm assuming fully contained), then this shouldn't be a problem. Also, the limits may also be floating point so you'll need to be careful here if you want to use these co-ordinates to crop a minimum spanning bounding box.

Match two images in different formats

I'm working on a software project in which I have to compare a set of 'input' images against another 'source' set of images and find out if there is a match between any of them. The source images cannot be edited/modified in any way; the input images can be scaled/cropped in order to find a match. The images can be in BMP,JPEG,GIF,PNG,TIFF of any dimensions.
A constraint: I'm not allowed to use any external libraries. ImageMagick is an exception and can be used.
I intend to use Java/Python. The software is purely command-line based.
I was reading on SO and some common image comparing algorithms. I'm planning to take 2 approaches.
1. I could use Histograms/buckets to find out the RGB values of the 2 images being compared.
2. Use SIFT/SURF to fin keypoint descriptors and find the euclidean distance between them and output the result based on the resultant distance.
The 2 images in comparison can be in different formats. An intuitive thought is that before analysis/comparison, the 2 images must be converted to a common format.I reasoned that the image should be converted to the one with lesser quality e.g. if the 2 input images are BMP and JPEG, convert the BMP to JPEG. This can be thought of as a pre-processing step.
My question:
Is image conversion to a common format required? Can 2 images of different formats be compared? IF they have to be converted before comparison, is my assumption of comparing from higher quality(BMP) to lower(JPEG) correct? It'd also be helpful if someone can suggest some algorithms for image conversion.
EDIT
A match is said to be found if the pattern image is found in the source image.
Say for example the source image consists of a football field with one player. If the pattern image contains the player EXACTLY as he is in the source image, then its a match.
No, conversion to a common format on disk is not required, and likely not helpful. If you extract feature descriptors from an image (SIFT/SURF, for example), it matters much less how the original images were stored on disk. The feature descriptors should be invariant to small compression artifacts.
A bit more...
Suppose you have a BMP that is an image of object X in your source dataset.
Then, in your input/query dataset, you have another image of object X, but it has been saved as a JPEG.
You have no idea how what noise was introduced in the encoding process that produced either of these images. There is lighting differences, atmospheric effects, lens effects, sensor noise, tone-mapping, gammut-mapping. Some of these vary from image to image, others vary from camera to camera. All this is done before the image even gets saved to storage in the camera. Yes, there are also JPEG compression artifacts, but to assume the BMP is "higher" quality and then degrade it through JPEG compression will not help. Perhaps the BMP has even gone through JPEG compression before being saved as a BMP.

Image Compression Algorithm - Breaking an Image Into Squares By Color

I'm trying to develop a mobile application, and I'm wondering the easiest way to convert an image into a text file, and then be able to recreate it later in memory said text. The image(s) in question will contain no more than 16 or so colors, so it would work out fine.
Basically, brute-forcing this solution would require me saving each individual's pixel color data into a file. However, this would result in a HUGE file. I know there's a better way - like, if there's a huge portion of the image that consists of the same color, breaking up the area into smaller squares and rectangles and saving their coordinates and size to file.
Here's an example. The image is supposed to be just black/white. The big color boxes represent theoretical 'data points' in the outputted text file. These boxes would really state their origin, size, and what color they should be.
E.g., top box has an origin of 0,0, a size of 359,48, and it represents the color black.
Saved in a text file, the data would be 0,0,359,48,0.
What kind of algorithm would this be?
NOTE: The SDK that I am using cannot return a pixel's color from an X,Y coordinate. However, I can load external information into the program from a text file and manipulate it that way. This data that I need to export to a text file will be from a different utility that will have the capability to get a pixel's color from X,Y coordinates.
EDIT: Added a picture
EDIT2: Added constraints
Could you elaborate on why you want to save an image (or its parts) as plain text? Can't you use a binary representation instead? Also, if images typically have lots of contiguous runs of pixels of same color, you may want to use the so-called run-length encoding (RLE). Alternatively, one of Lempel-Ziv-something compression algorithms could be used (LZ77, LZ78, LZW).
Compress the image into a compressed format (e.g. JPEG, PNG, GIF, etc) and then save it as a .txt file or whatever. To recreate the image, just read in the file into your program using whatever library function suits your particular needs.
If it's necessary that the .txt file have some textual meaning, then you may be in some trouble.
In cs there is an algorithm like spatial index to recursivley subdivide a plane into 4 tiles. If the cell has the same size it looks like a quadtree. If want you to subdivide a plane into pattern (of colors) you can use this tiling idea to dynamically change the size of the cell. A good start to look at is a z-curve or a hilbert curve.

Resources