reading pixel values in 16-bit grayscale PNG in delphi - delphi-xe2

I have been trying to read pixel values from 16-bit grayscale images (.PNG), but I am unable to extract the 16-bit values. They always come out as 8-bit values.
The image loads nicely with this code:
PngImage := TPngImage.Create;
PngImage.LoadFromFile(ImageFile);
and
PngImage.Header.ColorType returns: COLOR_GRAYSCALE
PngImage.Header.BitDepth returns: 16
I then declare the variables:
pngrowdata,extrapngrowdata: pWordArray;
and read them using:
PngRowData := PngImage.Scanline[y];
ExtraPngRowData := PngImage.ExtraScanline[y];
where y is a line number in the image.
The problem is that for all values of x PngRowData[x] is identical to ExtraPngRowData[x], i.e. I do not get the high and the low byte of a 16-bit value.
The documentation of Extrascanline in the Embarcadero help file is lacking, it is only stated that this function is used to read the second byte of
the 16-bit pixels and that Scanline is used to read the first byte,
Does anyone have a clue? Just to check the image used, I can load it to ImageJ and in that program the pixels are read nicely as 16-bit integers.
Claus

Related

JPG sizes not as expected

I have read severel posts like this one claiming that the size of a JPG texture can be calculated as
If each pixel contains 32 bits of information, then
307,200 * 32 = 9,830,400 bits of information
Divide by the 8 bits to become a byte value
9,830,400 / 8 = 1228800 bytes (Or 1.17 Mb)
which totally makes sense since each pixel is represented by a color value. Here comes the weird part:
I have these two JPG files
First JPG
which has the dimensions 242x198 and uses 24-bit color values.
Second JPG
which has the dimensions 3840x2400 and uses 24-bit color values.
Then I tried to calculate the sizes using the technique above and concluded that
The first JPG must have the size 242*198*24 = 1149984 bits = 1149984/8/1000 = 143.7 kb - now the actual file size is 47,6 kb?? So the calculation apparently gives a number above the actual size, why?
The second JPG must have the size 3840*2400*24 = 221184000 = 221184000/8/10000 = 27.6 mb - now the actual file size is 7.33 mb. So the calculation apparently gives a number above the actual size, why?
I have myself managed to draw the first JPG and made sure to export it without compression (JPG 100).
The whole point of JPEGs is that they are compressed to take up less space. If you resave your file as a BMP you'll get the sizes you expect (plus a bit extra for headers and alignment).

Octave 4.0 `imwrite` png: `BitDepth` 4?

Matlab's imwrite allows me to specify paired arguments 'BitDepth',4 when writing a 2D uint16 array to a file '*.png'. My Octave's imwrite doesn't accept the paired arguments 'BitDepth',4. I can exercise some limited control of bit depth, however, if I scale the data for uint8 and save it to a 2D uint8 array; the '*.png' from imwrite is then just over half the size the file for a uint16 array. I got the idea to do this by looking at imread, for which the bit depth of the source image file determines the uint type of the destination variable. Assuming that the uint type of the source 2D image array similarly determines the bit depth of the imwrite destination file, uint8 yields a bit depth of 8. I found, however, that a bit depth of 2 is often enough for 100 dpi grayscale scans of hand notes. Is there an easy way to have such arbitrary bit depth control for imwrite?
Aside: Regarding the reference to uint16 above, I didn't just make that up. It's the default from a conversion from colour RGB. From a web search, I found a conversion method for my old Octave 4.0 (no rgb2gray):
im=imread('rgb.jpeg');
[imInd,Ind]=rgb2ind(im);
imGray16=ind2gray(imInd,Ind); imwrite(imGray16,'gray16b.png');
imGray8=uint8(imGray16/256); imwrite(imGray8,'gray8b.png');
I am using the Octave installation that is part of Cygwin. However, the laptop I use has limited user rights, and upgrading Octave requires phenomenal amounts of time.
Answering your question, you can't set BitDepth when calling imwrite. The function will write an image with the data type of the variable (provided that the image file type supports it).
If you really need arbitrary bit depth control when writing the file, you would need to interact with libpng directly, that is, write your own oct function.
However, there's a few things about your comments:
the issue of converting from uint16 -> uint8 and getting more than half the size, I'm guessing is because the image is compressed.
rgb2ind does nor convert to uint16 by default. It will convert to uint8 or uint16 depending on the number of unique colours in your image.
the function rgb2gray is part of the image package. Load that package if you want that function.

How an image get converted by base64 algorithm?

After I read in https://en.wikipedia.org/wiki/Base64 about how the word Man gets converted into TWFu by using the base64 algorithm, I was wondering how an image get converted by the same algorithm, after all this conversion takes bytes ,divide them into groups of 6 and then looking for their ASCII value.
My question is, how an image becomes a base64-encoded string?
I want an answer that describes the flow from when we save the image in our computer until it becomes a base64-string.
Terms that I hope will be explained in the answer are:
pixels/dpi/ppi/1bit/8bit/24bit/Mime.
Base64 isn't an image encoder, it's a byte encoder, important distinction. Whatever you pass it, whether it be a picture, an mp3, or the string "ilikepie" - it takes those bytes and generates a text representation of them. It has no understanding of anything in your pixels/dpi/ppi/1bit/8bit/24bit/Mime list, that would be the business of the software that reads those original bytes.
Per request I want an answer that describes the flow from when we save the image in our computer until it's become 64base string.
To get to a base64 representation:
Open paint and draw a smiley face.
Save that smiley face as smile.png
Paint uses its png encoder to convert the bitmap of pixels into a stream of bytes that it compresses and appends headers to so that when it sees those bytes again it knows how to display them.
Image is written to disk as series of bytes.
You run a base64 encoder on smile.png.
base64 reads the bytes from disk at the location smile.png refers to and converts their representation and displays the result.
To display that base64 representation in a browser:
browser is handed a resource encoded with base64, which looks something ...
Browser takes the image/png part and knows that the data following it will be the bytes of a png image.
It then sees base64, and knows that the next blob will need to be base64 decoded, before it can then be decoded by its png decoder.
It converts the base64 string to bytes.
It passes those bytes to its png decoder.
It gets a bitmap graphic that it can then display.
every image is consists of many pixels, the number of pixel is determined by the image resolution.
image resolution - the number of pixels in a row & number of rows.
for example image with resolution of 800x600 has 800 pixels in a row & 600 rows.
every pixel has bit depth - the number of bits represent pixel.
for example with bit depth of 1 every pixel is represent by one bit and has only 2 options (0 or 1 - black or white).
image can saved in many different formats. the most common are bitmap , jpeg, gif. whatever format is used an image always displayed in computer screens as bitmap (uncompressed format). every format is saved differently.
jpeg- is a 24 bit (bit depth) format. when you stored the image it work in compressed form and you loss some of the image data.
gif- up to 8 bit (bit depth) format. a gif image can be optimized by removing some of the colours in its palette. it is a lossless format.
Just throwing this in for the Bytes clarification :
"The word MAN gets converted into TWFu by using the base64 algorithm, I was wondering how an image gets converted by the same
algorithm, after all this conversion takes Bytes, divide them into
groups of 6 and then looking for their ASCII value."
"My question is, How an image becomes base64 string?"
correction : MAN becomes TUFO. It is actually Man that becomes TWFu as you've shown above.
Now onto the clarification...
The byte values (binary numbers) of the image can be represented in hex notation, which makes it possible to process those same bytes as a string. Hex has a range from 0 up to F which means.. ranging 0 to 9 then it's A = 10 up F = 15. Giving 16 possible values.
Hex is also called Base16.
Bytes conversion to Base64 is simply : Numbers converted from Base16 to Base64.
The Hex range of 0 to F is within Base64 valid chars and so can be written as Base64.
For example :
The beginning 3 bytes of JPEG format are always ff d8 ff
The same as Base64 is : /9j/ ...(see examples at these link1 and link2)
You can test by :
Open any .jpg image in a downloaded free Hex Editor. You can also try online Hex editors but most won't Copy to clipboard. This online HEX viewer will allow Select/Copy but you have to manually remove all those extra minus - in copied hex text (ie: use the Find & Replace option in some text editor), or skip/avoid selecting them before any copy/paste.
Go to this Data Converter and re-type or copy/paste as many bytes (from starting FF up to any X amount) into the [HEX] box and press decode below it. This will show you those bytes as Base64 and even tells you the decimal value of each byte.
When you upload any file in a html form using <input type="file>" it is transferred to server in the exactly same form as it is stored on your computer or device. Browser doesn't check what file format is and traits it as just block of bytes. For transfer details see How does HTTP file upload work?

Extra data within image (PPM/PAM/PNM)

Is it possible to store extra data in pixels of a binary PNM file in such a way that it can still be read as an image (hopefully by any decoder, but specifically by ffmpeg)?
I have a simulation that saves its data as PPM currently and I'd like a way to record more than three values per pixel in the file, and yet still be able to use it as an image (obviously only the first three/four values will actually affect the image).
In particle I think the TUPLTYPE of PAM should allow me to do this, but I don't know how make something that's also a readable image from that.
There are two tricks which together can get up to 5 additional bytes per pixel in PAM file.
First trick:
You can try store additional byte of information in alpha channel and then choose to ignore that information in decoder. Enabling alpha channel in PAM is done by adding _APLHA to TUPLTYPE argument, so instead TUPLTYPE RGB you have TUPLTYPE RGB_ALPHA.
Second trick:
You can set MAXVAL in PAM (or equivalent field in PPM and others) to 65535 instead of 255, which means that every pixel will be described by three 16-bit values instead of three 8-bit ones. Now, for these 16-bit values the 8 least significant bits can be used to store information as they do not affect visual properties of image when shown on typical computer screen.
First + second trick:
This gives you additional 3 x 8 = 24 bits for RGB planes and 16 bits in alpha channel. Which means: 5 bytes.
I've not used PNM file format, but I've done this trick with a .bmp file.
Hijack the least significant bit of the image data and stuff it with whatever binary data you want. Nobody will see the difference between a pixel value of a 0 or 1 (00000000 or 00000001), or the the difference between a 254 or 255 (1111110 or 11111111). For every 8 bytes of image data a byte of extra data can be embedded (6 bytes if you use a limited character set). The file viewing software won't know any difference. Any software which could open the file before the encoding, would be able to read it after.
If you want the data to be more covert/hidden, the bits can be stuffed into the image data with a shuffle routine, where the first bit might be location 50, the second in 123, the third in 32... and after locations 0-255 (first 256 bytes if image data) are used (first 32 bytes of extra data), start the shuffle again.

Printing the pixel values of YUV image

When i convert a image to YUV from RGB I cant get a value of Y when I try printing the pixels. I get a value 377 and when I cast it with integer I get 255. WHich I presume is incorrect. Is there a better way or rather correct way to print the values of YUV image pixels.
Actually i am priting the values (int)src.at<Vec3b>(j,i).val[0]= 255
and src.at<Vec3b>(j,i).val[0] = 377
Also on that note, the Y is the combination of RGB calculated with some constants according to note. I am actually confused as how to get the value of Y.
This is a problem of OpenCV. OpenCV does not gracefully handle (scale) YUV or HSV color spaces for uchar format. With Vec3b you have effectively 3-channel uchar, and that ranges [0;255].
The solution is to use another matrix type. With cv::Mat3f you have a 3-channel floating point image. Then the values will be correctly converted by cvtColor function. You can get a Mat3f from a Mat3b by assignment.
Another solution that uses less memory may be Mat3s and Mat3w types, if supported by cvtColor.

Resources