How an image get converted by base64 algorithm? - windows

After I read in https://en.wikipedia.org/wiki/Base64 about how the word Man gets converted into TWFu by using the base64 algorithm, I was wondering how an image get converted by the same algorithm, after all this conversion takes bytes ,divide them into groups of 6 and then looking for their ASCII value.
My question is, how an image becomes a base64-encoded string?
I want an answer that describes the flow from when we save the image in our computer until it becomes a base64-string.
Terms that I hope will be explained in the answer are:
pixels/dpi/ppi/1bit/8bit/24bit/Mime.

Base64 isn't an image encoder, it's a byte encoder, important distinction. Whatever you pass it, whether it be a picture, an mp3, or the string "ilikepie" - it takes those bytes and generates a text representation of them. It has no understanding of anything in your pixels/dpi/ppi/1bit/8bit/24bit/Mime list, that would be the business of the software that reads those original bytes.
Per request I want an answer that describes the flow from when we save the image in our computer until it's become 64base string.
To get to a base64 representation:
Open paint and draw a smiley face.
Save that smiley face as smile.png
Paint uses its png encoder to convert the bitmap of pixels into a stream of bytes that it compresses and appends headers to so that when it sees those bytes again it knows how to display them.
Image is written to disk as series of bytes.
You run a base64 encoder on smile.png.
base64 reads the bytes from disk at the location smile.png refers to and converts their representation and displays the result.
To display that base64 representation in a browser:
browser is handed a resource encoded with base64, which looks something ...
Browser takes the image/png part and knows that the data following it will be the bytes of a png image.
It then sees base64, and knows that the next blob will need to be base64 decoded, before it can then be decoded by its png decoder.
It converts the base64 string to bytes.
It passes those bytes to its png decoder.
It gets a bitmap graphic that it can then display.

every image is consists of many pixels, the number of pixel is determined by the image resolution.
image resolution - the number of pixels in a row & number of rows.
for example image with resolution of 800x600 has 800 pixels in a row & 600 rows.
every pixel has bit depth - the number of bits represent pixel.
for example with bit depth of 1 every pixel is represent by one bit and has only 2 options (0 or 1 - black or white).
image can saved in many different formats. the most common are bitmap , jpeg, gif. whatever format is used an image always displayed in computer screens as bitmap (uncompressed format). every format is saved differently.
jpeg- is a 24 bit (bit depth) format. when you stored the image it work in compressed form and you loss some of the image data.
gif- up to 8 bit (bit depth) format. a gif image can be optimized by removing some of the colours in its palette. it is a lossless format.

Just throwing this in for the Bytes clarification :
"The word MAN gets converted into TWFu by using the base64 algorithm, I was wondering how an image gets converted by the same
algorithm, after all this conversion takes Bytes, divide them into
groups of 6 and then looking for their ASCII value."
"My question is, How an image becomes base64 string?"
correction : MAN becomes TUFO. It is actually Man that becomes TWFu as you've shown above.
Now onto the clarification...
The byte values (binary numbers) of the image can be represented in hex notation, which makes it possible to process those same bytes as a string. Hex has a range from 0 up to F which means.. ranging 0 to 9 then it's A = 10 up F = 15. Giving 16 possible values.
Hex is also called Base16.
Bytes conversion to Base64 is simply : Numbers converted from Base16 to Base64.
The Hex range of 0 to F is within Base64 valid chars and so can be written as Base64.
For example :
The beginning 3 bytes of JPEG format are always ff d8 ff
The same as Base64 is : /9j/ ...(see examples at these link1 and link2)
You can test by :
Open any .jpg image in a downloaded free Hex Editor. You can also try online Hex editors but most won't Copy to clipboard. This online HEX viewer will allow Select/Copy but you have to manually remove all those extra minus - in copied hex text (ie: use the Find & Replace option in some text editor), or skip/avoid selecting them before any copy/paste.
Go to this Data Converter and re-type or copy/paste as many bytes (from starting FF up to any X amount) into the [HEX] box and press decode below it. This will show you those bytes as Base64 and even tells you the decimal value of each byte.

When you upload any file in a html form using <input type="file>" it is transferred to server in the exactly same form as it is stored on your computer or device. Browser doesn't check what file format is and traits it as just block of bytes. For transfer details see How does HTTP file upload work?

Related

what exactly is the actual image data inside IDAT chunck in PNG file?

PNG Specification here says PNG file contains a chunk IDAT which contains the actual image data.
My question is when I modify (using hex editor) LSB of any 1 byte in IDAT the whole image goes bad(colors changes randomly or image becomes transparent with some outline remaining or completely blank).
How just changing 1 byte can cause this?
It says exactly what's in there in the specification you linked. Did you consider reading the specification?

Converted byte array to image and back to byte array, the values changed

I tried to implement steganography with the following steps :
1. Converted image to buffered image
2. Converted buffered image to Bytes array
3. Made modifications in the byte array
4. Converted byte array back to buffered image
5. Saved it as a jpg file
The problem arose when i read the saved file again, converted it to byte array and found that byte array is different from what i obtained after Step 3. (although there were not much difference as 6 converted to 7, 9 to 8 and so on)
I really have no idea why did this happen.
If you save as a JPEG, the RGB data gets converted to YCbCr. Those two color spaces have different gamuts so values get clamped.
JPEG data may be subsampled, causing data to be changed. You can avoid these changes by not subsampling.
The JPEG DCT may introduce small errors (limited to +/-1 if implemented correctly)
Quantization will make rather large changes to the data. You can avoid changes at this step by having all 1s in your quantization tables.
No matter what you do, #1 and #3 can introduce changes in the JPEG compression process.
JPG is a lossy image format, so you can't expect it to hold the data exactly after it is saved. It is especially unsuited for steganography, as it will destroy the small details required for this use, even when using the highest quality setting.
Solution is to use a lossless format, like PNG.
A BufferedImage may be already a byte array. If when you create you BufferedImage you use the encoding TYPE_BYTE_GRAY, 3BYTE_BGR or 4BYTE_ABGR, then your BufferedImage is already a byte array. To access the byte array, you do: byte[] buffer = ((DataBufferByte)my image.getRaster().getDataBuffer()).getData() ;
And when you write an image as a JPEG, you compress with loss your image. So the information you save is altered and cannot be retrieved as before. You should use PNG/TIFF/BMP, PNG being the most common.

Get width and height from jpeg without 0xFF 0xC0

I'm trying to get the file dimensions (width and height) from a jpeg (in this case an Instagram picture)
https://scontent.cdninstagram.com/t51.2885-15/s640x640/sh0.08/e35/11264864_1701024620182742_1335691074_n.jpg
As I understand if, the width and height are defined after the 0xFF 0xC0 marker, however I cannot find this marker in this picture. Has it been stripped or is there an alternative marker I should check for?
The JPEG Start-Of-Frame (SOF) marker has 4 possible values:
FFC0 (baseline) - This is the usual mode chosen for photos and encodes fully specified DCT blocks in groupings depending on the color/subsample options chosen
FFC1 (extended) - This is similar to baseline, but has more than 8-bits
per color stimulus
FFC2 (progressive) - This mode is often found on web pages to allow the image to load progressively as the data is
received. Each "scan" of the image progressively defines more coefficients of the DCT blocks until they're fully defined. This effectively provides more and more detail as more scans are decoded
FFC3 (lossless) - This mode uses a simple Huffman encoding
to losslessly encode the image. The only place I've seen this used is
on 16-bit grayscale DICOM medical images
If you're scanning through the bytes looking for an FFCx pattern, be aware that you may encounter one in the embedded thumbnail image (inside an FFE1 Exif marker). To properly find the SOFx of the main image, you'll need to walk the chain of JPEG markers. The 2-byte length (big-endian) follows the 2-byte marker. One last pitfall to avoid is that some JPEG encoders stick extra FF values in between the valid markers. If you encounter a FFFF where a marker should be, just increment the pointer by 1 byte and try again until you hit a valid marker.
You can get it pretty simply at the command-line with ImageMagick which is installed on most Linux distros and is available for OSX and Windows.
identify -ping https://scontent.cdninstagram.com/t51.2885-15/s640x640/sh0.08/e35/11264864_1701024620182742_1335691074_n.jpg
Output
https://scontent.cdninstagram.com/t51....1074_n.jpg JPEG 640x640 640x640+0+0 8-bit sRGB 162KB 0.000u 0:00.000

JPEG - Can EOI Marker appear inside image data after SOS?

I understand that a JPEG file starts with 0xFFD8 (SOI), followed by a number of 0xFFEn segments holding metadata, then a number of segments holding the compression relate data (DQT, DHT, etc) of which the final one is 0xFFDA (SOS); then comes the actual image data which ends with 0xFFD9 (EOI). Each of those segments states its length in the two bytes following the JPEG marker so it is a trivial execise to calculate the end of a segment/start of next segment and the start of the image data can be calculated from the length of the SOS segment.
Up to that point, the appearance of 0xFFD9 (EOI) is irrelevant 1, because the segments are identified by the length. As far as I can see, however, there is no way of determining the length of the image data other than finding the 0xFFD9(EOI) marker following the SOS segment. In order for that to be assured, it would mean that 0xFFD9 must not appear inside the actual image data itself. Is there something built into the JPEG algorithm to ensure that or am I missing something here?
1 A second 0xFFD8 and 0xFFD9 can appear if a thumbnail is included in the image but that is taken care of by the length of the containing segment - usually a 0xFFE1 (APP1) segment from what I have seen. In images I have checked so far, the start and size of the thumbnail image data is still given in the 0x0201 (JPEGInterchangeFormat - Offset to JPEG SOI)and 0x202 (JPEGInterchangeFormatLength - Bytes of JPEG data) fields in IFD1, even though these were deprecated in Tech Note #2.
In JPEG, the Compressed value FF is encoded as FF00.
The compressed value FFD9 would be encoded as FF00D9.

Extra data within image (PPM/PAM/PNM)

Is it possible to store extra data in pixels of a binary PNM file in such a way that it can still be read as an image (hopefully by any decoder, but specifically by ffmpeg)?
I have a simulation that saves its data as PPM currently and I'd like a way to record more than three values per pixel in the file, and yet still be able to use it as an image (obviously only the first three/four values will actually affect the image).
In particle I think the TUPLTYPE of PAM should allow me to do this, but I don't know how make something that's also a readable image from that.
There are two tricks which together can get up to 5 additional bytes per pixel in PAM file.
First trick:
You can try store additional byte of information in alpha channel and then choose to ignore that information in decoder. Enabling alpha channel in PAM is done by adding _APLHA to TUPLTYPE argument, so instead TUPLTYPE RGB you have TUPLTYPE RGB_ALPHA.
Second trick:
You can set MAXVAL in PAM (or equivalent field in PPM and others) to 65535 instead of 255, which means that every pixel will be described by three 16-bit values instead of three 8-bit ones. Now, for these 16-bit values the 8 least significant bits can be used to store information as they do not affect visual properties of image when shown on typical computer screen.
First + second trick:
This gives you additional 3 x 8 = 24 bits for RGB planes and 16 bits in alpha channel. Which means: 5 bytes.
I've not used PNM file format, but I've done this trick with a .bmp file.
Hijack the least significant bit of the image data and stuff it with whatever binary data you want. Nobody will see the difference between a pixel value of a 0 or 1 (00000000 or 00000001), or the the difference between a 254 or 255 (1111110 or 11111111). For every 8 bytes of image data a byte of extra data can be embedded (6 bytes if you use a limited character set). The file viewing software won't know any difference. Any software which could open the file before the encoding, would be able to read it after.
If you want the data to be more covert/hidden, the bits can be stuffed into the image data with a shuffle routine, where the first bit might be location 50, the second in 123, the third in 32... and after locations 0-255 (first 256 bytes if image data) are used (first 32 bytes of extra data), start the shuffle again.

Resources