images with invalid jpeg marker - image

I encountered a website where the images consistently throws 'invalid jpeg marker' error when downloaded. I am wondering is it possible that they are intentionally doing something which causes this error for most of the users who try to download and use their images?
I want to protect the jpeg resources of my website from unauthorised use. Is it possible to really change something in jpeg header or meta tags so that jpg images display fine on browser but if someone downloads it for their own use it throws an error 'invalid jpeg marker'?
(I don't intend to discuss alternative ways of protecting images online or the limitations of it.)

If it can display on a browser, it can display on something else. What is likely happening is that the decoder you are using to view when downloaded is more strict than your web browser. I presume you are not using your browser to view them after downloading.
You could run the "file" command to make sure the images are actually JPEGs. If so, there are a number of programs available to analyze the structure of a JPEG stream. You could use one to see what's going on with the image. You may have some odd marker ordering or possibly the JPEG image does not occur right at the start of the file stream. I've seen some weird JPEGs where there were extra bytes after the APPn marker and before the next JPEG marker. Some decoders ignored the extra bytes others puked.

Related

Does jpeg compression destroys any embedded malicious code inside an image?

I'm trying to figure out what is the best way to "clean" images that are coming from non authorized source (app visitors) before opening them, similar to Whatsapp.
Scanning each image with anti virus is probably not so efficient in a large scale, So i came to assumption that rewriting each incoming image by compressing it using jpeg could results a clean image without a malicous code inside it.
From what i read so far the JPEG compression should destroy any hidden content and reorder the data structure of the image which will results a safe image.
WTYT? Am i on the right path to overcome this issue?
There is no code in a JPEG stream. In fact, I don't know of any image format that directs the decoder execute code.
The worst think I can thing of would be to have a JPEG stream that, say embedded malicious code in a thumbnail, COM, or APPn marker. Then another application would look for that image and load the code.
Even this requires something else to get on your system to execute the JPEG "code" and it would be a lot of trouble for something that could be accomplished much easier.

Decode JPEG image stripped from inside a PDFs file

I have code that decompresses jpgs into bit maps which works fine for JPEG files, however when I feed the code a JPEG I have stripped directly from a PDFs XObject I get errors.
Adobe reader displays the image fine so I don't believe it's corrupted. I have read through JPEG and PDFs documentation and don't find any obvious problems.
My question is this, is there anything different in the "JPEG" embedded inside a PDFs stream and a normal JPEG? And if so what is it?
Note: I can manually open the PDFs, copy the image, paste into paint, and save...when I do this everything works....my problem is I need this automated.
When my code parses the PDFs, strips out the image stream, dumps the binary to a file, and then I try and open this file, it does not work. What am I missing?
My errors seem to be occurring in the Huffman decoding process, the cdt and Huffman tables appear to be read in fine.
Pardon my using the answer section but I overflowed the comment section:
My questions:
1. What code is failing to decode the JPEG? You say you "have code" but where did that come from? Why do you think that it is reliable?
What is the file format of the JPEG stream? JFIF, ADOBE, EXIF, none specified?
Could there be something in the file format that your decoder cannot handle? Does your encoder check for different types of APPn markers?
What is the JPEG format? What type of SOS marker?
Does this encoder source handle all the normally formats? Baseline, Extended, Sequential, progressive? If you have progressive JPEG and and encoder that only does baseline, you are going to have a problem.
How many components does the JPEG stream have?
Some Adobe files have 4 components and decoders may only be able to handle 1 or 3.

Better thumbnail creation of raw images

I'm building a web application (RoR) that manages images that are in raw image format. I need to create thumbnail/web versions of these images to be displayed on the site. Currently, I'm using imagemagick, which delegates to dcraw to produce the jpeg thumbnail. The problem I'm running into is that the thumbnail deviates from the look of the original; the image gets darker and the white balance is sometimes heavily shifted.
I'm assuming that the raw format default setting can't be read by dcraw, and thus it's left guessing how to parameterize the raw conversion. I can play around with customizing these setting, but it seems getting it right on one image causes others to be further off the mark.
Is there a better way to do this in order to get a result that more closely mimics the what I might see in a raw viewer like photoshop, or even Mac OSX preview? Given that Mac OS X supports a variety of digital camera raw formats, is there anyway to utilize the OS's ability to render preview images (especially considering that result is what is expected).
The raw images that I'm using are 3FRs and fffs (both from Hasselblad).
I can post samples if people are interested.
Thanks
Look at "sips" and "Resizing images using the command line" to get you started.

HTML5 Canvas toDataURL 8 bit?

In a webappp I am currently creating the user has to provide images that get stored server side in a database. To minimize server load I am handling image resizing client-side courtesy of the HTML5 Canvas and getting the user to pre-approve the quality of the resized image.
The issue I have run into is this - the file size of the resized image is big. If I resize the same image with Paint.NET I can get a perfectly decent light weight 8 bit PNG image. Even the 32 bit Paint.NET image is smaller than the one that turns up on the server via toDataURL. I tried playing around with the toDataURL quality parameter but changing it has no effect whatsoever - exactly the same data size.
I should mention tha t I am testing with Chrome 20.0.1132.57 m and that the only browsers that are relevant to the app are the desktop versions of Chrome and Safari.
I know I could do some server side image processing but I want to avoid that if possible. Question - what, if anything can I do to cut down on the image file size sent out from the browser?
Browsers may happily ignore any quality parameter given for the toDataUrl and such. I don't believe honoring it is mandatory by the specification.
The only way to control the quality exactly would be
Write your own PNG compressor in JS or use something you can steal from the internets https://github.com/imaya/CanvasTool.PngEncoder
Dump <canvas> data to ArrayBuffer
Pass this to WebWorker
Let WebWorker compress it using your PNG compressor library
I believe there exist JPEG/PNG encoding and decoding solutions already.
Alternative you may try canvas.mozGetAsFile() / canvas.toBlob(), but I'll believe browsers still won't honour quality parameters.
https://developer.mozilla.org/en/DOM/HTMLCanvasElement/

How to determine if a photo is corrupted?

I have a requirement where in I have to determine whether a photo is corrupted and accordingly tag it as such.
Another thing, I need is to determine if an Image has got wrong extension. What I mean by wrong extension is that sometimes I have come across a photo that has extension of jpg but when I load this photo into IrfanView it reports that the photo is in different format that the extension.
How can I do this in Delphi.
I have a requirement where in I have to determine whether a photo is corrupted and accordingly tag it as such.
You can try some things, but with certain file formats (example: BMP, JPEG to some extent) only a human can ultimately decide if the file is OK or corrupted. The simplest test is to simply load the file into a corresponding object (TJpegImage, TPngObject, etc). If you get an exception while loading you've surely got a corrupted file. Unfortunately if no exception is raised you can't really say the file is not corrupted. I've seen corrupted JPEG files that load just fine into a Delphi TImage and can be opened with Windows's Image Viewer, but are obviously corrupted to a human observer. With BMP images it's even clearer: open up a bitmap, overwrite some bytes in the middle of the file and then open it in a viewer. How can any automated system tell those wrongly colored bits in the middle of the bitmap are actually wrong?
Another thing, I need is to determine if an Image has got wrong extension. What I mean by wrong extension is that sometimes I have come across a photo that has extension of jpg but when I load this photo into IrfanView it reports that the photo is in different format that the extension.
How about doing some of the same, trying to load the file into the object that corresponds to it's extension, and if you fail, try opening up with some other formats? This should be easy.
Alternatively you can investigate image headers: Most file formats start with a short signature, a few bytes. You can look up the documentation of all image file formats and find the signature, or you can simply open up an large number of files and look for a pattern in the first 4 bytes. I'd go for this second alternative since finding proper documentation for all image file formats might be a challenge.
The only way to check if file is corrupted is to try reading it as it is described in file format, ie. load BMP as BMP with reading BMP header, BMP data etc. There are many web pages that describe graphics file formats. Of course if you transmit files and are afraid that it will be corrupted after transmitting then save such files with some sum like CRC32, or even cryptographic MD5 or SHA1. Then after transmitting check if calculated sum is the same as original.
In Delphi there is unit jpeg and types TJPEGImage and TBitmap. Try loading it with data and check exception. For others formats there are many libraries, just look for required file formats.
To check if file extension is good try reading some first bytes of file and check it with some dictionary of graphics file headers. For example GIF files should start with GIF, BMP files starts with BM, and in JPEG header you will find JFIF. I think unix utility file works this way.
Since you used the term "requirement", I suspect that you're doing a job for someone, possibly as a contract. So make sure that you nail the requirements before worrying about the code.
IMO, you need to get samples of test cases. As others mentioned, failure to load the file as a particular format will be one test. But what about a .jpg that loads ok, but the bottom third is missing? Or a .jpg that loads ok but has green "static" lines in the middle where an error occurred upstream somewhere (on the camera, photoshop, whatever) but then the processing recovered and resumed? In this case, the .jpg may really have green lines in it. Is that considered "corrupt" or not? This is where you need to be careful, especially if it's a contract job.
I have handled this situation by reading the suspicious image and trying to getting its shape. The task is done within try-except block. Following is the code:
import cv2
image = cv2.imread('./image.jpg')
try:
dummy = image.shape # this line will throw the exception
except:
print("[INFO] Image is not available or corrupted.")
This approach should cover all your needs like:
Detecting a corrupted image
Non-image file with an image-type extension detection
Missing image detection etc.

Resources