ImageMagick adds thick horizontal lines to PNGs extracted from PDF - image

Edit July 7, 2017: Downgrading to ImageMagick 6.9.5 solved this problem, which may be Cygwin-specific. I still don't know the underlying cause.
I need to extract data via OCR from images in PDF reports published by Chicago Public Schools. An example PDF is here (NB: this link downloads the file automatically rather than opening it in the browser). Here's a sample image (from PDF page 11, print page 8), extracted with pdfimages -png version 0.52.0 on Cygwin:
I'd like to crop each bar into its own file and extract the text with OCR. But when I try this with ImageMagick (version 7.0.4-5 Q16 x86_64 2017-01-25 according to convert -version), using the command convert chart.png -crop 320x600+0+0 bar.png, I get this image, with horizontal lines that interfere with OCR:
Running pdfimages to extract to PPM format first and then converting to PNG while cropping gives the same result, as does round-trip converting the extracted images to SVG format with ImageMagick's rsvg delegate, and fiddling with the PNG alpha channel changes the line's colors from gray to white or black but doesn't eliminate them. I've found a workaround of round-trip converting extracted images through JPG (introducing ringing artifacts, which I hope are irrelevant). But I don't see why I should have to do this. Incidentally, ImageMagick introduces the lines to PNGs even if I run a null conversion convert chart.png chart.png, which ought to leave the image unchanged:
I have found other complaints that PDF software adds horizontal lines to images, but none of them exactly matches this problem. A discussion thread mentions that versions of the PDF standard somehow differ in their treatment of alpha channels, but my knowledge of graphics is too poor understand the discussion fully; besides, my images get horizontal lines added after they're extracted from the PDF, because of something internal to ImageMagick. Can anyone shed some light on the causes of the grey lines?

Using the latest ImageMagick 7.0.6.0 Q16 Mac OS X, I get a good result. As mentioned above by Bonzo, the correct syntax for IM 7 is magick rather than convert. The use of convert reverts to IM 6. Also do not use magick convert either.
magick chart.png -crop 320x600+0+0 +repage bar.png
If this does not work for you, then there must have been a bug in your older version of IM 7. So you should then upgrade.
Note also the +repage is needed to remove the virtual canvas

Related

Converting TIFF to PDF with GraphicsMagick MediaBox / CropBox resolution

We are currently converting TIFF files to PDF using GraphicsMagick. The TIFF is coming from an eFAX and has a (pixel) resolution of 1728x2200.
If you do the conversion with tiff2pdf or just open it on Preview and convert export it to PDF, it is generated with a MediaBox value of 612x792 point, which is what is expected.
However graphics magick generates a MediaBox of 1728x4400 and a CropBox of 610x792. It all looks good if you open it on a PDF viewer because it's using the CropBox but if you're feeding it to GhostScript after, you don't get the Image on the full page but as a small square inside the document.
The lazy solutions would be to change for Tiff2PDF or add -dUseCropBox to our GhostScript command but I'd like to know what GraphicsMagick option should be used to have the PDF with the good MediaBox. It's like it doesn't understand that the resolution is in Pixels and not in Point. Hope somebody has insights

Remove meta information in picture using "convert" without changing the colors of the picture

I have a problem using the tool convert (imagemagick) to remove metainformation of pictures to make them smaller for faster delivery in websites.
When I use the convert command with following parameters:
convert -format png -strip pic1.png pic2.png
then the converted picture is much more darker than the original one.
I also tried to require the colorspace of the original picture with:
convert -format png -colorspace sRGB -strip pic1.png pic2.png
but it's the same problem.
Has anyone an idea how to solve it?
These are the example pictures:
(Original) Pic1:
(Converted) Pic 2:
The problem is the installed version 6.7.7 of ImageMagick (Standard-Version of Linux Mint 17)
In version 6.4 and (said in comment) in version 6.9 it works.
So the 6.7.7 version is buggy in that case.

MiniMagick's "strip" function makes picture filesize bigger

I have used MiniMagick to compress JPEG files.
With strip function, I want to get rid of EXIF from image. So, I do:
image = MiniMagick::Image.open("my_picture.jpg")
image.strip
image.write("my_picture_small.jpg")
but sometimes the size of my_picture_small.jpg is bigger than my_picture.jpg.
However, when I don't use the strip function, like
image = MiniMagick::Image.open("my_picture.jpg")
# image.strip
image.write("my_picture_small.jpg")
my_picture_small.jpg's size is smaller.
That situation happened with some picture deal with Photoshop and in my CentOS computer, but run well with my Macbook. I don't know why stripping some information led to more storage.
Can anyone explain it?
Have found that ImageMagick will recompress image even if it with any arguments, such as
convert image.jpg new_image.jpg
new_image.jpg will be different from image.jpg more or less. If image.jpg is from a phone or camera or a image processing tools, the degree of difference is also different.
So compress images with MiniMagick or Rmagick that use ImageMagick as there system support, just do convert -strip image.jpg new_image.jpg may led to a unexpected result, avoid to use MiniMagick command if there is no need to greatly compress file.

Uncrush PNG image on ubuntu?

IPA image uses pngcrush to compress PNG image, but I want to uncrush a PNG image on Ubuntu.
Can anyone give me any idea?
The standard PNG utility pngcrush has been modified by Apple, which makes it produce technically invalid PNGs: a new chunk is inserted before the mandatory first chunk IHDR, RGB(A) order of pixel data is inverted, and RGB pixels get premultiplied with their alpha.
Hence, I'd rather call these PNGs "fried", rather than just "crushed".
Try my own pngdefry. The source code is written on a Mac OSX machine but it should be compilable for other OSes as well; it's pretty straightforward C code.

JPEG Shows in Firefox but Not IE8

I'm working on a Sidebar Gadget and cannot get my JPEGs to show up (PNGs work). When I try to open the file by itself in IE8 it doesn't work. Firefox, of course, can open it fine.
JPEG Details:
Dimensions: 1080X900
180 dpi
Bit depth 24
Color representation: uncalibrated
I've found some things talking about the images being compressed incorrectly (?) but I haven't been able to get it working...
Any clues?
IE8 drops support for CMYK JPEG and renders them as the infamous red X without so much as a warning.
If you have ImageMagick:
identify -verbose image.jpg
will show you the image colorspace. If it's CMYK, you can convert to RGB with:
convert broken.jpg -colorspace RGB fixed.jpg
If you need to do CMYK to RGB conversion on a whole batch of JPEG-images, this command may be helpful to you:
for i in *.jpg; do convert "$i" -colorspace RGB "$i"; done
PS: If you'd like to see what is going on, just add -verbose:
for i in *.jpg; do convert "$i" -colorspace RGB -verbose "$i"; done
I had a similar issue with IE8 not displaying two JPEG images. FF, Safari, Chrome all displayed them without complaint but IE acted as if the files were not there. I have no idea what was going on, but a quick image conversion to gif or png fixed the problem. Just another in a long line of confirmations that IE sucks.
Had similar problems with existing images, which will not show up in IE8.
Problem is, as converter42 says: CMYK-Images
Convert them to RGB colorspace and all is good
The Solution with the PNG is not the best, because PNG files can be MUUUCH larger than JPGS.
If you are using photoshop for creating the jpgs. Try the below.
Open the file and go to 'Image' menu
Go to Mode
Select RGB
Save and upload to server.
This should work.
Why are you dealing with the image at 180 dpi and not the 72dpi screen resolution? At screen resolution the image will be roughly double that size. Still, the size is manageable for any browser.
When creating a gadget, you should be using PNGs for all the elements of the gadgets. Are you having issues displaying JPEG photos?
Have you looked for the yellow bar at the top of IE that blocks certain suspicious content from being loaded (popups, activex, javascript, etc.)? If it appears, try telling it to "allow".
Lastly, what are you using to compress your images to JPEG?
EDIT: If you want to do batch conversion use the batch converter in photoshop or use the Actions panel to record the conversion process for a single image, then replay the action on an entire folder. Additionally, you can save this action to a "droplet" which is a small application containing the action that you can drop an image or folder on top to.
Alternatively, if you don't fell like learning Actions, XNView is an excellent image viewer and converter that supports something like 160 different image formats and can batch convert and batch rename huge lists of files.
I fixed this issue by opening the CMYK JPEG file in Windows Paint and then saving as a JPEG, which Paint encodes as RGB by default. Not a great solution because I'm sure that Paint's converter is not as robust as Photoshop's, but this can be a quick fix if the job needs to be done now and there's no access to the tools above.

Resources