Ghostscript command for finding the number of colors used for each page in pdf file - ghostscript

I'm new to GhostScript. Can you let me know the Ghostscript command for finding the number of colors used for each page in pdf file. I need to parse the results of this command from java program

There is no such Ghostscript command or device. It would also be difficult to figure out; so much depends on what you mean. Do you intend to count the colour of each pixel in every image for example ? Which colour spaces are you interested in ? What about ICCBased colour spaces, do you want the component values, or the CIE values ?
[edit]
Yeah there's no Ghostscript equivalent, I did say that.
You wuold have to intercept every call to the colour operators, examine the components being supplied and see if they were no black and white. For example, if you set a CMYK colour with C=M=Y=0 and K!=0 then its still black and white. Similar arguments apply for RGB, CIE and ICC colour spaces.
Now I bet ImageMagick doesn't do that, I suspect it simply uses Ghostscript to render a bitmap (probably RGB) and then counts the number of pixels of each colour in the output. Image manipulation tools pretty much all have to have a way to do that counting already, so its a low cost for them.
Its also wrong.
It doesn't tell you anything about the original colour. If you render a colour object to a colour space that is different to the one it was specified in, then the rendering engine has to convert it from the colour space it was in, to the expected one. This often leads to colour shifts, especially when converting from RGB to CMYK but any conversion will potentially have this problem.
So if this is what ImageMagick is doing, its inaccurate at best. It is possible to write PostScript to do this accurately, with some effort, but exactly what counts as 'colour' and 'black and white' is still a problem. You haven't said why you want to know if an input file is 'black and white' (you also haven't said if gray counts as black and white, its not the same thing)
I'm guessing you intend to either charge more for colour printing, or need to divert colour input to a different printer. In which case you do need to know if the PDF uses (eg) R=G=B=1 for black, because that often will not result in C=M=Y=0 K=1 when rendered to the printer. Not only that, but the exact colour produced may not even be the same from one printer to another (colour conversion is device-dependent), so just because Ghostscript produced pure black doesn't mean that another printer would.
This is not a simple subject.

Related

Read encoded files

I was trying to read some files like images, but when I try to open them with the notepad I found weird codes like this:
ÿH‹\$0H‹t$8HƒÄ _ÃÌÌÌÌÌÌH‰\$H‰l$H‰t$ WAVAWHƒì ·L
Click here to see the image
So I have the following questions:
Why do I find those weird symbols instead of zeros and ones?
Does programmers do this for security or optimization?
Is this an encoding such as ASCII that every symbol has an unique decimal and binary number associated?
Can anyone with the correspondent decoder read this information?
Thank you
Most data files like images are stored as hexadecimal. If you know the format of the file, you can use a hexadecimal editor (I use HexEdit) to look at the data.
A colour is often stored as RGB, meaning Red, Green, or Blue, so for instance, this is a dark red:
80 00 00 // (there are no spaces in the real file format, but hex editors add them.)
The format of an image depends on how it's stored. Most image formats have ways of encoding the difference between pixels rather than the actual pixels themselves, because there's a lot of information redundancy between the different pixels.
For instance, if I have a picture of the night sky with a focus on the moon, there's probably a big area in one corner that's all much the same shade of grey; encoding that without optimization would mean a hell of a lot of file that just read:
9080b09080b09080b09080b09080b09080b09080b59080b59080b5...
In this case, the grey is slightly bluish-purple, tending towards a brighter blue at the end. I've stored it as RGB here - R:90, G:80, B:b0 - but there are other formats for that storage too. Try them out here.
Instead of listing every pixel, I could equally say instead "6 lots of bluish-gray then it gets brighter in blue":
=6x9080b0+3x000005+...
This reduces the amount of information I would need to transmit. Most optimizations aren't quite that human-readable, but they operate on similar lines (this is a general information principle used in all kinds of things like .zip files too, not just images).
Note that this is still a lossless format; I could always get back to the actual pixel-perfect image. Bitmaps (.bmp) are lossless (though obviously still digital; they will never capture everything a human sees).
A number of formats use the frequency of images to encode the information. It's a bit like looking at a wave form of music, except it's two-dimensional. Depending on the sampling frequency, information could easily be lost here (and often is). JPEGs (.jpg) use lossy compression like this.
The reason you see ASCII characters is because some of the values just happen to coincide with ASCII text codes. It's pure coincidence; Notepad is doing its best to interpret what's essentially gibberish. For instance this colour sequence:
4e4f424f4459
happens to coincide with the letters "NOBODY", but also represents two pixels next to each other. Both are grey, especially the left (R:4e, G:4f, B:42) with the right-most one being a bit more blue (R:4f, G:44, B:59).
But that's only if your format is storing raw pixel information... which is expensive, so it probably isn't the case.
Image formats are a pretty specialist area. The famous XKCD cartoon "Digital Data" showcases the optimizations being made in some of them. This is why, generally speaking, you shouldn't use JPEG for text, but use something like PNG (.png) instead.

Imagesc conversion formula

I have a .png image that has been created from some grayscale numbers using Matlab's imagesc tool using the standard color map.
For some reason, I am unable to recover the raw data. Is there a way of recovering the raw data from the image? I tried rgb2gray which more or less worked, but if I replug the new image into imagesc, it gives me a slightly different result. Also, the pixel with the most intensity differs in both images.
So, to clarify: I would love to know, how Matlab applies the rgb colormap to the grayscale values, when using the standard colormap.
This is the image we are talking about:
http://imgur.com/qFsGrWw.png
Thank you!
No, you will not get the right data if you are using the standard colormap, or jet.
Generally, its a very bad thing to try to reverse engineer plots, as they will never contain the entirety of the information. This is true in general, but even more if you use colormaps that are do not change accordingly with the data. The amount of blue in jet is massively bigger in range than the amount of orange, or another color. The color changes are non-linear with the data changes, and this will make you miss a lot of resolution. You may know what value orange corresponds to, but blue will be a very wide range of possible values.
In short:
Triying to get data from representation of data (i.e. plots) is a terrible idea
jet is a terrible idea

Algorithm for change specific color of when rendering

Given that I have a texture file generated from a font bitmap builder like this:
Now I load it into my program. Then, I want to write my text with different colors rather than black such as blue, pink, ... from that original texture file.
What trick or algorithm should I use?
Any one can help me, please
Thank you very much.
If you have access to a white version with transparency, you can use hardware vertex blending to get any color you want, but as I said, the text has to be white. Doing it in a software loop is possible, but only with brute-force.. Even if you use a library, it will only be scanning every pixel in, converting it, and re-writing it again... Which is slow. So use hardware vertex coloring.

Image Color Picking Script

I have a bunch of sports team logos. What I want to do is find the color that is used for the highest percentage of pixels. So, for the patriots logo below, I would pick out the blue or #000f47 (white will not be an acceptable color), as this is used for the highest percentage of pixels. Obviously I can eyeball each image, use the color picker tool in Gimp/Photoshop, and determine the color. However, I would like to script this if possible.
I can use any format for the picture input. Would it be possible to read the raw bitmap file format and determine this way? What would be an easy format to read? Do any tools support this, like ImageMagick, etc?
Thanks
If you're up for it then it's fairly straight forward to write your own image processor in C#; just run through the pixels, grab the R, G and, B values and increment a counter for each unique combination.
Having said that, if the image is anti-aliased then what you or I would eye-ball as being blue will be variations of the RGB and the processor would count them seperately. You might want to build in some allowable tollerances into the processor.
Just to be picky, isn't the most frequent pixel value in the image above white not blue?

Get dominant colors from image discarding the background

What is the best (result, not performance) algorithm to fetch dominant colors from an image. The algorithm should discard the background of the image.
I know I can build an array of colors and how many they appear in the image, but I need a way to determine what is the background and what is the foreground, and keep only the second (foreground) in mind while read the dominant colors.
The problem is very hard especially for gradient backgrounds or backrounds with patterns (not plain)
Isolating the foreground from the background is beyond the scope of this particular answer, but...
I've found that applying a pixelation filter to an image will draw out a really good set of 'average' colours.
Before
After
I sometimes use this approach to derive a pallete of colours with a particular mood. I first find a photograph with the general tones I'm after, pixelate and then sample from the resulting image.
(Thanks to Pietro De Grandi for the image, found on unsplash.com)
The colour summarizer is a pretty sweet spot for info on this subject, not to mention their seemingly free XML Web API that will produce descriptive colour statistics for an image of your choosing, reporting back the following formatted with swatches in HTML or as XML...
what is the average color hue, saturation and value in my image?
what is the RGB colour that is most representative of the image?
what do the RGB and HSV histograms look like?
what is the image's human readable colour description (e.g. dark pure blue)?
The purpose of this utility is to generate metadata that summarizes an
image's colour characteristics for inclusion in an image database,
such as Flickr. In particular this tool is being used to generate
metadata for Flickr's Color Fields group.
In my experience though.. this tool still misses the "human-readable" / obvious "main" color, A LOT of the time. Silly machines!
I would say this problem is closer to "impossible" than "very hard". The only approach to it that I can think of would be to make the assumption that the background of an image is likely to consist of solid blocks of similar colors, while the foreground is likely to consist of smaller blocks of dissimilar colors.
If this assumption is generally true, then you could scan through the whole image and weight pixels according to how similar or dissimilar they are to neighboring pixels. In other words, if a pixel's neighbors (within some arbitrary radius, perhaps) were all similar colors, you would not incorporate that pixel into the overall estimate. If the neighbors tend to be very different colors, you would weight the pixel heavily, perhaps in proportion to the degree of difference.
This may not work perfectly, but it would definitely at least tend to exclude large swaths of similar colors.
As far as my knowledge of image processing algorithms extends , there is no certain way to get the "foreground"; it is only possible to get the borders between objects. You'll probably have to make do with an average, or your proposed array count method. In that, you'll want to give colours with higher saturation a higher "score" as they're much more prominent.

Resources