I'm trying to find the algorithm for IBM/Xerox algorithm: Recorded Image Data Inline Coding recording algorithm (RIDIC).
Within an IPDS print stream, the image gets wrapped in this RIDIC algorithm. I need to be able to take the stream and decode the image portion back to its original image. There is little-to-no information out there as far as I've been able to find.
Here is literally all the information I have on it so far from http://afpcinc.org/wp-content/uploads/2014/07/IPDS-Reference-10.pdf:
"The Recorded Image Data Inline Coding recording algorithm (RIDIC) formats a single image in the binary element sequence of a unidirectional raster scan with no interlaced fields and with parallel raster lines, from left
to right and from top to bottom."
"Each binary element representing an image data element after decompression, without grayscale, is 0 for an
image data element without intensity, and 1 for an image data element with intensity. More than one binary
element can represent an image data element after decompression, corresponding to a grayscale or color
algorithm. Each raster scan line is an integral multiple of 8 bits. If an image occupies an area whose width is
other than an integral multiple of 8 bits, the scan line is padded with zeros."
Any information to work with this algorithm would be greatly appreciated!
Most likely, you're making it a bigger thing than it really is. RIDIC is a recording algorithm: it is the format in which the original image data is arranged prior to compression. Only if the compression is set to "No Compression" would you have to deal with data in the recording format. And then, RIDIC is simply an ordering of bit groups that describe each pixel. E.g. if you had 16-level grayscale, RIDIC encodes each pixel in left-right,top-down order in a nibble, and pads to an even number of bytes.
Related
I'm trying to remove damaged jpeg duplicates of some 27k photos. Unfortunately most of these are actually damaged duplicates showing half or less of the original image before cutting out to mess/grey.
Is there any intelligent algorithm to hash a picture that instead of hashing a reduced size version of the full image (as in aHash, pHash and dHash) does it pixel by pixel (starting top left and reading LTR)?
The thing is most algorithms just reduce the image size and then create an hash in order to compare the pictures. As these damaged files really lack most of the data it's impossible to compare the first "few lines or few pixels of an image". The only software coming close to this is AllDup, but it only does a comparison bit-by-bit and not checking the actual image data.
Would that even be possible?
Thanks in advance.
I understand that a JPEG file starts with 0xFFD8 (SOI), followed by a number of 0xFFEn segments holding metadata, then a number of segments holding the compression relate data (DQT, DHT, etc) of which the final one is 0xFFDA (SOS); then comes the actual image data which ends with 0xFFD9 (EOI). Each of those segments states its length in the two bytes following the JPEG marker so it is a trivial execise to calculate the end of a segment/start of next segment and the start of the image data can be calculated from the length of the SOS segment.
Up to that point, the appearance of 0xFFD9 (EOI) is irrelevant 1, because the segments are identified by the length. As far as I can see, however, there is no way of determining the length of the image data other than finding the 0xFFD9(EOI) marker following the SOS segment. In order for that to be assured, it would mean that 0xFFD9 must not appear inside the actual image data itself. Is there something built into the JPEG algorithm to ensure that or am I missing something here?
1 A second 0xFFD8 and 0xFFD9 can appear if a thumbnail is included in the image but that is taken care of by the length of the containing segment - usually a 0xFFE1 (APP1) segment from what I have seen. In images I have checked so far, the start and size of the thumbnail image data is still given in the 0x0201 (JPEGInterchangeFormat - Offset to JPEG SOI)and 0x202 (JPEGInterchangeFormatLength - Bytes of JPEG data) fields in IFD1, even though these were deprecated in Tech Note #2.
In JPEG, the Compressed value FF is encoded as FF00.
The compressed value FFD9 would be encoded as FF00D9.
I'm building a photographic film scanner. The electronic hardware is done now I have to finish the mechanical advance mechanism then I'm almost done.
I'm using a line scan sensor so it's one pixel width by 2000 height. The data stream I will be sending to the PC over USB with a FTDI FIFO bridge will be just 1 byte values of the pixels. The scanner will pull through an entire strip of 36 frames so I will end up scanning the entire strip. For the beginning I'm willing to manually split them up in Photoshop but I would like to implement something in my program to do this for me. I'm using C++ in VS. So, basically I need to find a way for the PC to detect the near black strips in between the images on the film, isolate the images and save them as individual files.
Could someone give me some advice for this?
That sounds pretty simple compared to the things you've already implemented; you could
calculate an average pixel value per row, and call the resulting signal s(n) (n being the row number).
set a threshold for s(n), setting everything below that threshold to 0 and everything above to 1
Assuming you don't know the exact pixel height of the black bars and the negatives, search for periodicities in s(n). What I describe in the following is total overkill, but that's how I roll:
use FFTw to calculate a discrete fourier transform of s(n), call it S(f) (f being the frequency, i.e. 1/period).
find argmax(abs(S(f))); that f represents the distance between two black bars: number of rows / f is the bar distance.
S(f) is complex, and thus has an argument; arctan(imag(S(f_max))/real(S(f_max)))*number of rows will give you the position of the bars.
To calculate the width of the bars, you could do the same with the second highest peak of abs(S(f)), but it'll probably be easier to just count the average length of 0 around the calculated center positions of the black bars.
To get the exact width of the image strip, only take the pixels in which the image border may lie: r_left(x) would be the signal representing the few pixels in which the actual image might border to the filmstrip material, x being the coordinate along that row). Now, use a simplistic high pass filter (e.g. f(x):= r_left(x)-r_left(x-1)) to find the sharpest edge in that region (argmax(abs(f(x)))). Use the average of these edges as the border location.
By the way, if you want to write a source block that takes your scanned image as input and outputs a stream of pixel row vectors, using GNU Radio would offer you a nice method of having a flow graph of connected signal processing blocks that does exactly what you want, without you having to care about getting data from A to B.
I forgot to add: Use the resulting coordinates with something like openCV, or any other library capable of reading images and specifying sub-images by coordinates as well as saving to new images.
I am enrolled in a Coursera Machine Learning course where I am learning about neural networks. I got some hand-written digits data from this link: http://yann.lecun.com/exdb/mnist/
Now I want to convert these data in to .jpg format, and I am using this code.
function nx=conv(x)
nx=zeros(size(x));
for i=1:size(x,1)
c=reshape(x(i,:),20,20);
imwrite(c,'data.jpg','jpg')
nx(i,:)=(imread('data.jpg'))(:)';
delete('data.jpg');
end
end
Then, I run the above code with:
nx=conv(x);
x is 5000 training examples of handwritten digits. Each training example is a 20 x 20 pixel grayscale image of a digit. Each pixel is represented by a floating point number indicating the grayscale intensity at that location.
The 20 x 20 grid of pixels is "unrolled" into a 400-dimensional vector. Each of these training examples becomes a single row in our data matrix x. This gives us a 5000 x 400 matrix x where every row is a training example for a handwritten digit image.
After I run this code, I rewrite an image to disk to check:
imwrite(nx(1,:),'check.jpg','jpg')
However, I find the image is fuzzy. How would I convert these images correctly?
You are saving the images using JPEG, which is a lossy compression algorithm to save the images. Lossy compression algorithms guarantee a high compression ratio, but at the expense of slightly degrading your image. That's probably why you are seeing fuzzy images as it is attributed to compression artifacts.
From the looks of it, you want to save exactly what the data should be to file. As such, use a lossless compression algorithm instead, like PNG. Therefore, change your saving line of code to use PNG:
imwrite(c,'data.png','png')
nx(i,:)=(imread('data.png'))(:)';
delete('data.png');
Also:
imwrite(nx(1,:),'check.png','png')
I have a medical binary image that I got after manually thresholding a grayscale medical image.during thresholding , I noticed some overlapped regions in histogram that contained pixels which could be of any type( either glandulat tissue type- I considered these in above threshold range , or fat tissue-I considered these in below threshold range)
How can I post process the binary image to get exact no. of pixels of glandular tissue only, discarding the effect of wrongly thresholded pixels in overlapped region? please help
There is no difficulty maintaining one counter per pixel type and incrementing every time you classify a pixel.
If you have more than two classes, obviously a binary image cannot store this information and you need to count before binarization.