A pixel format transform logic from 32-bit pixel to 24-bit pixel by dropping all the alpha channels - pixel

I try to solve a coding problem and the problem description is in pic. 1. I find it’s hard for me to understand the format transform logic. I write down all my questions as follows. I hope someone can help me to figure out the logic.
Question:
It says the program should drop the alpha channel, and also mentions the beginning place of the next pixel. When I process the first pixel in the input array and drop the alpha channel in the first pixel, does the next pixel mentioned in the description is the first pixel or the second pixel in the output array?
About the beginning place of the next pixel, it says the next pixel starts in the 4th byte of the first 32-bit word. I’m not sure when I calculate the beginning place, I should calculate it using the original data, without dropping the alpha channel, or I should use the one I dropped all the alpha channels.
Also, when I calculate the first 32-bit word, should I start from the very beginning of the original input array or should I start from the alpha channel I just dropped?

It says the program should drop the alpha channel, and also mentions the beginning place of the next pixel. When I process the
first pixel in the input array and drop the alpha channel in the first
pixel, does the next pixel mentioned in the description is the first
pixel or the second pixel in the output array?
Neither, it's the second pixel from the input array. You are supposed to read the input array, hence the name (one 32bit block at a time). The result will be saved in the output array.
About the beginning place of the next pixel, it says the next pixel starts in the 4th byte of the first 32-bit word. I’m not sure when I
calculate the beginning place, I should calculate it using the
original data, without dropping the alpha channel, or I should use the
one I dropped all the alpha channels.
Since you have to calculate the resulting output array you should "calculate" the start positions in the output array. As you save as sequence of 3x8 bit blocks (24bit) in a 32bit array (since it is an int array), you will get an "overlap" on how/where the ranges/values are saved. This is indicated in the assignment like:
| 3x8 | 3x8 | 3x8 | 3x8 |
|--------|--------|--------|--------|
|11 22 33,11 22 33,11 22 33,11 22 33|
|-----------|-----------|-----------|
| int[0] | int[1] | int[2] |
Also, when I calculate the first 32-bit word, should I start from the very beginning of the original input array or should I start from
the alpha channel I just dropped?
You do not change the input array at all. Instead you save the result in the output array and you start at the beginning at output_RGB[0].

Related

Editing an image according to desired average color

I'm trying to come up with an algorithm that would output colorized/modified image according to source image and desired average color of resulting image as an input.
So let's say that I want an average color to be #ffcc00 - then for any given image I can get a new image that has average color of exactly #ffcc00. I don't mind if the resulting image is heavily modified as long as shapes are recognizable.
How should I approach this?
I think you need to look at your #ffcc00 as three distinct parts. You want to make the red channel average become 255, the green channel average become 204 and the blue channel average become zero.
If your image is unsigned 8-bit, all pixels will be in the range 0..255, i.e. no negative values. So the only way to make the red channel average 255 is if all pixels have red=255. Likewise, the only way to make the blue channel average 0 is if you make the blue component of all pixels zero.
That leaves just the green channel. So you have the existing green channel with some mean value and you want to transform that value to 204, so you effectively want to multiply all the green channel pixels by 204/(current mean). If the new mean is higher than the old mean, some pixels will hit 255 and clip, so you may need to iterate multiplying by a little more till you get what you want. Likewise, if the desired new mean is lower than the existing mean, some pixels may clip at zero and you may need to iterate and multiply by a little less till you get what you want.
Have a look here for a more scientific answer.

Gnuplot: Making a gif of a map generated with matrix?

I generated a .dat file with 100 matrix 15x15, now I want to create a gif which shows the evolution from the first to the last matrix. They are all matrix with 1 or -1, so if I want to represent the inicial matrix I can copy and paste it in another file and I put this in gnuplot:
plot 'firstmatrix.dat' matrix with image
It represents the 1, -1 matrix with yellow and black.
To create the gif I'm trying to do this in gnuplot:
set terminal gif animate delay 20
set output 'evolution.gif'
set xrange [0:15]
set yrange [0:15]
N=15
nframes=5
do for [i=1:int(nframes)] {
plot 'evolution.dat' every ::(i-1)*N+1::i*N matrix with image
}
I intend to read from the first line of the file to the 15th line, then from the 16th to the 30th and so on.
I put only 5 frames to see better the result, and I obtain that the gif shows the first matrix in the first frame and nothing more, only white frames.
The error message is four times this one:
warning: Skipping data file with no valid points
So the data for the first frame, the first matrix, is well processed but not the rest. So here is my problem, I don't know why it process good the first one and no more.
Thanks in advance.
It shows only the first matrix in the first frame
You've been pretty close. But it took me also some iterations and testing...
Apparently, slicing a block of rows from a matrix requires every :::rowFirst::rowLast (mind the 3 colons at the beginning). And then gnuplot apparently takes the row index of the whole matrix as y-coordinate. Since you want to have it "on top of each other" you need the modulo operator % (check help operators binary). It might have been a bit easier if your matrices were separated by one or two empty lines.
Code:
### animated matrix data
reset session
### create some random data
set print $Data
do for [n=1:20] {
do for [y=1:15] {
Line = ''
do for [x=1:15] {
Line=Line.sprintf("% 3g",int(rand(0)*2)*2-1)
}
print Line
}
}
set print
set terminal gif animate delay 30
set output "tbMatrixAnimated.gif"
unset key
N=15
do for [i=1:20] {
plot $Data u 1:(int($2)%N):3 matrix every :::N*(i-1)::N*i-1 with image
}
set output
### end of code
Result: (only 20 matrices)

Detecting individual images in an array of images

I'm building a photographic film scanner. The electronic hardware is done now I have to finish the mechanical advance mechanism then I'm almost done.
I'm using a line scan sensor so it's one pixel width by 2000 height. The data stream I will be sending to the PC over USB with a FTDI FIFO bridge will be just 1 byte values of the pixels. The scanner will pull through an entire strip of 36 frames so I will end up scanning the entire strip. For the beginning I'm willing to manually split them up in Photoshop but I would like to implement something in my program to do this for me. I'm using C++ in VS. So, basically I need to find a way for the PC to detect the near black strips in between the images on the film, isolate the images and save them as individual files.
Could someone give me some advice for this?
That sounds pretty simple compared to the things you've already implemented; you could
calculate an average pixel value per row, and call the resulting signal s(n) (n being the row number).
set a threshold for s(n), setting everything below that threshold to 0 and everything above to 1
Assuming you don't know the exact pixel height of the black bars and the negatives, search for periodicities in s(n). What I describe in the following is total overkill, but that's how I roll:
use FFTw to calculate a discrete fourier transform of s(n), call it S(f) (f being the frequency, i.e. 1/period).
find argmax(abs(S(f))); that f represents the distance between two black bars: number of rows / f is the bar distance.
S(f) is complex, and thus has an argument; arctan(imag(S(f_max))/real(S(f_max)))*number of rows will give you the position of the bars.
To calculate the width of the bars, you could do the same with the second highest peak of abs(S(f)), but it'll probably be easier to just count the average length of 0 around the calculated center positions of the black bars.
To get the exact width of the image strip, only take the pixels in which the image border may lie: r_left(x) would be the signal representing the few pixels in which the actual image might border to the filmstrip material, x being the coordinate along that row). Now, use a simplistic high pass filter (e.g. f(x):= r_left(x)-r_left(x-1)) to find the sharpest edge in that region (argmax(abs(f(x)))). Use the average of these edges as the border location.
By the way, if you want to write a source block that takes your scanned image as input and outputs a stream of pixel row vectors, using GNU Radio would offer you a nice method of having a flow graph of connected signal processing blocks that does exactly what you want, without you having to care about getting data from A to B.
I forgot to add: Use the resulting coordinates with something like openCV, or any other library capable of reading images and specifying sub-images by coordinates as well as saving to new images.

"Barcode" reading from scanned image

I want to read a barcode from a scanned image that I printed. The image format is not relevant. I found that the scanned images are of very low quality and can understand why it normal barcodes fail.
My idea is to create a non standard and very simple barcode at the top of each page printed. It will be 20 squares in a row forming a simple binary code.Filled = 1, open = 0. It will be large enough on aA4 to make detection easy.
At this stage I need to load the image and find the barcode somewhere at the top. It will not be exactly at the same spot as it is scanned in. Step into each block and build the ID.
Any knowledge or links to info would be awesome.
If you can preset a region of interest that contains the code and nothing else, then detection is pretty easy. Scan a few rays across this region and find the white/black and black/white transitions. Then, knowing where the "cells" should be, you known their polarity.
For this to work, you need to frame your cells with two black ones on both ends to make sure to know where it starts/stops (if the scale is fixed, you can do with just a start cell, but I would not recommend this).
You could have a look at https://github.com/zxing/zxing. I would suggest to use a 1D bar code, but wide enough to match the low resolution of the scanner.
You could also invent your own bar code encoding and try to parse it your self. Use thick bars for 1 and thin lines for 0. A thick bar would be for instance 2 white pixels, 4 black pixels. A thin line would be 2 white pixels, 2 black pixels and 2 white pixels. The last two pixels encode the bit value.
The pixel should be the size of the scanned image pixel.
You then process the image scan line by scan line, trying to locate the bar code.
We locate the bar code by comparing a given pixel value sequence with a pattern. This is performed by computing a score function. The sum of squared difference is a good pick. When computing the score we ignore the two pixels encoding the bit value.
When the score is below a threshold, we found a matching pattern. It is good to add parity bits to the encoded value so that it's validity can be checked.
Computing a sum of square on a sliding window can be optimized.

Extra data within image (PPM/PAM/PNM)

Is it possible to store extra data in pixels of a binary PNM file in such a way that it can still be read as an image (hopefully by any decoder, but specifically by ffmpeg)?
I have a simulation that saves its data as PPM currently and I'd like a way to record more than three values per pixel in the file, and yet still be able to use it as an image (obviously only the first three/four values will actually affect the image).
In particle I think the TUPLTYPE of PAM should allow me to do this, but I don't know how make something that's also a readable image from that.
There are two tricks which together can get up to 5 additional bytes per pixel in PAM file.
First trick:
You can try store additional byte of information in alpha channel and then choose to ignore that information in decoder. Enabling alpha channel in PAM is done by adding _APLHA to TUPLTYPE argument, so instead TUPLTYPE RGB you have TUPLTYPE RGB_ALPHA.
Second trick:
You can set MAXVAL in PAM (or equivalent field in PPM and others) to 65535 instead of 255, which means that every pixel will be described by three 16-bit values instead of three 8-bit ones. Now, for these 16-bit values the 8 least significant bits can be used to store information as they do not affect visual properties of image when shown on typical computer screen.
First + second trick:
This gives you additional 3 x 8 = 24 bits for RGB planes and 16 bits in alpha channel. Which means: 5 bytes.
I've not used PNM file format, but I've done this trick with a .bmp file.
Hijack the least significant bit of the image data and stuff it with whatever binary data you want. Nobody will see the difference between a pixel value of a 0 or 1 (00000000 or 00000001), or the the difference between a 254 or 255 (1111110 or 11111111). For every 8 bytes of image data a byte of extra data can be embedded (6 bytes if you use a limited character set). The file viewing software won't know any difference. Any software which could open the file before the encoding, would be able to read it after.
If you want the data to be more covert/hidden, the bits can be stuffed into the image data with a shuffle routine, where the first bit might be location 50, the second in 123, the third in 32... and after locations 0-255 (first 256 bytes if image data) are used (first 32 bytes of extra data), start the shuffle again.

Resources