I'm trying to write a function in Matlab that reads in TIFF images from various cameras and restores them to their correct data values for analysis. These cameras are from a variety of brands, and, so far, store either 12 or 14 bit data into 16 bit output. I've been reading them in using imread, and I was told that dividing by either 16 or 4 would convert the data back to it's original form. Unfortunately, that was when the function was only intended for one brand of camera specifically, which nicely scales data to 16 bit at time of capture so that such a transformation would work.
Since I'd like to keep the whole image property detection thing as automated as possible, I've done some digging in the data for a couple different cameras, and I'm running into an issue that I must be completely clueless about. I've determined (so far) that the pictures will always be stored in one of two ways: such that the previous method will work (they multiply the original data out to fill the 16 bits), or they just stuff the data in directly and append zeroes to the front or back for any vacant bits. I decided to see if I could detect which was which and have been using the following two methods. The images I test should easily have values that fill up the full range from zero to saturation (though sometimes not quite), and are fairly large resolution, so in theory these methods should work:
I start by reading in the image data:
Mframe = imread('signal.tif');
This method attempts to detect the number of bits that ever get used:
bits = 0;
for i = 1:16
Bframe = bitget(Mframe,i);
bits = bits + max(max(Bframe));
end
And this method attempts to find if there has been a scaling operation done:
Mframe = imread('signal.tif');
Dframe = diff(Mframe);
mindiff = min(min(nonzeros(Dframe)));
As a 3rd check I always look at the maximum value of my input image:
maxval = max(max(Mframe));
Please check my understanding here:
The value of maxval should be at 65532 in the case of a 16 bit image containing any saturation.
If the 12 or 14 bit data has been scaled to 16 bit, it should return maxval of 65532, a mindiff of 16 or 4 respectively, and bits as 16.
If the 12 or 14 bit data was stored directly with leading/trailing zeros, it can't return a maxval of 65532, mindiff should not return 16 or 4 (though it IS remotely possible), and bits should show as 12 or 14 respectively.
If an image is actually not reaching saturation, it can't return a maxval of 65532, mindiff should still act as described for the two cases above, and bits could possibly return as one lower than it otherwise would.
Am I correct in the above? If not please show me what I'm not understanding (I'm definitely not a computer scientist), because I seem to be getting data that conflicts with this.
Only one case appears to work just like I expect. I know the data to be 12 bit, and my testing shows maxval near 65532, mindiff of 16, and bits as 15. I can conclude that this image is not saturated and is a 12 bit scaled to 16 bit.
Another case for a different brand I know to have 12 bit output, and testing an image that I know isn't quite saturated gives me maxval of 61056, mindiff of 16, and bits as 12. ???
Yet another case, for yet again another brand, is known to have 14 bit output, and when I test an image I know to be saturated it gives me maxval of 65532, mindiff of 4, and bits as 15. ???
So very confused.
Well, after a lot of digging I finally figured it all out. I wrote some code to help me understand the differences between the different files and discovered that a couple of the cameras had "signatures" of sorts in them. I'm contacting the manufacturers for more information, but one in particular appears to be a timestamp that always occurs in the first 2 pixels.
Anyhow, I wrote the following code to fix the two issues I found and now everything is working peachy:
Mframe = imread('signal.tiff');
minval = min(min(Mframe));
mindiff = min(min(nonzeros(diff(Mframe))));
fixbit = log2(double(mindiff));
if rem(fixbit,2) % Correct Brand A Issues
fixbit = fixbit + 1;
Bframe = bitget(Mframe,fixbit);
[x,y] = find(Bframe==1);
for i=1:length(x)
Mframe(x(i),y(i)) = Mframe(x(i),y(i)) + mindiff;
end
end
for i=1:4 % Correct Brand B Timestamp
Bframe = bitget(Mframe,i);
if any(any(Bframe))
Mframe(1,1) = minval; Mframe(1,2) = minval;
end
end
for i = 1:16 % Get actual bit depth
Bframe = bitget(Mframe,i);
bits = bits + max(max(Bframe));
end
As for the Brand A issues, that camera appears to have bad data in just a few pixels of every frame (not the same every time) where a value appears in a pixel that is a one bit lower difference than should be possible from the pixel below it. For example, in a 12 bit picture the minimum difference should be 16 and a 14 bit picture should have a minimum difference of 4, but they have values that are 8 and 2 lower than the pixel below them. Don't know why that's happening, but it was fairly simple to gloss over.
Related
I'm able to use soundmeter to measure the "root-mean-square (RMS) of sound fragments". Now I want to get the decibels dB measurement from this value somehow.
Afaict the formula is something like:
dB = 20 * log(RMS * P, 10)
where log is base 10, and P is some unknown value power, which (as far as I can tell from https://en.wikipedia.org/wiki/Decibel) depends on the microphone that is used.
Now if I use a sound level app on my iPhone I see the avg noise in the room is 68dB, and the measurements that I receive from the soundmeter --collect --seconds 10 are:
Collecting RMS values...
149 Timeout Collected result:
min: 97
max: 149
avg: 126
Is something wrong with this logic? and how can I determine what value of P to use without calculating it (which I'm tempted to do, and seems to work). I'd assume I'd have to look it up online in some specs page, but that seems quite difficult, and using osx I'm not sure now to figure out what the value of P would be. Also this seems to be dependent on the microphone volume level setting for osx.
soundmeter is not returning the RMS in the unit one would normally expect, which would be calibrated such that a full-scale digital sine wave is 1.0 and silence is 0.0.
I found these snippets of code:
In https://github.com/shichao-an/soundmeter/blob/master/soundmeter/meter.py
data = self.output.getvalue()
segment = pydub.AudioSegment(data)
rms = segment.rms
which calls https://github.com/jiaaro/pydub/blob/master/pydub/audio_segment.py
def rms(self):
if self.sample_width == 1:
return self.set_sample_width(2).rms
else:
return audioop.rms(self._data, self.sample_width)
The function immediately below you can see that the data is divided by the maximum sample value to give the desired scale. I assume ratio_to_db is 20*log10(x)
def dBFS(self):
rms = self.rms
if not rms:
return -float("infinity")
return ratio_to_db(self.rms / self.max_possible_amplitude)
In your particular case you need to take the collected RMS level divided by the 2^N where N is the number of bits per sample to get the RMS level scaled and then convert to dB. This number will be dBFS or decibels relative to digital full-scale and will be between +0 and -inf. To get a positive dBSPL value you need to find the sensitivity of your microphone. You can do this by looking up the specs or calibrating to a known reference. If you want to trust the app on your iPhone and it reports the room noise is 68 dBSPL while your program reads -40 dBFS then you can do simple arithmetic to convert by simply adding the difference of the two (108) to the dBFS number you get.
Is there a well known documented algorithm for (positive) integer streams / time series compression, that would:
have variable bit length
work on deltas
My input data is a stream of temperature measurements from a sensor (more specifically a TMP36 read out by an Arduino). It is physically impossible that big jumps occur between measurements (time constant of sensor). I therefore think my compression algorithm should work on deltas (set a base on stream start and then only difference to next value). Because gaps are limited, I want variable bit length, because differences lower than 4 fit on 2 bits, lower than 8 on 3 bits and so on... But there is a dilemma between telling in stream the bit size of the next delta and just working on, say, 3 bit deltas and telling size only when bigger for instance.
Any idea what algorithm solves than one?
Use variable-length integers to code the deltas between values, and feed that to zlib to do the compression.
First of all there are different formats in existent. One thing I would do first is getting rid of the sign. A sign is usually a distraction when thinking about compression. I usually use the scheme where every positive is 2*v and every negative value is just 2*(-v)-1. So 0 = 0, -1 = 1, 1 = 2, -2 = 3, 2 = 4... .
Since with that scheme you have nothing like 0b11111111 = -1 the leading bits are gone. Now you can think about how to compress those symbols / numbers. One thing you can do is create a representive sample and use it to train a static huffman code. This should be possible within your on chip constraints. Another more simple aproach is using huffman codes for bit lengths and write the bits to stream. So 0 = bitlength 0, -1 = bitlength 1, 2,3 = bitlength length 2, ... . By using huffman codes to describe this bitlength you become quite compact literals.
I usually use a mixture. I use the most frequent symbols / values as raw values and use not so frequent numbers by using bit lengths + bit pattern of the actual value. This way you stay compact and do not have to deal with excessive tables (there are only 64 symbols for 64 bits lengths possible).
Also there are other schemes like leading bit where for example of every byte the first bit (or the highest) marks the last byte of the value so as long as the bit is set there will be another byte for the integer. If it is zero its the last byte of the value.
I usually train a static huffman code for such purposes. Its easy and you can even do the encoding and decoding becoming source code / generate source code out from your code (simply create ifs/switch statements and write your tables as arrays in your code).
You can use Integer compression methods with delta or delta of delta encoding like used in TurboPFor Integer Compression. Gamma coding can be also used if the deltas have very small values.
The current state of the art for this problem is Quantile Compression. It compresses numerical sequences such as integers and typically achieves 35% higher compression ratio than other approaches. It has delta encoding as a built-in feature.
CLI example:
cargo run --release compress \
--csv my.csv \
--col-name my_col \
--level 6 \
--delta-order 1 \
out.qco
Rust API example:
let my_nums: Vec<i64> = ...
let compressor = Compressor::<i64>::from_config(CompressorConfig {
compression_level: 6,
delta_encoding_order: 1,
});
let bytes: Vec<u8> = compressor.simple_compress(&my_nums);
println!("compressed down to {} bytes", bytes.len());
It does this by describing each number with a Huffman code for a range (a [lower, upper] bound) followed by an exact offset into that range.
By strategically choosing the ranges based on your data, it comes close the Shannon entropy of the data distribution.
Since your data comes from a temperature sensor, your data should be very smooth, and you may even consider delta orders higher than 1 (e.g. delta order 2 is "delta-of-deltas").
I have an image with CL_FLOAT format and stores all RGBA channels. Now every 4th pixel of image has integers stored there, I store them clasically as:
image[i * 4 + 3].x = *(float*)(&someInt);
image[i * 4 + 3].y = *(float*)(&someInt2);
etc.
And as I need these to be integers (and not floats), the rest of the pixels have to store floats, so I don't have much options here.
When I read image back from OpenCL I get the values correctly, the problem arises in OpenCL kernel:
Whenever I read image like this (sampler is set just to nearest filtering):
float4 fourthPixel = read_imagef(img, sampler, coords);
And I try to convert it to integer as
int id = as_int(fourthPixel.x);
I don't read correct number (it always returns 0, unless number is quite high in integer form).
I got few points so far - if I store number like 1505353234 it WORKS, giving me back 6539629947781120.000000 - which is correct. If I store smaller numbers, it seems that read_imagef just clamps then down to 0.
So it's quite obvious, that ALL denormalized numbers are clamped down to zero. So, is there any good way to actually force read_imagef to not clamp down denormalized numbers to zero, without adding further instruction (ye i could add 0x7f000000 or such - but I need performance in the code, so this solution is unacceptable)?
So apparently reading image through read_imagei works fine. I also looked to specs and found out that your device can clamp denormalized floats to zero.
For several valid reasons I have to use BSD's random() to generate awfully large amounts of random numbers, and since its cycle is quite short (~2^69, if I'm not mistaken) the quality of such numbers degrades pretty quickly for my use case. I could use the rng board I have access to but it's painfully slow so I thought I could do this trick: take one number from the board, use it to seed random(), use random() to draw numbers and reseed it when the board says a new number is available. The board generates about 100 numbers per second so my guess is that random() hardly gets to cycle over and the generation rate easily keeps up with my requirements of several millions numbers per second.
Anyway, the problem is that random() claims to uniformly draw numbers between 0 and (2^31)-1, but I've been drawing an uncountable amount of numbers and I've never ever seen a 0 nor a (2^31)-1 so far. Maybe some 1 and (2^31)-2, but I've never seen the extremes. Now, I know the problem with random numbers is that you can never be sure (see Dilbert, Debian), but this seem extremely odd nonetheless. Moreover I tried analysing the generated datasets with Octave using the histc() function, and the lowest and the highest bins contain between half and three quarter the amount of numbers of the middle bins (which in turn are uniformly filled, so I guess in some sense the distribution is "uniform").
Can anybody explain this?
EDIT Some code
The board outputs this structure with the three components, and then I do some mumbo-jumbo combining them to produce the seed. I have no specs about this board, it's an ancient piece of hardware thrown together by a previous student some years ago, there's little documentation and this formula I'm using is one of those suggested in the docs. The STEP parameter tells me how may numbers I can draw using one seed so I can optimise performance and throttle down CPU usage at the same time.
float n = fabsf(fmod(sqrt(a.s1*a.s1 + a.s2*a.s2 + a.s3*a.s3), 1.0));
unsigned int seed = n * UINT32_MAX;
srandom(seed);
for(int i = 0; i < STEP; i++) {
long r = random();
n = (float)r / (UINT32_MAX >> 1);
[_numbers addObject:[NSNumber numberWithFloat:n]];
}
Are you certain that
void main() {
while (random() != 0L);
}
hangs indefinitely? On my linux machine (the Gnu C library uses the same linear feedback shift register as BSD, albeit with a different seeding procedure) it doesn't.
According to this reference the algorithm produces 'runs' of consecutive zeroes or ones up to length n-1 where n is the size of the shift register. When this has a size of 31 integers (the default case) we can even be certain that, eventually, random() will return 0 a whopping 30 (but never 31) times in a row! Of course, we may have to wait a few centuries to see it happening...
To extend the cycle length, one method is to run two RNGs, with different periods, and XOR their output. See L'Ecuyer 1988 for some examples.
I am testing my new image file format, which without going into unnecessary detail consists of the PPM RGB 24-bit per pixel format sent through zlib's compression stream, and an 8 byte header appended to the front.
While I was writing up tests to evaluate the performance of the corresponding code which implements this I had one test case which produced pretty terrible results.
unsigned char *image = new unsigned char[3000*3000*3];
for(int i=0;i<3000*3000;++i) {
image[i*3] = i%255;
image[i*3+1] = (i/2)%255;
image[i*3+2] = (i*i*i)%255;
}
Now what I'm doing here is creating a 3000x3000 fully packed 3 byte per pixel image, which has red and green stripes increasing steadily, but the blue component is going to be varying quite a bit.
When I compressed this using the zlib stream for my .ppmz format, it was able to reduce the size from 27,000,049 bytes (the reason it is not an even 27 million is 49 bytes are in the headers) to 25,545,520 bytes. This compressed file is 94.6% the original size.
This got me rather flustered at first because I figured that even if the blue component was so chaotic it couldn't be helped much, at least the red and green components repeated themselves quite a bit. A smart enough compressor ought to be able to shrink to about 1/3 the size...
To test that, I took the original 27MB uncompressed file and RAR'd it, and it came out to 8,535,878 bytes. This is quite good, at 31.6%, even better than one-third!
Then I realized I made a mistake defining my test image. I was using mod 255 when I should be clamping to 255, which is mod 256:
unsigned char *image = new unsigned char[3000*3000*3];
for(int i=0;i<3000*3000;++i) {
image[i*3] = i%256;
image[i*3+1] = (i/2)%256;
image[i*3+2] = (i*i*i)%256;
}
The thing is, there is now just one more value that my pixels can take, which I was skipping previously. But when I ran my code again, the ppmz became a measly 145797 byte file. WinRAR squeezed it into 62K.
Why would this tiny change account for this massive difference? Even mighty WinRAR couldn't get the original file under 8MB. What is it about repeating values every 256 steps that doing so every 255 steps completely changes? I get that with the %255 it makes the first two color components' patterns slightly out of phase, but behavior is hardly random. And then there's just crazy modular arithmetic being dumped into the last channel. But I don't see how it could account for such a huge gap in performance.
I wonder if this is more of a math question than a programming question, but I really don't see how the original data could contain any more entropy than my newly modified data. I think the power of 2 dependence indicates something related to the algorithms.
Update: I've done another test: I switched the third line back to (i*i*i)%255 but left the others at %256. ppmz compression ratio rose a tiny bit to 94.65% and RAR yielded a 30.9% ratio. So it appears as though they can handle the linearly increasing sequences just fine, even when they are out of sync, but there is something quite strange going on where arithmetic mod 2^8 is a hell of a lot more friendly to our compression algorithms than other values.
Well, first of all, computers like powers of two. :)
Most of such compression algorithms use compression blocks that are typically aligned to large powers of two. When your cycle aligns perfectly with these blocks, there is only one "unique sequence" to compress. If your data is not aligned, your sequence will shift a little across each block and the algorithm may not be able to recognize it as one "sequence".
EDIT: (updated from comments)
The second reason is that there's an integer overflow on i*i*i. The result is a double modulus: one over 2^32 and then one over 255. This double modulus greatly increases the length of the cycle making it close to random and difficult for the compression algorithm to find the "pattern".
Mystical has a big part of the answer, but it also pays to look at the mathematical properties of the data itself, especially the blue channel.
(i * i * i) % 255 repeats with a period of 255, taking on 255 distinct values all equally often. A naïve coder (ignoring the pattern between different pixels, or between the R and B pixels) would need 7.99 bits/pixel to code the blue channel.
(i * i * i) % 256 is 0 whenever i is a multiple of 8 (8 cubed is 512, which is of course 0 mod 256);
It's 64 whenever i is 4 more than a multiple of 8;
It's 192 whenever i is 4 less than a multiple of 8 (together these cover all multiples of 4);
It's one of 16 different values whenever i is an even non-multiple of 4, depending on i's residue mod 64.
It takes on one of 128 distinct values whenever i is odd.
This makes for only 147 different possibilities for the blue pixel, with some occuring much more often than others, and a naïve entropy of 6.375 bits/pixel for the blue channel.