Dithered image larger than original (using the Rust image crate) - image

I'm learning Rust and wanted to try my hand at error diffusion dithering. I've got it working, but the dithered file ends up bigger than the original, which is the opposite of what's supposed to happen. The original JPEG image is 605 KB big, but the dithered image has a whopping 2.57 MB. My knowledge of the image crate is very limited and I found all the various structs for representing images confusing, so I must be missing something regarding the API.
Here's the code for dithering the image (included are only parts which I deemed relevant):
impl DiffusionKernel<'_> {
pub const FLOYD_STEINBERG: DiffusionKernel<'_> = // Constructor
fn distribute_error(
&self,
error: &(i16, i16, i16),
image: &mut DynamicImage,
width: u32,
height: u32,
x: u32,
y: u32,
) {
for target in self.targets /* "targets" are the pixels where to distribute the error */ {
// Checks if the target x and y are in the bounds of the image
// Also returns the x and y coordinates of the pixel, because the "target" struct only describes the offset of the target pixel from the pixel being currently processed
let (is_valid_target, target_x, target_y) =
DiffusionKernel::is_valid_target(target, width, height, x, y);
if is_valid_target == false {
continue;
}
let target_pix = image.get_pixel(target_x, target_y);
// Distribute the error to the target_pix
let new_pix = Rgba::from([new_r, new_g, new_b, 255]);
image.put_pixel(target_x, target_y, new_pix);
}
}
pub fn diffuse(&self, bit_depth: u8, image: &mut DynamicImage) {
let width = image.width();
let height = image.height();
for x in 0..width {
for y in 0..height {
let pix = image.get_pixel(x, y);
let pix_quantized = ColorUtil::reduce_color_bit_depth(pix, bit_depth); // Quantizes the color
let error = (
pix.0[0] as i16 - pix_quantized.0[0] as i16,
pix.0[1] as i16 - pix_quantized.0[1] as i16,
pix.0[2] as i16 - pix_quantized.0[2] as i16,
);
image.put_pixel(x, y, pix_quantized);
self.distribute_error(&error, image, width, height, x, y);
}
}
// Distributing the error ends up creating colors like 7, 7, 7, or 12, 12, 12 instead of 0, 0, 0 for black,
// so here I'm just ensuring that the colors are correctly quantized.
// I think the algorithm shouldn't behave like this, I'll try to fix it later.
for x in 0..width {
for y in 0..height {
let pix = image.get_pixel(x, y);
let pix_quantized = ColorUtil::reduce_color_bit_depth(pix, bit_depth);
image.put_pixel(x, y, pix_quantized);
}
}
}
}
Here's the code for loading and saving the image:
let format = "jpg";
let path = String::from("C:\\...\\Cat.".to_owned() + format);
let trimmed_path = path.trim(); // This needs to be here if I'm getting the path from the console
let bfr = Reader::open(trimmed_path)
.unwrap()
.with_guessed_format()
.unwrap();
let mut dynamic = bfr.decode().unwrap();
// dynamic = dynamic.grayscale();
error_diffusion::DiffusionKernel::FLOYD_STEINBERG.diffuse(1, &mut dynamic);
dynamic
.save(trimmed_path.to_owned() + "_dithered." + format)
.expect("There was an error saving the image.");

Ok, so I got back to trying to figure this out, and it looks like you just need to use an image encoder like the PngEncoder and a file to write to in order to lower the bit depth of an image. The encoder takes bytes, not pixels, but thankfully, images have a as_bytes method which returns what you need.
Here's the code:
let img = image::open(path).expect("Failed to open image.");
let (width, height) = img.dimensions();
let writer = File::create(path.to_owned() + "_out.png").unwrap();
// This is the best encoder configuration for black/white images, which is my output
// Grayscale with multiple colors -> black/white using dithering
let encoder = PngEncoder::new_with_quality(writer, CompressionType::Best, FilterType::NoFilter);
encoder
.write_image(img.as_bytes(), width, height, ColorType::L8)
.expect("Failed to write image.");

Related

Convert RAW-image in Bayer-encoding (RGGB) and 16 bit depth to RGB: Final image is too dark and has a greenish cast (Rust)

I use a Rust library to parse raw ARW images (Sony Raw Format). I get a raw buffer of 16 bit pixels, it gives me the CFA (Color Filter Array) (which is RGGB), and the data buffer contains height * width pixels in bayer encoding. Each pixel is stored as 16 bit (however, I think the camera only uses 12 or 14 of the 16 bits for each pixel).
I'm using a Bayer library for the demosaicing process. Currently, my final image is too dark and has a greenish cast after the demosaic process. I guess the error is that before I pass the data to the bayer library, I try to transform each 16 bit value to 8 bit by dividing it by u16::max and multiplying it with u8::max. However, I don't know if this is the right approach.
I guess I need to perform additional steps between the parsing of the raw file and passing it to the bayer library. May I have any advice, please?
I can ensure that at least some demosaicing works. Here's a screenshot of the resulting image:
Current Code
The libraries I'm using are rawloader and bayer
let decoded_raw = rawloader::decode_file(path).unwrap();
let decoded_image_u16 = match &decoded_raw.data {
RawImageData::Integer(data) => data,
RawImageData::Float(_) => panic!("not supported yet"),
};
// u16 to u8 (this is probably wrong)
let mut decoded_image_u8 = decoded_image_u16
.iter()
.map(|val| {
// todo find out how to interpret the u16!
let val_f32 = *val as f32;
let u16_max_f32 = u16::MAX as f32;
let u8_max_f32 = u8::MAX as f32;
(val_f32 / u16_max_f32 * u8_max_f32) as u8
})
.collect::<Vec<u8>>();
// prepare final RGB buffer
let bytes_per_pixel = 3; // RGB
let mut demosaic_buf = vec![0; bytes_per_pixel * decoded_raw.width * decoded_raw.height];
let mut dst = bayer::RasterMut::new(
decoded_raw.width,
decoded_raw.height,
bayer::RasterDepth::Depth8,
&mut demosaic_buf,
);
// DEMOSAIC
// adapter so that `bayer::run_demosaic` can read from the Vec
let mut decoded_image_u8 = ReadableByteSlice::new(decoded_image_u8.as_slice());
bayer::run_demosaic(
&mut decoded_image_u8,
bayer::BayerDepth::Depth8,
// RGGB is definitely right for my AWR file
bayer::CFA::RGGB,
bayer::Demosaic::Linear,
&mut dst,
)
.unwrap();
I'm not sure if this is connected to the actual problem, but your conversion is way overkill.
To convert from the full range of a u16 to the full range of a u8, use:
(x >> 8) as u8
fn main() {
let convert = |x: u16| (x >> 8) as u8;
println!("{} -> {}", 0, convert(0));
println!("{} -> {}", 30000, convert(30000));
println!("{} -> {}", u16::MAX, convert(u16::MAX));
}
0 -> 0
30000 -> 117
65535 -> 255
I might be able to help you further if you post the input image, but without being able to reproduce your problem I don't think there will be much else here.

How to interpret result of MLMultiArray in semantic segmentation through CoreML?

I am trying to implement a semantic segmentation model in my application. I have been able to convert the u2net model to a CoreML model. I am unable to get a workable result from the MLMultiArray output. The specification description is as follows:
input {
name: "input"
type {
imageType {
width: 512
height: 512
colorSpace: RGB
}
}
}
output {
name: "x_1"
type {
multiArrayType {
shape: 1
shape: 3
shape: 512
shape: 512
dataType: FLOAT32
}
}
}
The model works great when opening it and using the model preview functionality in Xcode. It shows the 2 different labels in 2 colours (there are only 2 classes + 1 background). I want to have the same output in my application, however when I manually process the MLMultiArray output to a CGImage I get different results. I am using the code provided here like this:
let image = output.cgImage(min: -1, max: 1, channel: 0, axes: (1,2,3))
This gives me something that looks somewhat usable but it has a lot of gradient within each channel. What I need is an image with simply 1 color value for each label.
I have tried converting the output of the model directly to an image through this sample code. This simply shows 'Inference Failed' in the Xcode model preview. When I try removing the unnecessary extra dimension in the MultiArray output I get this error:
"Error reading protobuf spec. validator error: Layer 'x_1' of type 'Convolution' has output rank 3 but expects rank at least 4."
What does the model preview in Xcode do what I am not doing? Is there a post processing step I need to take to get usable output?
Answering my own question:
Turns out the resulting pixels for each channel represent the possibility of it being the class represented by that channel.
In other words find the maximum pixel value at a certain position. The channel with the highest value is the class pixel.
func getLabelsForImage() {
....
setup model here
....
guard let output = try? model.prediction(input: input) else {
fatalError("Could not generate model output.")
}
let channelCount = 10
// Ugly, I know. But works:
let colors = [NSColor.red.usingColorSpace(.sRGB)!, NSColor.blue.usingColorSpace(.sRGB)!, NSColor.green.usingColorSpace(.sRGB)!, NSColor.gray.usingColorSpace(.sRGB)!, NSColor.yellow.usingColorSpace(.sRGB)!, NSColor.purple.usingColorSpace(.sRGB)!, NSColor.cyan.usingColorSpace(.sRGB)!, NSColor.orange.usingColorSpace(.sRGB)!, NSColor.brown.usingColorSpace(.sRGB)!, NSColor.magenta.usingColorSpace(.sRGB)!]
// I don't know my min and max output, -64 and 64 seems to work OK for my data.
var firstData = output.toRawBytes(min: Float32(-64), max: Float32(64), channel: 0, axes: (0,1,2))!.bytes
var outputImageData:[UInt8] = []
for _ in 0..<firstData.count {
let r:UInt8 = UInt8(colors[0].redComponent * 255)
let g:UInt8 = UInt8(colors[0].greenComponent * 255)
let b:UInt8 = UInt8(colors[0].blueComponent * 255)
let a:UInt8 = UInt8(colors[0].alphaComponent * 255)
outputImageData.append(r)
outputImageData.append(g)
outputImageData.append(b)
outputImageData.append(a)
}
for i in 1..<channelCount {
let data = output.toRawBytes(min: Float32(-64), max: Float32(64), channel: i, axes: (0,1,2))!.bytes
for j in 0..<data.count {
if data[j] > firstData[j] {
firstData[j] = data[j]
let r:UInt8 = UInt8(colors[i].redComponent * 255)
let g:UInt8 = UInt8(colors[i].greenComponent * 255)
let b:UInt8 = UInt8(colors[i].blueComponent * 255)
let a:UInt8 = UInt8(colors[i].alphaComponent * 255)
outputImageData[j*4] = r
outputImageData[j*4+1] = g
outputImageData[j*4+2] = b
outputImageData[j*4+3] = a
}
}
}
let image = imageFromPixels(pixels: outputImageData, width: 512, height: 512)
image.writeJPG(toURL: labelURL.deletingLastPathComponent().appendingPathComponent("labels.jpg"))
}
// I found this function here: https://stackoverflow.com/questions/38590323/obtain-nsimage-from-pixel-array-problems-swift
func imageFromPixels(pixels: UnsafePointer<UInt8>, width: Int, height: Int)-> NSImage { //No need to pass another CGImage
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
let bitmapInfo:CGBitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.premultipliedLast.rawValue)
let bitsPerComponent = 8 //number of bits in UInt8
let bitsPerPixel = 4 * bitsPerComponent //ARGB uses 4 components
let bytesPerRow = bitsPerPixel * width / 8 // bitsPerRow / 8 (in some cases, you need some paddings)
let providerRef = CGDataProvider(
data: NSData(bytes: pixels, length: height * bytesPerRow) //Do not put `&` as pixels is already an `UnsafePointer`
)
let cgim = CGImage(
width: width,
height: height,
bitsPerComponent: bitsPerComponent,
bitsPerPixel: bitsPerPixel,
bytesPerRow: bytesPerRow, //->not bits
space: rgbColorSpace,
bitmapInfo: bitmapInfo,
provider: providerRef!,
decode: nil,
shouldInterpolate: true,
intent: CGColorRenderingIntent.defaultIntent
)
return NSImage(cgImage: cgim!, size: NSSize(width: width, height: height))
}

Drawing a GDI bitmap of variable size with StretchBlt causes Black unshrinking borders to appear

So I'm trying to render the client area of another window onto my own. For this I use the struct
pub struct Peek {
id: String,
target_window: *mut HWND__,
out_dc: *mut HDC__,
width: i32,
height: i32,
}
with a method:
pub fn snap(&mut self) {
if unsafe { IsWindow(self.target_window) }.is_positive() {
let mut rect: RECT = RECT::default();
unsafe { assert!(GetWindowRect(self.target_window, &mut rect).is_positive()) };
let width = rect.right - rect.left;
let height = rect.bottom - rect.top;
if self.height != height || self.width != width {
self.height = height;
self.width = width;
unsafe {
let screen_dc = GetDC(null_mut());
assert!(!screen_dc.is_null(), format!("Error getting screen dc: {}",Error::last_os_error()));
let new_size_bmp = CreateCompatibleBitmap(screen_dc, self.width,self.height);
assert!(!new_size_bmp.is_null(), format!("Error creating new bitmap: {}",Error::last_os_error()));
assert!(ReleaseDC(null_mut(), screen_dc) == 1, format!("Error releasing screen_dc: {}", Error::last_os_error()));
let ejected_hgdi = SelectObject(self.out_dc, new_size_bmp as HGDIOBJ);
assert!(!ejected_hgdi.is_null(),format!("Error selecting new size bmp into out_dc: {}",Error::last_os_error()));
assert!(DeleteObject(ejected_hgdi) != 0,
format!("Error deleting old contents of out_dc: {}\nwidth => {}\nheight => {}",
Error::last_os_error(), self.width, self.height));
};
}
let target_dc = unsafe { GetDC(self.target_window) };
assert!(!target_dc.is_null(),
format!("Error getting dc, (it`s a null): {}",Error::last_os_error()));
unsafe {assert!(
BitBlt(self.out_dc,
0,
0,
self.width,
self.height,
target_dc,
rect.left,
rect.top,
SRCCOPY).is_positive()
);}
unsafe {
assert!(
ReleaseDC(self.target_window, target_dc) == 1,
format!("Error releasing dc: {}",Error::last_os_error())
)
};
} else {
panic!()
}
}
I've called this method In my window loop like so
snapper.snap();
unsafe {
assert!(StretchBlt (
dc,
0,
0,
snapper.width,
snapper.height,
snapper.out_dc,
0,
0,
snapper.width,
snapper.height,
SRCCOPY
).is_positive())
};
and it works however there is a major display problem I cant pinpoint the origin of. Supposing that the window I want to copy from:
when I run my script, which generates transparent window and renders the window beneath on it - something like this occur:
, the image gets resized in sensible manner - however this black sidelines stay and furthermore I assume are created per iteration of the loop in the snap method as just before that my loop fills the window with white - the transparency colorkey, and calls calls RedrawWindow.
What's even more strange is that those black borders expand with the frame but not shrink, so after expanding the target window and shrinking it afterwards the window image does maintain its size but the borders only expand.
I'm not sure if it has to do with the way DC reshapes after the new bitmap is selected into it or with some math fluke, but i was beating my head for a couple of hours on it now - so I'm asking here.
P.S. The crate I use is winapi-rs

How to save ndarray in Rust as image?

I have ndarray
let mut a = Array3::<u8>::from_elem((50, 40, 3), 3);
and I use image library
let mut imgbuf = image::ImageBuffer::new(50, 40);
How could I save my ndarray as image ?
If there is better image library then image for this I could use it.
The easiest way is to ensure that the array follows is in standard layout (C-contiguous) with the image dimensions in the order (height, width, channel) order (HWC), or in an equivalent memory layout. This is necessary because image expects rows to be contiguous in memory.
Then, build a RgbImage using the type's from_raw function.
use image::RgbImage;
use ndarray::Array3;
fn array_to_image(arr: Array3<u8>) -> RgbImage {
assert!(arr.is_standard_layout());
let (height, width, _) = arr.dim();
let raw = arr.into_raw_vec();
RgbImage::from_raw(width as u32, height as u32, raw)
.expect("container should have the right size for the image dimensions")
}
Example of use:
let mut array: Array3<u8> = Array3::zeros((200, 250, 3)); // 250x200 RGB
for ((x, y, z), v) in array.indexed_iter_mut() {
*v = match z {
0 => y as u8,
1 => x as u8,
2 => 0,
_ => unreachable!(),
};
}
let image = array_to_image(array);
image.save("out.png")?;
The output image:
Below are a few related helper functions, in case they are necessary.
Ndarrays can be converted to standard layout by calling the method as_standard_layout, available since version 0.13.0. Before this version, you would need to collect each array element into a vector and rebuild the array, like so:
fn to_standard_layout<A, D>(arr: Array<A, D>) -> Array<A, D>
where
A: Clone,
D: Dimension,
{
let v: Vec<_> = arr.iter().cloned().collect();
let dim = arr.dim();
Array::from_shape_vec(dim, v).unwrap()
}
Moreover, converting an ndarray in the layout (width, height, channel) to (height, width, channel) is also possible by swapping the first two axes and making the array C-contiguous afterwards:
fn wh_to_hw(mut arr: Array3<u8>) -> Array3<u8> {
arr.swap_axes(0, 1);
arr.as_standard_layout().to_owned()
}

Low framerate when running Piston example

I'm building a simple 2D game in Rust using Piston. I used examples from the Piston documentation and expanded it and it works quite well. However, I get pretty bad performance:
Drawing only 2 squares gives me a framerate of about 30-40 FPS
Drawing 5 000 squares gives me a framerate of about 5 FPS
This is on a Core i7 # 2.2GHz running Windows 10. Rust version 1.8, Piston version 0.19.0.
Is this expected or have I made any mistakes in my code? Am I even measuring the FPS correctly?
extern crate piston_window;
extern crate piston;
extern crate rand;
use piston_window::*;
use rand::Rng;
fn main() {
const SIZE: [u32; 2] = [600,600];
const GREEN: [f32; 4] = [0.0, 1.0, 0.0, 1.0];
const NUM: u32 = 1000; //change this to change number of polygons
const SQUARESIZE: f64 = 10.0;
// Create an Glutin window.
let window: PistonWindow = WindowSettings::new("test",SIZE)
.exit_on_esc(true)
.build()
.unwrap();
let mut frames = 0;
let mut passed = 0.0;
let mut rng = rand::thread_rng();
for e in window {
if let Some(_) = e.render_args() {
e.draw_2d(|c, g| {
//clear the screen.
clear(GREEN, g);
for i in 0..NUM {
//setting up so that it looks pretty
let x = (i % SIZE[0]) as f64;
let y = (i % SIZE[1]) as f64;
let fill = (x / (SIZE[0] as f64)) as f32;
let color: [f32; 4] = [fill,1.0-fill,fill,fill];
let x = rng.gen_range::<f64>(0.0,SIZE[0] as f64);
//draw the square
let square = rectangle::square(0.0, 0.0, SQUARESIZE);
let transform = c.transform.trans(x-SQUARESIZE/2.0,y-SQUARESIZE/2.0);
rectangle(color, square, transform, g);
}
frames+=1;
});
}
if let Some(u) = e.update_args() {
passed += u.dt;
if passed > 1.0 {
let fps = (frames as f64) / passed;
println!("FPS: {}",fps);
frames = 0;
passed = 0.0;
}
}
}
}
Thank you for your help.
EDIT: taskmgr tells me that it only uses about 17K memory, but one of my physical CPU cores maxes out when the FPS drops below about 20.
EDIT 2: Changed the code to a complete working example.

Resources