I am currently attempting to undergo lossless compression of RGB24 files using H264 on FFMPEG. However, the color space transformation used in the H264 compression (RGB24 -> YUV444) has proven to be lossy (I'm guessing due to quantisation error).
Is there anything else I can use (eg a program) to transform my RGB24 files to YUV losslessly, before compressing them with lossless H264?
The ultimate goal is to compress an RGB24 file then decompress it, with the decompressed file exactly matching the original file. eg RGB24 -> YUV444 -> compressed YUV44 -> decompressed YUV444 -> RGB24.
Is this at all possible?
This is a copy/paste from my answer here:
RGB-frame encoding - FFmpeg/libav
lets look at the colorspace conversion.
void YUVfromRGB(double& Y, double& U, double& V, const double R, const double G, const double B)
{
Y = 0.257 * R + 0.504 * G + 0.098 * B + 16;
U = -0.148 * R - 0.291 * G + 0.439 * B + 128;
V = 0.439 * R - 0.368 * G - 0.071 * B + 128;
}
And plug in some dummy values:
R = 255, G = 255, B = 255
Y = 235
R = 0, G = 0, B = 0
Y = 16
As you can see, the range 0 -> 255 is squished to 16 -> 235. Thus we have shown that there are some colors in the RGB colorspace that do not exist in the (digital) YUV color space. Hence the conversion is lossy by definition.
Related
I'm trying to sync CSS and FFmpeg color correction. The goal is to create tool that converts CSS bri-sat-con-gam filter values to corresponding ffmpeg vals and vice versa.
e.g.
-vf "eq=brightness=0.3:saturation=1.3:contrast=1.1"
→
filter="brightness(30%) saturate(130%) contrast(110%)"
While algorithms for CSS properties are available at W3C, I have failed to find ones for FFmpeg. I've tried to dig GitHub. Starting from here I've unfolded function calls, but it looks "a bit" too hard to navigate in 20 years and 104k commits old project. :)
I'll be very grateful if anyone can help me to figure out precise formulas for brightness, saturation, contrast, and gamma. Any hints. Thx.
This is the core function:
static void create_lut(EQParameters *param)
{
int i;
double g = 1.0 / param->gamma;
double lw = 1.0 - param->gamma_weight;
for (i = 0; i < 256; i++) {
double v = i / 255.0;
v = param->contrast * (v - 0.5) + 0.5 + param->brightness;
if (v <= 0.0) {
param->lut[i] = 0;
} else {
v = v * lw + pow(v, g) * param->gamma_weight;
if (v >= 1.0)
param->lut[i] = 255;
else
param->lut[i] = 256.0 * v;
}
}
param->lut_clean = 1;
}
The filter operates only on 8-bit YUV inputs. This function creates a Look Up Table mapping all 8-bit input values 0-255 to output values. Then this table is applied to the input pixels.
The functions with names of the form set_parameter like set_gamma convert the user supplied argument to the final value used in the above function. contrast is applied only to the luma plane; saturation to the chroma planes.
I was analyzing a 12 bit per pixel, GRBG, Little Endian, 1920x1280 resolution raw image but I am confused how data or RGB pixels are stored. Image size is 4915200 bytes, when calculated 4915200/(1920x1280) = 2. That means each pixel takes 2 bytes and 4 bits in 2bytes are used for padding. I tried to edit image with Hex editor but I have no idea how pixels are stored in image. Please do share if you have any idea.
Image Link
That means each pixel takes 2 bytes and 4 bits in 2bytes are used for padding
Well, sort of. It means each sample is stored in two consecutive bytes, with 4 bits of padding. But in raw images, samples usually aren't pixels, not exactly. Raw images have not been demosaiced yet, they are raw after all. For GRGB, the Bayer pattern looks like this:
What's in the file, is a 1920x1280 grid of 12+4 bit samples, arranged in the same order as pixels would have been, but each sample has only one channel, namely the one that corresponds to its position in the Bayer pattern.
Additionally, the color space is probably linear, not Gamma-compressed. The color balance is unknown unless you reverse engineer it. A proper decoder would have a calibrated color matrix, but I don't have that.
I combined these two things and guessed a color balance to do a really basic decoding (with bad demosaicing, just to demonstrate that the above information is probably accurate):
Using this C# code:
Bitmap bm = new Bitmap(1920, 1280);
for (int y = 0; y < 1280; y += 2)
{
int i = y * 1920 * 2;
for (int x = 0; x < 1920; x += 2)
{
const int stride = 1920 * 2;
int d0 = data[i] + (data[i + 1] << 8);
int d1 = data[i + 2] + (data[i + 3] << 8);
int d2 = data[i + stride] + (data[i + stride + 1] << 8);
int d3 = data[i + stride + 2] + (data[i + stride + 3] << 8);
i += 4;
int r = Math.Min((int)(Math.Sqrt(d1) * 4.5), 255);
int b = Math.Min((int)(Math.Sqrt(d2) * 9), 255);
int g0 = Math.Min((int)(Math.Sqrt(d0) * 5), 255);
int g3 = Math.Min((int)(Math.Sqrt(d3) * 5), 255);
int g1 = Math.Min((int)(Math.Sqrt((d0 + d3) * 0.5) * 5), 255);
bm.SetPixel(x, y, Color.FromArgb(r, g0, b));
bm.SetPixel(x + 1, y, Color.FromArgb(r, g1, b));
bm.SetPixel(x, y + 1, Color.FromArgb(r, g1, b));
bm.SetPixel(x + 1, y + 1, Color.FromArgb(r, g3, b));
}
}
You can load your image into a Numpy array and reshape correctly like this:
import numpy as np
# Load image and reshape
img = np.fromfile('Image_12bpp_grbg_LittleEndian_1920x1280.raw',dtype=np.uint16).reshape((1280,1920))
print(img.shape)
(1280, 1920)
Then you can demosaic and scale to get a 16-bit PNG. Note that I don't know your calibration coefficients so I guessed:
#!/usr/bin/env python3
# Demosaicing Bayer Raw image
# https://stackoverflow.com/a/68823014/2836621
import cv2
import numpy as np
filename = 'Image_12bpp_grbg_LittleEndian_1920x1280.raw'
# Set width and height
w, h = 1920, 1280
# Read mosaiced image as GRGRGR...
# BGBGBG...
bayer = np.fromfile(filename, dtype=np.uint16).reshape((h,w))
# Extract g0, g1, b, r from mosaic
g0 = bayer[0::2, 0::2] # every second pixel down and across starting at 0,0
g1 = bayer[1::2, 1::2] # every second pixel down and across starting at 1,1
r = bayer[0::2, 1::2] # every second pixel down and across starting at 0,1
b = bayer[1::2, 0::2] # every second pixel down and across starting at 1,0
# Apply (guessed) color matrix for 16-bit PNG
R = np.sqrt(r) * 1200
B = np.sqrt(b) * 2300
G = np.sqrt((g0+g1)/2) * 1300 # very crude
# Stack into 3 channel
BGR16 = np.dstack((B,G,R)).astype(np.uint16)
# Save result as 16-bit PNG
cv2.imwrite('result.png', BGR16)
Keywords: Python, raw, image processing, Bayer, de-Bayer, mosaic, demosaic, de-mosaic, GBRG, 12-bit.
I'm new to HLSL. I am trying to convert color space of an image captured using DXGI Desktop Duplication API from BGRA to YUV444 using texture as render target.
I have set my pixel shader to perform the required transformation. And taking the 4:2:0 sub-sampled YUV from render target texture and encoding it as H264 using ffmpeg, I can see the image.
The problem is - it is greenish.
The input color information for the shader is of float data type but the coefficient matrix available for RGB to YUV conversion assumes integer color information.
If I use clamp function and take the integers out of input color, I'm losing the accuracy.
Any suggestions and directions are welcome. Please let me know if any other information helps.
I suspect the Pixel shader I wrote, As I am working with it for the first time. Here is the pixel shader.
float3 rgb_to_yuv(float3 RGB)
{
float y = dot(RGB, float3(0.29900f, -0.16874f, 0.50000f));
float u = dot(RGB, float3(0.58700f, -0.33126f, -0.41869f));
float v = dot(RGB, float3(0.11400f, 0.50000f, -0.08131f));
return float3(y, u, v);
}
float4 PS(PS_INPUT input) : SV_Target
{
float4 rgba, yuva;
rgba = tx.Sample(samLinear, input.Tex);
float3 ctr = float3(0, 0, .5f);
return float4(rgb_to_yuv(rgba.rgb) + ctr, rgba.a);
}
The render target is mapped to CPU readable texture and copying the YUV444 data into 3 BYTE arrays and supplying to ffmpeg libx264 encoder.
The encoder writes the encoded packets to a video file.
Here I'm taking for each 2X2 matrix of pixels one U(Cb) and one V(Cr) and 4 Y values.
I retrieve the yuv420 data from texture as :
for (size_t h = 0, uvH = 0; h < desc.Height; ++h)
{
for (size_t w = 0, uvW = 0; w < desc.Width; ++w)
{
dist = resource1.RowPitch *h + w * 4;
distance = resource.RowPitch *h + w * 4;
distance2 = inframe->linesize[0] * h + w;
data = sptr[distance + 2 ];
pY[distance2] = data;
if (w % 2 == 0 && h % 2 == 0)
{
data1 = sptr[distance + 1];
distance2 = inframe->linesize[1] * uvH + uvW++;
pU[distance2] = data1;
data1 = sptr[distance ];
pV[distance2] = data1;
}
}
if (h % 2)
uvH++;
}
EDIT1: Adding the Blend state desc :
D3D11_BLEND_DESC BlendStateDesc;
BlendStateDesc.AlphaToCoverageEnable = FALSE;
BlendStateDesc.IndependentBlendEnable = FALSE;
BlendStateDesc.RenderTarget[0].BlendEnable = TRUE;
BlendStateDesc.RenderTarget[0].SrcBlend = D3D11_BLEND_SRC_ALPHA;
BlendStateDesc.RenderTarget[0].DestBlend = D3D11_BLEND_INV_SRC_ALPHA;
BlendStateDesc.RenderTarget[0].BlendOp = D3D11_BLEND_OP_ADD;
BlendStateDesc.RenderTarget[0].SrcBlendAlpha = D3D11_BLEND_ONE;
BlendStateDesc.RenderTarget[0].DestBlendAlpha = D3D11_BLEND_ZERO;
BlendStateDesc.RenderTarget[0].BlendOpAlpha = D3D11_BLEND_OP_ADD;
BlendStateDesc.RenderTarget[0].RenderTargetWriteMask = D3D11_COLOR_WRITE_ENABLE_ALL;
hr = m_Device->CreateBlendState(&BlendStateDesc, &m_BlendState);
FLOAT blendFactor[4] = {0.f, 0.f, 0.f, 0.f};
m_DeviceContext->OMSetBlendState(nullptr, blendFactor, 0xffffffff);
m_DeviceContext->OMSetRenderTargets(1, &m_RTV, nullptr);
m_DeviceContext->VSSetShader(m_VertexShader, nullptr, 0);
m_DeviceContext->PSSetShader(m_PixelShader, nullptr, 0);
m_DeviceContext->PSSetShaderResources(0, 1, &ShaderResource);
m_DeviceContext->PSSetSamplers(0, 1, &m_SamplerLinear);
m_DeviceContext->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
EDIT2 : The value of Y U V when calculated on CPU:45 200 170 and values after pixel shader which involves floating point calculations:86 141 104.
The corresponding R G B:48 45 45. What could be making the difference?
It looks like your matrix is transposed.
According to: www.martinreddy.net/gfx/faqs/colorconv.faq under [6.4] ITU.BT-601 Y'CbCr:
Y'= 0.299*R' + 0.587*G' + 0.114*B'
Cb=-0.169*R' - 0.331*G' + 0.500*B'
Cr= 0.500*R' - 0.419*G' - 0.081*B'
You misinterpreted the behavior of numpy.dot in the source you copied.
Also, it looks like #harold is correct, you should be offsetting both U and V.
Based on this Wikipedia article, to convert RGB -> YUV444 (BT.601) you should use this function:
float3 RGBtoYUV(float3 c)
{
float3 yuv;
yuv.x = dot(c, float3(0.299, 0.587, 0.114));
yuv.y = dot(c, float3(-0.14713, -0.28886, 0.436));
yuv.z = dot(c, float3(0.615, -0.51499, -0.10001));
return yuv;
}
Also, what's the format of the texture that you load into your shader?
Considering that you are using float4 rgba, yuva;, did you convert BGRA -> RGBA first?
I am given a .bin file. I know that the elements in this file correspond to Y Cb Cr values (4:2:2). Also, the data type is 8 bits. How can I view this?
I found a pretty good site: http://rawpixels.net/ which does what is expected but for YUV format. I want for YCbCr format.
Priliminary google search gives conversion to RGB, which is not desired.
I have attached an example .bin file on dropbox. The size of image is 720 X 576.
From Wikipedia
Y′CbCr is often confused with the YUV color space, and typically the
terms YCbCr and YUV are used interchangeably, leading to some
confusion; when referring to signals in video or digital form, the
term "YUV" mostly means "Y′CbCr".
If you are on a linux-based system and have access to ffmpeg, the following command correctly displays the data
ffplay -f rawvideo -video_size 720x576 -pix_fmt yuyv422 38.bin
Another good tool for displaying of RGB/YCbCr images is vooya which is free for linux but not for windows.
My own tool, yuv-viewer works as well.
Hope this helps.
You can up-sample the 4:2:2 down-sampled chroma like this:
////////////////////////////////////////////////////////////////////////////////
// unpack.c
// Mark Setchell
//
// Convert YCbCr 4:2:2 format file to full YCbCr without chroma subsampling
//
// Compile with:
// gcc -o unpack unpack.c
// Run with:
// ./unpack < input.bin > output.bin
////////////////////////////////////////////////////////////////////////////////
#include <stdio.h>
#include <sys/uio.h>
#include <unistd.h>
#include <sys/types.h>
int main(){
unsigned char ibuf[4]; // Input data buffer format: Y Cb Y Cr
unsigned char obuf[6]; // Output data buffer format: Y Cb Cr Y Cb Cr
// Read 4 bytes at a time, and upsample chroma
while(fread(ibuf,4,1,stdin)==1){
obuf[0]=ibuf[0];
obuf[1]=ibuf[1];
obuf[2]=ibuf[3];
obuf[3]=ibuf[2];
obuf[4]=ibuf[1];
obuf[5]=ibuf[3];
fwrite(obuf,6,1,stdout);
}
return 0;
}
Then you would run this to up-sample:
./unpack < input.bin > output.bin
and then use ImageMagick convert to get a PNG (or JPEG, or TIF) like this:
convert -size 720x576 -depth 8 yuv:result.bin image.png
In theory, ImageMagick should be able to do the up sampling itself (and not need a C program) with a command line like this, but I can't seem to make it work:
convert -interlace none -sampling-factor 4:2:2 -size 720x576 -depth 8 yuv:input.bin image.jpg
If anyone knows why - please comment!
This is a slightly different version of my other answer, insofar as it up-samples the chroma, and also converts the YUV to RGB and then creates a NetPBM PNM format file. That means that you only need to install the pnmtopng utility from NetPBM to get to a PNM image - and NetPBM is much lighter weight and simpler to install than ImageMagick.
////////////////////////////////////////////////////////////////////////////////
// yuv2pnm.c
// Mark Setchell
//
// Convert YUV 4:2:2 format file to RGB PNM format without chroma subsampling
//
// Compile with:
// gcc -o yuv2pnm yuv2pnm.c
//
// Run with:
// ./yuv2pnm < input.bin > output.pnm
//
// and then use ImageMagick to go to PNG format, or JPEG or TIF, with
//
// convert output.pnm image.png
//
// or, all in one line (still with ImageMagick) to JPEG:
//
// ./yuv2pnm < input.bin | convert pnm:- image.jpg
//
// or, use the (simpler-to-install) NetPBM's "pnmtopng" to convert to a PNG file
//
// ./yuv2pnm < input.bin | pnmtopng - > image.png
////////////////////////////////////////////////////////////////////////////////
#include <stdio.h>
#define MIN(a,b) (a<b) ? a : b
void YUV2RGB(unsigned char Y,unsigned char U, unsigned char V,unsigned char *RGB)
{
int R,G,B;
R = Y + (1.370705 * (V-128));
G = Y - (0.698001 * (V-128)) - (0.337633 * (U-128));
B = Y + (1.732446 * (U-128));
RGB[0] = MIN(255,R);
RGB[1] = MIN(255,G);
RGB[2] = MIN(255,B);
}
int main(int argc,char* argv[]){
unsigned char buf[4]; // Input data buffer format: Y Cb Y Cr
unsigned char RGB[6]; // Output data buffer format: R G B R G B
int width=720;
int height=576;
// Write PNM header
fprintf(stdout,"P6\n");
fprintf(stdout,"%d %d\n",width,height);
fprintf(stdout,"255\n");
// Read 4 bytes at a time, upsample chroma and convert to 2 RGB pixels
while(fread(buf,4,1,stdin)==1){
YUV2RGB(buf[0],buf[1],buf[3],&RGB[0]);
YUV2RGB(buf[2],buf[1],buf[3],&RGB[3]);
fwrite(RGB,6,1,stdout);
}
return 0;
}
NetPBM format is described here. Note PNM is an abbreviation that includes PPM.
Find below formula to convert YUV data into RGB.
R = Y + 1.4075 * (V - 128)
G = Y - 0.3455 * (U - 128) - (0.7169 * (V - 128))
B = Y + 1.7790 * (U - 128)
I am very new to Matlab. I am trying to convert an RGB image to YUV, and convert it back to RGB. This is my code:
RGB = imread('ist.jpg');
R = RGB(:,:,1);
G = RGB(:,:,2);
B = RGB(:,:,3);
Y = 0.299 * R + 0.587 * G + 0.114 * B;
U = -0.14713 * R - 0.28886 * G + 0.436 * B;
V = 0.615 * R - 0.51499 * G - 0.10001 * B;
R = Y + 1.139834576 * V;
G = Y -.3946460533 * U -.58060 * V;
B = Y + 2.032111938 * U;
RGB = cat(3,R,G,B);
imshow(RGB);
The final image that Matlab shows me is very blue-ish and very different from the initial RGB image. Also when I compare the certain pixels before and after Blue channel values, I get different values then each other. What am I doing wrong.
Also if there is a more efficient and shorter way to let me access an image's Y, U and V values, it would be better.
I would be really thankful for any help of any kind.
I do not have MATLAB access any more so can't test this. However, imread is most likely returning uint8 data. Do whos and see what the data type is. If it is uint8, the RGB->YUV->RGB conversion is clipping. Try this:
RGB = double(imread('ist.jpg')); % convert to double
% RGB->YUV->RGB like you have them in the current code
RGB = cat(3,R,G,B)./255; % since it's a double now, need 0..1 range for imshow.
% Divide 0..255 by 255 to get 0..1.
imshow(RGB);
See this for more discussion of imshow and data types.
There are functions in the Image Processing Toolbox to do that: ycbcr2rgb and rgb2ycbcr.