Which flag to use for better quality with sws_scale? - ffmpeg

/* values for the flags, the stuff on the command line is different */
#define SWS_FAST_BILINEAR 1
#define SWS_BILINEAR 2
#define SWS_BICUBIC 4
#define SWS_X 8
#define SWS_POINT 0x10
#define SWS_AREA 0x20
#define SWS_BICUBLIN 0x40
#define SWS_GAUSS 0x80
#define SWS_SINC 0x100
#define SWS_LANCZOS 0x200
#define SWS_SPLINE 0x400
Which one is better for image quality? What are differences? Are they all lossy?
I'm trying to convert RGB24 into YUV420P.

The RGB24 to YUV420 conversation itself is lossy. The scaling algorithm is probably used in downscaling the color information. I'd say the quality is:
point << bilinear < bicubic < lanczos/sinc/spline
I don't really know the others.
Under rare circumstances sinc is the ideal scaler and lossless, but those conditions are usually not met.
Are you also scaling the video? Otherwise I'd go for bicubic.

Related

Turn off sw_scale conversion to planar YUV 32 byte alignment requirements

I am experiencing artifacts on the right edge of scaled and converted images when converting into planar YUV pixel formats with sw_scale. I am reasonably sure (although I can not find it anywhere in the documentation) that this is because sw_scale is using an optimization for 32 byte aligned lines, in the destination. However I would like to turn this off because I am using sw_scale for image composition, so even though the destination lines may be 32 byte aligned, the output image may not be.
Example.
Full output frame is 1280x720 yuv422p10le. (this is 32 byte aligned)
However into the top left corner I am scaling an image with an outwidth of 1280 / 3 = 426.
426 in this format is not 32 byte aligned, but I believe sw_scale sees that the output linesize is 32 byte aligned and overwrites the width of 426 putting garbage in the next 22 bytes of data thinking this is simply padding when in my case this is displayable area.
This is why I need to actually disable this optimization or somehow trick sw_scale into believing it does not apply while keeping intact the way the program works, which is otherwise fine.
I have tried adding extra padding to the destination lines so they are no longer 32 byte aligned,
this did not help as far as I can tell.
Edit with code Example. Rendering omitted for ease of use.
Also here is a similar issue, unfortunately as I stated there fix will not work for my use case. https://github.com/obsproject/obs-studio/pull/2836
Use the commented line of code to swap between a output width which is and isnt 32 byte aligned.
#include "libswscale/swscale.h"
#include "libavutil/imgutils.h"
#include "libavutil/pixelutils.h"
#include "libavutil/pixfmt.h"
#include "libavutil/pixdesc.h"
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
/// Set up a 1280x720 window, and an item with 1/3 width and height of the window.
int window_width, window_height, item_width, item_height;
window_width = 1280;
window_height = 720;
item_width = (window_width / 3);
item_height = (window_height / 3);
int item_out_width = item_width;
/// This line sets the item width to be 32 byte aligned uncomment to see uncorrupted results
/// Note %16 because outformat is 2 bytes per component
//item_out_width -= (item_width % 16);
enum AVPixelFormat outformat = AV_PIX_FMT_YUV422P10LE;
enum AVPixelFormat informat = AV_PIX_FMT_UYVY422;
int window_lines[4] = {0};
av_image_fill_linesizes(window_lines, outformat, window_width);
uint8_t *window_planes[4] = {0};
window_planes[0] = calloc(1, window_lines[0] * window_height);
window_planes[1] = calloc(1, window_lines[1] * window_height);
window_planes[2] = calloc(1, window_lines[2] * window_height); /// Fill the window with all 0s, this is green in yuv.
int item_lines[4] = {0};
av_image_fill_linesizes(item_lines, informat, item_width);
uint8_t *item_planes[4] = {0};
item_planes[0] = malloc(item_lines[0] * item_height);
memset(item_planes[0], 100, item_lines[0] * item_height);
struct SwsContext *ctx;
ctx = sws_getContext(item_width, item_height, informat,
item_out_width, item_height, outformat, SWS_FAST_BILINEAR, NULL, NULL, NULL);
/// Check a block in the normal region
printf("Pre scale normal region %d %d %d\n", (int)((uint16_t*)window_planes[0])[0], (int)((uint16_t*)window_planes[1])[0],
(int)((uint16_t*)window_planes[2])[0]);
/// Check a block in the corrupted region (should be all zeros) These values should be out of the converted region
int corrupt_offset_y = (item_out_width + 3) * 2; ///(item_width + 3) * 2 bytes per component Y PLANE
int corrupt_offset_uv = (item_out_width + 3); ///(item_width + 3) * (2 bytes per component rshift 1 for horiz scaling) U and V PLANES
printf("Pre scale corrupted region %d %d %d\n", (int)(*((uint16_t*)(window_planes[0] + corrupt_offset_y))),
(int)(*((uint16_t*)(window_planes[1] + corrupt_offset_uv))), (int)(*((uint16_t*)(window_planes[2] + corrupt_offset_uv))));
sws_scale(ctx, (const uint8_t**)item_planes, item_lines, 0, item_height,window_planes, window_lines);
/// Preform same tests after scaling
printf("Post scale normal region %d %d %d\n", (int)((uint16_t*)window_planes[0])[0], (int)((uint16_t*)window_planes[1])[0],
(int)((uint16_t*)window_planes[2])[0]);
printf("Post scale corrupted region %d %d %d\n", (int)(*((uint16_t*)(window_planes[0] + corrupt_offset_y))),
(int)(*((uint16_t*)(window_planes[1] + corrupt_offset_uv))), (int)(*((uint16_t*)(window_planes[2] + corrupt_offset_uv))));
return 0;
}
Example Output:
//No alignment
Pre scale normal region 0 0 0
Pre scale corrupted region 0 0 0
Post scale normal region 400 400 400
Post scale corrupted region 512 36865 36865
//With alignment
Pre scale normal region 0 0 0
Pre scale corrupted region 0 0 0
Post scale normal region 400 400 400
Post scale corrupted region 0 0 0
I believe sw_scale sees that the output linesize is 32 byte aligned and overwrites the width of 426 putting garbage in the next 22 bytes of data thinking this is simply padding when in my case this is displayable area.
That's actually correct, swscale indeed does that, good analysis. There's two ways to get rid of this:
disable all SIMD code using av_set_cpu_flags_mask(0).
write the re-scaled 426xN image in a temporary buffer and then manually copy the pixels into the unpadded destination plane.
The reason ffmpeg/swscale overwrite the destination is for performance. If you don't care about runtime and want the simplest code, use the first solution. If you do want performance and don't mind slightly more complicated code, use the second solution.

How do I decompose the value of 192 in the code `SetConsoleTextAttribute(hStdout, 192)` into foreground and background color?

I'm trying to understand how SetConsoleTextAttribute works.
Per the definition in consoleapi2.h, this code
#include <windows.h>
#include <stdio.h>
int main(){
HANDLE hStdout = GetStdHandle(STD_OUTPUT_HANDLE);
SetConsoleTextAttribute(hStdout, FOREGROUND_RED| BACKGROUND_BLUE);
printf(" \n");
}
sets the foreground color of the console as red and blue for background.
this code is to examine the combined value, that is, 20.
printf("%d", FOREGROUND_RED | BACKGROUND_BLUE);
as the definition in consoleapi2.h.
#define FOREGROUND_RED 0x0004 // text color contains red.
#define BACKGROUND_BLUE 0x0010 // background color contains blue.
So far so good, I understand the code clearly until I tried the following code
#include <windows.h>
#include <stdio.h>
int main(){
HANDLE hStdout = GetStdHandle(STD_OUTPUT_HANDLE);
SetConsoleTextAttribute(hStdout, 192);
printf(" \n");
}
and I got
I make up the value of 192 but I don't know how it works. How do I decompose the value into foreground and background color?
Could someone give me a clue?
The wAttributes parameter to SetConsoleTextAttribute is of type WORD, i.e. a 16-bit unsigned integer. The lower 8 bits of the character attributes encode the color information.
The following diagram illustrates what the individual bits mean:
+===+===+===+===+===+===+===+===+
| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | bit
+===+===+===+===+===+===+===+===+
| I | R | G | B | I | R | G | B | meaning
+---+---+---+---+---+---+---+---+
| background | foreground | scope
+---+---+---+---+---+---+---+---+
RGB are the individual red-green-blue color channels. If the respective bit is set, the color channel is on, otherwise it is off. I designates the intensity. When set it selects the "bright" color, otherwise it refers to the "dark" variant.
The value 192 is 0b1100'0000 in binary. The I and R bits of the background color are set, meaning "bright red". None of the foreground color bits are set, so the foreground color is "black".

what is the difference between time_base and fps in x264 and ffmpeg?

there is codec parameters time_base and fps in x264, we actually can infer fps if we have time_base, so why x264 provide an extra parameter fps? It really confuses me, and it seems that fps doesn't work if I set both of them. also, ffmpeg encoder context has these two parameters
typedef struct x264_param_t
{
uint32_t i_fps_num;
uint32_t i_fps_den;
uint32_t i_timebase_num; /* Timebase numerator */
uint32_t i_timebase_den; /* Timebase denominator */
} rc;

Using atmega16 to reduce a DC motor speed with cytron md10c motor driver

So I have this DC motor which I want to reduce it's speed to 25% so obviously I used phase correct pwm to do so via the motor driver , I was able to do it through timer1 but my assistant professor wants me to do it with the 8-bit timer0 so I wrote the code and it ran but in full so my question is there are some calculation that must be done before writing the code and if there are , what are these calculations ?
Note : motor frequency is 100-250 Hz
I am working with internal frequency 1 MHz and prescaler 1024
#define F_CPU 1000000UL
#include <avr/io.h>
#include <util/delay.h>
int main(void)
{
DDRB = DDRB | (1<<PB3); //set OC0 as output pin --> pin where the PWM signal is generated from MC
/*set CS02:0 to 101 to work with 1024 prescaler
set WGM0[1:0] to 01 to work with phase correct,pwm
set COM01:0 to 11 to Set OC0 on compare match when up-counting. Clear OC0 on compare match
when downcounting*/
TCCR0 = 0b00111101;
OCR0 = 64; // 25% DUTY CYCLE
while (1)
{
;
}
}
Your question actually forces us to guess a bit - You're not telling enough facts to really help you. So, I'll guess you're using fast PWM, and guess you're controlling the motor speed with the PWM duty cycle.
Motor frequency and prescaler values are actually not so interesting - If you want to have speed reduction, you want to change the duty cycle, I assume.
A duty cycle of 25% of a 16-bit timer ist $10000/4 = $4000 (I guess that's what you set the Output compare register of your 16 bit timer to)
Obviously, on an 8-bit timer, a duty cycle of 25% is $100/4 = $40.
Also note what you need to write into TCCR0 to achieve the same thing from timer 0 is entirely different from what you write to TCCR1 for pretty much the same action - The bit positions are 100% different. Consult the data sheet, I think you got that wrong.

View .bin file (YCbCr 4:2:2 format)

I am given a .bin file. I know that the elements in this file correspond to Y Cb Cr values (4:2:2). Also, the data type is 8 bits. How can I view this?
I found a pretty good site: http://rawpixels.net/ which does what is expected but for YUV format. I want for YCbCr format.
Priliminary google search gives conversion to RGB, which is not desired.
I have attached an example .bin file on dropbox. The size of image is 720 X 576.
From Wikipedia
Y′CbCr is often confused with the YUV color space, and typically the
terms YCbCr and YUV are used interchangeably, leading to some
confusion; when referring to signals in video or digital form, the
term "YUV" mostly means "Y′CbCr".
If you are on a linux-based system and have access to ffmpeg, the following command correctly displays the data
ffplay -f rawvideo -video_size 720x576 -pix_fmt yuyv422 38.bin
Another good tool for displaying of RGB/YCbCr images is vooya which is free for linux but not for windows.
My own tool, yuv-viewer works as well.
Hope this helps.
You can up-sample the 4:2:2 down-sampled chroma like this:
////////////////////////////////////////////////////////////////////////////////
// unpack.c
// Mark Setchell
//
// Convert YCbCr 4:2:2 format file to full YCbCr without chroma subsampling
//
// Compile with:
// gcc -o unpack unpack.c
// Run with:
// ./unpack < input.bin > output.bin
////////////////////////////////////////////////////////////////////////////////
#include <stdio.h>
#include <sys/uio.h>
#include <unistd.h>
#include <sys/types.h>
int main(){
unsigned char ibuf[4]; // Input data buffer format: Y Cb Y Cr
unsigned char obuf[6]; // Output data buffer format: Y Cb Cr Y Cb Cr
// Read 4 bytes at a time, and upsample chroma
while(fread(ibuf,4,1,stdin)==1){
obuf[0]=ibuf[0];
obuf[1]=ibuf[1];
obuf[2]=ibuf[3];
obuf[3]=ibuf[2];
obuf[4]=ibuf[1];
obuf[5]=ibuf[3];
fwrite(obuf,6,1,stdout);
}
return 0;
}
Then you would run this to up-sample:
./unpack < input.bin > output.bin
and then use ImageMagick convert to get a PNG (or JPEG, or TIF) like this:
convert -size 720x576 -depth 8 yuv:result.bin image.png
In theory, ImageMagick should be able to do the up sampling itself (and not need a C program) with a command line like this, but I can't seem to make it work:
convert -interlace none -sampling-factor 4:2:2 -size 720x576 -depth 8 yuv:input.bin image.jpg
If anyone knows why - please comment!
This is a slightly different version of my other answer, insofar as it up-samples the chroma, and also converts the YUV to RGB and then creates a NetPBM PNM format file. That means that you only need to install the pnmtopng utility from NetPBM to get to a PNM image - and NetPBM is much lighter weight and simpler to install than ImageMagick.
////////////////////////////////////////////////////////////////////////////////
// yuv2pnm.c
// Mark Setchell
//
// Convert YUV 4:2:2 format file to RGB PNM format without chroma subsampling
//
// Compile with:
// gcc -o yuv2pnm yuv2pnm.c
//
// Run with:
// ./yuv2pnm < input.bin > output.pnm
//
// and then use ImageMagick to go to PNG format, or JPEG or TIF, with
//
// convert output.pnm image.png
//
// or, all in one line (still with ImageMagick) to JPEG:
//
// ./yuv2pnm < input.bin | convert pnm:- image.jpg
//
// or, use the (simpler-to-install) NetPBM's "pnmtopng" to convert to a PNG file
//
// ./yuv2pnm < input.bin | pnmtopng - > image.png
////////////////////////////////////////////////////////////////////////////////
#include <stdio.h>
#define MIN(a,b) (a<b) ? a : b
void YUV2RGB(unsigned char Y,unsigned char U, unsigned char V,unsigned char *RGB)
{
int R,G,B;
R = Y + (1.370705 * (V-128));
G = Y - (0.698001 * (V-128)) - (0.337633 * (U-128));
B = Y + (1.732446 * (U-128));
RGB[0] = MIN(255,R);
RGB[1] = MIN(255,G);
RGB[2] = MIN(255,B);
}
int main(int argc,char* argv[]){
unsigned char buf[4]; // Input data buffer format: Y Cb Y Cr
unsigned char RGB[6]; // Output data buffer format: R G B R G B
int width=720;
int height=576;
// Write PNM header
fprintf(stdout,"P6\n");
fprintf(stdout,"%d %d\n",width,height);
fprintf(stdout,"255\n");
// Read 4 bytes at a time, upsample chroma and convert to 2 RGB pixels
while(fread(buf,4,1,stdin)==1){
YUV2RGB(buf[0],buf[1],buf[3],&RGB[0]);
YUV2RGB(buf[2],buf[1],buf[3],&RGB[3]);
fwrite(RGB,6,1,stdout);
}
return 0;
}
NetPBM format is described here. Note PNM is an abbreviation that includes PPM.
Find below formula to convert YUV data into RGB.
R = Y + 1.4075 * (V - 128)
G = Y - 0.3455 * (U - 128) - (0.7169 * (V - 128))
B = Y + 1.7790 * (U - 128)

Resources