Sobel edge-detection, weird output - edge-detection

I´m trying to implement a Sobel algorithm for edge-detection for a YUV camera stream. Initially it seems quite easy but I´m not sure if this approach is correct:
I´m applying the filter just to the Y pixel component and doing U and V = 0 (black and white image).
After, and in order to check the result, I´m sending frames through the serial port, but before I convert the image from YUV to jpg.
The black and white image works perfectly and I can see it on the PC application which I wrote, but when I´m applying the Sobel filter to Y component I´m getting this:
the code:
#define index(xx, yy) ((yy * width + xx) * 2) & 0xFFFFFFFE // address multiple of 2
(...............)
for (y=1, y < height-1; y++){
for (x=1, y < width-1; y++){
pixel_valueY_h=0.0;
pixel_valueY_v=0.0;
for (j= -1; j<2; j++){
for (i= -1; i<2; i++){
offset= index(x+1, y+1);
pixel_valueY_h += (sobel_h[j + 1][i + 1])* input[offset+1]; //offset+1=> Y component
pixel_valueY_v += (sobel_v[j + 1][i + 1])* input[offset+1];
}
}
offset = index(x,y);
pixel_value= sqrt1((pixel_valueY_h * pixel_valueY_h)+(pixel_valueY_v * pixel_valueY_v));
if (pixel_value > 255) pixel_value=255;
if (pixel_value < 0) pixel_value=0;
//output frame
output[offset] &=0x00; //U and V components = 0
output[offset+1] &=(255- (unsigned char)pixel_value );
}
}
(...............)
Any clue about what is happening?
Thanks in advance.

Finally I've got it working, the problem was the memory addressing using the macro: #define index(xx, yy) ((yy * width + xx) * 2) & 0xFFFFFFFE which for some reason was giving incorrect addresses.
Instead I added the line ( ((yy * width + xx) * 2) & 0xFFFFFFFE) in the code and like that (without modifications) is working perfectly.
Thanks.

Related

FFmpeg color correction algorithm

I'm trying to sync CSS and FFmpeg color correction. The goal is to create tool that converts CSS bri-sat-con-gam filter values to corresponding ffmpeg vals and vice versa.
e.g.
-vf "eq=brightness=0.3:saturation=1.3:contrast=1.1"
→
filter="brightness(30%) saturate(130%) contrast(110%)"
While algorithms for CSS properties are available at W3C, I have failed to find ones for FFmpeg. I've tried to dig GitHub. Starting from here I've unfolded function calls, but it looks "a bit" too hard to navigate in 20 years and 104k commits old project. :)
I'll be very grateful if anyone can help me to figure out precise formulas for brightness, saturation, contrast, and gamma. Any hints. Thx.
This is the core function:
static void create_lut(EQParameters *param)
{
int i;
double g = 1.0 / param->gamma;
double lw = 1.0 - param->gamma_weight;
for (i = 0; i < 256; i++) {
double v = i / 255.0;
v = param->contrast * (v - 0.5) + 0.5 + param->brightness;
if (v <= 0.0) {
param->lut[i] = 0;
} else {
v = v * lw + pow(v, g) * param->gamma_weight;
if (v >= 1.0)
param->lut[i] = 255;
else
param->lut[i] = 256.0 * v;
}
}
param->lut_clean = 1;
}
The filter operates only on 8-bit YUV inputs. This function creates a Look Up Table mapping all 8-bit input values 0-255 to output values. Then this table is applied to the input pixels.
The functions with names of the form set_parameter like set_gamma convert the user supplied argument to the final value used in the above function. contrast is applied only to the luma plane; saturation to the chroma planes.

Direct Access to CreateDIBitmap Bits

[The final fix, which works unconditionally: use SetDIBitsToDevice, not BitBlt, to copy out the post-text-draw image data. With this change, all occurrences of the problem are gone.]
I fixed the problem I'm having, but for the life of me I can't figure out why it occurred.
Create a bitmap with CreateDIBitmap. Get a pointer to the bitmap bits.
Select the bitmap into a memory DC.
Background fill the bitmap by directly writing the bitmap memory.
TextOut.
No text displays.
What fixed the problem: change item 3. from direct fill to a call to FillRect. All is well, it works perfectly.
This is under Windows 10 but from what little I could find on the web, it spans all versions of Windows. NO operations work on the bitmap - even calling FillRect - after the manual write. No savvy, Kimosabe. Elsewhere in the app, I even build gradient fills by directly writing to that bitmap memory and there is no problem. But once TextOut is called after the manual fill, the bitmap is locked (effectively) and no further functions work on it - nor do any return an error.
I'm using a font with a 90 degree escapement. Have not tried it with a "normal" font, 0 degree escapement. DrawTextEx with DT_CALCRECT specifically states it only works on 0 degree escapement fonts so I had to use TextOut for this reason.
Very bizarre.
No, there were no stupid mistakes like using the same text color as the background color. I've spent too long on this for that. One option people have available is that the endless energy that would normally be spent destroying the question and/or the person who asked it could instead be used to write a few lines of code and try it for yourself.
Here's a function to make a bitmap. Don't pass a plain colour, pass a gradient fill, say going from white to pinkish.
Does it display correctly? If so, does the TextOut call on top of that work?
static HBITMAP MakeBitmap(unsigned char *rgba, int width, int height, VOID **buff)
{
VOID *pvBits; // pointer to DIB section
HBITMAP answer;
BITMAPINFO bmi;
HDC hdc;
int x, y;
int red, green, blue, alpha;
// setup bitmap info
bmi.bmiHeader.biSize = sizeof(BITMAPINFOHEADER);
bmi.bmiHeader.biWidth = width;
bmi.bmiHeader.biHeight = height;
bmi.bmiHeader.biPlanes = 1;
bmi.bmiHeader.biBitCount = 32; // four 8-bit components
bmi.bmiHeader.biCompression = BI_RGB;
bmi.bmiHeader.biSizeImage = width * height * 4;
hdc = CreateCompatibleDC(GetDC(0));
answer = CreateDIBSection(hdc, &bmi, DIB_RGB_COLORS, &pvBits, NULL, 0x0);
for (y = 0; y < height; y++)
{
for (x = 0; x < width; x++)
{
red = rgba[(y*width + x) * 4];
green = rgba[(y*width + x) * 4 + 1];
blue = rgba[(y*width + x) * 4 + 2];
alpha = rgba[(y*width + x) * 4 + 3];
red = (red * alpha) >> 8;
green = (green * alpha) >> 8;
blue = (blue * alpha) >> 8;
((UINT32 *)pvBits)[(height - y - 1) * width + x] = (alpha << 24) | (red << 16) | (green << 8) | blue;
}
}
DeleteDC(hdc);
*buff = pvBits;
return answer;
}

Can't isolate pixels from av_frame_copy_to_buffer

I'm trying to pull the YUV pixel data from an AVFrame, modify the pixels, and put it back into FFmpeg.
I'm currently using this to retrieve the YUV buffer
const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(base->format);
int baseSize = av_image_get_buffer_size(base->format, base->width, base->height, 32);
uint8_t *baseBuffer = (uint8_t*)malloc(baseSize);
av_image_copy_to_buffer(baseBuffer, baseSize, base->data, base->linesize, base->format, base->width, base->height, 32);
But I can't seem to correctly target pixels in that buffer. From the source code they seem to be stacking the planes on top of each other, leading me to attempt this
int width = base->width;
int height = base->height;
int chroma2h = desc->log2_chroma_h;
int linesizeY = base->linesize[0];
int linesizeU = base->linesize[1];
int linesizeV = base->linesize[2];
int chromaHeight = (height + (1 << chroma2h) -1) >> chroma2h;
int x = 100;
int y = 100;
uint8_t *vY = base;
uint8_t *vU = base +(linesizeY*height);
uint8_t *vV = base +((linesizeY*height) + (linesizeU*chromaHeight));
vY+= x + (y * linesizeY);
vU+= x + (y * linesizeU);
vV+= x + (y * linesizeV);
Using that, if I try to modify pixels from a range of 300,300-400,400 I get a small box darker than the rest of the video, along with horizontal stripes of darkness along the video. The original color is still there, so I think I'm still touching the Y plane on all 3 pointers.
How can I actually hit the pixels I want to hit?

How to transform mouse location in isometric tiling map?

So I've managed myself to write the first part (algorithm) to calculate each tile's position where should it be placed while drawing this map (see bellow). However I need to be able to convert mouse location to the appropriate cell and I've been almost pulling my hair off because I can't figure out a way how to get the cell from mouse location. My concern is that it involves some pretty high math or something i'm just something easy i'm not capable to notice.
For example if the mouse position is 112;35 how do i calculate/transform it to to get that the cell is 2;3 at that position?
Maybe there is some really good math-thinking programmer here who would help me on this or someone who knows how to do it or can give some information?
var cord:Point = new Point();
cord.x = (x - 1) * 28 + (y - 1) * 28;
cord.y = (y - 1) * 14 + (x - 1) * (- 14);
Speaking of the map, each cell (transparent tile 56x28 pixels) is placed in the center of the previous cell (or at zero position for the cell 1;1), above is the code I use for converting cell-to-position. I tried lot of things and calculations for position-to-cell but each of them failed.
Edit:
After reading lot of information it seems that using off screen color map (where colors are mapped to tiles) is the fastest and most efficient solution?
I know this is an old post, but I want to update this since some people might still look for answers to this issue, just like I was earlier today. However, I figured this out myself. There is also a much better way to render this so you don't get tile overlapping issues.
The code is as simple as this:
mouse_grid_x = floor((mouse_y / tile_height) + (mouse_x / tile_width));
mouse_grid_y = floor((-mouse_x / tile_width) + (mouse_y / tile_height));
mouse_x and mouse_y are mouse screen coordinates.
tile_height and tile_width are actual tile size, not the image itself. As you see on my example picture I've added dirt under my tile, this is just for easier rendering, actual size is 24 x 12. The coordinates are also "floored" to keep the result grid x and y rounded down.
Also notice that I render these tiles from the y=0 and x=tile_with / 2 (red dot). This means my 0,0 actually starts at the top corner of the tile (tilted) and not out in open air. See these tiles as rotated squares, you still want to start from the 0,0 pixel.
Tiles will be rendered beginning with the Y = 0 and X = 0 to map size. After first row is rendered you skip a few pixels down and to the left. This will make the next line of tiles overlap the first one, which is a great way to keep the layers overlapping coorectly. You should render tiles, then whatever in on that tile before moving on to the next.
I'll add a render example too:
for (yy = 0; yy < map_height; yy++)
{
for (xx = 0; xx < map_width; xx++)
{
draw tiles here with tile coordinates:
tile_x = (xx * 12) - (yy * 12) - (tile_width / 2)
tile_y = (yy * 6) + (xx * 6)
also draw whatever is on this tile here before moving on
}
}
(1) x` = 28x -28 + 28y -28 = 28x + 28y -56
(2) y` = -14x +14 +14y -14 = -14x + 14y
Transformation table:
[x] [28 28 -56 ] = [x`]
[y] [-14 14 0 ] [y`]
[1] [0 0 1 ] [1 ]
[28 28 -56 ] ^ -1
[-14 14 0 ]
[0 0 1 ]
Calculate that with a plotter ( I like wims )
[1/56 -1/28 1 ]
[1/56 1/28 1 ]
[0 0 1 ]
x = 1/56*x` - 1/28y` + 1
y = 1/56*x` + 1/28y` + 1
I rendered the tiles like above.
the sollution is VERY simple!
first thing:
my Tile width and height are both = 32
this means that in isometric view,
the width = 32 and height = 16!
Mapheight in this case is 5 (max. Y value)
y_iso & x_iso == 0 when y_mouse=MapHeight/tilewidth/2 and x_mouse = 0
when x_mouse +=1, y_iso -=1
so first of all I calculate the "per-pixel transformation"
TileY = ((y_mouse*2)-((MapHeight*tilewidth)/2)+x_mouse)/2;
TileX = x_mouse-TileY;
to find the tile coordinates I just devide both by tilewidth
TileY = TileY/32;
TileX = TileX/32;
DONE!!
never had any problems!
I've found algorithm on this site http://www.tonypa.pri.ee/tbw/tut18.html. I couldn't get it to work for me properly, but I change it by trial and error to this form and it works for me now.
int x = mouse.x + offset.x - tile[0;0].x; //tile[0;0].x is the value of x form witch map was drawn
int y = mouse.y + offset.y;
double _x =((2 * y + x) / 2);
double _y= ((2 * y - x) / 2);
double tileX = Math.round(_x / (tile.height - 1)) - 1;
double tileY = Math.round(_y / (tile.height - 1));
This is my map generation
for(int x=0;x<max_X;x++)
for(int y=0;y<max_Y;y++)
map.drawImage(image, ((max_X - 1) * tile.width / 2) - ((tile.width - 1) / 2 * (y - x)), ((tile.height - 1) / 2) * (y + x));
One way would be to rotate it back to a square projection:
First translate y so that the dimensions are relative to the origin:
x0 = x_mouse;
y0 = y_mouse-14
Then scale by your tile size:
x1 = x/28; //or maybe 56?
y1 = y/28
Then rotate by the projection angle
a = atan(2/1);
x_tile = x1 * cos(a) - y1 * sin(a);
y_tile = y1 * cos(a) + x1 * sin(a);
I may be missing a minus sign, but that's the general idea.
Although you didn't mention it in your original question, in comments I think you said you're programming this in Flash. In which case Flash comes with Matrix transformation functions. The most robust way to convert between coordinate systems (eg. to isometric coordinates) is using Matrix transformations:
http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/geom/Matrix.html
You would want to rotate and scale the matrix in the inverse of how you rotated and scaled the graphics.

Help please with CGBitmapContext and 16 bit images

I'd LOVE to know what I'm doing wrong here. I'm a bit of a newbie with CGImageRefs so any advice would help.
I'm trying to create a bitmap image that has as it's pixel values a weighted sum of the pixels from another bitmap, and both bitmaps are 16bits per channel. For some reason I had no trouble getting this to work with 8bit images but it fails miserably with 16bit. My guess is that I'm just not setting things up correctly. I've tried using CGFloats, floats and UInt16s as the data types but nothing has worked. The input image has no alpha channel. The output image I get looks liked colored snow.
relevant stuff from the header:
UInt16 *inBaseAddress;
UInt16 *outBaseAddress;
CGFloat inAlpha[5];
CGFloat inRed[5];
CGFloat inGreen[5];
CGFloat inBlue[5];
CGFloat alphaSum, redSum, greenSum, blueSum;
int shifts[5];
CGFloat weight[5];
CGFloat weightSum;
I create the context for the input bitmap (a CGImageRef created with CGImageSourceCreateImageAtIndex(source, 0, NULL)) using:
size_t width = CGImageGetWidth(inBitmap);
size_t height = CGImageGetHeight(inBitmap);
size_t bitmapBitsPerComponent = CGImageGetBitsPerComponent(inBitmap);
size_t bitmapBytesPerRow = (pixelsWide * 4 * bitmapBitsPerComponent / 8);
CGColorSpaceRef colorSpace = CGImageGetColorSpace(inImage);
CGBitmapInfo bitmapInfo = kCGBitmapByteOrderDefault | kCGImageAlphaNoneSkipLast;
CGContextRef inContext = CGBitmapContextCreate (NULL,width,height,bitmapBitsPerComponent,bitmapBytesPerRow,colorSpace,bitmapInfo);
The context for the output bitmap is created in the same way. I draw the inBitmap into the inContext using:
CGRect rect = {{0,0},{width,height}};
CGContextDrawImage(inContext, rect, inBitmap);
Then I initialize the inBaseAddress and outBaseAddress like so:
inBaseAddress = CGBitmapContextGetData(inContext);
outBaseAddress = CGBitmapContextGetData(outContext);
Then I fill the outBaseAddress with values from the inBaseAddress:
for (n = 0; n < 5; n++)
{
inRed[n] = inBaseAddress[inSpot + 0 + shifts[n]];
inGreen[n] = inBaseAddress[inSpot + 1 + shifts[n]];
inBlue[n] = inBaseAddress[inSpot + 2 + shifts[n]];
inAlpha[n] = inBaseAddress[inSpot + 3 + shifts[n]];
}
alphaSum = 0.0;
redSum = 0.0;
greenSum = 0.0;
blueSum = 0.0;
for (n = 0; n < 5; n++)
{
redSum += inRed[n] * weight[n];
greenSum += inGreen[n] * weight[n];
blueSum += inBlue[n] * weight[n];
alphaSum += inAlpha[n] * weight[n];
}
outBaseAddress[outSpot + 0] = (UInt16)roundf(redSum);
outBaseAddress[outSpot + 1] = (UInt16)roundf(greenSum);
outBaseAddress[outSpot + 2] = (UInt16)roundf(blueSum);
outBaseAddress[outSpot + 3] = (UInt16)roundf(alphaSum);
As a simple check I've tried:
outBaseAddress[outSpot + 0] = inBaseAddress[inSpot + 0];
outBaseAddress[outSpot + 1] = inBaseAddress[inSpot + 1];
outBaseAddress[outSpot + 2] = inBaseAddress[inSpot + 2];
outBaseAddress[outSpot + 3] = inBaseAddress[inSpot + 3];
which works and at least means that the contexts and pointers to the bitmap data are working.
Thanks for any input. This has been pretty frustrating since it worked just fine with 8bit images.
OK, I've got it figured out. I needed to set the bitmapInfo to kCGBitmapByteOrder16Little for the 16bit images and to kCGBitmapByteOrder32Little for the 8bit images. I'm a bit surprised by this actually as would have expected it to be the other way around (32Little for 16 bit and 16Little for 8bit).
I also needed to type def the pointers to the bitmaps as UInt8* and UInt16*. It also appears that I have to include an alpha channel in the bitmapContext. I'm not sure why but the context returned was always nil without it.
It sounds like a byte ordering problem
Have you checked that CGImageGetBitsPerComponent is returning 16? As a matter of style, if you're assuming you're creating a bitmap context with 16 bits per pixel (since you treat the data as UInt16*), you should set explicitly set size_t bitmapBitsPerComponent = 16.
What is your shifts array for? It seems like the most likely place for error, since it's affecting the address you're reading from, but you don't explain it at all. Are the values in shifts multiples of 16?

Resources