I am decoding raw H.265 data using avcodec_decode_video2 api. When I examine the resulting instance pictYUV of type AVFrame, I see that pictYUV->format is AV_PIX_FMT_YUV420P and pictYUV->data[0] points to Y-plane. Both of these are expected. However, it appears pictYUV->data[1] seem to contain V-plane data and pictYUV->data[2]seem to contain U-plane data. My intuition was that pictYUV->data would store YUV planes in that order and not YVU planes. Wondering if the data is always ordered as YVU or is there some flag I failed to look at. Regards.
AV_PIX_FMT_YUV420P is planar YUV format (see P at the end of its name), so Y, U, and V are stored separated. There are also YUV formats with interleaved YUV format.
If you are getting the data from an IP camera, it is normal to get planar format.
Related
Since what ffmpeg does generally is read either an audio / image / video file of a given Codec & then converts it to a different Codec, it must have at some point hold to raw values of the media files, which:
for Audio the raw Samples (2*44100 Samples) in case of Stereo Audio as int / float
rgba pixel data for images (as int8 array)
for video, array of images & linked Audio streams
How can I essentially just read those raw values & get them in Memory / on Disk in lets say C++ / Python / Java?
best regards
ffmpeg is just a command line tool. The libraries behind the scene are part of the Libav* family. i.e. libavformt, libavcodec, libavtuil, swsscale, swresample, etc.
You can use those libraries directly in C or C++, or use some soft of FFI in other languages. (you can also pipe some raw formats such as y4m)
Going from a file name to a frame buffer will take a little more code than just "open()" But there are many tutorials online, and other stackoverflow questions that answer that.
Note:
rgba pixel data for images (as int8 array)
RGBa is not very common format for video. It's usually YUV, and uasually uses sub sampling for the chroma planes. Its also usually planner, so instead of a int8 array its a array of pointers pointing to several int8 arrays
I'm using ffms2(ffmpegsource) a wrapper around libav to get video and audio frame from a file.
Video decoding is working fine. However I'm facing some issues with audio decoding.
FFMS2 provide a simple function FFMS_GetAudio(FFMS_AudioSource *A, void *Buf, int64_t Start, int64_t Count, FFMS_ErrorInfo *ErrorInfo); api to get the decoded buffer. The decoded data is return in buffer provided by user.
For single channel the data is interpretation is straight forward with data byte starting from first location of user buffer. However when it comes to two channel there are two possibilities the decoded data could be planar or interleaved depending upon sample format return by FFMS_GetAudioProperties. In my case the sample format is always planar which means that decoded data will in two sperate data plane data[0] and data[1]. And this is what is explained by libav/ffmpeg and also by portaudio which consider planar data to be in two separate data plane.
However FFMS_GetAudio just take single buffer from user. So can I assume for planar data
data[0] = buf, data[1] = buf + offset, where offset is half the length of buffer return by FFMS_GetAudio.
FFMS does not provide any good document for this interpretation. It would be great help if some can provide more information on this.
FFMS2 currently does not support outputting planar audio. More recent revisions (post-2.17) automatically interleave planar audio, while older versions from before libav added support for planar audio simply ignore all planes after the first.
I'm building one part of H264 encoder. For testing system, I need to created input image for encoding. We have a programme for read image to RAM file format to use.
My question is how to create a RAW file: bitmap or tiff (I don't want to use compressed format link JPEG)? I googled and recognize alot of raw file type. So what type i should use and how to create? . I think i will use C/C++ or Matlab to create raw file.
P/S: my need format is : YUV ( or Y Cb Cr) 4:2:0 and 8 bit colour deepth
The easiest raw format is just a stream of numbers, representing the pixels. Each raw format can be associated with metadata such as:
width, heigth
width / image row (ie. gstreamer & x-window align each row to dword boundaries)
bits per pixel
byte format / endianness (if 16 bits per pixel or more)
number of image channels
color system HSV, RGB, Bayer, YUV
order of channels, e.g. RGBA, ABGR, GBR
planar vs. packed (or FOURCC code)
or this metadata can be just an internal specification...
I believe one of the easiest approaches (after of course a steep learning curve :) is to use e.g. gstreamer, where you can use existing file/stream sources that read data from camera, file, pre-existing jpeg etc. and pass those raw streams inside a defined pipeline. One useful element is a filesink, which would simply write a single or few successive raw data frames to your filesystem. The gstreamer infrastructure has possibly hundreds of converters and filters, btw. including h264 encoder...
I would bet that if you just dump your memory, that output will conform already to some FOURCC -format (also recognized by gstreamer).
I'm trying to encode geometric data in an image file to decode in-browser using Canvas. Beyond what I learned from reading the about the GIF, PNG and BMP formats today, I don't know much about image files (or binary files in general! I grok binary math conversions, but I've never had to interrogate or write binary data without something abstracting it for me).
This Mozilla tutorial (https://developer.mozilla.org/En/HTML/Canvas/Pixel_manipulation_with_canvas) indicates that Canvas reads the image as an array of 8-bit values, every four representing RGBA.
This leads me to believe I want to encode my data as an array of 8-bit values, and put it between an image header and an image footer.
What's the simplest way to do this?
I got a raw YUV file format all I know at this point is that the clip has a resolution of 176x144.
the Y pla is 176x144=25344 bytes, and the UV plan is half of that. Now, I did some readings about YUV, and there are different formats corresponding to different ways how the Y & US planes are stored.
Now, how can perform some sort of check in Cocoa to find the raw YUV file format. Is there a file header in the YUV frame where I can extract some information?
Thanks in advance to everyone
Unfortunately, if it's just a raw YUV stream, it will just be the data for the frames written to disk, one after another. There probably won't be a header that indicates what specific format is being used.
It sounds like you have determined that it's a YUV 4:2:2 stream, so you just need to determine the interleaving order (the most common possibilities are listed here). In response to your previous question, I posted a function which converts a frame from the UYVY (Y422) YUV format to the 2VUY format used by Apple's YUV OpenGL extension. Your best bet may be to try that out and see how the images look, then modify the interleaving format until the colors and image clears up.