How to create RAW image? - image

I'm building one part of H264 encoder. For testing system, I need to created input image for encoding. We have a programme for read image to RAM file format to use.
My question is how to create a RAW file: bitmap or tiff (I don't want to use compressed format link JPEG)? I googled and recognize alot of raw file type. So what type i should use and how to create? . I think i will use C/C++ or Matlab to create raw file.
P/S: my need format is : YUV ( or Y Cb Cr) 4:2:0 and 8 bit colour deepth

The easiest raw format is just a stream of numbers, representing the pixels. Each raw format can be associated with metadata such as:
width, heigth
width / image row (ie. gstreamer & x-window align each row to dword boundaries)
bits per pixel
byte format / endianness (if 16 bits per pixel or more)
number of image channels
color system HSV, RGB, Bayer, YUV
order of channels, e.g. RGBA, ABGR, GBR
planar vs. packed (or FOURCC code)
or this metadata can be just an internal specification...
I believe one of the easiest approaches (after of course a steep learning curve :) is to use e.g. gstreamer, where you can use existing file/stream sources that read data from camera, file, pre-existing jpeg etc. and pass those raw streams inside a defined pipeline. One useful element is a filesink, which would simply write a single or few successive raw data frames to your filesystem. The gstreamer infrastructure has possibly hundreds of converters and filters, btw. including h264 encoder...
I would bet that if you just dump your memory, that output will conform already to some FOURCC -format (also recognized by gstreamer).

Related

ffmpeg - Read raw Data instead of converting to different Format

Since what ffmpeg does generally is read either an audio / image / video file of a given Codec & then converts it to a different Codec, it must have at some point hold to raw values of the media files, which:
for Audio the raw Samples (2*44100 Samples) in case of Stereo Audio as int / float
rgba pixel data for images (as int8 array)
for video, array of images & linked Audio streams
How can I essentially just read those raw values & get them in Memory / on Disk in lets say C++ / Python / Java?
best regards
ffmpeg is just a command line tool. The libraries behind the scene are part of the Libav* family. i.e. libavformt, libavcodec, libavtuil, swsscale, swresample, etc.
You can use those libraries directly in C or C++, or use some soft of FFI in other languages. (you can also pipe some raw formats such as y4m)
Going from a file name to a frame buffer will take a little more code than just "open()" But there are many tutorials online, and other stackoverflow questions that answer that.
Note:
rgba pixel data for images (as int8 array)
RGBa is not very common format for video. It's usually YUV, and uasually uses sub sampling for the chroma planes. Its also usually planner, so instead of a int8 array its a array of pointers pointing to several int8 arrays

Get width and height from jpeg without 0xFF 0xC0

I'm trying to get the file dimensions (width and height) from a jpeg (in this case an Instagram picture)
https://scontent.cdninstagram.com/t51.2885-15/s640x640/sh0.08/e35/11264864_1701024620182742_1335691074_n.jpg
As I understand if, the width and height are defined after the 0xFF 0xC0 marker, however I cannot find this marker in this picture. Has it been stripped or is there an alternative marker I should check for?
The JPEG Start-Of-Frame (SOF) marker has 4 possible values:
FFC0 (baseline) - This is the usual mode chosen for photos and encodes fully specified DCT blocks in groupings depending on the color/subsample options chosen
FFC1 (extended) - This is similar to baseline, but has more than 8-bits
per color stimulus
FFC2 (progressive) - This mode is often found on web pages to allow the image to load progressively as the data is
received. Each "scan" of the image progressively defines more coefficients of the DCT blocks until they're fully defined. This effectively provides more and more detail as more scans are decoded
FFC3 (lossless) - This mode uses a simple Huffman encoding
to losslessly encode the image. The only place I've seen this used is
on 16-bit grayscale DICOM medical images
If you're scanning through the bytes looking for an FFCx pattern, be aware that you may encounter one in the embedded thumbnail image (inside an FFE1 Exif marker). To properly find the SOFx of the main image, you'll need to walk the chain of JPEG markers. The 2-byte length (big-endian) follows the 2-byte marker. One last pitfall to avoid is that some JPEG encoders stick extra FF values in between the valid markers. If you encounter a FFFF where a marker should be, just increment the pointer by 1 byte and try again until you hit a valid marker.
You can get it pretty simply at the command-line with ImageMagick which is installed on most Linux distros and is available for OSX and Windows.
identify -ping https://scontent.cdninstagram.com/t51.2885-15/s640x640/sh0.08/e35/11264864_1701024620182742_1335691074_n.jpg
Output
https://scontent.cdninstagram.com/t51....1074_n.jpg JPEG 640x640 640x640+0+0 8-bit sRGB 162KB 0.000u 0:00.000

simple image format for encoding geometry data and decoding in HTML5 Canvas

I'm trying to encode geometric data in an image file to decode in-browser using Canvas. Beyond what I learned from reading the about the GIF, PNG and BMP formats today, I don't know much about image files (or binary files in general! I grok binary math conversions, but I've never had to interrogate or write binary data without something abstracting it for me).
This Mozilla tutorial (https://developer.mozilla.org/En/HTML/Canvas/Pixel_manipulation_with_canvas) indicates that Canvas reads the image as an array of 8-bit values, every four representing RGBA.
This leads me to believe I want to encode my data as an array of 8-bit values, and put it between an image header and an image footer.
What's the simplest way to do this?

How is HDR data stored?

I am wondering what the data structure is behind storing images with HDR data. I understand how regular images (rgba) and cubemaps are stored. I doubt its as simple as storing multiple images at different exposures inside the same file.
You've probably moved on long ago, but I thought it worth posting references for anyone else who happened upon this question.
Here is an old reference for the Radiance .pic (now .hdr) file format. The useful info starts at the bottom of page 29.
http://radsite.lbl.gov/radiance/refer/filefmts.pdf
excerpt:
The basic idea is to store a 1-byte mantissa for each of three
primaries, and a common 1-byte exponent. The accuracy of these values
will be on the order of 1% (+/-1 in 200) over a dynamic range from
10^-38 to 10^38.
And here is a more recent reference for JPEG HDR format: http://www.anyhere.com/gward/papers/cic05.pdf
It's generally a matter of increasing the range of values (in an HSV sense) representable, so you can use e.g. RGB[A] where each element is a 16-bit int, 32-bit int, float, double etc. instead of a JPEG-type-quality 8-bit int. There's a trade-off between increasing the range represented, retaining fine gradations within that range, and whether some particular intensity levels are given priority via some non-linearity in the mapping (e.g. storing a log of the value).
The raw file from the camera normally stores the 12-14bit values from the Bayer mask - so effectively a greeyscale. These are sometimes compressed losslessly (in Canon or Nikon) or as 16bit values (Olympus). The header also contains the white balance and gain calibrations for the red,green,blue masked pixels so you can generate a color image.
Once you have a color image you can store it however you want, normally 16bit RGB is the easiest.
Here is some information on the Radiance file format, used for HDR images. It uses 32-bit floating-point numbers.
First, I am not sure if there is a public format for storing multiple images at different exposures inside cause the usage is rare. Those multiple images are used as one sort of HDR sources, but they are not HDR, they are just normal LDR (L for low) or SDR (S for standard?) images encoded like JPEG from digital cameras.
It is more common to store resulting in HDR format and the point is just like everyone mentioned, in floating point.
There are some HDR formats:
OpenEXR
TIF
Radiance
...
You can get more info from wiki

Finding YUV file format in Cocoa

I got a raw YUV file format all I know at this point is that the clip has a resolution of 176x144.
the Y pla is 176x144=25344 bytes, and the UV plan is half of that. Now, I did some readings about YUV, and there are different formats corresponding to different ways how the Y & US planes are stored.
Now, how can perform some sort of check in Cocoa to find the raw YUV file format. Is there a file header in the YUV frame where I can extract some information?
Thanks in advance to everyone
Unfortunately, if it's just a raw YUV stream, it will just be the data for the frames written to disk, one after another. There probably won't be a header that indicates what specific format is being used.
It sounds like you have determined that it's a YUV 4:2:2 stream, so you just need to determine the interleaving order (the most common possibilities are listed here). In response to your previous question, I posted a function which converts a frame from the UYVY (Y422) YUV format to the 2VUY format used by Apple's YUV OpenGL extension. Your best bet may be to try that out and see how the images look, then modify the interleaving format until the colors and image clears up.

Resources