When I encode Rgb24 frame with H264 I get "input width is greater than than stride"...
By the way if I give raw image which is Yuv420p, ffmpeg successfully encodes it...
What I wanted to know is:
i) Do we have to give Yuv format for encoding? Can't give rgb frame for encoding h264?
ii) If we can give rgb frame, what is the trick?
I know this is a bit late (no answers since 2010), but it sounds like you need (or needed) to adjust the wrapping of your image data.
From the following MSDN article (I know it's MSDN, but its explanation of the concepts involved is REALLY good):
When a video image is stored in memory, the memory buffer might
contain extra padding bytes after each row of pixels. The padding
bytes affect how the image is stored in memory, but do not affect how
the image is displayed.
The stride is the number of bytes from one row of pixels in memory to
the next row of pixels in memory. Stride is also called pitch. If
padding bytes are present, the stride is wider than the width of the
image, as shown in the following illustration.
Read more here
Look at what you've specified for both your image width and image stride. Whatever data you are supplying for the row has more bits than you're specified for the stride (and I'm guessing the width as well, if they are in agreement).
Related
Without going into much detail, I'm trying to do an HTTP POST of a dummy image to a server to cause the server to create an internal record of the image in its database. The image is to later be replaced in storage with the actual image that's supposed to be there without the server knowing.
Unfortunately, this server is "smart" and validates whatever image data is being sent to it; it will reject random bytes if they don't match some image format (e.g. jpeg, gif, png. etc).
Naturally, the most obvious approach would be to send the smallest gif possible (1x1 grey pixel; ~26 bytes). Unfortunately, this server keeps an immutable record of the dimensions of the image that it reads... so a 1x1 pixel image won't cut it.
So my question is, what's the smallest possible scaled image of a solid colour I can send as a dummy instead? Ideally, a completely uniformly grey image of 100x100 pixels in this format should be roughly the same as a 1000x2000 image of the same colour, due to compression.
(Forgive me if the tags aren't very good; I'm not sure where this should go)
You can possibly achieve what you want with a specially-crafted GIF file.
The GIF format allows you to specify "logical screen width" and "logical screen height" values in the "Logical Screen Descriptor" section at the start of the file, which define the size of the image.
However, you don't actually need to encode the pixels for the entire image, and any pixels which are not encoded are considered transparent. Instead, the GIF file contains one or more "Image Descriptor" sections, which encodes the pixels for a sub-region of the image. This is used for compressing GIF animations (only the sub-regions of the image that change compared to the previous frame need to be encoded) but it can also be used for single-frame images. So what you can do is just output a single Image Descriptor encoding a 1x1 transparent pixel region of the image, and set the logical screen width and height values your desired image size, to create a uniformly transparent GIF image of arbitrary size, for a fixed file size (42 bytes).
You just need to modify bytes 6-9 (for the width) and bytes 10-13 (for the height) of this transparent 1x1 pixel GIF. Note that GIF uses little-endian byte order.
Here is an example file, a 1024 x 1024 pixel transparent GIF image.
This file loads up correctly for me as a 1k by 1k transparent image in the GIMP image editor, but some file viewers seem to base the image size on the size of the image descriptors and display it as 1x1 pixels, which is wrong AFAIK. You'll have to test whether your server reads them correctly.
I'm trying to get the file dimensions (width and height) from a jpeg (in this case an Instagram picture)
https://scontent.cdninstagram.com/t51.2885-15/s640x640/sh0.08/e35/11264864_1701024620182742_1335691074_n.jpg
As I understand if, the width and height are defined after the 0xFF 0xC0 marker, however I cannot find this marker in this picture. Has it been stripped or is there an alternative marker I should check for?
The JPEG Start-Of-Frame (SOF) marker has 4 possible values:
FFC0 (baseline) - This is the usual mode chosen for photos and encodes fully specified DCT blocks in groupings depending on the color/subsample options chosen
FFC1 (extended) - This is similar to baseline, but has more than 8-bits
per color stimulus
FFC2 (progressive) - This mode is often found on web pages to allow the image to load progressively as the data is
received. Each "scan" of the image progressively defines more coefficients of the DCT blocks until they're fully defined. This effectively provides more and more detail as more scans are decoded
FFC3 (lossless) - This mode uses a simple Huffman encoding
to losslessly encode the image. The only place I've seen this used is
on 16-bit grayscale DICOM medical images
If you're scanning through the bytes looking for an FFCx pattern, be aware that you may encounter one in the embedded thumbnail image (inside an FFE1 Exif marker). To properly find the SOFx of the main image, you'll need to walk the chain of JPEG markers. The 2-byte length (big-endian) follows the 2-byte marker. One last pitfall to avoid is that some JPEG encoders stick extra FF values in between the valid markers. If you encounter a FFFF where a marker should be, just increment the pointer by 1 byte and try again until you hit a valid marker.
You can get it pretty simply at the command-line with ImageMagick which is installed on most Linux distros and is available for OSX and Windows.
identify -ping https://scontent.cdninstagram.com/t51.2885-15/s640x640/sh0.08/e35/11264864_1701024620182742_1335691074_n.jpg
Output
https://scontent.cdninstagram.com/t51....1074_n.jpg JPEG 640x640 640x640+0+0 8-bit sRGB 162KB 0.000u 0:00.000
I'd like to know which is the image format which permits me to encode losslessly 0xFFFFFF colors, but I need the one which occupies less space on disk. I know that BMP, JPEG(variant), TIFF, PNG,(just to say some) are lossless, but which one is the one that, considering also zipping or whatever, can occupy less space?
A PNG image (16million.png) containing all possible RGB888 colors was published in 1996. It occupies 115,989 bytes. I have converted the same image to a MNG file of just 472 bytes. The current version of pngcrush (1.8.0) brings the PNG file down to 91514 bytes.
See Khalid Sayood's Lossless Compression Handbook.
If on the other hand you are asking about a format that can represent a single pixel in any one of the 16 million colors, then PNG takes 69 bytes
including the 8-byte PNG signature, the IHDR, IEND, and IDAT chunk overhead, and several bytes of zlib overhead within the IDAT chunk, while a simple PPM file only takes 14 bytes to represent such single-pixel images (P6 1 1 255 \n red green blue).
Between those extremes, the best compression depends upon the content of the image.
I'm building one part of H264 encoder. For testing system, I need to created input image for encoding. We have a programme for read image to RAM file format to use.
My question is how to create a RAW file: bitmap or tiff (I don't want to use compressed format link JPEG)? I googled and recognize alot of raw file type. So what type i should use and how to create? . I think i will use C/C++ or Matlab to create raw file.
P/S: my need format is : YUV ( or Y Cb Cr) 4:2:0 and 8 bit colour deepth
The easiest raw format is just a stream of numbers, representing the pixels. Each raw format can be associated with metadata such as:
width, heigth
width / image row (ie. gstreamer & x-window align each row to dword boundaries)
bits per pixel
byte format / endianness (if 16 bits per pixel or more)
number of image channels
color system HSV, RGB, Bayer, YUV
order of channels, e.g. RGBA, ABGR, GBR
planar vs. packed (or FOURCC code)
or this metadata can be just an internal specification...
I believe one of the easiest approaches (after of course a steep learning curve :) is to use e.g. gstreamer, where you can use existing file/stream sources that read data from camera, file, pre-existing jpeg etc. and pass those raw streams inside a defined pipeline. One useful element is a filesink, which would simply write a single or few successive raw data frames to your filesystem. The gstreamer infrastructure has possibly hundreds of converters and filters, btw. including h264 encoder...
I would bet that if you just dump your memory, that output will conform already to some FOURCC -format (also recognized by gstreamer).
Bitmap images contain pure representation of raw data. A 512x512 24bit bitmap image (like game textures) is 768KB in size, as it should be. Why a 512x512, but 8-bit instead of 256KB, is 257KB? Also a 256x256 8-bit image is 65KB instead of 64! (66,614 bytes instead of 65536 bytes); but for the 24 bit one, it is exactly as it should be.
Thanks...I'm confused.
What do you mean with "bitmap image"? It is true that they contain raw data, but they may contain a header also.
And/Or there could be some rounding error in whatever interface is telling you "65 Kb". What is the exact length in bytes for the 64K image? Is it 65536 bytes, or something more?