how to combine/merge slices into one slice per video frame (h.264) - slice

I need to find a way or code that can combine multiple slices (NAL units) into a single slice per video frame
I have some clips. Most of their frames are encoded with multiple slices per frame using H.264 but it triggers some problems on our decoders, which we can't change or modify but the decoder works fine if it's a single slice per frame. I tried FFmpeg to combine multiple slices into one but I hadn't found any way to do that.
Can anybody suggest something to do that? It would be best if there are some code that can do that, like open sources.
Thanks

You can't do this without reencoding so you will need another decoder which doesn't have problems with multiple slices per frame.

Related

How would I create a radially offset mosaic of rtsp streams that transitions to a logo

I'm new to stack overflow, but I've been researching how to do this for a couple weeks to no avail. I'm hoping perhaps one of you has some knowledge I haven't seen online yet.
Here is a crude illustration of what I hope to accomplish. I have a video wall of eight monitors - four each of two different sizes. The way it's set up now, all eight monitors are treated together as one big monitor displaying an oddly shaped cutout of a desktop.
Eventually I need each individual monitor to display a separate RTSP stream for about thirty seconds, then have the entire display - all eight monitors in conjunction - to fade out into a large logo.
My problem right now is that I don't know of a way to mask an rtsp stream so it looks like this rather than this, let alone how to arrange them into a weirdly spaced, oddly angled, multiple aspect-ratio mosaic like in the original illustration.
Thank you all for your time. I'm just an intern here without insane technical knowhow, but I'll try to clarify as much as I can.
-J
I believe -filter-complex is one of the ffmpeg CLI flags that you need. You can find many examples online, but here are a few links of interest:
Here's an ffmpeg wiki on creating a mosaic https://trac.ffmpeg.org/wiki/Create%20a%20mosaic%20out%20of%20several%20input%20videos
FFMpeg - Combine multiple filter_complex and overlay functions
That should get you started, but you will probably need to add customization depending on frame size and formats.

FFMPEG API -- How much do stream parameters change frame-to-frame?

I'm trying to extract raw streams from devices and files using ffmpeg. I notice the crucial frame information (Video: width, height, pixel format, color space, Audio: sample format) is stored both in the AVCodecContext and in the AVFrame. This means I can access it prior to the stream playing and I can access it for every frame.
How much do I need to account for these values changing frame-to-frame? I found https://ffmpeg.org/doxygen/trunk/demuxing__decoding_8c_source.html#l00081 which indicates that at least width, height, and pixel format may change frame to frame.
Will the color space and sample format also change frame to frame?
Will these changes be temporary (a single frame) or lasting (a significant block of frames) and is there any way to predict for this stream which behavior will occur?
Is there a way to find the most descriptive attributes that this stream is possible of producing, such that I can scale all the lower-quality frames up, but not offer a result that is mindlessly higher-quality than the source, even if this is a device or a network stream where I cannot play all the frames in advance?
The fundamental question is: how do I resolve the flexibility of this API with the restriction that raw streams (my output) do not have any way of specifying a change of stream attributes mid-stream. I imagine I will need to either predict the most descriptive attributes to give the stream, or offer a new stream when the attributes change. Which choice to make depends on whether these values will change rapidly or stay relatively stable.
So, to add to what #szatmary says, the typical use case for stream parameter changes is adaptive streaming:
imagine you're watching youtube on a laptop with various methods of internet connectivity, and suddenly bandwidth decreases. Your stream will automatically switch to a lower bandwidth. FFmpeg (which is used by Chrome) needs to support this.
alternatively, imagine a similar scenario in a rtc video chat.
The reason FFmpeg does what it does is because the API is essentially trying to accommodate to the common denominator. Videos shot on a phone won't ever change resolution. Neither will most videos exported from video editing software. Even videos from youtube-dl will typically not switch resolution, this is a client-side decision, and youtube-dl simply won't do that. So what should you do? I'd just use the stream information from the first frame(s) and rescale all subsequent frames to that resolution. This will work for 99.99% for the cases. Whether you want to accommodate your service to this remaining 0.01% depends on what type of videos you think people will upload and whether resolution changes make any sense in that context.
Does colorspace change? They could (theoretically) in software that mixes screen recording with video fragments, but it's highly unlikely (in practice). Sample format changes as often as video resolution: quite often in the adaptive scenario, but whether you care depends on your service and types of videos you expect to get.
Usually not often, or ever. However, this is based on the codec and are options chosen at encode time. I pass the decoded frames through swscale just in case.

Detect frames that have a given image/logo with FFmpeg

I'm trying to split a video by detecting the presence of a marker (an image) in the frames. I've gone over the documentation and I see removelogo but not detectlogo.
Does anyone know how this could be achieved? I know what the logo is and the region it will be on.
I'm thinking I can extract all frames to png's and then analyse them one by one (or n by n) but it might be a lengthy process...
Any pointers?
ffmpeg doesn't have any such ability natively. The delogo filter simply works by taking a rectangular region in its parameters and interpolating that region based on its surroundings. It doesn't care what the region contained previously; it'll fill in the region regardless of what it previously contained.
If you need to detect the presence of a logo, that's a totally different task. You'll need to create it yourself; if you're serious about this, I'd recommend that you start familiarizing yourself with the ffmpeg filter API and get ready to get your hands dirty. If the logo has a distinctive color, that might be a good way to detect it.
Since what you're after is probably going to just be outputting information on which frames contain (or don't contain) the logo, one filter to look at as a model will be the blackframe filter (which searches for all-black frames).
You can write a detect-logo module, Decode the video(YUV 420P FORMAT), feed the raw frame to this module, Do a SAD(Sum of Absolute Difference) on the region where you expect a logo,if SAD is negligible its a match, record the frame number. You can split the videos at these frames.
SAD is done only on Y(luma) frames. To save processing you can scale the video to a lower resolution before decoding it.
I have successfully detect logo using a rpi and coral ai accelerator in conjunction with ffmeg to to extract the jpegs. Crop the image to just the logo then apply to your trained model. Even then you will need to sample a minute or so of video to determine the actual logos identity.

Forcing custom H.264 intra-frames (keyframes) at encode-time?

I have a video sequence that I'd like to skip to specific frames at playback-time (my player is implemented using AVPlayer in iOS, but that's incidental). Since these frames will fall at unpredictable intervals, I can't use the standard "keyframe every N frames/seconds" functionality present in most video encoders. I do, however, know the target frames in advance.
In order to do this skipping as efficiently as possible, I need to force the target frames to be i-frames at encode time. Ideally in some kind of GUI which would let me scrub to a frame, mark it as a keyframe, and then (re)encode my video.
If such a tool isn't available, I have the feeling this could probably be done by rolling a custom encoder with libavcodec, but I'd rather use a higher-level (and preferably scriptable) tool to do the job if a GUI isn't possible. Is this the kind of task ffmpeg or mencoder can be bent to?
Does anybody have a technique for doing this? Also, it's entirely possible that this is an impossible task because of some fundamental ignorance I have of the h.264 codec. If so, please do put me right.
ffmpeg has a -force_key_frames option that accepts a series of arbitrary timestamps as well as other ways to specify the frames. From the documentation:
-force_key_frames 0:05:00,...
Answered my own question: it's possible to set custom compression keyframes in Apple Compressor.
Compression markers are also known as manual compression markers. These are markers you can add to a Final Cut Pro sequence (or in the Compressor Preview window) to indicate when Compressor should generate an MPEG I-frame during compression.
Source.
Could you not use chapter markers to jump between sections? Not an ideal solution but a lot easier to achieve.
You can use this software:
http://www.applesolutions.com/bantha/MH.html

Seeking by bytes in FFmpeg

I will appreciate your advice on the following. I'am developing a video converter which based on FFmpeg's libavformat, and I need to implement an accurate seeking API. First of all, I developed an indexer of video stream which just saves a presentation timestamps(PTS) of every packet. And then my encoder uses this index to seek the video file. Before this operations, I remux file to mp4 container, for example. Remux is needed for videos which have no correct index inside, or video has no index at all. I need to implement seeking by bytes, of course with previously built index. I tried many ways to implement this, but without any success. Maybe you know how to implement an accurate seeking by bytes in FFmpeg? Best regards.
Yury!
You must seek until you meet the correct frame in the GOP area in where is destination frame.
Before that you have to decode two both GOP in where are first frame and last frame of segment and save their positions.
This position values will use finding your correct frames.
I've found that matters in two years ago.
If you don't have correct understood, I'll give you new algorithm.
Best regards.

Resources