Why the audio of my mp4 file is going out of async - windows

I am having a problem converting a wmv file to mp4. I am using x264.exe this command to get the video stream
x264 --output temporal.264 --fps 25 --preset slow --bitrate 2048 --vbv-maxrate 2048 --vbv-bufsize 9600 --min-keyint 48 --keyint 48 --scenecut 0 --no-scenecut --pass 1 --video-filter "resize:width=640,height=480" Original.wmv
Then I use ffmpeg.exe to extract the audio stream with this line:
ffmpeg -i .wmv -acodec libfdk_aac -b:a 32000 temporal.aac
finally I use MP4Box to merges each stream with this line:
MP4Box -add temporal.264 Final.mp4
MP4Box -add temporal.aac Final.mp4
The problem is that the final.mp4 audio is out of sync. It starts good but with time it goes out of sync with time.
I run this command:
MP4Box -info 010004470063PE-10022017083824-2_MultiMedia--1.mp4
and I discover that the estimated time of both streams are different:
output of command
* Movie Info *
Timescale 600 - 2 tracks
Computed Duration 01:00:03.643 - Indicated Duration 01:00:03.643
Fragmented File: no
File suitable for progressive download (moov before mdat)
File Brand isom - version 1
Compatible brands: isom avc1
Created: GMT Wed Jun 27 16:31:44 2018
Modified: GMT Wed Jun 27 16:31:44 2018
File has root IOD (9 bytes)
Scene PL 0xff - Graphics PL 0xff - OD PL 0xff
Visual PL: AVC/H264 Profile (0x7f)
Audio PL: AAC Profile # Level 2 (0x29)
No streams included in root OD
Track # 1 Info - TrackID 1 - TimeScale 25000
Media Duration 00:59:57.520 - Indicated Duration 00:59:57.520
Track has 1 edit lists: track duration is 00:59:57.320
Media Info: Language "Undetermined (und)" - Type "vide:avc1" - 89938 samples
Visual Track layout: x=0 y=0 width=640 height=480
MPEG-4 Config: Visual Stream - ObjectTypeIndication 0x21
AVC/H264 Video - Visual Size 640 x 480
AVC Info: 1 SPS - 1 PPS - Profile Main # Level 3
NAL Unit length bits: 32
Chroma format YUV 4:2:0 - Luma bit depth 8 - chroma bit depth 8
SPS#1 hash: 41EE779BEF2AA71A7131EAFD3C77C7E3BC95FD8E
PPS#1 hash: 086E1D72A40A0E8CF35D102F34A9DF6CD44D6CEF
Self-synchronized
RFC6381 Codec Parameters: avc1.4D401E
Average GOP length: 250 samples
Track # 2 Info - TrackID 2 - TimeScale 44100
Media Duration 01:00:03.644 - Indicated Duration 01:00:03.644
Media Info: Language "Undetermined (und)" - Type "soun:mp4a" - 155196 samples
MPEG-4 Config: Audio Stream - ObjectTypeIndication 0x40
MPEG-4 Audio AAC LC - 2 Channel(s) - SampleRate 44100
Synchronized on stream 1
RFC6381 Codec Parameters: mp4a.40.2
All samples are sync
I am not shore why this is happening, becar the original wmv is perfectly synchronize. Any help?

.aac is a raw container having no timestamps, if there are PTS gaps in the source audio, they will be lost.
You have two workarounds:
a) extract to a timed samples container
ffmpeg -i .wmv -acodec libfdk_aac -b:a 32000 -vn temporal.m4a
b) fill in the gaps and extract
ffmpeg -i .wmv -af aresample=async=1 -acodec libfdk_aac -b:a 32000 temporal.aac

Base on the response of Gyan I use ffmpeg to transcode the wmv file to an mp4, separating the audio and video sequences was a bad idea. At the end I use the following command for transcoding:
ffmpeg -i <input>.wmv -c:v libx264 -preset slow -crf 23 -c:a aac -r 25 -b:a 48k -strict -2 -max_muxing_queue_size 4000 <output>.mp4

Related

Speed up hardcoding subs with ffmpeg vs handbrake

I have a video that i need to hardcode subtitles for legacy devices so I usually use ffmpeg for this and it works fine but its so slow.
This is the command i usually use.
ffmpeg -i test-cut.mp4 -vf "subtitles=test-cut.srt" -c:v libx264 -crf 24 -vsync passthrough -c:a copy test.ffmpeg.mp4
The other day i tried handbrakeCLI and it was a lot lot faster on the same file.
I used this HandbrakeCLI command to convert and hardcode, i know its not like for like but you would expect ffmpeg to be faster since its not converting the audio and really only has to convert the video at the frames that need subtitles.
HandBrakeCLI --preset "Very Fast 1080p30" --format av_mp4 -i test-cut.mp4 --srt-file test-cut.srt --srt-burn=1 -o test.handbrake.mp4
Is there a way that i can have ffmpeg just inprint the subs on the video that needs subs. instead of having to convert the whole video.
For example if i have a minute long video and there is 1 sentence that is said at 30 seconds in and that sentence/sub needs to be displayed for 5 seconds to have ffmpeg only convert that 5 seconds segment and just copy the rest of the video and audio to the new output?
The input video is 720p with 24 FPS and the speed difference is usually that handbrake is 2x and ffmpeg is about 0.7x of video duration. I know im using a 1080p profile for handbrake. Both were tested on the same machine and the same video.
Versions
ffmpeg version 4.2.2
HandBrake 1.3.1
video info
original video
CONTAINER......: MPEG-4
SIZE...........: 13.3 MiB
RUNTIME........: 45 s 94 ms
VIDEO CODEC....: avc1, High#L3.1, 8 bits
RESOLUTION.....: 1280x720
BITRATE........: 2 402 kb/s
FRAMERATE......: 24.000 FPS
AUDIO..........: AAC, 2 channels, 66.2 kb/s
handbrake output
CONTAINER......: MPEG-4
SIZE...........: 2.81 MiB
RUNTIME........: 45 s 51 ms
VIDEO CODEC....: x264, avc1, Main#L4, 8 bits
RESOLUTION.....: 1118x692
BITRATE........: 355 kb/s
FRAMERATE......: 24.000 FPS
AUDIO..........: AAC, 2 channels, 160 kb/s
ffmpeg output
CONTAINER......: MPEG-4
SIZE...........: 3.90 MiB
RUNTIME........: 45 s 94 ms
VIDEO CODEC....: x264, avc1, High#L3.1, 8 bits
RESOLUTION.....: 1280x720
BITRATE........: 651 kb/s
FRAMERATE......: 23.976 (23976/1000) FPS
AUDIO..........: AAC, 2 channels, 66.2 kb/s
Would be grateful if anyone could provide any suggestions to get ffmpeg faster at hardcoding subs.
For Handbrake, you use the veryfast preset.You can also specify a preset with ffmpeg, by default it is set to medium but you can change:-preset veryfast

Recording streams audio with ffmpeg for Cloud Speech-to-Text

Goodnight
I am trying to record audio with the following features:
codec: flac
sampling rate: 16000hz
I am testing with the following line of code:
ffmpeg -t 15 -i http://198.15.86.218:9436/stream -codec:a flac -b:a 16k example.flac
But when reviewing the output file, I get the following:
codec: flac
sampling rate: 44000hz
I could guide the correct use of ffmpeg options.
-b:a is to set the bitrate*. For sampling rate, you have to use -ar.
Use
ffmpeg -t 15 -i http://198.15.86.218:9436/stream -codec:a flac -ar 16k example.flac
*for a lossless codec, bitrate setting is irrelevant.
"Free Lossless Audio Codec" Flac is lossless and hence output bit-rate cannot be controlled precisely. -b:a 16k is actually trying to set output bit-rate of audio to 16k bits per second.
While in your case you need it to be sampled at 16000 Hz. So the correct option would be to use -ar [audio rate]
ffmpeg -t 15 -i http://198.15.86.218:9436/stream -c:a flac -ar 16000 example.flac
If you want to control the output bit-rate with FLAC encoder then you can use the option -compression_level 0-15 with 5 being default. You can get mode details on controlling other parameters of FLAC ffmpeg encoder here.

MPEG-DASH - Multiplexed Representations Issue

I'm trying to learn ffmpeg, MP4Box, and MPEG-DASH, but I'm running into an issue with the .mp4 I'm using. I'm using ffmpeg to demux the mp4 with this command:
ffmpeg -i test.mp4 -c:v copy -g 72 -an video.mp4 -c:a copy audio.mp4
Once the two files are created, I use MP4Box to segment the files for the dash player using this command:
MP4Box -dash 4000 -frag 1000 -rap -segment-name segment_ output.mp4
Which does create all the files I think I need. Then I point the player to the output_dash.mpd and nothing happens except a ton of messages in the console:
[8] EME detected on this user agent! (ProtectionModel_21Jan2015)
[11] Playback Initialized
[21] [dash.js 2.3.0] MediaPlayer has been initialized
[64] Parsing complete: ( xml2json: 3.42ms, objectiron: 2.61ms, total: 0.00603s)
[65] Manifest has been refreshed at Wed Apr 12 2017 12:16:52 GMT-0600 (MDT)[1492021012.196]
[72] MediaSource attached to element. Waiting on open...
[77] MediaSource is open!
[77] Duration successfully set to: 148.34
[78] Added 0 inline events
[78] No video data.
[79] No audio data.
[79] No text data.
[79] No fragmentedText data.
[79] No embeddedText data.
[80] Multiplexed representations are intentionally not supported, as they are not compliant with the DASH-AVC/264 guidelines
[81] No streams to play.
Here is the MP4Box -info on the video I'm using:
* Movie Info *
Timescale 1000 - Duration 00:02:28.336
Fragmented File no - 2 track(s)
File suitable for progressive download (moov before mdat)
File Brand mp42 - version 512
Created: GMT Wed Feb 6 06:28:16 2036
File has root IOD (9 bytes)
Scene PL 0xff - Graphics PL 0xff - OD PL 0xff
Visual PL: Not part of MPEG-4 Visual profiles (0xfe)
Audio PL: Not part of MPEG-4 audio profiles (0xfe)
No streams included in root OD
iTunes Info:
Name: Rogue One - A Star Wars Story
Artist: Lucasfilm
Genre: Trailer
Created: 2016
Encoder Software: HandBrake 0.10.2 2015060900
Cover Art: JPEG File
Track # 1 Info - TrackID 1 - TimeScale 90000 - Duration 00:02:28.335
Media Info: Language "Undetermined" - Type "vide:avc1" - 3552 samples
Visual Track layout: x=0 y=0 width=1920 height=816
MPEG-4 Config: Visual Stream - ObjectTypeIndication 0x21
AVC/H264 Video - Visual Size 1920 x 816
AVC Info: 1 SPS - 1 PPS - Profile High # Level 4.1
NAL Unit length bits: 32
Pixel Aspect Ratio 1:1 - Indicated track size 1920 x 816
Self-synchronized
Track # 2 Info - TrackID 2 - TimeScale 44100 - Duration 00:02:28.305
Media Info: Language "English" - Type "soun:mp4a" - 6387 samples
MPEG-4 Config: Audio Stream - ObjectTypeIndication 0x40
MPEG-4 Audio MPEG-4 Audio AAC LC - 2 Channel(s) - SampleRate 44100
Synchronized on stream 1
Alternate Group ID 1
I know I need to separate the video and audio and I think that's where my issue is. The command I'm using probably isn't doing the right thing.
Is there a better command to demux my mp4?
Is the MP4Box command I'm using best for segmenting the files?
If I use different files, will they always need to be demuxed?
One thing to mention, if I use the following commands everything works fine, but there is no audio because of the -an which means it's only video:
ffmpeg -i test.mp4 -c:v copy -g 72 -an output.mp4
MP4Box -dash 4000 -frag 1000 -rap -segment-name segment_ output.mp4
UPDATE
I noticed that the video had no audio stream, but the audio had the video stream which is why I got the mux error. I thought that might be an issue so I ran this command to keep the unwanted streams out of the outputs:
ffmpeg -i test.mp4 -c:v copy -g 72 -an video.mp4 -c:a copy -vn audio.mp4
then I run:
MP4Box -dash 4000 -frag 1000 -rap -segment-name segment_ video.mp4 audio.mp4
now I no longer get the Multiplexed representations are intentionally not supported... message, but now I get:
[122] Video Element Error: MEDIA_ERR_SRC_NOT_SUPPORTED
[123] [object MediaError]
[125] Schedule controller stopping for audio
[126] Caught pending play exception - continuing (NotSupportedError: Failed to load because no supported source was found.)
I tried playing the video and audio independently through Chrome and they both work, just not through the dash player. Ugh, this is painful to learn, but I feel like I'm making progress.
I ended up going with Bento4. I'm not sure why I couldn't get MP4Box working, but Bento4 worked very easy and had me up and going within a few hours.
You are not using the good profile. Considering the logs, I suppose you're using the dash-if player. In that case, you need to use this command:
MP4Box -dash 4000 -frag 1000 -profile dashavc264:onDemand -rap -segment-name segment_ output.mp4

Wav audio file compression not working

Is it possible to compress a wav audio file without reducing the sampling rate?
I have an audio file with 256 bit rate and sampling rate - 8000Hz. I would just like to reduce the bit rate to 128/64 kbs
I tried converting to mp3 and back to wav,
ffmpeg -i input.wav 1.mp3
ffmpeg -i "1.mp3" -acodec pcm_s16le -ar 4000 out.wav
but this reduced sampling rate as well.
ffmpeg -i "1.mp3" -acodec pcm_s16le -ab 128 out.wav has default 256 bit rate
PCM (WAV) is uncompressed, so -b:a/-ab is ignored.
The bitrate of WAV is directly affected by the sample rate, channel layout, and bits per sample.
Calculating PCM/WAV bitrate
Assuming 8000 samples per second, stereo channel layout, 16 bits per sample:
sample rate × number of channels × bits per sample = bitrate
8000 × 2 × 16 = 256000 bits/s, or 256 kb/s
Getting channels, sample rate, bit depth
You can just view the output of ffmpeg -i input.wav or use ffprobe for a more concise output:
$ ffprobe -loglevel error -select_streams a -show_entries stream=sample_rate,channels,bits_per_sample -of default=nw=1 input.wav
sample_rate=8000
channels=2
bits_per_sample=16
Changing the bitrate
Bitrate should not be a consideration when using WAV. If bitrate is a problem then WAV is the wrong choice for you. That being said, you can change the bitrate by changing:
The sample rate (-ar)
The number of channels (-ac)
The bit depth. For PCM/WAV the bit depth is the number listed in the encoder name: -c:a pcm_s24le, -c:a pcm_s16le, -c:a pcm_u8, etc. See ffmpeg -encoders.
Examples for 128 kb/s (this will probably sound bad):
ffmpeg -i input.wav -ar 8000 -ac 1 -c:a pcm_s16le output.wav
ffmpeg -i input.wav -ar 8000 -ac 2 -c:a pcm_s8 output.wav
Another option is to use a lossless compressed format. The quality will be the same as WAV but the file size can be significantly smaller. Example for FLAC:
$ ffmpeg -i audio.wav audio.flac
$ ls -alh audio.wav audio.flac
6.1M audio.flac
11M audio.wac
I usually do this using Audacity
1) import the wav file to audacity
2) Then File>Export
3) Choose "Constant" and then from the Quality drop-down select your required bit-rate
I haven't tried that with ffmpeg, but the command should be:
ffmpeg -i input.wav -ab 64000 output.wav

How can I generate encoded HEVC bitstream using ffmpeg?

I am able to encoded YUV file to mp4 using HEVC:
ffmpeg.exe -f rawvideo -s 1920x1080 -pix_fmt yuv420p -i input.yuv -c:v hevc -r 30 -x265-params crf=27 -vframes 300 -an -y test.mp4
Here is the mp4box -info test.mp4 shows:
* Movie Info *
Timescale 1000 - Duration 00:00:10.000
1 track(s)
Fragmented File: no
File suitable for progressive download (moov before mdat)
File Brand isom - version 512
Created: UNKNOWN DATE Modified: UNKNOWN DATE File has no MPEG4 IOD/OD
iTunes Info:
Encoder Software: Lavf56.11.100
Track # 1 Info - TrackID 1 - TimeScale 15360 - Media Duration 00:00:10.000 Track has 1 edit lists: track duration is 00:00:10.000 Media Info: Language "Undetermined" - Type "vide:hev1" - 300 samples Visual Track layout: x=0 y=0 width=1920 height=1080 MPEG-4 Config: Visual Stream - ObjectTypeIndication 0x23 HEVC Video - Visual Size 1920 x 1080
HEVC Info: Profile Main # Level 5 - Chroma Format 1
NAL Unit length bits: 32 - general profile compatibility 0x60000000
Parameter Sets: 1 VPS 1 SPS 1 PPS
SPS resolution 1920x1080
Bit Depth luma 8 - Chroma 8 - 1 temporal layers
But how can I get the decodeble bit stream? I tried
mp4box -raw 1 test.mp4 -out out.bin
It gives:
Extracting MPEG-H HEVC stream to hevc
But the out.bin couldn't be decoded by HM or elecard.
Thanks
Use
ffmpeg -i input.mp4 -c:v hevc -f hevc out.bin
to generate an Annex B bytestream. This can be fed to another decoder.

Resources