I am using fluent-ffmpeg to resize a video.
I can't figure out what's happening though. I have 2 video files, one works but the other doesn't. I've been scouring the mediainfo outputs of both files, checking for discrepancies but other than filesize, duration etc. there's no difference (same codec, format, width/height, frame rate etc)
Here's a link to both files.
I've been reading these video files into fluent-ffmpeg using an input stream, as so:
await new Promise((resolve, reject) => {
ffmpeg(file.stream)
.output(path)
.size('426x240')
.on('start', function() {
console.log('started');
})
.on('error', function(err) {
console.log('An error occurred: ' + err.message);
})
.on('progress', function(progress) {
console.log('... frames: ' + progress.frames);
})
.on('end', function() {
console.log('Finished processing');
resolve();
})
.run();
});
The working file prints:
started
... frames: 86
... frames: 107
Finished processing
But the non-working file doesn't seem to have any frames, and prints:
started
... frames: 0
Finished processing
Any idea what could be wrong?
The ffmpeg command being executed:
ffmpeg -i pipe:0 -y -filter:v scale=w=426:h=240 uploads/works.mp4
I've been scouring the mediainfo outputs of both files, checking for discrepancies but other than filesize, duration etc. there's no difference
It does, but in full mode only. try mediainfo -f on the files, you'll see:
IsStreamable : Yes
for the working file, and
IsStreamable : No
For the non working file.
a "no" here means that the input needs to support seek (header is at the end, player needs to seek to end for parsing header then seek back to beginning for parsing data).
It seems like ffmpeg have problem probing the file when you pass it as a stream. But it does work if you pass it as a file. Could be because probing/demuxer can optionally use seeks etc. I tried to increase the probe buffer but didn't get it to work.
This do not work:
cat doesnt_work.mp4 | ffmpeg -i pipe:0 test.mp4
But this works:
ffmpeg -i doesnt_work.mp4 test.mp4
Related
While trying to use PyAV to encode live mono audio from a microphone to a compressed audio stream (using mp2 or flac as encoder), the program kept raising an exception ValueError: [Errno 22] Invalid argument.
To remove the live microphone source as a cause of the problem, and to make the problematic code easier for others to run/test, I have removed the mic source and now just generate a pure tone as a sequence of input buffers.
All attempts to figure out the missing or mismatched or incorrect argument have just resulted in seeing documentation and examples that are the same as my code.
I would like to know from someone who has used PyAV successfully for mono audio what the correct method and parameters are for encoding mono frames into the mono stream.
The package used is av 10.0.0 installed with
pip3 install av --no-binary av
so it uses my package-manager provided ffmpeg library, which is version 4.2.7.
The problematic python code is:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Recreating an error 22 when encoding sound with PyAV.
Created on Sun Feb 19 08:10:29 2023
#author: andrewm
"""
import typing
import sys
import math
import fractions
import av
from av import AudioFrame
""" Ensure some PyAudio constants are still defined without changing
the PyAudio recording callback function and without depending
on PyAudio simply for reproducing the PyAV bug [Errno 22] thrown in
File "av/filter/context.pyx", line 89, in av.filter.context.FilterContext.push
"""
class PA_Stub():
paContinue = True
paComplete= False
pyaudio = PA_Stub()
"""Generate pure tone at given frequency with amplitude 0...1.0 at
sampling frewuency fs and beginning at phase offset 'phase'.
Returns the new phase after the sinusoid has cycled over the
sampling window length.
"""
def generate_tone(
freq:int, phase:float, amp:float, fs, samp_fmt, buffer:bytearray
) -> float:
assert samp_fmt == "s16", "Only s16 supported atm"
samp_size_bytes = 2
n_samples = int(len(buffer)/samp_size_bytes)
window = [int(0) for i in range(n_samples)]
theta = phase
phase_inc = 2*math.pi * freq / fs
for i in range(n_samples):
v = amp * math.sin(theta)
theta += phase_inc
s = int((2**15-1)*v)
window[i] = s
for sample_i in range(len(window)):
byte_i = sample_i * samp_size_bytes
enc = window[sample_i].to_bytes(
2, byteorder=sys.byteorder, signed=True
)
buffer[byte_i] = enc[0]
buffer[byte_i+1] = enc[1]
return theta
channels = 1
fs = 44100 # Record at 44100 samples per second
fft_size_samps = 256
chunk_samps = fft_size_samps * 10 # Record in chunks that are multiples of fft windows.
# print(f"fft_size_samps={fft_size_samps}\nchunk_samps={chunk_samps}")
seconds = 3.0
out_filename = "testoutput.wav"
# Store data in chunks for 3 seconds
sample_limit = int(fs * seconds)
sample_len = 0
frames = [] # Initialize array to store frames
ffmpeg_codec_name = 'mp2' # flac, mp3, or libvorbis make same error.
sample_size_bytes = 2
buffer = bytearray(int(chunk_samps*sample_size_bytes))
chunkperiod = chunk_samps / fs
total_chunks = int(math.ceil(seconds / chunkperiod))
phase = 0.0
### uncomment if you want to see the synthetic data being used as a mic input.
# with open("test.raw","wb") as raw_out:
# for ci in range(total_chunks):
# phase = generate_tone(2600, phase, 0.8, fs, "s16", buffer)
# raw_out.write(buffer)
# print("finished gen test")
# sys.exit(0)
# #----
# Using mp2 or mkv as the container format gets the same error.
with av.open(out_filename+'.mp2', "w", format="mp2") as output_con:
output_con.metadata["title"] = "My title"
output_con.metadata["key"] = "value"
channel_layout = "mono"
sample_fmt = "s16p"
ostream = output_con.add_stream(ffmpeg_codec_name, fs, layout=channel_layout)
assert ostream is not None, "No stream!"
cctx = ostream.codec_context
cctx.sample_rate = fs
cctx.time_base = fractions.Fraction(numerator=1,denominator=fs)
cctx.format = sample_fmt
cctx.channels = channels
cctx.layout = channel_layout
print(cctx, f"layout#{cctx.channel_layout}")
# Define PyAudio-style callback for recording plus PyAV transcoding.
def rec_callback(in_data, frame_count, time_info, status):
global sample_len
global ostream
frames.append(in_data)
nsamples = int(len(in_data) / (channels*sample_size_bytes))
frame = AudioFrame(format=sample_fmt, layout=channel_layout, samples=nsamples)
frame.sample_rate = fs
frame.time_base = fractions.Fraction(numerator=1,denominator=fs)
frame.pts = sample_len
frame.planes[0].update(in_data)
print(frame, len(in_data))
for out_packet in ostream.encode(frame):
output_con.mux(out_packet)
for out_packet in ostream.encode(None):
output_con.mux(out_packet)
sample_len += nsamples
retflag = pyaudio.paContinue if sample_len<sample_limit else pyaudio.paComplete
return (in_data, retflag)
print('Beginning')
### some e.g. PyAudio code which starts the recording process normally.
# istream = p.open(
# format=sample_format,
# channels=channels,
# rate=fs,
# frames_per_buffer=chunk_samps,
# input=True,
# stream_callback=rec_callback
# )
# print(istream)
# Normally at this point you just sleep the main thread while
# PyAudio calls back with mic data, but here it is all generated.
for ci in range(total_chunks):
phase = generate_tone(2600, phase, 0.8, fs, "s16", buffer)
ret_data, ret_flag = rec_callback(buffer, ci, {}, 1)
print('.', end='')
print(" closing.")
# Stop and close the istream
# istream.stop_stream()
# istream.close()
If you uncomment the RAW output part you will find the generated data can be imported as PCM s16 Mono 44100Hz into Audacity and plays the expected tone, so the generated audio data does not seem to be the problem.
The normal program console output up until the exception is:
<av.AudioCodecContext audio/mp2 at 0x7f8e38202cf0> layout#4
Beginning
<av.AudioFrame 0, pts=0, 2560 samples at 44100Hz, mono, s16p at 0x7f8e38202eb0> 5120
.<av.AudioFrame 0, pts=2560, 2560 samples at 44100Hz, mono, s16p at 0x7f8e382025f0> 5120
The stack trace is:
Traceback (most recent call last):
File "Dev/multichan_recording/av_encode.py", line 147, in <module>
ret_data, ret_flag = rec_callback(buffer, ci, {}, 1)
File "Dev/multichan_recording/av_encode.py", line 121, in rec_callback
for out_packet in ostream.encode(frame):
File "av/stream.pyx", line 153, in av.stream.Stream.encode
File "av/codec/context.pyx", line 484, in av.codec.context.CodecContext.encode
File "av/audio/codeccontext.pyx", line 42, in av.audio.codeccontext.AudioCodecContext._prepare_frames_for_encode
File "av/audio/resampler.pyx", line 101, in av.audio.resampler.AudioResampler.resample
File "av/filter/graph.pyx", line 211, in av.filter.graph.Graph.push
File "av/filter/context.pyx", line 89, in av.filter.context.FilterContext.push
File "av/error.pyx", line 336, in av.error.err_check
ValueError: [Errno 22] Invalid argument
edit: It's interesting that the error happens on the 2nd AudioFrame, as apparently the first one was encoded okay, because they are given the same attribute values aside from the Presentation Time Stamp (pts), but leaving this out and letting PyAV/ffmpeg generate the PTS by itself does not fix the error, so an incorrect PTS does not seem the cause.
After a brief glance in av/filter/context.pyx the exception must come from a bad return value from res = lib.av_buffersrc_write_frame(self.ptr, frame.ptr)
Trying to dig into av_buffersrc_write_frame from the ffmpeg source it is not clear what could be causing this error. The only obvious one is a mismatch between channel layouts, but my code is setting the layout the same in the Stream and the Frame. That problem had been found by an old question pyav - cannot save stream as mono and their answer (that one parameter required is undocumented) is the only reason the code now has the layout='mono' argument when making the stream.
The program output shows layout #4 is being used, and from https://github.com/FFmpeg/FFmpeg/blob/release/4.2/libavutil/channel_layout.h you can see this is the value for symbol AV_CH_FRONT_CENTER which is the only channel in the MONO layout.
The mismatch is surely some other object property or an undocumented parameter requirement.
How do you encode mono audio to a compressed stream with PyAV?
Installed google assistant sdk on raspi3, the speaker is a home mini bluetooth, it is paired and connected to raspi, played from youtube and it works! even google says it's connected!
However, when running command in terminal as (env) "googlesamples-assistant-pushtotalk --project-id (not going to paste ID) --device-model-id" I get the following:
/home/pi/env/lib/python3.5/site-packages/google/auth/crypt/_cryptography_rsa.py:22: CryptographyDeprecationWarning: Python 3.5 support will be dropped in the next release of cryptography. Please upgrade your Python.
import cryptography.exceptions
INFO:root:Connecting to embeddedassistant.googleapis.com
Traceback (most recent call last):
File "/home/pi/env/bin/googlesamples-assistant-pushtotalk", line 8, in
sys.exit(main())
File "/home/pi/env/lib/python3.5/site-packages/click/core.py", line 722, in call
return self.main(*args, **kwargs)
File "/home/pi/env/lib/python3.5/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/home/pi/env/lib/python3.5/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/pi/env/lib/python3.5/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/home/pi/env/lib/python3.5/site-packages/googlesamples/assistant/grpc/pushtotalk.py", line 351, in main
flush_size=audio_flush_size
File "/home/pi/env/lib/python3.5/site-packages/googlesamples/assistant/grpc/audio_helpers.py", line 190, in init
blocksize=int(block_size/2), # blocksize is in number of frames.
File "/home/pi/env/lib/python3.5/site-packages/sounddevice.py", line 1345, in init
**_remove_self(locals()))
File "/home/pi/env/lib/python3.5/site-packages/sounddevice.py", line 762, in init
samplerate)
File "/home/pi/env/lib/python3.5/site-packages/sounddevice.py", line 2571, in _get_stream_parameters
info = query_devices(device)
File "/home/pi/env/lib/python3.5/site-packages/sounddevice.py", line 569, in query_devices
raise PortAudioError('Error querying device {0}'.format(device))
sounddevice.PortAudioError: Error querying device -1
When using arecord -l or aplay -l in terminal, get the same message for both: "aplay: device_list:270: no soundcards found..."
Also, running test in terminal using "speaker-test -t wav", the test runs but no sound is working"
" speaker-test 1.1.3
Playback device is default
Stream parameters are 48000Hz, S16_LE, 1 channels
WAV file(s)
Rate set to 48000Hz (requested 48000Hz)
Buffer size range from 9600 to 4194304
Period size range from 480 to 4096
Using max buffer size 4194304
Periods = 4
was set period_size = 4096
was set buffer_size = 4194304
0 - Front Left
Time per period = 0.339021
0 - Front Left
Time per period = 0.315553
0 - Front Left
Time per period = 0.315577
*Keeps generating but with no sound."
Finally, going through sudo nano /home/pi/.asoundrc file, when connected to speaker is:
pcm.!default {
type plug
slave.pcm {
type bluealsa
device "x❌x❌x:x"
profile "a2dp"
}
}
ctl.!default {
type bluealsa
}
AND when going to "sudo nano /etc/asound.conf" it seems that it generates another code, when also connected to same speaker:
pcm.!default {
type asym
capture.pcm "mic"
playback.pcm "speaker"
}
pcm.mic {
type plug
slave.pcm {
type bluealsa device "x❌x❌x:x"
profile "sco"
}
}
pcm.speaker {
type plug
slave.pcm {
type bluealsa device "x❌x❌x:x"
profile "sco"
}
}
I tried copy/paste code of /etc/asound.conf into /home/pi/.asoundrc and run speaker-test -t wav, but i get:
speaker-test 1.1.3
Playback device is default
Stream parameters are 48000Hz, S16_LE, 1 channels
WAV file(s)
ALSA lib bluealsa-pcm.c:680:(_snd_pcm_bluealsa_open) Couldn't get BlueALSA transport: No such device
Playback open error: -19,No such device"
So, whats the deal??
From doc, I use below command to draw a box and fill it with color.
ffmpeg -i output.mp4 -vf "drawbox=x=0:y=570:w=in_w:h=40:color=pink#0.5:t=fill" output_1.mp4
I got an error.
[Parsed_drawbox_0 # 0x7fa5c6f05840] [Eval # 0x7ffee6f23bc0] Undefined constant or missing '(' in 'fill'
Last message repeated 5 times
[Parsed_drawbox_0 # 0x7fa5c6f05840] Error when evaluating the expression 'fill'.
[Parsed_drawbox_0 # 0x7fa5c6f05840] Failed to configure input pad on Parsed_drawbox_0
You seem to be using an older ffmpeg version; upgrade or use max in place of fill.
you can set t(thickness) equal to h(height) .
import ffmpeg
istream = ffmpeg.input("aaa.mp4")
istream = ffmpeg.drawbox(istream, x=0, width='iw', height='ih/2', y='ih/2', color='red', t="max")
ostream = ffmpeg.output(istream, "bbb.mp4")
ffmpeg.run(ostream)
When I used AndroidFFmpeg to play audio file (MP3 file), I got issue
header missing while seeking.
Video play is ok; error occurs only in mp3 file.
Anyone knows how to fix it?
I have a temporary solution to fix it. In mpegaudiodec.c, you comment out errors return:
header = AV_RB32(buf);
if (header>>8 == AV_RB32("TAG")>>8) {
av_log(avctx, AV_LOG_DEBUG, "discarding ID3 tag\n");
return buf_size;
}
if (ff_mpa_check_header(header) < 0) {
av_log(avctx, AV_LOG_ERROR, "Header missing\n");
//return AVERROR_INVALIDDATA; do not return errors
}
A couple of similar questions are on stackoverflow, but I haven't been able to figure this exact problem out.
I want to get a list of the fourccs for the avi codecs that FFMpeg can decode.
I know how to get all the formats ffmpeg -formats and codecs ffmpeg -codecs but neither list gives me an accessible list of fourccs. Neither does the documentation I can find.
I need this list, so that my application can access the fourcc of an avi file and determine whether to use ffmpeg or VfW (or DirectX) to try decode the file.
Is there some ffmpeg command that can give me this list?
To extend the answer given above by Darren (and because the comment facility doesn't allow this much text) here is the full list of codecs parsed from the isom_8c-source file on ffmpeg.org:
raw yuv2 2vuy yuvs L555 L565 B565 24BG
BGRA RGBA ABGR b16g b48r bxbg bxrg bxyv
NO16 DVOO R420 R411 R10k R10g r210 AVUI
AVrp SUDS v210 bxy2 v308 v408 v410 Y41P
yuv4 jpeg mjpa AVDJ AVRn dmb1 mjpb SVQ1
svq1 svqi SVQ3 mp4v DIVX XVID 3IV2 h263
s263 dvcp dvc dvpp dv5p dv5n AVdv AVd1
dvhq dvhp dvh1 dvh2 dvh4 dvh5 dvh6 dvh3
VP31 rpza cvid 8BPS smc rle WRLE qdrw
WRAW avc1 ai5p ai5q ai52 ai53 ai55 ai56
ai1p ai1q ai12 ai13 ai15 ai16 m1v1 mpeg
m1v m2v1 hdv1 hdv2 hdv3 hdv4 hdv5 hdv6
hdv7 hdv8 hdv9 hdva mx5n mx5p mx4n mx4p
mx3n mx3p xd54 xd55 xd59 xd5a xd5b xd5c
xd5d xd5e xd5f xdv1 xdv2 xdv3 xdv4 xdv5
xdv6 xdv7 xdv8 xdv9 xdva xdvb xdvc xdvd
xdve xdvf xdhd xdh2 AVmp mjp2 tga tiff
gif png MNG vc-1 avs2 drac AVdn H263
3IVD AV1x AVup sgi dpx exr apch apcn
I don't know if it's comprehensive but the source code seems to contain a list of FourCCs.
Look at http://ffmpeg.org/doxygen/trunk/isom_8c-source.html
There are lots of lines like this
{ CODEC_ID_AAC, MKTAG('m', 'p', '4', 'a') }
You should be able to download the latest source and write a script to pick them out.
It is possible to the mapping via avformat api, without digging in the source code.
uint32_t tag = MKTAG('H', '2', '6', '4');
const struct AVCodecTag *table[] = { avformat_get_riff_video_tags(), 0 };
enum AVCodecID vcodec = av_codec_get_id(table, tag );
The functions avformat_get_riff_video_tags, avformat_get_riff_audio_tags and av_codec_get_id are all defined in "libavformat/avformat.h".
you can also get the mapping for a specific format using the table AVOutputFormat.codec_tag or AVInputFormat.codec_tag