How can I determine the length (in ms) of an audio file (e.g .wav) using RubyAudio
s = RubyAudio::Sound.open("1.wav")
You can get the SongInfo by:
songInfo = s.info
And then the song info contains the sample rate and the number of frames which you can use to calculate the duration of the sound file:
duration = songInfo.frames / songInfo.samplerate
From a cursory look at the docs, it looks like you can't do that with RubyAudio.
Have you tried looking at ruby-mp3info? I don't know if it's still actively developed, nor if it works for multiple audio formats, but it claims to be able to give you the duration of an mp3.
An alternate way would be to do an estimate based on the bitrate and the file length.
RubyAudio doesn't appear to have been updated in six years and its documentation is sparse. If you're able I'd recommend using rtaglib instead.
However, if you're married to RubyAudio it looks like you can get both a frame count (Audio::Soundfile#frames) and a sample (frame) rate (Audio::Soundfile#samplerate). Knowing this you should be able to divide the number of frames by sample rate to get the length of the file in seconds.
Related
I've unsuccessfully mucked around with this on my own and need help.
Given the public Web camera feed at https://itsvideo.arlingtonva.us:8011/live/cam58.stream/playlist.m3u8 I'd like to be able to be able to capture the video feed into an MP4 or MPG file with a reasonably accurate timestamp using the Windows command line (so I can put it into a batch script, etc.).
This is probably easy for someone who is already a wiz with VLC or FFmpeg or some such tool.
Additional wish list items would be to call up a higher resolution stream for a shorter duration (so as to balance I/O impact) and/or to just get still images instead of the video offered.
For instance, the m3u file has the following parameters:
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-STREAM-INF:BANDWIDTH=214105,CODECS="avc1.100.40",RESOLUTION=352x288
chunklist_w977413411.m3u8
Would there be a way to substitute any of these to increase the resolution and reduce the video duration in a corresponding way so that net I/O is the same? Or even to just get a still image, whether higher res or not?
First problem is with audio rescaling. I'm trying to redo doc/examples/transcode_aac.c so that it also resamples from 41100 to 48000, it contained a warning that it can't do it.
Using doc/examples/resampling_audio.c as a reference, I saw that before doing swr_convert, I need to find the number of audio samples at the output with the code like this:
int dst_nb_samples = av_rescale_rnd( input_frame->nb_samples + swr_get_delay(resampler_context, 41100),
48000, 41100, AV_ROUND_UP);
Problem is, when I just set int dst_nb_samples = input_frame->nb_samples (which is 1024), it encodes and plays normally, but when I do that av_rescale_rnd thing (which results in 1196), audio is slowed down and distorted, like there are skips in the audio.
Second problem is with trying to mux webm with opus audio.
When I set AVStream->time_base to 1/48000, and increase AVFrame->pts by 960, the resulted file is played in the player as a file that is much bigger. 17 seconds audio shows as 16m11s audio, but it plays normally.
When I increase pts by 20, it displays normally, but has a lot of [libopus # 00ffa660] Queue input is backward in time messages during the encoding. Same for pts 30, still has those messages.
Should I try time_scale 1/1000? webm always have timecodes in milliseconds, and opus have packet size of 20ms (960 samples at 48000 Hz).
Search for pts += 20;
Here is the whole file, all modification I did are marked with //MINE: http://www.mediafire.com/file/jlgo7x4hiz7bw64/transcode_aac.c
Here is the file I tested it on http://www.mediafire.com/file/zdy0zarlqw3qn6s/480P_600K_71149981_soundonly.mkv
The easiest way to achieve that is by using swr_convert_frame which take a frame and resample it to a completely different one.
You can read more about it here: https://ffmpeg.org/doxygen/3.2/swresample_8h_source.html
dst_nb_samples can be calculated as this:
dst_nb_samples = 48000.0 / audio_stream->codec->sample_rate * inputAudioFrame->nb_samples;
Yours probably correct too, I didn't check, but this one I used before, confirm with yours but the number you gave check out. So real problem is probably somewhere else. Try to supply 960 samples in sync with video frames, to do this you need to store audio frames to an additional liner buffer. See if problem fixes.
And/or:
2ndly my experiences says audio pts increase as number of samples per frame (i.e. 960 for 50fps video for 48000hz (48000/50)), not by ms. If you supply 1196 samples, use pts += 1196 (if not used additional buffer I mentioned above). This is different then video frame pts. Hope that helps.
You are definitely in right path. I'll examine the source code if I have time. Anyway hope that helps.
I am working on a project where everybody has to activate a part of a song. I have about 7000 mp3's, each with the same length of the final mix but with only a small part of audio. So for example you can hear a drum hit at the 15th second and the rest of the mp3 (about 4 min.) is silence.
I use the mix filter to add all the mp3's. I add them 32 mp3s at a time.
The first test I've run results in the first mixed mp3s to be silenced? (I set the Volume on the mix to the number of tracks) Also the sound is of poor quality after the mix. Can I fix this?
Or do you think this can not be done by ffmpeg? Do you know an alternative program to do this?
Thanks!
B.
If you are using the amix filter, add normalize=0 at the end, before specifying the output file. This will make ffmpeg keep all the volumes of your audio inputs at the same level.
I'm trying to make a simple CLI program that parses a SRT subtitle file and creates a new one, editing the timestamps to fit the desired framerate.
Eg I have a one-hour video track that runs at 25.0fps, with proper subtitles.
When encoding the same video at 23.976fps, the output video is a few seconds shorter (3 seconds approximately)
I've tried applying the following cross product to each time value in my srt file :
timestamp = timestamp * outputfps / inputfps
This produces captions that are approx. 3 minutes earlier compared to the input SRT (for the last captions - for the first ones the delay is obviously lesser), where the maximum delay should be 3 seconds, according to the new video file length.
This is all new for me and it seems obvious that something's wrong with the way I convert these timestamps. Could you please highlight my mistake?
Edit : According to j_random_hacker clever answer, the video should have the same duration at 25 than at 12 fps, which is easily verified. Seems like the 3 seconds offset I have is there no matter what the output framerate is - I guess there's some sort of trimming happening back there.
The main question remains : how does one convert a subtitle track so it doesn't go out of sync as the video file plays? (see my own comment below if this is unclear)
I have a serie of video files encoded in mpeg2 (I can change this encoding), and I have to produce a movie in flash flv (this is a requirement, I can't change that encoding).
One destination movie is a compilation of different source video files.
I have a playlist defining the destination movie. For example:
Video file Position Offset Length
little_gnomes 0 0 8.5
fairies 5.23 0.12 12.234
pixies 14 0 9.2
Video file is the name of the file, position is when the file should be started (in the master timeline), offset is the offset within the video file, and length is the length of the video to play. The numbers are seconds (in double).
This would result in something like that (final movie timeline):
0--5.23|--8.5|--14|--17.464|--23.2|
little_nomes **************
fairies *********************
pixies *****************
Where video overlaps, the last video to be added override the last one, the audio should be mixed.
The resulting video track would be:
0--5.23|--8.5|--14|--17.464|--23.2|
little_nomes *******
fairies ***********
pixies *****************
While the resulting audio would be:
0--5.23|--8.5|--14|--17.464|--23.2|
little_nomes 11111112222222
fairies 222222211112222222222
pixies 22222222221111111
Where 1 or 2 is the number of mixed audio tracks.
There can be a maximum of 3 audio tracks.
I need to write a program which takes the playlist as input and produce the flv file. I'm open to any solution (must be free/open source).
An existing tool that can do that would be the simplest, but I found none. As for making my own solution, I found only ffmpeg, I was able to do basic things with it, but the documentation is terribly lacking.
It can be any language, it doesn't have to be super fast (if it takes 30 minutes to build a 1h movie it's fine).
The solution will run on opensolaris based x64 servers. If I have to use linux, this would work too. But windows is out of the question.
I finally ended writing my solution from scratch, using ffmpeg library. It's a lot of boiler plate code but in the end the logic in not complicated.
I found the MLT framework which helped me greatly.
Here are two related questions:
Command line video editing tools
https://superuser.com/questions/74028/linux-command-line-tool-for-video-editing
Avisynth sounds as if it might do what you want, but it's Windows-only.
You may very well end up writing your own application using the FFmpeg library. You're right, the documentation could be better... but the tutorial by Stephen Dranger is a good place to start (if you don't know it yet).
Well, if you prefer Java, I've written several similar programs using Xuggler's API.
If your videos / images are already online, you may use the Stupeflix API to create the final videos. You can change the soundtrack, add filters to the video and much more. Here the documentation and an online demo : https://developer.stupeflix.com/documentation/ .