How to play and seek fragmented MP4 audio using MSE SourceBuffer? - ffmpeg

If you end up here, you might want to take a look at shaka-player and the accompanying shaka-streamer. Use it. Don't implement this yourself unless you really have to.
I am trying for quite some time now to be able to play an audio track on Chrome, Firefox, Safari, etc. but I keep hitting brick walls. My problem is currently that I am just not able to seek within a fragmented MP4 (or MP3).
At the moment I am converting audio files such as MP3 to fragmented MP4 (fMP4) and send them chunk-wise to the client. What I do is defining a CHUNK_DURACTION_SEC (chunk duration in seconds) and compute a chunk size like this:
chunksTotal = Math.ceil(this.track.duration / CHUNK_DURATION_SEC);
chunkSize = Math.ceil(this.track.fileSize / this.chunksTotal);
With this I partition the audio file and can fetch it entirely jumping chunkSize-many bytes for each chunk:
| chunk 1 | chunk 2 | ... | chunk n |
How audio files are converted to fMP4
ffmpeg -i input.mp3 -acodec aac -b:a 256k -f mp4 \
-movflags faststart+frag_every_frame+empty_moov+default_base_moof \
This seems to work with Chrome and Firefox (so far).
How chunks are appended
After following this example, and realizing that it's simply not working as it is explained here, I threw it away and started over from scratch. Unfortunately without success. It's still not working.
The following code is supposed to play a track from the very beginning to the very end. However, I also need to be able to seek. So far, this is simply not working. Seeking will just stop the audio after the seeking event got triggered.
The code
/* Desired chunk duration in seconds. */
const AUDIO_EVENTS = [
class ChunksLoader {
/** The total number of chunks for the track. */
public readonly chunksTotal: number;
/** The length of one chunk in bytes */
public readonly chunkSize: number;
/** Keeps track of requested chunks. */
private readonly requested: boolean[];
/** URL of endpoint for fetching audio chunks. */
private readonly url: string;
private track: Track,
private sourceBuffer: SourceBuffer,
private logger: NGXLogger,
) {
this.chunksTotal = Math.ceil(this.track.duration / CHUNK_DURATION_SEC);
this.chunkSize = Math.ceil(this.track.fileSize / this.chunksTotal);
this.requested = [];
for (let i = 0; i < this.chunksTotal; i++) {
this.requested[i] = false;
this.url = `${environment.apiBaseUrl}/api/tracks/${}/play`;
* Fetch the first chunk.
public begin() {
* Handler for the "timeupdate" event. Checks if the next chunk should be fetched.
* #param currentTime
* The current time of the track which is currently played.
public handleOnTimeUpdate(currentTime: number) {
const nextChunkIndex = Math.floor(currentTime / CHUNK_DURATION_SEC) + 1;
const hasAllChunks = this.requested.every(val => !!val);
if (nextChunkIndex === (this.chunksTotal - 1) && hasAllChunks) {
this.logger.debug('Last chunk. Calling mediaSource.endOfStream();');
if (this.requested[nextChunkIndex] === true) {
if (currentTime < CHUNK_DURATION_SEC * (nextChunkIndex - 1 + 0.25)) {
* Fetches the chunk if it hasn't been requested yet. After the request finished, the returned
* chunk gets appended to the SourceBuffer-instance.
* #param chunkIndex
* The chunk to fetch.
private maybeFetchChunk(chunkIndex: number) {
const start = chunkIndex * this.chunkSize;
const end = start + this.chunkSize - 1;
if (this.requested[chunkIndex] == true) {
this.requested[chunkIndex] = true;
if ((end - start) == 0) {
this.logger.warn('Nothing to fetch.');
const totalKb = ((end - start) / 1000).toFixed(2);
this.logger.debug(`Starting to fetch bytes ${start} to ${end} (total ${totalKb} kB). Chunk ${chunkIndex + 1} of ${this.chunksTotal}`);
const xhr = new XMLHttpRequest();'get', this.url);
xhr.setRequestHeader('Authorization', `Bearer ${AuthenticationService.getJwtToken()}`);
xhr.setRequestHeader('Range', 'bytes=' + start + '-' + end);
xhr.responseType = 'arraybuffer';
xhr.onload = () => {
this.logger.debug(`Range ${start} to ${end} fetched`);
this.logger.debug(`Requested size: ${end - start + 1}`);
this.logger.debug(`Fetched size: ${xhr.response.byteLength}`);
this.logger.debug('Appending chunk to SourceBuffer.');
export enum StreamStatus {
export class PlayerState {
status: StreamStatus = StreamStatus.NOT_INITIALIZED;
providedIn: 'root'
export class MediaSourcePlayerService {
public track: Track;
private mediaSource: MediaSource;
private sourceBuffer: SourceBuffer;
private audioObj: HTMLAudioElement;
private chunksLoader: ChunksLoader;
private state: PlayerState = new PlayerState();
private state$ = new BehaviorSubject<PlayerState>(this.state);
public stateChange = this.state$.asObservable();
private currentTime$ = new BehaviorSubject<number>(null);
public currentTimeChange = this.currentTime$.asObservable();
private httpClient: HttpClient,
private logger: NGXLogger
) {
get canPlay() {
const state = this.state$.getValue();
const status = state.status;
return status == StreamStatus.PAUSED;
get canPause() {
const state = this.state$.getValue();
const status = state.status;
return status == StreamStatus.PLAYING || status == StreamStatus.SEEKING;
public playTrack(track: Track) {
this.track = track;
public play() {
public pause() {
public stop() {
public seek(seconds: number) {
this.audioObj.currentTime = seconds;
private startPlayingFrom(seconds: number) {`Start playing from ${seconds.toFixed(2)} seconds`);
this.mediaSource = new MediaSource();
this.mediaSource.addEventListener('sourceopen', this.onSourceOpen);
this.audioObj = document.createElement('audio');
this.addEvents(this.audioObj, AUDIO_EVENTS, this.handleEvent);
this.audioObj.src = URL.createObjectURL(this.mediaSource);;
private onSourceOpen = () => {
this.mediaSource.removeEventListener('sourceopen', this.onSourceOpen);
this.mediaSource.duration = this.track.duration;
this.sourceBuffer = this.mediaSource.addSourceBuffer('audio/mp4; codecs="mp4a.40.2"');
// this.sourceBuffer = this.mediaSource.addSourceBuffer('audio/mpeg');
this.chunksLoader = new ChunksLoader(
private handleEvent = (e) => {
const currentTime = this.audioObj.currentTime.toFixed(2);
const totalDuration = this.track.duration.toFixed(2);
this.logger.warn(`MediaSource event: ${e.type} (${currentTime} of ${totalDuration} sec)`);
const currentStatus = this.state$.getValue();
switch (e.type) {
case 'playing':
currentStatus.status = StreamStatus.PLAYING;
case 'pause':
currentStatus.status = StreamStatus.PAUSED;
case 'timeupdate':
case 'seeking':
currentStatus.status = StreamStatus.SEEKING;
if (this.mediaSource.readyState == 'open') {
private addEvents(obj, events, handler) {
events.forEach(event => obj.addEventListener(event, handler));
Running it will give me the following output:
Apologies for the screenshot but it's not possible to just copy the output without all the stack traces in Chrome.
What I also tried was following this example and call sourceBuffer.abort() but that didn't work. It looks more like a hack that used to work years ago but it's still referenced in the docs (see "Example" -> "You can see something similar in action in Nick Desaulnier's bufferWhenNeeded demo ..").
case 'seeking':
currentStatus.status = StreamStatus.SEEKING;
if (this.mediaSource.readyState === 'open') {
Trying with MP3
I have tested the above code under Chrome by converting tracks to MP3:
ffmpeg -i input.mp3 -acodec aac -b:a 256k -f mp3 output.mp3
and creating a SourceBuffer using audio/mpeg as type:
I have the same problem when seeking.
The issue wihout seeking
The above code has another issue:
After two minutes of playing, the audio playback starts to stutter and comes to a halt prematurely. So, the audio plays up to a point and then it stops without any obvious reason.
For whatever reason there is another canplay and playing event. A few seconds after, the audio simply stops..


Streaming audio from FFMPEG to browser via WebSocket and WebAudioApi

My project has 2 parts:
a web interface that the user accesses
and a standalone app installed on the computer, that acts as a websocket server.
From the web UI, the user has to hear his computer's microphone.
At this moment, I have a working solution that listens to microphone and sends the raw PCM audio chunks back to web-UI which is able to play them. But some serious lag gets added in time, despite it all runs on the same computer, so there's no internet latency/etc. That is why I am testing FFMPEG now.
So, here's the FFMPEG command for streaming microphone data:
ffmpeg.exe -re -f dshow -i audio="Microphone (HD Pro Webcam C920)" -ar 44100 -ac 1 -f f32le pipe:1
Data gets sent successfully via websocket, but playing it using WebAudioApi is not working, i mean i don't hear anything.
Can anyone point me to what am I doing wrong?
Here's the web javascript:
let ipOfAudioServer = 'localhost';
let wsClient = null;
var audioCtx = null;
var subcounter = 0;
var audiobuffer = [];
var source = null;
// must match the values in the audio-server. Thought despite audio-server could send 2channels.. we resume to only one, to save bandwidth
var sampleRate = 44100;
var channels = 1;
var microphone = 'Microphone (HD Pro Webcam C920)';
// this method reads current position from the audiobuffer and plays the audio
// the method will re-call itself, in order to play the next item in queue = function(soundName) {
var ffs = audiobuffer[subcounter];
if (ffs) {
var frameCount = ffs.byteLength;
console.log(frameCount, audiobuffer.length);
var myAudioBuffer = audioCtx.createBuffer(channels, frameCount, sampleRate);
if (myAudioBuffer != null)
subcounter += 1;
source = audioCtx.createBufferSource();
source.buffer = myAudioBuffer;
source.onended = () => { console.log("finished, continuing to seek buffer!"); play(soundName); }
// just in case the counter got to be bigger than the actual amount of items in the list, set it back to last one
if (subcounter > audiobuffer.length)
subcounter = audiobuffer.length;
// the method to initialize WS client
this.initWebsocketClient = function ()
if (wsClient == null)
wsClient = new WebSocket(`ws://${ipOfAudioServer}:23233`, "protocol");
wsClient.binaryType = "arraybuffer";
wsClient.onmessage = function (event)
if (typeof === 'object') {
// clear memory in case buffer is already too big
if (subcounter > 50) {
console.log('cleared memory');
audiobuffer = [];
subcounter = 0;
if (audiobuffer.length == 1) {
else {
if ( == 'stopMicrophone=ok') {
wsClient = null;
audiobuffer = [];
subcounter = 0;
// method used in send() which will actually send the message only after connection has been established successfully.
this.waitForConnection = function (callback, interval) {
if (wsClient.readyState === 1) {
} else {
var that = this;
// optional: implement backoff for interval here
setTimeout(function () {
that.waitForConnection(callback, interval);
}, interval);
// using this method to send WS messages to the audio-server
this.send = function (message, callback)
this.waitForConnection(function () {
if (typeof callback !== 'undefined') {
}, 1000);
// called by clicking the start button
function startCapture() {
if (audioCtx == null)
audioCtx = new (window.AudioContext || window.webkitAudioContext)();
audiobuffer = [];
subcounter = 0;
// called by clicking the stop button
function stopCapture() {

How to seek to a position in a song Discord.js?

I am facing some difficulty with seeking to a specified timestamp in the current song. I have separate files for all my commands. I want to create a seek.js file which takes input a specified time and then passes it to the play.js file(it plays the current song in the queue) but the problem is I cant seem to find a way to how do this.
This is my play command.
const { Collector } = require("discord.js");
const ytdlDiscord = require("ytdl-core-discord");
//const play = require("../commands/play");
module.exports = {
async play(song, message){
const queue = message.client.queue.get(;
if(!queue.connection.dispatcher &&{;
queue.textChannel.send(`**Cadenza** left successfully`).catch(console.error);
else return;
return queue.textChannel.send(`**Music Queue Ended**`);
let stream = await ytdlDiscord(song.url,{filter: 'audioonly', quality: 'highestaudio', highWaterMark: 1<<25});
let streamType = song.url.includes("") ? "opus" : "ogg/opus";
queue.connection.on("disconnect", () => message.client.queue.delete(;
const dispatcher = queue.connection
.play(stream, {type: streamType, highWaterMark: 1})
.on("finish", () => {
let last = queue.songs.shift();
queue.songs.push(last);[0], message);
queue.songs.shift();[0], message);
.on("error", (err) => {
queue.songs.shift();[0], message);
dispatcher.setVolumeLogarithmic(queue.volume / 100);
queue.textChannel.send(`Started Playing **${song.title}**`);
seek command
const { play } = require("../include/play");
function timeConvert(str){
const t = str.split(':');
let s = 0, m = 1;
while(t.length > 0){
s = +m * parseInt(t.pop(),10);
m = m * 60;
return s;
module.exports = {
name: 'seek',
description: 'Seeks to a certain point in the current track.',
const queue = message.client.queue.get(;
if(!queue) return"There is no song playing.").catch(console.error);
queue.playing = true;
let time = timeConvert(args[0]);
if( time > queue.songs[0].duration)
return`**Input a valid time**`);
let time = timeConvert(args[0]) * 1000;
#main code here
How can I pass the time variable to play() so that the current song seeks to that amount?

Can't advertise on bluetooth

I want to create a Gatt Server in my Xamarin.Forms app so that other devices can scan for it via bluetooth. I am using this plugin:
This is my code to create a Gatt Server and advertise data:
server = CrossBleAdapter.Current.CreateGattServer();
var service = server.AddService(serviceGuid, true);
var characteristic = service.AddCharacteristic(
CharacteristicProperties.Read |
CharacteristicProperties.Write | CharacteristicProperties.WriteNoResponse,
GattPermissions.Read | GattPermissions.Write
var notifyCharacteristic = service.AddCharacteristic
CharacteristicProperties.Indicate | CharacteristicProperties.Notify,
GattPermissions.Read | GattPermissions.Write
IDisposable notifyBroadcast = null;
notifyCharacteristic.WhenDeviceSubscriptionChanged().Subscribe(e =>
var #event = e.IsSubscribed ? "Subscribed" : "Unsubcribed";
if (notifyBroadcast == null)
notifyBroadcast = Observable
.Where(x => notifyCharacteristic.SubscribedDevices.Count > 0)
.Subscribe(_ =>
Debug.WriteLine("Sending Broadcast");
var dt = DateTime.Now.ToString("g");
var bytes = Encoding.UTF8.GetBytes("SendingBroadcast");
characteristic.WhenReadReceived().Subscribe(x =>
var write = "HELLO";
// you must set a reply value
x.Value = Encoding.UTF8.GetBytes(write);
x.Status = GattStatus.Success; // you can optionally set a status, but it defaults to Success
characteristic.WhenWriteReceived().Subscribe(x =>
var write = Encoding.UTF8.GetString(x.Value, 0, x.Value.Length);
Debug.WriteLine("in WhenWriteReceived() value: " + write);
// do something value
await server.Start(new AdvertisementData
LocalName = "DariusServer",
ServiceUuids = new List<Guid>() { serverServiceGuid }
I am using this app to scan for my advertisement data:
I can't discover my app with it. I don't know what I'm doing wrong? I am testing with a real device, SM-T350 tablet
I spent countless hours to get this plugin to work with no luck. But this native code works for anyone else who has the same problem:
private async Task AndroidBluetooth()
await Task.Delay(5000); // just to make sure bluetooth is ready to go, this probably isn't needed, but good for peace of mind during testing
BluetoothLeAdvertiser advertiser = BluetoothAdapter.DefaultAdapter.BluetoothLeAdvertiser;
var advertiseBuilder = new AdvertiseSettings.Builder();
var parameters = advertiseBuilder.SetConnectable(true)
AdvertiseData data = (new AdvertiseData.Builder()).AddServiceUuid(new ParcelUuid(Java.Util.UUID.FromString("your UUID here"))).Build();
MyAdvertiseCallback callback = new MyAdvertiseCallback();
advertiser.StartAdvertising(parameters, data, callback);
catch(Exception e)
public class MyAdvertiseCallback : AdvertiseCallback
public override void OnStartFailure([GeneratedEnum] AdvertiseFailure errorCode)
// put a break point here, in case something goes wrong, you can see why
public override void OnStartSuccess(AdvertiseSettings settingsInEffect)
Just to note, it wouldn't work if if I included the device name, because the bluetooth transmission would be too large in that case with a service UUID (max 31 bytes I believe).

BufferedSoundStream can't play wav files

I'm making an app for my drum classes and to make it cross-platform I've chosen Urho.Sharp, because it has low level Sound API as well as rich graphics capabilities.
As a first step I'm making a metronome app and for that I'm working with BufferedSoundStream adding here audio and then needed silence, as described here:
But the resulting sound is not a sound at all, like random bits got into buffered stream.
This is my code:
/// this code initialize sound subsystem
void CreateSound()
// Sound source needs a node so that it is considered enabled
node = new Node();
SoundSource source = node.CreateComponent<SoundSource>();
soundStream = new BufferedSoundStream();
// Set format: 44100 Hz, sixteen bit, stereo
soundStream.SetFormat(44100, true, true);
// Start playback. We don't have data in the stream yet, but the
SoundSource will wait until there is data,
// as the stream is by default in the "don't stop at end" mode
/// this code preload all sound resources
readonly Dictionary<PointSoundType, string> SoundsMapping = new Dictionary<PointSoundType, string>
{PointSoundType.beat, "wav/beat.wav"},
{, "wav/click.wav"},
{PointSoundType.click_accent, "wav/click_accent.wav"},
{PointSoundType.crash, "wav/crash.wav"},
{PointSoundType.foot_hh, "wav/foot_hh.wav"},
{PointSoundType.hh, "wav/hh.wav"},
{PointSoundType.open_hh, "wav/open_hh.wav"},
{PointSoundType.ride, "wav/ride.wav"},
{PointSoundType.snare, "wav/snare.wav"},
{PointSoundType.tom_1, "wav/tom_1.wav"},
{PointSoundType.tom_2, "wav/tom_2.wav"},
Dictionary<PointSoundType, Sound> SoundCache = new Dictionary<PointSoundType, Sound>();
private void LoadSoundResources()
// preload all sounds
foreach (var s in SoundsMapping)
SoundCache[s.Key] = ResourceCache.GetSound(s.Value);
Debug.WriteLine("resource loaded: " + s.Value + ", length = " + SoundCache[s.Key].Length);
/// this code fill up the stream with audio
private void UpdateSound()
// Try to keep 1/10 seconds of sound in the buffer, to avoid both dropouts and unnecessary latency
//float targetLength = 1.0f / 10.0f;
// temporary increase buffer to 1s
float targetLength = 1.0f;
float requiredLength = targetLength - soundStream.BufferLength;
if (requiredLength < 0.0f)
uint numSamples = (uint)(soundStream.Frequency * requiredLength);
// check if stream is still full
if (numSamples == 0)
var silencePause = new short[44100];
// iterate and play all sounds
SoundCache.All(s =>
soundStream.AddData(s.Value.Handle, s.Value.DataSize);
// add silencio
soundStream.AddData(silencePause, 0, silencePause.Length);
return true;
Make sure your wav files are in the resource cache. Then don't play the BufferedSoundStream, but the Urho.Audio.Sound sound. This is just a different override of the same method Urho.Audio.SoundSource.Play(), but it works.
int PlaySound(string sSound)
var cache = Application.Current.ResourceCache;
Urho.Audio.Sound sound = cache.GetSound(sSound);
if (sound != null)
Node soundNode = scene.CreateChild("Sound");
Urho.Audio.SoundSource soundSource = soundNode.CreateComponent<Urho.Audio.SoundSource>();
soundSource.Gain = 0.99f;
return 1;
return 0;
Since you're using urhosamples, you can start each drum sample from the override update something like this:
public float fRun = 0.0f;
public int iRet = 0; // keep counting the played sounds
public override void OnUpdate(float timeStep)
fRun = fRun + timeStep;
int iMS = (int)(10f * fRun); // tenth of seconds
if (iMS == 100) iRet = iRet + PlaySound("wav/hh.wav");
if (iMS == 120) iRet = iRet + PlaySound("wav/hh.wav");
if (iMS == 140) iRet = iRet + PlaySound("wav/hh.wav");
if (iMS == 160) iRet = iRet + PlaySound("wav/open_hh.wav");
if (iMS >= 160) fRun = 0.8f;

Can I get timecode from directshow video?

I'm trying to make a timecode counter for a video player based on GMFBridge and DirectShow.
I'm using a Timer to call GetCurrentPosition() every 200ms but I believe it's not accurate. I'd like at least to get the frame number (from start) of the current frame when a video is running.
Can this actually be done?
I'm using DirectShowLib .NET library.
To my knowledge this is hard to achieve, in a solution I work on I did the following to get 'frame number':
public int NumberOfFrames
return (int)(Duration / AverageTimePerFrame);
public double AverageTimePerFrame
return videoInfoHeader.AvgTimePerFrame / 10000000.0;
public int GetCurrentFrame(double currentTime)
int noOfFrames = (int)(Duration / AverageTimePerFrame);
return Convert.ToInt32(Math.Min(noOfFrames - 1, Math.Floor(currentTime / AverageTimePerFrame)));
I got the videoInfoHeader by doing:
// Get the media type from the SampleGrabber
AMMediaType media = new AMMediaType();
hr = sampGrabber.GetConnectedMediaType(media);
if ((media.formatType != FormatType.VideoInfo) || (media.formatPtr == IntPtr.Zero))
throw new NotSupportedException("Unknown Grabber Media Format");
// Grab the size info
videoInfoHeader = (VideoInfoHeader)Marshal.PtrToStructure(media.formatPtr, typeof(VideoInfoHeader));
However this is obviously tailored to my own use-case, hopefully it helps you a bit though. Good luck!
Added CurrentTime code (the locker is for my own usage you can most likely remove that):
public double CurrentTime
lock (locker)
IMediaPosition mediaPos = fFilterGraph as IMediaPosition;
int hr;
if (value >= 0 && value <= Duration)
hr = mediaPos.put_CurrentPosition(value);
lock (locker)
IMediaPosition mediaPos = fFilterGraph as IMediaPosition;
int hr;
double currentTime;
hr = mediaPos.get_CurrentPosition(out currentTime);
return currentTime;
