What is the best setup for livestreaming from dji drone to local server? - dji-sdk

I need to stream livevideo data from DJI drones to a local server using Android Mobile SDK. The video data from the server will be processed by other software (human recognition). What is the recommended approach to prove best streaming quality? We use Mavic 2 for development and Matrice 210 v2 RTK for production purposes. So the answer should primarily focus on possibilites of Matrice 210 v2 RTK.
At the moment we use DJIs Livestreammanager like this
DJISDKManager.getInstance().getLiveStreamManager().setLiveUrl( rtmpURL );
DJISDKManager.getInstance().getLiveStreamManager().startStream();
We use nginx Webserver with rtmp module as the streaming endpoint which pushes stream to the network and records the stream to disk at same time. This works but we see quite bad quality with this streams. The stream lags quite often and it pixels on faster drone movement.
Since i am not familiar with video processing i need recommendation how to solve this requirement. My Questions are like:
-Do i need to use ffmpeg library to preprocess the stream in some way?
-Is there a better solution than nginx+rtmp module for the backend server?
-Would is be possible to utilize the outputs of matrice remote controller to deliver higher quality?

-Do i need to use ffmpeg library to preprocess the stream in some way?
If you want to handle the stream yourself you need to use ffmpeg to decode the raw livestream into actual picture frames. See my answer to question 3.
-Is there a better solution than nginx+rtmp module for the backend server?
This highly depends on the use case. If you are not satisfied with the video quality, it'd be more reliable to decode the stream yourself with ffmpeg. I actually don't know how much nginx+rtmp do automatically, since I've never worked with it.
-Would is be possible to utilize the outputs of matrice remote controller to deliver higher quality?
The raw livestream which can be accessed with the SDK is a raw h264 or h265 stream (NAL Units), depending on the drone model used.
In my application I register the VideoDataCallback in the onComponentChange method with the camera componentKey and send it away via LocalBroadcastManager.
The livestream can then be accessed anywhere in the app by simply subscribing to the broadcast.
private VideoFeeder.VideoDataListener videoDataCallback = (videoBuffer, size) -> {
Intent intent = new Intent("videoStream");
intent.putExtra("content", videoBuffer);
intent.putExtra("size", size);
LocalBroadcastManager.getInstance(this).sendBroadcast(intent);
};
A simple way to send the livestream over the network is using a TCP-Socket: start a thread and open a Socket using an IP and Port:
this.s = new Socket(this.ipAddress, this.port);
this.s.setKeepAlive(true);
this.s.setTcpNoDelay(true);
this.dos = new DataOutputStream(this.s.getOutputStream());
Put a loop around to reconnect it if the connection/sending fails.
In the BroadcastReceiver, accumulate the bytes by analyzing the size (bos is a simple ByteArrayOutputStream):
private BroadcastReceiver frameReceiverNAL = new BroadcastReceiver() {
#Override
public void onReceive(Context context, Intent intent) {
try {
int frameSize = intent.getIntExtra("size", 0);
if(frameSize == 2032) {
bos.write(intent.getByteArrayExtra("content"), 0, frameSize);
} else {
bos.write(intent.getByteArrayExtra("content"), 0, frameSize);
final byte[] sendUnit = bos.toByteArray();
bos.reset();
if(numAsyncTasks < MAX_NUMBER_ASYNCTASKS) {
numAsyncTasks++;
AsyncTask.execute(() -> sendNAL(sendUnit));
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
};
private void sendNAL(byte[] sendUnit){
try {
if(dos != null) {
dos.write(sendUnit);
dos.flush();
}
} catch (IOException e) {
e.printStackTrace();
}
numAsyncTasks--;
}
The MAX_NUMBER_ASYNCTASKS is used as an upper cap to prevent the app from crashing if it fails to send the NALs fast enough (resulted in 500+ unfinished AsyncTasks before crashing). In my case I set the number to 128.
On the Server side you need to create a ServerSocket to receive the byteStream. Depending on the programming language you can either use ffmpeg directly or use a native interface, e.g. bytedeco to create the frames from the stream.
Another way would be to decode the stream on the smartphone into YUV images (2.8MB each) and optionally convert them to jpeg to reduce the size of a single frame (~100KB each). The YUV or jpeg image can then be sent to the server using a TCP Socket. The SDK provides a method for creating YUV images in the class DJICodecManager and the tutorials contain an example for converting them to JPEG.

Related

Why is Chromecast unable to stream this HLS video? "Neither ID3 nor ADTS header was found" / Error NETWORK/315

I'm trying to stream some URLs to my Chromecast through a sender app. They're HLS/m3u8 URLs.
Here's one such example URL: https://qa-apache-php7.dev.kaltura.com/p/1091/sp/109100/playManifest/entryId/0_wifqaipd/protocol/https/format/applehttp/flavorIds/0_h65mfj7f,0_3flmvnwc,0_m131krws,0_5407xm9j/a.m3u8
However they never seem to load on the Chromecast, despite other HLS/m3u8 URLs working (example of an HLS stream that does work).
It's not related to CORS as they indeed have the proper CORS headers.
I notice they have separate audio groups in the root HLS manifest file.
When I hook it up to a custom receiver app, I get the following logs:
The relevant bits being (I think): Neither ID3 nor ADTS header was found at 0 and cast.player.api.ErrorCode.NETWORK/315 (which I believe is a result of the first)
These are perfectly valid/working HLS URLs. They play back in Safari on iOS and desktop perfectly, as well as VLC.
Is there something I need to be doing (either in my sender app or my receiver app) to enable something like the audio tracks? The docs seem to indicate something about that.
I also found this Google issue where a person had a similar issue, but solved it somehow that I can't understand. https://issuetracker.google.com/u/1/issues/112277373
How would I playback this URL on Chromecast properly? Am I to do something in code?
This already has a solution here but I will add this answer in case someone looks up the exact error message / code.
The problem lies in the hlsSegmentFormat which is initialized to TS for multiplexed segments but currently defaults to packed audio for HLS with alternate audio tracks.
The solution is to intercept the CAF LOAD request and set the correct segment format:
const context = cast.framework.CastReceiverContext.getInstance();
const playerManager = context.getPlayerManager();
// intercept the LOAD request
playerManager.setMessageInterceptor(cast.framework.messages.MessageType.LOAD, loadRequestData => {
loadRequestData.media.hlsSegmentFormat = cast.framework.messages.HlsSegmentFormat.TS;
return loadRequestData;
});
context.start();
Source: Google Cast issue tracker
For those who manage multiple video sources in various formats and who don't want to arbitrarily force the HLS fragment format to TS, I suggest to track the error and set a flag that force the format at the next retry (by default, the receiver tries 3 times before giving up).
First, have a global flag to enable the HLS segments format override:
setHlsSegmentFormat = false;
Then detect the error:
playerManager.addEventListener(cast.framework.events.EventType.ERROR,
event => {
if (event.detailedErrorCode == cast.framework.events.DetailedErrorCode.HLS_NETWORK_INVALID_SEGMENT) {
// Failed parsing HLS fragments. Will retry with HLS segments format set to 'TS'
setHlsSegmentFormat = true;
}
}
);
Finally, handle the flag when intercepting the playback request:
playerManager.setMediaPlaybackInfoHandler(
(loadRequest, playbackConfig) => {
if (setHlsSegmentFormat) {
loadRequest.media.hlsSegmentFormat = cast.framework.messages.HlsSegmentFormat.TS;
// clear the flag to not force the format for subsequent playback requests
setHlsSegmentFormat = false;
}
}
);
The playback will quickly fail the first time and will succeed at the next attempt. The loading time is a bit longer but the HLS segment format is only set when required.

ThreeJS Positional Audio with WebRTC streams produces no sound

I am trying to use ThreeJS positional audio with WebRTC to build a sort of 3D Room audio chat feature. I am able to get the audio streams sent across the clients. However, the positional audio does not seem to work. Irrespective of where the user (camera) moves, the intensity of the audio remains the same. Some relevant code is being posted below:
The getUserMedia has the following in the promise response
// create the listener
listener = new THREE.AudioListener();
// add it to the camera object
camera.object3D.add( listener );
//store the local stream to be sent to other peers
localStream = stream;
Then on each WebRTC PEER connection, I set the stream that is received to a mesh using Positional Audio
// create the sound object
sound = new THREE.PositionalAudio( listener ); // using the listener created earlier
const soundSource = this.sound.context.createMediaStreamSource(stream);
// set the media source to as the sound source
sound.setNodeSource(soundSource);
// assume that I have handle to the obj where I need to set the sound
obj.object3D.add( sound )
This is done for each of the clients and these local streams are being sent via WebRTC to one another, however there is no sound from the speakers? Thanks

Live-Streaming webcam webm stream (using getUserMedia) by recording chunks with MediaRecorder over WEB API with WebSockets and MediaSource

I'm trying to broadcast a webcam's video to other clients in real-time, but I encounter some problems when viewer's start watching in the middle.
For this purpose, I get the webcam's stream using getUserMedia (and all its siblings).
Then, on a button click, I start recording the stream and send each segment/chunk/whatever you call it to the broadcaster's websocket's backend:
var mediaRecorder = new MediaRecorder(stream);
mediaRecorder.start(1000);
mediaRecorder.ondataavailable = function (event) {
uploadVideoSegment(event); //wrap with a blob and call socket.send(...)
}
On the server side (Web API, using Microsoft.Web.WebSockets),
I get the byte[] as is perfectly.
Then I send the byte[] to the Viewers which are currently connected to the Broadcaster, read it on the onmessage event of the socket using a FileReader and append the Uint8Array to the sourceBuffer of the MediaSource which is the src of the HTML5 video element.
When the Viewers get the byte[] from the beginning, specifically, the first 126 bytes which start with the EBMLHeader (0x1A45DFA3) and end with the Cluster's beginning (0x1F43B675), and then the whole bulk of the media - it's being played fine.
The problem occurs when a new viewer joins in the middle and fetches the second chunk and later.
I've been trying to research and get the hands a little dirty with some kinds of ways. I understand that the header is essential (http://www.slideshare.net/mganeko/media-recorder-and-webm), that there's some stuff concerning keyframes and all this stuff but I got confused very quickly.
So far, I tried to write my own simple webm parser in c# (from a reference of node.js project in github - https://github.com/mganeko/wmls). Thus I splitted the header from the first chunk, cached it and tried to send it with each chunk later. Of course it didn't work.
I think that maybe the MediaRecorder is splitting the cluster in the middle as the ondataavailable event is fired (that's because I've noticed the the start fo the second chunk doesn't begin with the Cluster's header).
At this point I got stuck without knowing how to use the parser to get it work.
Then I read about using ffmpeg to convert the webm stream s.t each frame is also a keyframe - Encoding FFMPEG to MPEG-DASH – or WebM with Keyframe Clusters – for MediaSource API (in Chris Nolet's answer).
I tried to use FFMpegConverter (for .Net) using:
var conv = new FFMpegConverter();
var outputStream = new MemoryStream();
var liveMedia = conv.ConvertLiveMedia("webm", outputStream, "webm", new ConvertSettings { VideoCodec = "vp8", CustomOutputArgs = "-g 1" });
liveMedia.Start();
liveMedia.Write(vs.RawByteArr, 0, vs.RawByteArr.Length); //vs.RawByteArr is the byte[] I got from the MediaRecorder
liveMedia.Stop();
byte[] buf = new byte[outputStream.Length];
outputStream.Position = 0;
outputStream.Read(buf, 0, (int)outputStream.Length);
I'm not familiar with FFMPEG so probably I'm not getting in the parameters correctly although in the answer that's what I saw but they kind of wrote it very shortly there.
Of course I encountered here plenty of problems:
When using websockets, the running of the FFMpegConverter simply forced closing the websockets channel. (I'll glad if someone could explain why).
I didn't give up, I wrote everything without websockets using HttpGet (for fetching the segment from the server) and HttpPost (with multipart blobs and all the after-party for posting the recorded chunks) methods and tried to use the FFMpegConverter as mentioned above.
For the first segment it worked BUT outputed a byte[] with half length of the original one (I'll be glad if someone could explain that as well), and for the other chunks it threw an exception (every time not just once) saying the pipe has been ended.
I'm getting lost.
Please help me, anybody. The main 4 questions are:
How can I get played the chunks that follow the first chunk of the MediaRecorder?
(Meanwhile, I just get the sourcebuffer close/end events fired and the sourceBuffer is detached from its parent MediaSource object (causing an exception like the "sourceBuffer has been removed from its parent") due to the fact that the byte[] passed to it is not good - Maybe i'm not using the webm parser I wrote in the correct way to detect important parts in the second chunk (which by the way doesn't start with a cluster - which why I wrote that it seems that the MediaRecorder is cutting the cluster in the middle))
Why does the FFMpeg cause the WebSockets to be closed?
Am I using the FFMpegConverter.ConvertLiveMedia with the correct parameters in order to get a new webm segment with all the information needed in it to get it as a standalone chunk, without being dependent on the former chunks (as Chris Nolet said in his answer in the SO link above)?
Why does the FFMpegConverter throw "the pipe ended" exception?
Any help will be extremely highly appreciated.

Control Chromecast Audio volume using Chrome API

I need to draw in UI a proper volume level on client (sender) side when working with Chromecast Audio. I see there are two ways of receiving (and might setting as well) volume from Chromecast - from Receiver and Media namespaces. In my understanding Receiver namespace stores general device's volume while Media namespace stores volume of currently played track.
It seems that I can't receive media volume by using GET_STATUS request for Media namespace before I load any tracks with LOAD request. Then how do I correctly display volume that will be used before loading media? Changing in UI RECEIVER volume to MEDIA volume after media is loaded doesn't look like a good solution and will be a surprise for users.
I fail to control volume using SET_VOLUME request for Receiver namespace - I've got no reply from Chromecast
Json::Value msg, response;
msg["type"] = "SET_VOLUME";
msg["requestId"] = ++request_id;
msg["volume"]["level"] = value; // float
response = send("urn:x-cast:com.google.cast.receiver", msg);
If the following lines are used instead of the last one, media volume is controlled OK:
msg["mediaSessionId"] = m_media_session_id;
response = send("urn:x-cast:com.google.cast.media", msg);
What am I doing wrong here?
In order to set the volume on the receiver, you should be using the SDK's APIs instead of sending a hand-crafted message. For example, you should use setReceiverVolumeLevel(). Also, use the receiver volume and not the stream volume.

How to retrieve HTTP headers from a stream in ffmpeg?

I'm currently makeing an audio streaming app on Android. I'm using Android NDK combined with ffmpeg to perform that it's working pretty well so far.
Right now I would like to retrieve the shoutcast metadata contained in the headers stream while streaming. Apparently ffmpeg doesn't provide a direct way to do that but I'm pretty sure it's technically possible to retrieve HTTP headers from the stream as we are receiving all the bytes while streaming.
Does anyone know how to retrieve HTTP headers from a stream using ffmpeg?
In case you are looking for shoutcast metadata...
Since FFmpeg 2.0 there is built-in support for them. Here's the http protocol that exposes the relevant AVOptions.
Implementation
Set the icy AVOption to 1 when calling avformat_open_input. This will set the Icy-MetaData HTTP header when opening the stream:
AVDictionary *options = NULL;
av_dict_set(&options, "icy", "1", 0);
AVFormatContext* container = avformat_alloc_context();
int err = avformat_open_input(&container, url, NULL, &options);
Then poll the icy_metadata_packet or icy_metadata_headers AVOption on your context to retrieve the current metadata:
char* metadata = NULL;
av_opt_get(container, "icy_metadata_packet", AV_OPT_SEARCH_CHILDREN, (uint8_t**) &metadata);
printf("icy_metadata_packet: %s\n", metadata);
av_free(metadata);
metadata = NULL;
av_opt_get(container, "icy_metadata_headers", AV_OPT_SEARCH_CHILDREN, (uint8_t**) &metadata);
printf("\nicy_metadata_headers:\n%s\n", metadata);
av_free(metadata);
Next you'll probably want to get the metadata information up to the Java layer of your Android app. I'm not familiar with the NDK, so you'll have to figure this out for yourself ;)
Example Output
icy_metadata_packet: StreamTitle='Zelda Reorchestrated - Twilight Symphony - Gerudo Desert';
icy_metadata_headers:
icy-br: 192
icy-description: Radio Hyrule
icy-genre: Remix
icy-name: Radio Hyrule
icy-pub: 1
icy-url: http://radiohyrule.com/
More Information
Find out more on the mailing list where the patch was proposed.
The options are defined on the AVClass for the HTTP and HTTPS context (code).
This involves 2 separate operations on the http response and has not much to do with android-ffmpeg.
see sections '1.1.3' , '1.1.6' here
Assuming you are using default implementation of HttpClient in android, the apis are very similar. There is a bridge package in use for android that wraps the apache httpclient libs used in my example.
When you get the response, you do one thing to get the response headers ( see links ) and then another thing to get the stream object in the ENTITY and then use JNI to pass a ptr to that stream over to the I/O from ffmpeg.

Resources