I would like to create a SDP media field with its attributes, and there are a few things I don't understand. I've skimmed and read the relevant RFC and I understand most of what each field means, but what I don't understand is how do I derive from the Audio/Video Format of the JMF, which parameters of the format compose the rtpmap registry entries I need to use. I see many times the fields
m=audio 12548 RTP/AVP 0 8 101
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
a=silenceSupp:off - - - -
a=ptime:20
a=sendrecv
these are received from the pbx server I'm connecting to, what do they mean in the terms of the JMF audio format properties. (I do understand these are standard audio format commonly used in telecommunication)
UPDATE:
I was more wondering about the format parameter '0 8 101' at the end of
m=audio 12548 RTP/AVP 0 8 101
I know they are referenced from this list, but how do I determine according to the JMF media format which one to use?
Thanks in advance,
Adam Zehavi.
You can use any of the codecs listed in the SDP. The agent that sent the SDP is stating that it supports all of the codecs listed.
In the SDP example you've provided you could start sending RTP encoded with either G711 ULAW (PCMU) or G711 ALAW (PCMA).
I'm not sure if this is what you asked for, but:
PCMU/8000: 1-channel, 8000 Hz, m-Law encoded format
PCMA/8000: 1-channel, 8000 Hz, A-Law encoded format
telephone-event: DTMF digits, telephone tones
Well after long while of searching and not really understanding, I can answer my question.
In my eyes, the only use for SDP would have been for each side to state to the other his media capabilities, I did not realize it was in the form of negotiation, I didn't understand the need for such a deep negotiation about media, I thought Client 1 could supply X,Y,Z,W, client 2 would response I can only get X,W and then Client one would say ok I send you W format...
don't know why this made perfect sense to me, and I'm going to design my SDP wrapper of my application in this manner, and only to use JMF formats as a comparison instead of dealing with the gutter of the SDP, over and over, I would try to design a general template that would perform all these annoying text generating methods, using JMF format array, just like I think it should be, the only thing I'm surprised is that I didn't find anything like this already made...
Thanks for all your help, and if anyone ever wonder about this subject again, just start reading this RFC
Related
I've been struggling with the following problem and can't figure out a solution. The provided java server application sends pcm audio data in chunks over a websocket connection. There are no headers etc. My task is to play these raw chunks of audio data in the browser without any delay. In the earlier version, I used audioContext.decodeAudioData because I was getting the full array with the 44 byte header at the beginning. Now there is no header so decodeAudioData cannot be used. I'll be very grateful for any suggestions and tips. Maybe I've to use some JS decoding library, any example or link will help me a lot.
Thanks.
1) Your requirement "play these raw chunks of audio data in the browser without any delay" is not possible. There is always some amount of time to send audio, receive it, and play it. Read about the term "latency." First you must get a realistic requirement. It might be 1 second or 50 milliseconds but you need to get something realistic.
2) Web sockets use tcp. TCP is designed for reliable communications, congestion control, etc. It is not design for fast low latency communications.
3) Give more information about your problem. Is you client and server communicating over the Internet or over a local Lan? This will hugely effect your performance and design.
4) The 44 byte header was a wav file header. It tells the type of data (sample rate, mono/stereo, bits per sample). You must know this information to be able to play the audio. IF you know the PCM type, you could insert it yourself and use your decoder as you did before. Otherwise, you need to construct an audio player manually.
Streaming audio over networks is not a trivial task.
As the picture shows: chromecast payload type 127 RTP packets from Nexus 5 contains lots of "zzzz..." comparing to from Ubuntu.
It looks like these "zzzzz..." is not so meaningful for screen mirror, thus I'm pretty confused why there are so many redundant content in packet?!
Days I trying to stream mp4 file with ffserver.
I read many questions like these:
https://superuser.com/questions/563591/streaming-mp4-with-ffmpeg
Begin stream simple mp4 with ffserver
http://ffmpeg.gusari.org/viewtopic.php?f=12&t=1190
http://ffmpeg.org/pipermail/ffserver-user/2012-July/000204.html
HTML5 - How to stream large .mp4 files?
Finally I cant understand is mp4 stream able or not?
Is it a way to do this with ffserver?
Is there any sample?I read helps but they most about live stream but I
just want to stream a simple mp4 file.
Yes.
Streaming an mp4-file is very much possible with ffserver. However it might require some reading of the documentation:
https://ffmpeg.org/ffmpeg.html
https://ffmpeg.org/ffserver.html
The crucial part is the writing of the configuration file for ffserver (ffserver.conf). As far as I know, ffmpeg provides a list of sample-configurations:
Although they might be a bit outdated but if you try to run them, ffserver will tell you if something isn't as it should be :)
Edit:
(Since I only have a rep of 1, I can't post more than 2 links I removed the samples and displayed a rather simple one below)
To stream an mp4-file you may have to consider that ffserver might have problems to stream in the mp4-format. Still you can stream a mp4-file but in a different Format.
A very simple way would be like this:
<Stream streamTest.asf> #ASF as the streaming Format
File "/tmp/video1.mp4" #or wherever you store your Videos
</Stream>
The server converts the file on it's own, but if you really want to stream in mp4 you may have to take a closer look at "fragmented mp4".
To watch the stream use a player that can handle asf (I used VLC) and watch from URL:
ip-address:port/streamTest.asf
Summary:
It should say that I am also still learning the ways of ffserver, so there might be some mistakes :)
This is a short summary of the chapters from the ffserver-documentation to get started.
5.2 Global options
The options in this chapter specify your server settings. For example how many simultaneous requests should be handled. On what port do you want to stream etc... For people who are completely new to ffserver, most of the default-values should be sufficient.
5.3 Feed section
The feed section is one of the core parts of ffserver. Since a feed can serve multiple streams it might be useful to build that first.
Note: Feed is only necessary if you want to a) live stream b) stream files that are not stored on your server
c) mess around with the file before streaming
5.4 Stream section
Here you can actually build your own stream. There are a lot of variables that can be changed and I recommend to start slowly with adding/customizing options.
From this point on the documentation does a decent job. So now you know, what you need (again, I feel like the possibilities are countless but I'm still a beginner^^) and where to find the basics.
The structure of your ffserver.conf might (but doesn't have to) look like this:
#Options from 5.2
HTTPPort 8090
#...
#Feed (Options from 5.3)
<Feed feed1.ffm>
#...
</Feed>
#
#Stream (Options from 5.4)
<Stream stream1.asf>
Feed feed1.ffm
Format asf
NoAudio
#...
</Stream>
Since this is my first post, I hope it is not too chaotic :)
ffserver.conf:
HTTPPort 8090
HTTPBindAddress 0.0.0.0
RTSPPort 8091
MaxHTTPConnections 2000
MaxClients 1000
MaxBandwidth 1000
CustomLog -
<Stream 1.mp4>
File "/path/1.mp4"
Format rtp
</Stream>
Start:
ffserver -f ffserver.conf
Play:
ffplay rtsp://localhost:8091/1.mp4
I have an application that sends raw h264 NALUs as generated from encoding on the fly using x264 x264_encoder_encode. I am getting them through plain TCP so I am not missing any frames.
I need to be able to decode such a stream in the client using Hardware Acceleration in Windows (DXVA2). I have been struggling to find a way to get this to work using FFMPEG. Perhaps it may be easier to try Media Foundation or DirectShow, but they won't take raw H264.
I either need to:
Change the code from the server application to give back an mp4 stream. I am not that experienced with x264. I was able to get raw H264 by calling x264_encoder_encode, by following the answer to this question: How does one encode a series of images into H264 using the x264 C API? How can I go from this to something that is wrapped in MP4 while still being able to stream it in realtime
I could at the receiver wrap it with mp4 headers and feed it into something that can play it using DXVA. I wouldn't know how to do this
I could find another way to accelerate it using DXVA with FFMPEG or something else that takes it in raw format.
An important restriction is that I need to be able to pre-process each decoded frame before displaying it. Any solution that does decoding and displaying in a single step would not work for me
I would be fine with either solution
I believe you should be able to use H.264 packets off the wire with Media Foundation. there's an example on page 298 of this book http://www.docstoc.com/docs/109589628/Developing-Microsoft-Media-Foundation-Applications# that use a HTTP stream with Media Foundation.
I'm only learning Media Foundation myself and am trying to do a similar thing to you, in my case I want to use H.264 payloads from an RTP packet, and from my understanding that will require a custom IMFSourceReader. Accessing the decoded frames should also be possible from what I've read since there seems to be complete flexibility in chaining components together into topologies.
What i want to do is the following procedure:
Get a frame from the Webcam.
Encode it with an H264 encoder.
Create a packet with that frame with my own "protocol" to send it via UDP.
Receive it and decode it...
It would be a live streaming.
Well i just need help with the Second step.
Im retrieving camera images with AForge Framework.
I dont want to write frames to files and then decode them, that would be very slow i guess.
I would like to handle encoded frames in memory and then create the packets to be sent.
I need to use an open source encoder. Already tryed with x264 following this example
How does one encode a series of images into H264 using the x264 C API?
but seems it only works on Linux, or at least thats what i thought after i saw like 50 errors when trying to compile the example with visual c++ 2010.
I have to make clear that i already did a lot of research (1 week reading) before writing this but couldnt find a (simple) way to do it.
I know there is the RTMP protocol, but the video stream will always be seen by one peroson at a(/the?) time and RTMP is more oriented to stream to many people. Also i already streamed with an adobe flash application i made but was too laggy ¬¬.
Also would like you to give me an advice about if its ok to send frames one by one or if it would be better to send more of them within each packet.
I hope that at least someone could point me on(/at?) the right direction.
My english is not good maybe blah blah apologies. :P
PS: doesnt has to be in .NET, it can be in any language as long as it works on Windows.
Many many many many thanks in advance.
You could try your approach using Microsoft's DirectShow technology. There is an opensource x264 wrapper available for download at Monogram.
If you download the filter, you need to register it with the OS using regsvr32. I would suggest doing some quick testing to find out if this approach is feasible, use the GraphEdit tool to connect your webcam to the encoder and have a look at the configuration options.
Also would like you to give me an advice about if its ok to send frames one by one or if it would be better to send more of them within each packet.
This really depends on the required latency: the more frames you package, the less header overhead, but the more latency since you have to wait for multiple frames to be encoded before you can send them. For live streaming the latency should be kept to a minimum and the typical protocols used are RTP/UDP. This implies that your maximum packet size is limited to the MTU of the network often requiring IDR frames to be fragmented and sent in multiple packets.
My advice would be to not worry about sending more frames in one packet until/unless you have a reason to. This is more often necessary with audio streaming since the header size (e.g. IP + UDP + RTP) is considered big in relation to the audio payload.