MP4 Fast Forward/Rewind - directshow.net

I need help with implementing fast forward and rewind. I'm using directshow in c# and have played with IMediaSeeking however the results suck! SetRate does not work at all and SetPositions is choppy and apparently has sync issues with multiple threads so it ceases to run after the first time it's called. I played with Imediaposition but could not get it to work at all. My graph is simply
FileSourceAsync -> Intel Splitter -> MainConcept Decoder -> Decklink Render
After scanning the supported interfaces the filesource and decoder do not apparently support IMediaSeeking.
Does anybody have any ideas or clues that can help me fast forward and rewind an mp4 file in a directshow graph?
Cheers.

IMediaSeeking works properly when underlying filter implement it properly. One of the filters you use (Intel's?) seems to be having issues with seeking. Perhaps you can replace it with a better alternative.

Related

Real time microphone audio manipulation windows

I would like to make an app (Target pc windows) that let you modify the micro input in real time, like introducing sound effects or even modulating your voice.
I searched over the internet and only found people telling that it would not be possible without using a virtual audio cable.
However I know some apps with similar behavior (voicemod, resonance) not using a virtual audio cable so I would like some help about how can be done (just the name of a library capable would be enough) or where to start.
Firstly, you can use professional ready-made software for that - Digital audio workstation (DAW) in combination with a huge number of plugins for that.
See 5 steps to real-time process your instrument in the DAW.
And What is (audio) direct monitoring?
If you are sure you have to write your own, you can use libraries for real-time audio processing (as far as I know, C++ is better for this than C#).
These libraries really works. They are specially designed for realtime.
https://github.com/thestk/rtaudio
http://www.portaudio.com/
See also https://en.wikipedia.org/wiki/Csound
If you don't have a professional sound interface yet, but want to minimize a latency, read about Asio4All
The linked tutorial worked for me. In it, a sound is recorded and saved to a .wav.
The key to having this stream to a speaker would be opening a SourceDataLine and outputting to that instead of writing to a wav file. So, instead of outputting on line 59 to AudioSystem.write, output to a SourceDataLine write method.
IDK if there will be a feedback issue. Probably good to output to headphones and not your speakers!
To add an effect, the AudioInputLine has to be accessed and processed in segments. In each segment the following needs to happen:
obtain the byte array from the AudioInputLine
convert the audio bytes to PCM
apply your audio effect to the PCM (if the effect is a volume change over time, this could be done by progressively altering a volume factor between 0 to 1, multiplying the factor against the PCM)
convert back to audio bytes
write to the SourceDataLine
All these steps have been covered in StackOverflow posts.
The link tutorial does some simplification in how file locations, threads, and the stopping and starting are handled. But most importantly, it shows a working, live audio line from the microphone.

Onvif playback stream cannot seek

I'm trying to obtain playback video streams from some Axis and Hikvision cameras, using Onvif.
I'm doing this in a C# application, and the resulted stream is played in VLC.
Using the FindRecordings/GetRecordingSearchResult calls and then GetReplayUri I can obtain the playback stream (RTSP/H264), but here I have this problem: this behaves like a live stream - I can only use play and pause. I cannot use the time cursor to seek, cannot play in reverse.
So I find this unusable for a playback application - you have to watch the entire recording (days or hours of recording!) in order to see a specific event in time. And once you play it, you cannot go back 1 minute to see it again.
This seems quite stupid to me, so I believe that I'm doing something wrong in my code. Maybe I'm missing some configuration in order to obtain a 'true' playback stream.
My question is: is this playback stream behavior the 'standard' one, and I cannot expect more on this? Or some of you have this working ok (seek, reverse, frame by frame stepping), so I will know it can be done.
Thank you.
Reverse playback is possible, but it is not easy. First, the reverse replay is initiated using the Scale header field with a negative value. As an example:
PLAY rtsp://192.168.0.1/path/to/recording RTSP/1.0
Cseq: 123
Session: 12345678
Require: onvif-replay
Range: clock=20090615T114900.440Z-
Rate-Control: no
Scale: -1.0
After the stream is initialized, you will get GOPs in reverse order, not just reversed frames. I don't know if VLC supports this way of operating.
Be aware that only devices with the ReversePlayback capability support reverse playback.
Please refer to the streaming specification for further details.
This is not a real solution to the problem above, but maybe it would help others to deal with this situation.
Some cameras with which I worked were continuously recording on the same video file (so the time range was not known) and they were reporting (via RTSP) the available time interval like this:
range:npt=0-
Due to this, VLC was not displaying any time interval in the time slider, so it was not
allowing for seek. In my case, it was a requirement to use VLC, so I had to find a workaround to the problem.
This was a module which was acting like a proxy, and it sit between VLC and the RTSP source (camera). So all RTSP traffic between VLC and camera was going via this module which I controlled, so I could easily change the responses from camera in a way which was ok for VLC, so I got the seek capability available in VLC.

Playing H.264 video in an application through ffmpeg using DXVA2 acceleration

I am trying to output H.264 video in a Windows application. I am moderately familiar with FFMPEG and I have been successful at getting it to play H.264 in a SDL window without a problem. Still, I would really benefit from using Hardware Acceleration (probably through DXVA2)
I am reading raw H264 video, no container, no audio ... just raw video (and no B-frames, just I and P). Also, I know that all the systems that will use this applications have Nvidia GPUs supporting at least VP3.
Given that set of assumptions I was hoping to cut some corners, make it simple instead of general, just have it working for my particular scenario.
So far I know that I need to set the hardware acceleration in the codec context by filling the hwaccel member through a call to ff_find_hwaccel. My plan is to look at Media Player Classic Home Cinema which does a pretty good job at supporting DXVA2 using FFMPEG when decoding H.264. However, the code is quite large and I am not exactly sure where to look. I can find the place where ff_find_hwaccel is called in h264.c, but I was wondering where else should I be looking at.
More specifically, I would like to know what is the minimum set of steps that I have to code to get DXVA2 through FFMPEG working?
EDIT: I am open to look at VLC or anything else if someone knows where I can find the "important" piece of code that does the trick. I just mentioned MPC-HC because I think it is the easiest to get to compile in Windows.

Encode WebCam frames with H.264 on .NET

What i want to do is the following procedure:
Get a frame from the Webcam.
Encode it with an H264 encoder.
Create a packet with that frame with my own "protocol" to send it via UDP.
Receive it and decode it...
It would be a live streaming.
Well i just need help with the Second step.
Im retrieving camera images with AForge Framework.
I dont want to write frames to files and then decode them, that would be very slow i guess.
I would like to handle encoded frames in memory and then create the packets to be sent.
I need to use an open source encoder. Already tryed with x264 following this example
How does one encode a series of images into H264 using the x264 C API?
but seems it only works on Linux, or at least thats what i thought after i saw like 50 errors when trying to compile the example with visual c++ 2010.
I have to make clear that i already did a lot of research (1 week reading) before writing this but couldnt find a (simple) way to do it.
I know there is the RTMP protocol, but the video stream will always be seen by one peroson at a(/the?) time and RTMP is more oriented to stream to many people. Also i already streamed with an adobe flash application i made but was too laggy ¬¬.
Also would like you to give me an advice about if its ok to send frames one by one or if it would be better to send more of them within each packet.
I hope that at least someone could point me on(/at?) the right direction.
My english is not good maybe blah blah apologies. :P
PS: doesnt has to be in .NET, it can be in any language as long as it works on Windows.
Many many many many thanks in advance.
You could try your approach using Microsoft's DirectShow technology. There is an opensource x264 wrapper available for download at Monogram.
If you download the filter, you need to register it with the OS using regsvr32. I would suggest doing some quick testing to find out if this approach is feasible, use the GraphEdit tool to connect your webcam to the encoder and have a look at the configuration options.
Also would like you to give me an advice about if its ok to send frames one by one or if it would be better to send more of them within each packet.
This really depends on the required latency: the more frames you package, the less header overhead, but the more latency since you have to wait for multiple frames to be encoded before you can send them. For live streaming the latency should be kept to a minimum and the typical protocols used are RTP/UDP. This implies that your maximum packet size is limited to the MTU of the network often requiring IDR frames to be fragmented and sent in multiple packets.
My advice would be to not worry about sending more frames in one packet until/unless you have a reason to. This is more often necessary with audio streaming since the header size (e.g. IP + UDP + RTP) is considered big in relation to the audio payload.

How to play multiple mp3/wma files at once?

I have the need to play multiple soundeffects at once in my WP7 app.
I currently have it working with wav files that takes around 5 megabyte, instead of 500kb when coded in wma/mp3.
Current part of the code:
Stream stream = TitleContainer.OpenStream(String.Format("/location/{0}.wav", value)
SoundEffect effect = SoundEffect.FromStream(stream);
effect.Play();
This works great in a loop, preparing all effects, and then playing them.
However, I would really like to use mp3/wma/whatever-codec to slim my xap file down.
I tried to use MediaElement, but it appears that you also can't use that to play multiple files. Also the XNA MediaPlayer can't be instantiated, and as far as I experienced can't be made to play multiple files at once.
The only solution I see left is that I somehow decode the mp3 to wav and feed that Stream to SoundEffect.
Any ideas on how to accomplish the multiple playback? Or suggestions on how to decode mp3 to wav?
On the conversion... sorry - but I don't think there's any api currently available for WMA or MP3 decoding.
Also, I don't think there are any implementations of MP3, WMA or Ogg decoders which are available in pure c# code - all of them I've seen use DirectShow or PInvoke - e.g. see C# Audio Library.
I personally do expect audio/video compression/decompression to be available at some point in the near future in the WP7 APIs - but I can't guess when!
For some simple compression you can try things like shipping mono instead of stereo files, or shipping 8 bit rather than 16 bit audio files - these are easy to convert back to 16 bit (with obvious loss of resolution) on the phone.
Using compression like zip might also help for some sound effects... but I wouldn't expect it to be hugely successful.

Resources