I am using a pretty common way to capture screenshots using Direct3D and encode them in an .mp4 video.
while (true)
{
HRESULT hr = pDevice->GetFrontBufferData(0, pSurface);
//code to create and send the sample to a IMFSinkWriter
}
It works despite being painfully slow (it's only good for 15 fps or so) but there's a major problem: I have to manually calculate each sample's timestamp and video length ends up as being incorrect and dependent on CPU speed.
Is there anyway to capture screenshot using a callback after a fixed interval of time (say 30 fps), without having to use the infinite loop?
I am implementing a media streaming source in c# windows phone 8 for streaming schoutcast urls.
I can play the buffered stream from the URLs. Now I have to implement seeking over the buffered audio data.
I tried setting forward and backward by 1 seconds from the GUI. Below is the code for rewinding
if(BackgroundAudioPlayer.Instance.CanSeek)
{
TimeSpan position = BackgroundAudioPlayer.Instance.Position;
BackgroundAudioPlayer.Instance.Position = new TimeSpan(0, 0, (int)(position.Seconds - 1));
}
But the player stops for long time and starts playing.
I think I have to implement the following method found in Media Stream Source implementation.
protected override void SeekAsync(long seekToTime)
{
ReportSeekCompleted(seekToTime);
}
I would like to know how to implement forward and backward seeking using Media Streaming Source without the delay?
I'm writing a DirectShow source filter which is registered as a CLSID_VideoInputDeviceCategory, so it can be seen as a Video Capture Device (from Skype, for example, it is viewed as another WebCam).
My source filter is based on the VCam example from here, and, for now, the filter produces the exact output as this example (random colored pixels with one Video output pin, no audio yet), all implemented in the FillBuffer() method of the one and only output pin.
Now the real scenario will be a bit more tricky - The filter uses a file handle to a hardware device, opened using the CreateFile() API call (opening the device is out of my control, and is done by a 3Party library). It should then read chunks of data from this handle (usually 256-512 bytes chunk sizes).
The device is a WinUSB device and the 3Party framework just "gives" me an opened file handle to read chunks from.
The data read by the filter is a *.mp4 file, which is streamed from the device to the "handle".
This scenario is equivalent to a source filter reading from a *.mp4 file on the disk (in "chunks") and pushing its data to the DirectShow graph, but without the ability to read the file entirely from start to end, so the file size is unknown (Correct?).
I'm pretty new to DirectShow and I feel as though I'm missing some basic concepts. I'll be happy if anyone can direct me to solutions\resources\explanations for the following questions:
1) From various sources on the web and Microsoft SDK (v7.1) samples, I understood that for an application (such as Skype) to build a correct & valid DirectShow graph (so it will render the Video & Audio successfully), the source filter pin (inherits from CSourceStream) should implement the method "GetMediaType". Depending on the returned value from this implemented function, an application will be able to build the correct graph to render the data, thus, build the correct order of filters. If this is correct - How would I implement it in my case so that the graph will be built to render *.mp4 input in chunks (we can assume constant chunk sizes)?
2) I've noticed the the FillBuffer() method is supposed to call SetTime() for the IMediaSample object it gets (and fills). I'm reading raw *.mp4 data from the device. Will I have to parse the data and extract the frames & time values from the stream? If yes - an example would b great.
3) Will I have to split the data received from the file handle (the "chunks") to Video & Audio, or can the data be pushed to the graph without the need to manipulate it in the source filter? If split is needed - How can it be done (the data is not continuous, and is spitted to chunks) and will this affect the desired implementation of "GetMediaType"?
Please feel free to correct me if I'm using incorrect terminology.
Thanks :-)
This is a good question. On the one hand this is doable, but there is some specific involved.
First of all, your filter registered under CLSID_VideoInputDeviceCategory category is expected to behave as a live video source. By doing so you make it discoverable by applications (such as Skype as you mentioned), and those applications will be attempting to configure video resolution, they expect video to go at real time rate, some applications (such as Skype) are not expecting compressed video such H.264 there or would just reject such device. You can neither attach audio right to this filter as applications would not even look for audio there (not sure if you have audio on your filter, but you mentioned .MP4 file so audio might be there).
On your questions:
1 - You would have a better picture of application requirement by checking what interface methods applications call on your filter. Most of the methods are implemented by BaseClasses and convert the calls into internal methods such as GetMediaType. Yes you need to implement it, and by doing so you will - among other - enable your filter to connect with downstream filter pins by trying specific media types you support.
Again, those cannot me MP4 chunks, even if such approach can work in other DirectShow graphs. Implementing a video capture device you should be delivering exactly video frames, preferably decompressed (well those could be compressed too, but you are going to immediately have compatibility issies with applications).
A solution you might be thinking of is to embed a fully featured graph internally to which you inject your MP4 chunks, then the pipelines parse those, decodes and delivers to your custom renderer, taking frames on which you re-expose them off your virtual device. This might be a good design, though assumes certain understanding of how filters work internally.
2 - Your device is typically treated as/expected to be a live source, which means that you deliver video in realtime and frames are not necessarily time stamped. So you can put times there and yes you definitely need to extract time stamps from your original media (or have it done by internal graph as mentioned in item 1 above), however be prepared that applications strip time stamps especially for preview purposes, since the source is "live".
3 - Getting back to audio, you cannot implement audio on the same virtual device. Well you can, and this filter might be even working in a custom built graph, but this is not going to work with applications. They will be looking for separate audio device, and if you implement such, they will instantiate it separately. So you are expected to implement both virtual video and virtual audio source, and implement internal synchronization behind the scenes. This is where timestamps will be important, by providing them correctly you will keep lip sync in live session to what it was originally on the media file you are streaming from.
I am recording audio using the XNA Microphone class and saving the recorded data in isolated storage in wav format.If the length of audio is small my app is working fine.But as it increases the memory consumed by the app also increases,which drastically slows down the device.The following code is used to play the audio
using (IsolatedStorageFile isoStore = IsolatedStorageFile.GetUserStoreForApplication())
{
using (IsolatedStorageFileStream fileStream = isoStore.OpenFile(AudioFilePath, FileMode.Open))
{
sound = SoundEffect.FromStream(FileStream);
sound.Play();
}
}
Any suggestion on how to handle the memory issue while playing large audio files.Or how can i save the PCM in other formats (wma,mp3) to reduce the size.
SoundEffect isn't intended for playing long pieces of audio. As the name suggests it is intended for short pieces and also playing lots of them, possibly at the same time.
To play longer pieces of audio you shoudl consider the MediaElement.
i'm currently reading the documentation of MSDN to render a stream to an audio renderer..
or in other word, to play my captured data from microphone.
http://msdn.microsoft.com/en-us/library/dd316756%28v=vs.85%29.aspx
this example provides example.
My problem now is that i couldn't really understand the project flow.
I currently have a different class storing the below parameters which i obtained from the capture process.
these parameters will be continuously re-written as the program captures streaming audio data from the microphone.
BYTE data;
UINT32 bufferframecount;
DWORD flag;
WAVEFORMATEX *pwfx;
My question is,
How does really the loadData() function works.
is it suppose to grab the parameter i'm writing from capture process?
how does the program sends the data to audio renderer, and play it in my speaker.
The loadData() function fills in audio pointed to by pData. The example abstracts the audio source, so this could be anything from a .wav file to the microphone audio that you already captured.
So, if you are trying to build from that example, I would implement the MyAudioSource class, and have it just read PCM or float samples from a file, whenever loadData() is called. Then, if you run that program, it should play the audio from the file out the speaker.