mix 2 real webcam into a fake webcam - windows

i need to get the streaming from 2 webcams on the same computer, and mix it as a fake webcam (so then i can use the fake webcam on any software).
I have seen that camcamx is for mac, webcamstudio is for linux, but i need a solution for windows and i can't find it, so i was thinking to write my own small app.
I can program with C#, Java and lazarus, but examples or library or whatever in any language will help anyway.
i will need to make a fake webcam that can be used as a webcam (detected on my computer as a usb webcam), and some code to grasp the stream from two real webcam and mix everything together (there will be like a primary webcam that will be bigger and a secondary webcam that will be smaller, on a corner of the big image)
Anyone can help me on that?

This is not a trivial exercise but it can be done. I know because I've done it before. :)
I implemented this in C++.
What you need to do is to create what's known as a shared memory server. A shared memory server is a region of ram that more than one process can access. Here's how to create one using Named Shared Memory under Windows:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366551(v=vs.85).aspx
In your application that mixes the video from the two cameras, you need to create a DirectShow rendering filter (CBaseRenderer) that writes the mixed video frame into this shared memory.
On the other end, you need to create a separate Visual Studio DLL project that will implement a DirectShow capture filter (CSource and CSourceStream) that will read the video bitmaps your main application writes into this buffer. This VS project needs to be a registerable DLL that can be called to register it as a DirectShow capture device for windows.
Your main application will create and maintain this shared memory buffer when it is operating. If another application (like a video conferencing program) accesses the capture device, all that will come from the device will be a blank buffer until you main application stars feeding real video frames into it.
Tip #1: Since this is a multi-threaded operation, you will need an event handle to signal the capture filter that a frame is ready. You will also need a mutex to control access to the buffer by the "rendering" thread in your application and the "capture" thread in the capture device.
Tip #2: You won't need to call UnmapViewOfFile or CloseHandle on the memory pointers until the rendering or capture filters are disposed.
There is a lot of code you will need to grind out, so any useful examples will be beyond the scope of this discussion. This should get you going in the right direction. Good luck!

I think your question is too far out of scope for what this site is all about. You're talking about thousands and thousands of lines of code and intimate knowledge of drivers, video decoding, mixing, etc., etc. if you're going to write this software on your own.
With that said, there probably is software for this for Windows. I'd start here:
http://alternativeto.net/software/webcamstudio/

Capture video from real webcam: Video Capture on MSDN
Fake webcam: the well known starting point is Vivek's sample/project available at http://tmhare.mvps.org/downloads.htm, see also this post "Fake" DirectShow video capture device
Getting all together is doable, though not trivial.

Related

How to programatically create custom input/output devices?

How do i create a custom media input/output device like a speaker or microphone that i can select from a program like Skype. For example i could make a GreyScale webcam that reads the webcam and makes it greyscale or a custom Beep Speaker that takes anything a program sends to the speaker and adds a beep after 3 seconds etc. An example would be this:
http://www.videohelp.com/tools/UScreenCapture
I just need help on how to create the actual (virtual?) device, not how to make it greyscale etc. I can figure that out later.
Where do i even begin to search for tutorials/readings on this? As per the tags, i prefer qt/c++ related but it doesn't necessarily have to be that. Just a nudge in the right direction to get me started would be fine.
You need to create a device driver. What that entails depends entirely on the platform and the type of device you want to emulate.
Start with the documentation of your operating system and look up references as if you were developing a new hardware device. But you'll just skip any actual hardware interfaces.
Nevertheless this is likely to require kernel programming, so Qt is likely to be inappropriate.

Can I programatically save the data stream sent to the sound card as a WAV file?

In Windows XP, you can configure your sound card properties via the preloaded windows software. In the recording properties, if "stereo mix" or "wave out" (or something similar) is selected as the recording device, programs that can record audio ("Sound Recorder" in windows for example) record a decent quality wave file of the audio stream. I usually use Goldwave from download.com to do this as an example of a third-party application that functions the same.
Well, I've had trouble getting this scenario to happen on Windows Vista or later in a direct no-bullsh*t manner as described above. It's more than just Vista+, it's also that some sound cards don't have that option at all.
I was just wondering if there is a way to run a windows-friendly program (VB?) that takes your audio output stream and converts it (in realtime, obviously) to a WAV file with the default sampling rate as other WAV files have.
Ideally, it would cool if it worked on any operating system, so is it possible to write a web service that "listens" to your audio card like that without making the computer think it's getting a virus attack or something?
Possibly related question:
How to save web audio streaming to file ( c++ / java )
I'm only aware of one manufacturer of sound cards that enabled that option (Creative). However Vista and beyond support a "loopback" mode which gives you effectively the same functionality. You need to use the low level WASAPI rendering stack but it should work just fine.
https://github.com/rdp/virtual-audio-output-sniffer provides a directshow input device to capture the sum of wave out for vista+
You could use low level waveOut API injection and capture what it receives.
I have SkypeMXrecorder, a software that does just that - inject into any exe and 'sniffs' what's going out from it and into the sound hardware. But, it seems rather complicated to implement...

How do I read a video camera in a win32 C program

I have this garden variety USB video camera, and it came with two mini-apps, one that just lets you see what the camera sees, and one that records to an .avi file.
But what's the API if I want to grab images from the camera in my own C program? I am making the assumptions that it's (1) possible and (2) desirable to make some call and have a 2D array of pixel information filled in.
What I really want to do is tinker with image processing algorithms, and for that I'd really like to get my code around some live data.
EDIT -
Having had a healthy exposure to Linux, I can grasp how (ideally/in theory) you could open() the device, use ioctl() to configure it, and read() the data. And I'm virtually certain that that's not how Windows is going to present the API. Not knowing what function names Windows might use for a video device API, or even if it has one, makes it difficult to look up, at least with the win32 api search capabilities that I have at my disposal.
You'll probably need the DirectShow API, provided that's how the camera operates. If the manufacturer created their own code path, you'll need their API.
Your first step, as pointed out by ChrisBD, is to check if Windows supports your device.
If that is the case you have three possible Windows APIs for capture:
DirectShow
VFW. Has more or less been replaced by DirectShow
MediaFoundation. Is the newest API that is intended to replace DirectShow. AFAIK not fully implemented yet and only available in Vista.
From the three DirectShow is the best choice. However, learning and using DirectShow is not a trivial task. An excellent example can be found here.
Another possibility is to use OpenCV. OpenCV is an image processing library, that you can also use for processing the images. OpenCV has an image capture API that provides a simpler abstraction and is easier to use than the Windows APIs.
The API is the way to go.
A good indication of whether the camera requires a bespoke one or not is to see if it is recognised by a PC without the manufacturer's applications installed. If windows has the drivers built in the you should be able to use the windows APIs to capture the images.
Alternatively if you know what compression codec has been used for the AVI file you could unpack it.
Ideally it would be good if you could capture the video in native (YUV, RGB15 or similar) format as then you can work on compression as well as manipulation.

Capturing the desktop with Windows Media Format(WMF)

I am using Windows Media Format SDK to capture the desktop in real time and save it in a WMV file (actually this is an oversimplification of my project, but this is the relevant part). For encoding, I am using the Windows Media Video 9 Screen codec because it is very efficient for screen captures and because it is available to practically everybody without the need to install anything, as the codec is included with Windows Media Player 9 runtime (included in Windows XP SP1).
I am making BITMAP screen shots using the GDI functions and feed those BITMAPs to the encoder. As you can guess, taking screen shots with GDI is slow, and I don't get the screen cursor, which I have to add manually to the BITMAPs. The BITMAPs I get initially are DDBs, and I need to convert those to DIBs for the encoder to understand (RGB input), and this takes more time.
Firing a profiler shows that about 50% of the time is spent in WMVCORE.DLL, the encoder. This is to be expected, of course as the encoding is CPU intensive.
The thing is, there is something called Windows Media Encoder that comes with a SDK, and can do screen capture using the desired codec in a simpler, and more CPU friendly way.
The WME is based on WMF. It's a higher lever library and also has .NET bindings. I can't use it in my project because this brings unwanted dependencies that I have to avoid.
I am asking about the method WME uses for feeding sample data to the WMV encoder. The encoding takes place with WME exactly like it takes place with my application that uses WMF. WME is more efficient than my application because it has a much more efficient way of feeding video data to the encoder. It doesn't rely on slow GDI functions and DDB->DIB conversions.
How is it done?
The source to CamStudio, a GPL'd screencasting app that's been around for years (commercially and then open-srcd later) might be useful?
http://sourceforge.net/project/showfiles.php?group_id=131922
I'd suggest looking at the guts of VNC clients too, though they're probably very simplistic (I think just grabbing screenshots then jpg'ing the tiles that have changed since the last capture).
You might want to consider not using WMV9 as the encoder for on-the-fly encoding if it is too cpu-heavy? Maybe use an older, less efficient compressor (like MS RLE) as used by HyperCam and then compress to WMV afterwards? MS RLE has been a default install since at least Win2000 I believe:
http://wiki.multimedia.cx/index.php?title=Microsoft_RLE
CamStudio's Lossless codec is GPL (same link as above), that offers pretty good compression (though you'd need to bundle the dll in your installer) and could be used on the fly, it works well with high compression on all modern systems.
It's been ages since I've done any Win32 coding, but AFAIK, WMF as a format is basically a list of GDI commands and their parameters which would explain why it is much more efficient to encode...
You'd probably need to hook into the top level GDI context (just as Remote Desktop does, I guess) and capture the GDI commands as they are called. I seem to remember there being some way of creating a WMF output GDI context which means you may be able to just delegate calls to it in some way.
I'm guessing here, but you may be able to find example code for the above in the TightVNC/QuickVNC for Windows projects as they would have to do something like that to capture changes on screen in an efficient way.
Have you checked out the BB FlashBack library?
I am on a similar hunt, and I have just started evaluating the BB FlashBack library.
I am not sure about the external dependencies or install footprint. It appears to have a proprietary codec that has to be installed, but the installation of the codec can be handled by the exposed BB FlashBack API.
Beware, there are licensing restrictions (Runtime setting of license keys, ...)
I can send you the CHM from the SDK via e-mail if you want to evaluate the API before committing to a licensed download.
Things I am in the midst of evaluating:
Proper captures of WPF views
mouse cursor tracking
Size of stored movie
How to display stored movie without proprietary codec (i.e. SWF export)
--Batgar

How do I capture the audio that is being played?

Does anyone know how to programmatically capture the sound that is being played (that is, everything that is coming from the sound card, not the input devices such as a microphone).
Assuming that you are talking about Windows, there are essentially three ways to do this.
The first is to open the audio device's main output as a recording source. This is only possible when the driver supports it, although most do these days. Common names for the virtual device are "What You Hear" or "Wave Out". You will need to use a suitable API (see WaveIn or DirectSound in MSDN) to do the capturing.
The second way is to write a filter driver that can intercept the audio stream before it reaches the physical device. Again, this technique will only work for devices that have a suitable driver topology and it's certainly not for the faint-hearted.
This means that neither of these options will be guaranteed to work on a PC with arbitrary hardware.
The last alternative is to use a virtual audio device, such as Virtual Audio Cable. If this device is set as the defualt playback device in Windows then all well-behaved apps will play through it. You can then record from the same device to capture the summed output. As long as you have control over the device that the application you want to record uses then this option will always work.
All of these techniques have their pros and cons - it's up to you to decide which would be the most suitable for your needs.
You can use the Waveform Audio Interface, there is an MSDN article on how to access it per PInvoke.
In order to capture the sound that is being played, you just need to open the playback device instead of the microphone. Open for input, of course, not for output ;-)
If you were using OSX, Audio Hijack Pro from Rogue Amoeba probably is the easiest way to go.
Anyway, why not just looping your audio back into your line in and recording that? This is a very simple solution. Just plug a cable in your audio output jack and your line in jack and start recordung.
You have to enable device stero mix. if you do this, direct sound find this device.

Resources