why is it necessary to send pcm samples to ALSA? - linux-kernel

I understand that if the hardware decoders are not present then we need to send pcm samples to alsa, but is it solely because of hardware limitation, or is there any requirement from ALSA as well ?

You need to use a sample format that is
known to ALSA, and
supported by the device.

Related

How Does Windows Handle Non-Realtime USB Audio Sources?

I am currently researching the feasibility of making a device that outputs PCM audio through the USB audio class streaming interface. This device has its own clock, and importantly does not generate samples at a multiple of 1 hz as the USB spec can specify, and produces packets in asynchronous mode. How does Windows handle it when a USB audio stream is consistently giving it samples at a rate above or below what the USB descriptor indicates, and at what level of the OS is this handled?
Second (and depending on the answer to the first question this may be already answered), the entire purpose of this project would be to capture this digital audio in its native format and sampling rate. What application Windows APIs would provide the exact PCM input from the USB audio stream, with no interpolation or other alterations or artifacts?
IDK about Windows in specific here, but Java on Windows would likely be set up to read the data as an AudioInputStream and would output as a SourceDataLine.
As far as timing issues, PCM processed by the SourceDataLine will be configured to a given sample rate and byte structure (configuration details provided in an AudioFormat class). The code underlying a SourceDataLine employs a buffer and a I think some sort of BlockingQueue or the native code implementation of something similar. I'm not entirely clear on this latter detail.
But the gist is that the SourceDataLine will suspend operation until it is able to fulfill its task. Thus, if the native code's DAC function is not ready, the SourceDataLine thread will suspend and wait until the output stage is ready to accept the next block of data for processing.
There are multiple transmission layers on the incoming data, much of which I don't know enough about. But I presume that if you have a way of assembling the incoming packets into a stream (with whatever buffering is required), then you should be able to receive and play PCM. Surely there are structures in C that would provide the equivalent functions of the Java classes I cited.

Hardware H264 encoding ID3D11Texture2D with Media Foundation

I am working on a project which captures screen and encodes it. I can already capture screen using desktop duplication API (Win8+). Using the API I can get ID3D11Texture2D textures and transfer them from GPU to CPU and then use libx264 to encode them.
However, pulling textures from GPU to CPU can be a bottle neck which potentially can decrease the fps. Also libx264 takes up CPU cycles (depending on quality) to encode frames. I am looking for encoding ID3D11Texture2D textures in GPU itself instead of using CPU for encoding as an optimization.
I have already checked the documentation and some sample codes but I have had no success. I would appreciate if someone could point me to some resource that does exactly what I want reliably.
Video encoders, hardware and software, might be available in different form factors. Windows itself offers extensible APIs with choice of encoders, and then additional encoders might be available as libraries and SDKs. You are already using one of such libraries (x264). Hardware encoders are typically vendor-specific and depend on available hardware, which is involved directly in the process of encoding. If you are interested in solution for specific hardware, it might make sense to check for respective SDK.
Otherwise, typical generic interface for hardware backed video encoding in Windows is Media Foundation Transform (MFT). Microsoft provides stock software only H.264 video encoder which is unlikely to give any advantage over x264, except the fact that it shares MFT interface with other options. Video hardware drivers, however, would often install additional MFTs for the hardware available, which add more MFTs backed by hardware implementation. Examples of such are:
IntelĀ® Quick Sync Video H.264 Encoder MFT
NVIDIA H.264 Encoder MFT
AMDh264Encoder
Offered by different vendors, they offer similar functionality and your using these MFTs to encode H.264 is a good way to take advantage of hardware video encoding with a wide range of hardware.
See also:
Registering and Enumerating MFTs
Basic MFT Processing Model
You have to check if sharing texture between GPU encoder and DirectX is possible.
I know it's possible to share texture between Nvidia Decoder and DirectX, because i've done it successfully. Nvidia has some interop capacity, so first, look if you can share texture to do all things in GPU.
With Nvidia you can do this : Nvidia Decoding->DirectX Display in GPU.
Check if DirectX Display->Nvidia Enconding is possible (knowing nvidia offers Interop)
For Intel and ATI, i don't know if they provide interop with DirectX.
The main thing to know is to check if you can interop your DirectX texture with GPU encoder texture.

how encode bitmap with H264 video encoder MFT in windows

My application do encoding of captured frame from GDI or DXGI method. currently i am doing encoding with help x264 library.
AFAIK x264 is software based library, i want to do encoding with help of GPU, so it can save CPU cycles and hope speed also will be faster.
After searching, I found a H.264 Video Encoder MFT which is doing h264 encoding.
But couple of questions are answered for me.
1) is It faster than x264 encoding library?
2) can bitmap frame be encoded with help this MFT?
- i have seen only MFVideoFormat_I420, MFVideoFormat_IYUV, MFVideoFormat_NV12, MFVideoFormat_YUY2, MFVideoFormat_YV12 these formats are supported
3) is it hardware accelerated(mean it's using CPU or GPU)?
- Initially my understanding was it uses GPU but i get confused after reading this post MFT Encoder (h264) High CPU utilization.
4) can H.264 Video Encoder MFT be used stand alone without using sink writer, as i have to sent data on network?
5) is there any other alternative in windows?
It might be some questions are very silly, please feel free to edit.
Media Foundation H.264 Video Encoder is software encoder. From my [subjective] experience it slower than x264 and, perhaps more important, x264 offers wider range of settings, specifically when it comes to choose modes on the speed over quality end of the range. Either way, stock MS encoder is not hardware accelerated.
However, there might be other MFTs available (typically installed with respective hardware drivers) that do hardware accelerated H.264 encoding. You can discover them by enumerating MFTs, perhaps most popular is Intel Quick Sync Video (QSV) Encoder.
HardwareVideoEncoderTransform app does the enumeration and provides you with relevant details:
Typical input is NV12, some offer other input choices (such as e.g. 32-bit RGB). If you need other formats, you will have to pre-convert the input.
Hardware backed encoders CPU consumption is low, and their efficiency depends on the hardware implementation. Yes, you can use them standalone, entirely standalone or wrapped as DirectShow filter and included in normal DirectShow pipeline.
Alternate H.264 encoders are typically SDK based, or wrappers over those SDKs in DirectShow/MFT form factors because vendors package their implementation in well-known forms already familiar to multimedia developers.

What API should be used for real-time audio in OSX?

I am looking to build an application that requires real-time control (or as good as possible) control of audio output in OSX.
I need the ability to send samples of audio to the sound-card with as much control as possible, no delays as when the audio frames are sent will be closely tied to a timer event run via the clock.
Is the Audio Queue what I am looking for?
Audio Units can be configured for the lowest latency (using short buffers of raw PCM sample) in OS X and iOS. The Audio Queue API is built on top of Audio Units, and may have buffering overhead, thus increasing latency.

How to get audio data from the Macbook microphone?

I am looking to write a small audio processing program, and I need some way to get audio input from the microphone in a Macbook.
Buffer polling? Notifications? What class/framework should I be aware of?
one of the easiest ways is with audio queues. its fairly abstracted, with a fair bit of doco and examples, simpler than delving into audio units, and the depths of core audio.
here is the official link.
Use Core Audio: http://developer.apple.com/mac/library/documentation/MusicAudio/Conceptual/CoreAudioOverview/Introduction/Introduction.html

Resources