Is the format of a Windows audio loopback capture fixed? Or is it sound card dependent? - windows

I am using windows audio core api to do loopback-capture and then processing the data. On my machine I get 48000 sampling rate with 32 bit floats for the format. Is that what Windows is using internally? I'm wondering if I'm tapping the output before any hardware specific conversion so the format is always the same, or if I might be getting 16 bit ints on some other machine?

There is clearly some variation between machines, at least with respect to sample rate, as WASAPI on my machine gives 32-bit floats at 44100Hz. The documentation for GetMixFormat (remarks section, paragraphs 2 and 3) suggests that the supplied format is the internal format used for mixing, and that it may well differ from what the sound card actually accepts as input, but doesn't make clear exactly which formats may be used. I suspect that this is intentionally vague so as to encourage developers to handle multiple formats in the case that they may be used somewhere. That said, given they are abstracting the mix format from the sound card, I would be surprised if they used different internal formats on different machines.

Related

AFSK Encoding and Decoding in Ruby

I'm looking to write an 'Audio Frequency Shift Keying' (AFSK) encoder and decoder that could be used to transmit and receive data using radio/sound waves.
Bell 202 is the AFSK standard I'm trying to work from it encodes binary data into two tones 1200hz and 2200hz. These two tones stand in for the 1 and 0 in binary.
I'm trying to use win32-sound within ruby to generate those tones, which it does. The issue is that to switch between one tone and move to the other it takes about 1 second to complete which is far too slow for the 1200 baud speed that bell 202 is capable of operating at.
My question is: Is there any other method to create these tones and switch between them faster? Also while I'm here is there a gem that would allow me to decode those tones back into binary data?
I tried to write an AFSK modulator/demodulator in Ruby a while ago. I got a working prototype that could write and read WAV files but came nowhere near real time performance (I optimised it up to about 8x real time if I recall correctly).
Some tips:
Ruby + Audio pretty much sucks, you probably won't find many Gems for demodulation
For me, only a pure C solution (with a small Objective-C wrapper so I could use it in an iPhone app) could provide the necessary performance boost
Eliminating discontinuities in the encoded signal by using Continuous Phase FSK (CPFSK), helps a lot with the signal quality
I used a coherent demodulator followed by a state machine which worked out pretty well https://www.kth.se/polopoly_fs/1.141538!/Menu/general/column-content/attachment/lec9.pdf

The reason behind endianness?

I was wondering, why some architectures use little-endian and others big-endian. I remember I read somewhere that it has to do with performance, however, I don't understand how can endianness influence it. Also I know that:
The little-endian system has the property that the same value can be read from memory at different lengths without using different addresses.
Which seems a nice feature, but, even so, many systems use big-endian, which probably means big-endian has some advantages too (if so, which?).
I'm sure there's more to it, most probably digging down to the hardware level. Would love to know the details.
I've looked around the net a bit for more information on this question and there is a quite a range of answers and reasonings to explain why big or little endian ordering may be preferable. I'll do my best to explain here what I found:
Little-endian
The obvious advantage to little-endianness is what you mentioned already in your question... the fact that a given number can be read as a number of a varying number of bits from the same memory address. As the Wikipedia article on the topic states:
Although this little-endian property is rarely used directly by high-level programmers, it is often employed by code optimizers as well as by assembly language programmers.
Because of this, mathematical functions involving multiple precisions are easier to write because the byte significance will always correspond to the memory address, whereas with big-endian numbers this is not the case. This seems to be the argument for little-endianness that is quoted over and over again... because of its prevalence I would have to assume that the benefits of this ordering are relatively significant.
Another interesting explanation that I found concerns addition and subtraction. When adding or subtracting multi-byte numbers, the least significant byte must be fetched first to see if there is a carryover to more significant bytes. Because the least-significant byte is read first in little-endian numbers, the system can parallelize and begin calculation on this byte while fetching the following byte(s).
Big-endian
Going back to the Wikipedia article, the stated advantage of big-endian numbers is that the size of the number can be more easily estimated because the most significant digit comes first. Related to this fact is that it is simple to tell whether a number is positive or negative by simply examining the bit at offset 0 in the lowest order byte.
What is also stated when discussing the benefits of big-endianness is that the binary digits are ordered as most people order base-10 digits. This is advantageous performance-wise when converting from binary to decimal.
While all these arguments are interesting (at least I think so), their applicability to modern processors is another matter. In particular, the addition/subtraction argument was most valid on 8 bit systems...
For my money, little-endianness seems to make the most sense and is by far the most common when looking at all the devices which use it. I think that the reason why big-endianness is still used, is more for reasons of legacy than performance. Perhaps at one time the designers of a given architecture decided that big-endianness was preferable to little-endianness, and as the architecture evolved over the years the endianness stayed the same.
The parallel I draw here is with JPEG (which is big-endian). JPEG is big-endian format, despite the fact that virtually all the machines that consume it are little-endian. While one can ask what are the benefits to JPEG being big-endian, I would venture out and say that for all intents and purposes the performance arguments mentioned above don't make a shred of difference. The fact is that JPEG was designed that way, and so long as it remains in use, that way it shall stay.
I would assume that it once were the hardware designers of the first processors who decided which endianness would best integrate with their preferred/existing/planned micro-architecture for the chips they were developing from scratch.
Once established, and for compatibility reasons, the endianness was more or less carried on to later generations of hardware; which would support the 'legacy' argument for why still both kinds exist today.

fast encoding video codec?

can anybody compare popular video codecs by encoding speed? I understand that usually better compression requires more processing time, but it's also possible that some codecs still provide comparably good compression with fast encoding. any comparison links?
thanks for your help
[EDIT]: codecs can be compared by used algorithms, regardless of its particular implementation, hardware used or video source, something like big O for mathematical algorithms
When comparing VP8 and x264, VP8 also shows 5-25 times lower encoding speed with 20-30% lower quality at average. For example x264 High-Speed preset is faster and has higher quality than any of VP8 presets at average."
its tough to compare feature sets vs speed/quality.
see some quality comparison http://www.compression.ru/video/codec_comparison/h264_2012/
The following paragraph and image are from VP9 encoding/decoding performance vs. HEVC/H.264 by Ronald S. Bultje:
x264 is an incredibly well-optimized encoder, and many people still
use it. It’s not that they don’t want better bitrate/quality ratios,
but rather, they complain that when they try to switch, it turns out
these new codecs have much slower encoders, and when you increase
their speed settings (which lowers their quality), the gains go away.
Let’s measure that! So, I picked a target bitrate of 4000kbps for each
encoder, using otherwise the same settings as earlier, but instead of
using the slow presets, I used variable-speed presets (x265/x264:
–preset=placebo-ultrafast; libvpx: –cpu-used=0-7).
This is one of those topics where Your Mileage May Vary widely. If I were in your position, I'd start off with a bit of research on Wikipedia, and then gather the tools to do some testing and benchmarking. The source video format will probably affect overall encoding speed, so you should test with video that you intend to use on the Production system.
Video encoding time can vary widely depending on the hardware used, and whether you used an accelerator card, and so on. It's difficult for us to make any hard and fast recommendations without explicit knowledge of your particular set up.
The only way to make decisions like this, is to test these things yourself. I've done the same thing when comparing Virtualisation tools. It's fun too!

What is a good format for storing sounds on windows compressed?

Currently we use .wav files for storing our sounds with our product. However, these can get large. I know there are many different sound files out there, however what is the best sound file to use that will:
1) Work on all windows-based systems (XP+)
2) Doesn't add a lot of extra code (ie: including a 3 mb library to play mp3's will offset any gains I get from removing the .wav files)
3) Isn't GPL or some code I can't use (ideally just something in the windows SDK, or maybe just a different compression scheme for .wav that compresses better and works nicely with sndPlaySound(..) or something similar.
Any ideas would be appreciated, thanks!
While WAV files are typically uncompressed, they can be compressed with various codecs and still be played with the system API's. The largest factors in the overall size are the number of channels (mono or stereo), the sample rate (11k, 44.1k, etc), and the sample size (8 bit, 16 bit, 24 bit). This link discusses the various compression schemes supported for WAV files and associated file sizes:
http://en.wikipedia.org/wiki/WAV
Beyond that, you could resort to encoding the data to WMA files, which are also richly supported without third party libraries, but would probably require using the Windows Media SDK or DirectShow for playback.
This article discusses the WMA codecs and levels of compression that can be expected:
http://www.microsoft.com/windows/windowsmedia/forpros/codecs/audio.aspx
If the totality of the files is what 'gets large' rather than individual files, so that the time taken by the extra step does not prevent timely action, you might consider zipping up the files yourself and unzipping them as needed. I realize this sounds, and in many cases may be, inefficient, but if mp3 is ruled out it may be worth looking at depending on other (not mentioned in your question) considerations.
I'd look at DirectShow and see if you can use the DirectShow MP3 or WMA codecs to compress the audio stream. All the DLLs are in-box on Windows so there's no additional redistributable needed.

decoding 802.11 b

I have a raw grabbed data from spectrometer that was working on wifi (802.11b) channel 6.
(two laptops in ad-hoc ping each other).
I would like to decode this data in matlab.
I see them as complex vector with 4.6 mln of complex samples.
I see their spectrum quite nice. I am looking document a bit less complicated as IEEE 802.11 standard (which I have).
I can share measurement data to other people.
There's now a few solutions around for decoding 802.11 using Software Defined Radio (SDR) techniques. As mentioned in a previous answer there is software that is based on gnuradio - specifically there's gr-ieee802-11 and also 802.11n+. Plus the higher end SDR boards like WARP utilise FPGA based implementations of 802.11. There's also a bunch of implementations of 802.11 for Matlab available e.g. 802.11a.
If your data is really raw then you basically have to build every piece of the signal processing chain in software, which is possible but not really straightforward. Have you checked the relevant wikipedia page? You might use gnuradio instead of starting from scratch.
I have used 802.11 IEEE standard to code and decode data on matlab.
Coding data is an easy task.
Decoding is a bit more sophisticated.
I agree with Stan, it is going to be tough doing everything yourself. you may get some ideas from the projects on CGRAN like :
https://www.cgran.org/wiki/WifiLocalization

Resources