simple .wav or .mp3 playbck in Windows - where has it gone? - winapi

Is there a "modern" replacement for the old Windows sndPlaySound() function, which was a very convenient way of playing a .wav file in the background while you focused on other matters? I now find myself needing to play an .mp3 file in the background and am wondering how to accomplish the same thing in a relatively easy way that the system supports inherently. Perhaps there's a COM component to acccomplish basic .mp3 playback?

Over years there have been a few audio and media related APIs and there are a few ways to achieve the goal.
The best in terms of absence of third party libs, best OS version coverage, feature set and simplicity is DirectShow API. 15 years old and still beats the hell out of rivals, supported in all versions of Windows that current and a few of previous versions of Visual Studio could target, except WinRT.
The code snippet below plays MP3 and WMA files. It is C++ however since it is all COM it is well portable across languages.
#include "stdafx.h"
#include <dshow.h>
#include <dshowasf.h>
#include <atlcom.h>
#pragma comment(lib, "strmiids.lib")
#define V(x) ATLVERIFY(SUCCEEDED(x))
int _tmain(int argc, _TCHAR* argv[])
{
static LPCTSTR g_pszPath = _T("F:\\Music\\Cher - Walking In Memphis.mp3");
V(CoInitialize(NULL));
{
CComPtr<IGraphBuilder> pGraphBuilder;
V(pGraphBuilder.CoCreateInstance(CLSID_FilterGraph));
CComPtr<IBaseFilter> pBaseFilter;
V(pBaseFilter.CoCreateInstance(CLSID_WMAsfReader));
CComQIPtr<IFileSourceFilter> pFileSourceFilter = pBaseFilter;
ATLASSERT(pFileSourceFilter);
V(pFileSourceFilter->Load(CT2COLE(g_pszPath), NULL));
V(pGraphBuilder->AddFilter(pBaseFilter, NULL));
CComPtr<IEnumPins> pEnumPins;
V(pBaseFilter->EnumPins(&pEnumPins));
CComPtr<IPin> pPin;
ATLVERIFY(pEnumPins->Next(1, &pPin, NULL) == S_OK);
V(pGraphBuilder->Render(pPin));
CComQIPtr<IMediaControl> pMediaControl = pGraphBuilder;
CComQIPtr<IMediaEvent> pMediaEvent = pGraphBuilder;
ATLASSERT(pMediaControl && pMediaEvent);
V(pMediaControl->Run());
LONG nEventCode = 0;
V(pMediaEvent->WaitForCompletion(INFINITE, &nEventCode));
}
CoUninitialize();
return 0;
}
If you are playing your own files you are sure to not contain large ID3 tag sections, the code might be twice as short.

A simple answer to a lot of problems like this is to simply call out to a command line program with system("play.exe soundfile.mp3") or equivalent. Just treat the command line as another API, an API that is has extensive functionality and is standard, portable, flexible, easy to debug and easy to modify. It may not be as efficient as calling a library function but that often doesn't matter, particularly if the program being called is already in the disk cache. Incidentally, avoid software complexity just because it's "modern"; often that's evidence of an architecture astronaut and poor programming practice.

When you say "Modern", do you mean a Windows 8 WinRT API? Or do you mean, "an API slightly newer than the ones invented for Windows 3.1"?
A survey of audio and video apis can be found here
For classic Windows desktop applications, there's PlaySound, which can play any WAV file.
For MP3, my team invented a solution using DirectSound and the Windows Media Format SDK. The latter can decode any WMA and MP3 file. We fed the audio stream directly into a DSound buffer. This is not for the faint of heart.
You could likely use the higher level alternative, the Windows Media Player API.
DirectShow is a very legacy alternative, but is easy to get something up and working. Media Foundation is the replacement for DirectShow.

Related

how to play audio output to a device using ffmpeg's avdevice library?

How can I use the ffmpeg's avdevices c library to output audio to an audio device (specifically alsa). All I could find is its doxygen and the only useful thing I was able to take out of it is, quote
"the (de)muxers in libavdevice are of the AVFMT_NOFILE type (they use their own I/O functions). The filename passed to avformat_open_input() often does not refer to an actually existing file, but has some special device-specific meaning - e.g. for xcbgrab it is the display name."
but I don't understand where do I specify AVFMT_NOFILE and where do I specify which device I want use.I see how I can get an 'AVOutputFormat' pointer but then what do I do with it?
update:
so now i found the function 'avformat_alloc_output_context2' so my code looks like this:
AVPacket pkt = av_packet_alloc();
avformat_alloc_output_context2(&ofmt_ctx, NULL, "alsa", NULL);
avformat_new_stream(ofmt_ctx, NULL);
while(av_read_frame(fmt_ctx, pkt) == 0){
av_write_frame(ofmt_ctx, pkt);
}
fmt_ctx is the input file's AVFormatContext.
but I am still getting an error '[alsa # 0x555daf361140] Invalid packet stream index: 1' what am I missing?
The OP is almost two years ago... This is a rabbit hole because of the complexity of an open sourced C language based software released 14+ years ago with 1k+ contributors that supports multiple operating systems through layers of abstraction, one of those being the AVFMT_NOFILE type. There is a reason why audio and video on operating systems take time to get it right.
I'm looking into this now and my suggestion is to start here:
https://github.com/FFmpeg/FFmpeg/tree/master/doc/examples
These are demo examples on specific functionality. A quick search on AVFMT_NOFILE brought up this:
https://github.com/FFmpeg/FFmpeg/blob/b6aeee2d8bedcd8cfc6aa91cc124c904a78adb1e/doc/examples/remuxing.c#L132
For examples on finding devices depending on OS, check this link:
https://github.com/rbrisita/codio-sui/wiki/FFmpeg#list-devices
Share your findings if you can!
Godspeed.

Does ios_base::sync_with_stdio(false) affect <fstream>?

It is well known that ios_base::sync_with_stdio(false) will help the performance of cin and cout in <iostream> by preventing sync b/w C and C++ I/O. However, I am curious as to whether it makes any difference at all in <fstream>.
I ran some tests with GNU C++11 and the following code (with and without the ios_base::sync_with_stdio(false) snippet):
#include <fstream>
#include <iostream>
#include <chrono>
using namespace std;
ofstream out("out.txt");
int main() {
auto start = chrono::high_resolution_clock::now();
long long val = 2;
long long x=1<<22;
ios_base::sync_with_stdio(false);
while (x--) {
val += x%666;
out << val << "\n";
}
auto end = chrono::high_resolution_clock::now();
chrono::duration<double> diff = end-start;
cout<<diff.count()<<" seconds\n";
return 0;
}
The results are as follows:
With sync_with_stdio(false): 0.677863 seconds (average 3 trials)
Without sync_with_stdio(false): 0.653789 seconds (average 3 trials)
Is this to be expected? Is there a reason for a nearly identical, if not slower speed, with sync_with_stdio(false)?
Thank you for your help.
The idea of sync_with_stdio() is to allow mixing input and output to standard stream objects (stdin, stdout, and stderr in C and std::cin, std::cout, std::cerr, and std::clog as well as their wide character stream counterparts in C++) without any need to worry about characters being buffered in any of the buffers of the involved objects. Effectively, with std::ios_base::sync_with_stdio(true) the C++ IOStreams can't use their own buffers. In practice that normally means that buffering on std::streambuf level is entirely disabled. Without a buffer IOStreams are rather expensive, though, as they process individual character involving potentially multiple virtual function calls. Essentially, the speed-up you get from std::ios_base::sync_with_stdio(false) is allowing both the C and C++ library to user their own buffers.
An alternative approach could be to share the buffer between the C and C++ library facilities, e.g., by building the C library facilities on top of the more powerful C++ library facilities (before people complain that this would be a terrible idea, making C I/O slower: that is actually not true at all with a proper implementation of the standard C++ library IOStreams). I'm not aware of any non-experimental implementation which does use that. With this setup std::ios_base::sync_with_stdio(value) wouldn't have any effect at all.
Typical implementations of IOStreams use different stream buffers for the standard stream objects from those used for file streams. Part of the reason is probably that the standard stream objects are normally not opened using a name but some other entity identifying them, e.g., a file descriptor on UNIX systems and it would require a "back door" interface to allow using a std::filebuf for the standard stream objects. However, at least early implementations of Dinkumware's standard C++ library which shipped (ships?), e.g., with MSVC++, used std::filebuf for the standard stream objects. This std::filebuf implementation was just a wrapper around FILE*, i.e., literally implementing what the C++ standard says rather than semantically implementing it. That was already a terrible idea to start with but it was made worse by inhibiting std::streambuf level buffering for all file streams with std::ios_base::sync_with_stdio(true) as that setting also affected file streams. I do not know whether this [performance] problem was fixed since. Old issue in the C/C++ User Journal and/or P.J.Plauger's "The [draft] Standard C++ Library" should show a discussion of this implementation.
tl;dr: According to the standard std::ios_base::sync_with_stdio(false) only changes the constraints for the standard stream objects to make their use faster. Whether it has other effects depends on the IOStream implementation and there was at least one (Dinkumware) where it made a difference.

OSX CoreAudio play through, no Graph API, no CAPublicUtility

This question is not about AU plugins, but about integrating audio units as building blocks of standalone application programs. After much trying I can't figure out what would be the simplest "graphless" connection of two AudioUnits, which would function as a "playthrough".
I understand how powerful and sufficient a single audio unit of subtype kAudioUnitSubType_HALOutput can be in capturing, rendering, live-processing and forwarding of any audio input data. However, play through seems functional as long as working with either full-duplex audio hardware or creating aggregate i/o device from built-in devices on user level.
However, built-in devices are not full duplex and "aggregating" them also has certain disadvantage. Therefore I've decided to study a hard-coded two-unit connection possibility (without plunging into Graph API), and test its behavior with non-full-duplex hardware.
Unfortunately, I have found neither comprehensive documentation nor example code for creating a simplest two-unit play through using only the straightforward connecting paradigm, as suggested in Apple Technical Note TN2091:
AudioUnitElement halUnitOutputBus = 1; //1 suggested by TN2091 (else 0)
AudioUnitElement outUnitInputElement = 1; //1 suggested by TN2091 (else 0)
AudioUnitConnection halOutToOutUnitIn;
halOutToOutUnitIn.sourceAudioUnit = halAudioUnit;
halOutToOutUnitIn.sourceOutputNumber = halUnitOutputBus;
halOutToOutUnitIn.destInputNumber = outUnitInputElement;
AudioUnitSetProperty (outAudioUnit, // connection destination
kAudioUnitProperty_MakeConnection, // property key
kAudioUnitScope_Input, // destination scope
outUnitInputElement, // destination element
&halOutToOutUnitIn, // connection definition
sizeof (halOutToOutUnitIn)
);
My task is to keep off involving Graphs if possible, or even worse, CARingBuffers from so-called PublicUtility, which used to be plagued by bugs and latency issues for years and involve some ambitious assumptions, such as:
#if TARGET_OS_WIN32
#include <windows.h>
#include <intrin.h>
#pragma intrinsic(_InterlockedOr)
#pragma intrinsic(_InterlockedAnd)
#else
#include <CoreFoundation/CFBase.h>
#include <libkern/OSAtomic.h>
#endif
Thanks in advance for any hint which may point me in the right direction.

windows audio waveOutSetVolume cross connects midiOutSetVolume

i have a program that generates both midi and wav audio. i need to be able to control the volume and balance of midi and audio separately and in theory, its seems like all i need to do is call
unsigned short left = (unsigned short)(wavvol*wavbal/100.0);
unsigned short right = (unsigned short)(wavvol*(100-wavbal)/100.0);
MMRESULT err = waveOutSetVolume(hWaveOut, left | (right<<16)); // for audio
and
unsigned short left = (unsigned short)(midivol*midibal/100.0);
unsigned short right = (unsigned short)(midivol*(100-midibal)/100.0);
MMRESULT err = midiOutSetVolume(s_hmidiout, left | (right<<16)); // for midi
for midi
the problem it, controlling midi volume sets wave volume and visa-verse, its like they are glues together inside windows
does anyone know if there is a way to unglue them?
BTW, i'm on windows 7, i know Microsoft messed up audio in win7. on XP i had an audio control panel with separate controls for midi and wave, that seems to have gone. i guess they just mix it down internally now and don't expose that even at the API level so maybe i've answered my own question.
still interested to know if there is a better answer through.
thanks, steve
I don't think they can be separated. You could move to the newer IAudioClient interface and use two WASAPI sessions to control the volume separately - one for wav and one for midi. This won't work on anything below Vista tho.
Alternatively you could track the volume levels in-code and as long as you don't play back both wav and midi at the same time reset them before playback.

Are the *A Win32 API calls still relevant?

I still see advice about using the LPTSTR/TCHAR types, etc., instead of LPWSTR/WCHAR. I believe the Unicode stuff was well introduced at Win2k, and I frankly don't write code for Windows 98 anymore. (Excepting special cases, of course.) Given that I don't care about Windows 98 (or, even less, ME) as they're decade old OS, is there any reason to use the compatibility TCHAR, etc. types? Why still advise people to use TCHAR - what benefit does it add over using WCHAR directly?
If someone tells you to walk up to 1,000,000 lines of non-_UNICODE C++, with plenty of declarations using char instead of wchar_t or TCHAR or WCHAR, you had better be prepared to cope with the non-Unicode Win32 API. Conversion on a large scale is quite costly, and may not be something the source-o-money is prepared to pay for.
As for new code, well, there's so much example code out there using TCHAR that it may be easier to cut and paste, and there is in some cases some friction between WCHAR as wchar_t and WCHAR as unsigned short.
Who knows, maybe some day MS will add a UTF-32 data type under TCHAR?
Actually, the unicode versions of functions were introduced with Win32 in 1993 with Windows NT 3.1. In fact, on the NT based oses, almost all the *A functions just convert to Unicode and call the *W version internally. Also, support for the *W functions on 9x does exist through Microsoft Layer for Unicode.
For new programs, I would definately recommend using the TCHAR macros or WCHARs directly. I doubt MS will be adding support for any other character sizes during NT's lifetime. For existing code bases, I guess it would depend on how important it is to support Unicode vs cost of fixing it. The *A functions need to stay in Win32 forever for backward compatibility.

Resources