What is a good format for storing sounds on windows compressed? - winapi

Currently we use .wav files for storing our sounds with our product. However, these can get large. I know there are many different sound files out there, however what is the best sound file to use that will:
1) Work on all windows-based systems (XP+)
2) Doesn't add a lot of extra code (ie: including a 3 mb library to play mp3's will offset any gains I get from removing the .wav files)
3) Isn't GPL or some code I can't use (ideally just something in the windows SDK, or maybe just a different compression scheme for .wav that compresses better and works nicely with sndPlaySound(..) or something similar.
Any ideas would be appreciated, thanks!

While WAV files are typically uncompressed, they can be compressed with various codecs and still be played with the system API's. The largest factors in the overall size are the number of channels (mono or stereo), the sample rate (11k, 44.1k, etc), and the sample size (8 bit, 16 bit, 24 bit). This link discusses the various compression schemes supported for WAV files and associated file sizes:
http://en.wikipedia.org/wiki/WAV
Beyond that, you could resort to encoding the data to WMA files, which are also richly supported without third party libraries, but would probably require using the Windows Media SDK or DirectShow for playback.
This article discusses the WMA codecs and levels of compression that can be expected:
http://www.microsoft.com/windows/windowsmedia/forpros/codecs/audio.aspx

If the totality of the files is what 'gets large' rather than individual files, so that the time taken by the extra step does not prevent timely action, you might consider zipping up the files yourself and unzipping them as needed. I realize this sounds, and in many cases may be, inefficient, but if mp3 is ruled out it may be worth looking at depending on other (not mentioned in your question) considerations.

I'd look at DirectShow and see if you can use the DirectShow MP3 or WMA codecs to compress the audio stream. All the DLLs are in-box on Windows so there's no additional redistributable needed.

Related

How to sequentially extract video segments from mp4/mkv video files?

Is there a way to programmatically walk through a file and extract (for example) consecutive 10-second fragments from it into separate files?
A colleague is currently using ffmpeg for this, but I'm thinking that invoking this command once per segment will result in a lot of file seeking that may not be necessary.
Or am I wrong, is there some kind of time index in these files that makes the seeks very fast?
The language currently used for calling ffmpeg is Ruby, so a compiled C library might work, and a Java library would definitely work (with JRuby).

Is the format of a Windows audio loopback capture fixed? Or is it sound card dependent?

I am using windows audio core api to do loopback-capture and then processing the data. On my machine I get 48000 sampling rate with 32 bit floats for the format. Is that what Windows is using internally? I'm wondering if I'm tapping the output before any hardware specific conversion so the format is always the same, or if I might be getting 16 bit ints on some other machine?
There is clearly some variation between machines, at least with respect to sample rate, as WASAPI on my machine gives 32-bit floats at 44100Hz. The documentation for GetMixFormat (remarks section, paragraphs 2 and 3) suggests that the supplied format is the internal format used for mixing, and that it may well differ from what the sound card actually accepts as input, but doesn't make clear exactly which formats may be used. I suspect that this is intentionally vague so as to encourage developers to handle multiple formats in the case that they may be used somewhere. That said, given they are abstracting the mix format from the sound card, I would be surprised if they used different internal formats on different machines.

what FFMPEG performance settings to use for processing videos for the web

I have a few questions regarding usage of ffmpeg for processing videos for the web. I'm a beginner so please bear with me (although I read some docs on the internet)
Performance
First of all, given the fact that FFMPEG utilizes all cores at 100%, what is the actual parallelism efficiency?
Let's assume the following scenario. I have a video (fullHD, doesn't matter what encoders / compression format was used to obtain the video) and I want to resize (downscale) to various sizes (e.g. 240px, 480px and 720px height) using mp4 format (thus using libx264 with aac codecs).
Using ffmpeg, I see that all of my laptop's cores (8) are used at 100% and I was wondering what scenarios can improve the overall performance of the whole processing task. So this leads us to basically 2 scenarios: Assuming the video mentioned above as input, for obtaining the 3 output videos (# 240px, 480px and 720px height sizes), we:
Process input video and obtain 1 output video at a time, and let all the cores work at the same time at 100%;
Process the video to obtain all output videos in parallel, by bounding each output video to a single processor core which'll work at 100%;
So the question is actually reduced to the parallelism efficiency of the ffmpeg program.
This means that letting ffmpeg process the task procVideo - which takes 1 input video to produce 1 single output video (transcoding/downscaling and so on) - on N processor cores doesn't mean it finish the task N times faster than letting it run the same task bound to a single core. So if the efficiency is smaller than 100%, it's better to have N procVideo tasks in parallel, each bound to a single core, rather than doing the task sequentially for each output video.
Codecs
Other than the above performance problem, the usage of codecs bugs me. I am trying to obtain mp4 videos because of the wide implementation of the format in html5 browsers.
So having a video as input in any format, I want to convert it to mp4. So I'm using libx264 codec with aac.
Use libx264, x264 or h264 for video encoding/decoding?
Use libfdk_aac, libaacplus or aac for audio encoding/decoding to aac?
Also, I would like to know what are the licesing fees for each of the above codec, as the online resources on these are quite limited / hard to understand.
If anyone could shed some light on those questions, I would really be grateful! Thanks for your time!
There are a few unrelated questions here.
FFmpeg performance
All that follows is based on my personal experience, and is by no means empirical evidence.
Try as you might, you'll be very hard-pressed to find a software that is more optimized than FFmpeg in performance.
Also keep in mind that most of the work in this case will be done by libx264, which is very mature and insanely fast. (Just try to encode an equivalent video to H.265 using ffmpeg and the not-quite-mature-yet x265, and you'll understand what I mean).
So in summary, you can assume that a single encoding is as fast as possible on the machine, and parallelizing will not improve anything.
An alternative solution to test is to ask ffmpeg to encode several files in a single invocation, so that the decoding part of the pipeline is only done once, as explained here: https://trac.ffmpeg.org/wiki/Creating%20multiple%20outputs.
In the end, you should test each case by carefully measuring the total encoding time for each scenario.
Codecs
x264 and libx264 are one and the same, the difference being that the latter is used by ffmpeg instead of being a standalone tool.
By "h264", I'm not sure what you mean. In ffmpeg, h264 is only a decoder, while libx264 is the encoder. You don't have much choice there.
About AAC, all essential information is present in this web page: https://trac.ffmpeg.org/wiki/Encode/AAC
So if you can obtain a build of ffmpeg linked against libfdk_aac, this is the safest bet for good quality audio.
License fees
This is a very sensitive subject. Most people will outright refuse to give you advice, and I'm no exception, because any legal advice implies liability in case of litigation.
To sum things up, see the following urls:
https://en.wikipedia.org/wiki/H.264/MPEG-4_AVC#Patent_licensing
https://en.wikipedia.org/wiki/Advanced_Audio_Coding#Licensing_and_patents
Some might argue that the difficulty of understanding the information is somehow done on purpose in order to confuse the general public.

What is the fastest way to combine audio files on a web server?

Disclaimer: Forgive my ignorance of audio/sound processing, my background is web and mobile development and this is a bespoke requirement for one of my clients!
I have a requirement to concatenate 4 audio files, with a background track playing behind all 4 audio files. The source audio files can be created in any format, or have any treatment applied to them, to improve the processing time, but the output quality is still important. For clarity, the input files could be named as follows (.wav is only an example format):
background.wav
segment-a.wav
segment-b.wav
segment-c.wav
segment-d.wav
And would need to be structured something like this:
[------------------------------background.wav------------------------------]
[--segment-a.wav--][--segment-b.wav--][--segment-c.wav--][--segment-d.wav--]
I have managed to use the SoX tool to achieve the concatenation portion of the above using MP3 files, but on a reasonably fast computer I am getting roughly an hours worth of concatenated audio per minute of processing, which isn't fast enough for my requirements, and I haven't applied the background sound or any 'nice to haves' such as trimming/fading yet.
My questions are:
Is SoX the best/only tool for this kind of operation?
Is there any way to make the process faster without sacrificing (too much) quality?
Would changing the input file format result in improved performance? If so, which format is best?
Any suggestions from this excellent community would be much appreciated!
Sox may not be the best tool, but I doubt you will find anything much better without hand-coding.
I would venture to guess that you are doing pretty well to process that much audio in that time. You might do better, but you'll have to experiment. You are right that probably the main way to improve speed is to change the file format.
MP3 and OGG will probably give you similar performance, so first identify how MP3 compares to uncompressed audio, such as wav or aiff. If MP3/OGG is better, try different compression ratios and sample rates to see which goes faster. With wav files, you can try lowering the sample rate (you can do this with MP3/OGG as well). If this is speech, you can probably go as low as 8kHz, which should speed things up considerably. For music, I would say 32kHz, but it depends on the requirements. Also, try mono instead of stereo, which should also speed things up.

Quick, multi-OS, command line conversion of JPEG-2000 to JPEG

I am working on a web script that handles image processing using ImageMagick. It takes relevant parameters, executes an ImageMagick command at the command line or shell depending on OS, and passes the raw image data back to the script. The language of the web script is obviously not pertinent.
Simple use cases include:
convert -resize 750 H:/221136.png - which just resizes the input image to 750 width and outputs the raw data to the console. More complex use cases involve rotating, resizing, cropping/panning, and drawing.
The script works great and is quite fast for PNG, GIF, and JPEG inputs, even at fairly large (4000x5000 resolutions). Unfortunately my input data also includes JPEG-2000. A 10-15 Megabyte JPEG2000 takes a truly insane amount of time for ImageMagick to process, in the order of 10-15 seconds. It is not suitable for live, on the fly conversion.
I know quick conversion of JPEG-2000 to JPEG for web output is possible, because a piece of Enterprise software I work with does it fairly on-the-fly. I'm not sure which library they use--the DLL/so they use is DL80JP2KLib.dll/.so. Looking it up, it seems that a company called DataLogic makes this, but they don't seem to have any obviously relevant programs on their site.
Ideally I'm looking for a solution (plug-in?) that would either enable ImageMagick to convert these high resolution JPEG-2000 images on-the-fly like it does with PNG, GIF, or JPEG... or a separate command utility that I can run in advance of ImageMagick to convert the JPEG-2000 to an intermediate format that ImageMagick can process quickly.
The servers that will run this script have 32 gigs of RAM and beefy processors. Assume that speed of conversion is more important than resource usage efficiency. Assume also that while I need some semblance of quality, image lossyness is not an urgent thing. Licensing requirements and/or price are not important, except that I need to be able to test it myself for speed on a few sample files before we buy. The ideal solution is also (relatively) OS independent
I tried an application from Kakadu Software and it's fairly quick, in the order of 3-4 seconds, but that's still not fast enough. If it's not possible to get below, say, one second, I will look at batch converting files in advance.
I have uploaded a representative file (JPEG-2000, ~8MB) to MediaFire:
http://www.mediafire.com/?yxv0j6vdwx0k996
I found exact image to be much faster in the past.
http://www.exactcode.de/site/open_source/exactimage/
Mark Tyler (original author of mtPaint) once split out the excellent graphics handling parts into a separate library (mtpixel ...since abandoned as a separate project, but included in mtcelledit # its Google code home)

Resources