bsnes multi-threaded WASAPI module by Fatbag

What is audio latency?

Audio latency is how long it takes before a rendered sound is sent out to your speakers.

Why can't bsnes use 3ms latency when foobar2000 and other programs using WASAPI say that my system supports it?

I'm not denying that your system supports playing back audio at this latency. The issue isn't that you're using "stupid Windows". In fact, even a large number of useless processes and services running in the background has very, very little detriment on your latency.

The reason a larger latency is required is because some things that bsnes emulates take longer than 3ms to compute. Be aware that if, for whatever reason, less than 3ms of audio gets rendered in its 3ms time frame, the deadline is missed, the audio device plays back the only thing it's got, and you'll hear a click, unless it was computed longer behind in advance.

On average, bsnes is more than quick enough for the job. If it can play a game at double speed, it can render audio at double speed. But this is all average case. Dozens of times each second, jobs must be performed which each might consume just 2 measly milliseconds, and these outliers are enough to set back the audio past these tiny deadlines.

So, for clarification, it's not your operating system's bad sheduler. It's, precisely, the single-core performance of your processor that determines how low your latency can go.

If we can't have pro audio latency, what makes WASAPI better than XAudio2?

Lower audio latency means we have more accurate video synchronization in bsnes. Let me explain.

Compare bsnes with audio sync on, video sync off, and WASAPI initialized with (A) 5ms latency and double buffering, versus (B) 10ms latency with no double buffering. When the audio is submitted, the audio API reserves it all until it's due. However, video is serial. It needs to be timed so that each frame hits once per V-blank.

In Setup B, audio samples and video frames are rendered quickly until the audio buffer's length is satisfied, which is when everything stops for the remainder of the audio latency period. If it takes 4ms to complete this, the video frame due at the end of the audio period (10ms in the future) is displayed 4ms in, and then the emulator just stops what it's doing for the next 6ms. The visual skipping you see is the result of seeing the future frame too soon and for too long.

In Setup A, it still takes 4ms to render 10ms of audio and the frames that go with it. However, once this point is reached, it only takes 1ms to begin taking a small additional amount of audio and video. Rather than grabbing data quickly all at once, we've split the workload into tinier chunks, each time delayed this lesser amount by the audio driver.

Because WASAPI allows us to play back audio at lower latency, we can buffer more to bring it back up to XAudio2-level latency, giving us the benefit of a more serialized video rendition.

Does this driver use a lot of CPU? How will it affect bsnes performance?

It's demanding on thread execution time, which WASAPI gives us automatically (according to Microsoft's documentation on AvSetMmThreadCharacteristics). Recall that this driver is multithreaded. It runs on a thread separate from bsnes. This has three benefits: it does not steal time from the bsnes thread, it's allowed to submit a buffer rendered earlier to WASAPI at any time (not just when sample() is called, which may be many milliseconds in the future), and it can send silence to WASAPI when there are no more rendered buffers available (so, when you pause emulation, you don't hear ugly sawtooth noise).

As for actually using raw CPU power, it uses very little of that! If you open the task manager, you'll find bsnes still uses about one core.

It's recommended that you use a dual-core CPU. I know, I know, "bsnes is supposed to be about raw per-core performance, not about more cores", but these days, the only CPUs you'll find that are single-core are embedded and low-end.

Is bit-perfect audio output possible with this driver?

If your latency is high enough, almost. Recall that the audio had to be resampled from the very ugly non-whole sampling rate used by the SNES, itself reinterpreted as having been at your specified input frequency. As far as outputting to your speakers goes, the audio will not pass through the system audio mixer, so it will not face audio effects (such as the evil auto-normalizer, or on some systems, dynamic range compressor) nor dithering artifacts.

This (unfortunately?) means that all other running programs mute while bsnes is running. If you're an over-obsessed accuracy whore, enjoy!

Can I turn on video sync too?

Yes, the second thread means audio buffers can continue to be sent to WASAPI while the video thread is waiting for its chance to draw to the screen. Depending on your system and the numbers you have picked, you may or may not see smoother video when you have both synchronizations picked.

bsnes multi-threaded WASAPI module by Fatbag

Revision 5, 2011-10-15: http://www.phppoll.org/WASAPI/bsnes_wasapi.7z

Compatibility check

How to configure the WASAPI driver for your system

Setting the output frequency

Test it out

Setting the input frequency

Setting the latency

Fine tuning the input frequency

Questions