bsnes multi-threaded WASAPI module by Fatbag

Revision 5, 2011-10-15:

Compatibility check

The WASAPI driver is for Vista and up only. If you are running a 64-bit version of Windows Vista, the driver will only work in a 64-bit compile of bsnes. On Windows 7 and up, you are not tied to 64-bit bsnes.

For best results, update your audio hardware driver such that it's modern enough to have received the WHQL certification for Windows 7.

How to configure the WASAPI driver for your system

Find the "settings.cfg" configuration file. Click Start, type in %appdata%\bsnes, and hit enter. Open the settings.cfg file with any text editor; even Notepad will work.

Then, look for the following settings and set them as follows:
video.synchronize = false audio.driver = "WASAPI" audio.synchronize = true audio.mute = false audio.volume = 100 audio.latency = 60 audio.inputFrequency = 32040

Setting the output frequency

Determine the native sampling rate of your audio device. This is simply the highest frequency output format supported by your audio device. To find this value, right-click the sound icon and choose Playback devices. Right-click the speakers you're using and choose Properties. Click the Advanced tab, and click the drop-down box under Default Format. Look for the item with the highest "Hz". You do not need to actually select it; the WASAPI driver will switch to it automatically and temporarily.

Record this value in the bsnes.cfg file for the setting "audio.outputFrequency".

Test it out

Save bsnes.cfg and then open no more than one instance of bsnes. You should not see an error message; if bsnes tells you that the audio driver failed to initialize, make sure you're using 64-bit bsnes if you're on 64-bit Vista, and that you have no other instances of bsnes running. If you still experience the problem (which at this point should occur only to very few people), try a lower output frequency supported by your system.

Play a ROM to test the audio and video. There should be minimal audio cracks and video skips.

Setting the input frequency

Run Ver Greeneye's Frequency Test tool to determine the best suited input audio frequency for your system. Don't test too rigorously yet; just run the tool once or twice and record the recommended value, rounded to the nearest whole (important), in bsnes.cfg for audio.inputFrequency.

Setting the latency

Lower the audio.latency value in bsnes.cfg and restart bsnes. You should repeat this process until you find what you think is the lowest latency value that does not give you audio cracks. Don't be afraid to settle with something a little high. The difference in playback lag is barely (if at all) perceivable, and you'll eliminate all of your actual clicking.

Fine tuning the input frequency

Experiment around with the audio.inputFrequency value in bsnes.cfg and restart bsnes. It's a game of Warmer/Colder. You should repeat this process until you believe you've found what gives you the least video skipping. devin's bsnesdemo.sfc ROM can help you spot video skipping.


What is audio latency?

Audio latency is how long it takes before a rendered sound is sent out to your speakers.

Why can't bsnes use 3ms latency when foobar2000 and other programs using WASAPI say that my system supports it?

I'm not denying that your system supports playing back audio at this latency. The issue isn't that you're using "stupid Windows". In fact, even a large number of useless processes and services running in the background has very, very little detriment on your latency.

The reason a larger latency is required is because some things that bsnes emulates take longer than 3ms to compute. Be aware that if, for whatever reason, less than 3ms of audio gets rendered in its 3ms time frame, the deadline is missed, the audio device plays back the only thing it's got, and you'll hear a click, unless it was computed longer behind in advance.

On average, bsnes is more than quick enough for the job. If it can play a game at double speed, it can render audio at double speed. But this is all average case. Dozens of times each second, jobs must be performed which each might consume just 2 measly milliseconds, and these outliers are enough to set back the audio past these tiny deadlines.

So, for clarification, it's not your operating system's bad sheduler. It's, precisely, the single-core performance of your processor that determines how low your latency can go.

If we can't have pro audio latency, what makes WASAPI better than XAudio2?

Lower audio latency means we have more accurate video synchronization in bsnes. Let me explain.

Compare bsnes with audio sync on, video sync off, and WASAPI initialized with (A) 5ms latency and double buffering, versus (B) 10ms latency with no double buffering. When the audio is submitted, the audio API reserves it all until it's due. However, video is serial. It needs to be timed so that each frame hits once per V-blank.

In Setup B, audio samples and video frames are rendered quickly until the audio buffer's length is satisfied, which is when everything stops for the remainder of the audio latency period. If it takes 4ms to complete this, the video frame due at the end of the audio period (10ms in the future) is displayed 4ms in, and then the emulator just stops what it's doing for the next 6ms. The visual skipping you see is the result of seeing the future frame too soon and for too long.

In Setup A, it still takes 4ms to render 10ms of audio and the frames that go with it. However, once this point is reached, it only takes 1ms to begin taking a small additional amount of audio and video. Rather than grabbing data quickly all at once, we've split the workload into tinier chunks, each time delayed this lesser amount by the audio driver.

Because WASAPI allows us to play back audio at lower latency, we can buffer more to bring it back up to XAudio2-level latency, giving us the benefit of a more serialized video rendition.

Does this driver use a lot of CPU? How will it affect bsnes performance?

It's demanding on thread execution time, which WASAPI gives us automatically (according to Microsoft's documentation on AvSetMmThreadCharacteristics). Recall that this driver is multithreaded. It runs on a thread separate from bsnes. This has three benefits: it does not steal time from the bsnes thread, it's allowed to submit a buffer rendered earlier to WASAPI at any time (not just when sample() is called, which may be many milliseconds in the future), and it can send silence to WASAPI when there are no more rendered buffers available (so, when you pause emulation, you don't hear ugly sawtooth noise).

As for actually using raw CPU power, it uses very little of that! If you open the task manager, you'll find bsnes still uses about one core.

It's recommended that you use a dual-core CPU. I know, I know, "bsnes is supposed to be about raw per-core performance, not about more cores", but these days, the only CPUs you'll find that are single-core are embedded and low-end.

Is bit-perfect audio output possible with this driver?

If your latency is high enough, almost. Recall that the audio had to be resampled from the very ugly non-whole sampling rate used by the SNES, itself reinterpreted as having been at your specified input frequency. As far as outputting to your speakers goes, the audio will not pass through the system audio mixer, so it will not face audio effects (such as the evil auto-normalizer, or on some systems, dynamic range compressor) nor dithering artifacts.

This (unfortunately?) means that all other running programs mute while bsnes is running. If you're an over-obsessed accuracy whore, enjoy!

Can I turn on video sync too?

Yes, the second thread means audio buffers can continue to be sent to WASAPI while the video thread is waiting for its chance to draw to the screen. Depending on your system and the numbers you have picked, you may or may not see smoother video when you have both synchronizations picked.