Paving the way for the next generation audio codec for TRUE Wireless Stereo (TWS) applications - PART 4 : Achieving the ultimate audio experience

By Hai YU, Clément MOULIN (Dolphin Design)

In this fourth part of Paving the way for the next generation audio codec for True Wireless Stereo (TWS) applications whitepaper, the pure audio performance will be discussed.

Figure 1 highlights the function under discussion amongst the main functionalities embedded in earbuds

Fig.1: Functional blocks of a typical TWS earbud chip

1- Enhanced audio performance

Sound quality remains the number one purchase driver for both Bluetooth and smart speakers. According to “The State of Play Report 2020” (a global analysis of user behaviors and desires driving consumer audio) by Qualcomm, “Sound quality remains the single most important factor for audio lovers worldwide, as it has for several years — and is the only purchase criterion that is more influential to consumers that we surveyed than price”.

Many factors come into play for an enhanced customer experience and high-resolution audio quality for TWS chips (TWS earbuds/headsets), such as the key characteristics when selecting the appropriate microphone and speaker, the BLE (Bluetooth Low Energy) connection link, the codec format, the noise cancellation/reduction software used by the specific signal processing accelerator or hardware (e.g., DSP), etc. This paper focuses mainly on the key characteristics that should be considered in advance, to evaluate the audio codec IP for an enhanced audio user experience with TWS earbud applications. In this section, we discuss the pivotal considerations of audio codec IP, which are vital to ensure the high audio quality and user experience.

To achieve an enhanced, high-resolution audio quality, the choice of audio codec is critical as it is responsible for the audio/voice signal conversion (analog-to-digital and digital-to-analog) even though other factors also play a key role; however, the matching between the ADC and microphone, the matching between the DAC (headphone output) with the speaker, the audio signal conversion quality, etc. are the fundamental elements along the entire audio signal processing chain.

Fig.2: True Wireless Stereo (TWS) earbuds use (from: The State of Play Report 2020 Qualcomm)

2- Which architecture for TWS applications?

How does one choose the most appropriate ADC architecture for Audio-/Voice-First Devices (TWS earbuds or headsets) and their applications among so many different analog-to-digital converters (ADC) architectures? There are successive approximation register (SAR) ADCs, Sigma-Delta ADCs (or Delta-Sigma Modulator ADC, DSM), pipeline ADCs, integrated ADCs, and so on. The pipeline ADCs are mostly used for the highest sample rates such as RF applications and software-defined radios. Many TWS chip designer or SoC integrators have no idea how to select the right ADCs from the remaining topologies SAR ADC and Sigma-Delta ADC architecture for their digital audio applications.

First and foremost, the Sigma-delta converters have an innate advantage over SAR ADCs: they require no special trimming or calibration, even to attain 16 to 24 bits of resolution. In addition, they do not require anti-aliasing filters with steep roll-offs at the analog inputs as their sampling rate, much higher than the effective bandwidth (based on the over-sampling architecture) allow to embedded digital decimation filters.
One of the subtle issues is that most TWS SoC integrators select a SAR ADC is because of its less complex architecture and lower power consumption, compared to conventional Sigma-delta ADCs since the silicon area and power consumption is very demanding in TWS applications. This lower power consumption can lead to limited audio performance, however although SAR ADCs can achieve the same level of audio performance (SNR, THD, THD+N, etc.), they have a much higher input inferred noise floor (far over the microphone’s noise floor) making them unsuitable for the middle and far-field audio usage scenarios (we will elaborate on this later). Also SAR ADCs are not compatible with the MEMS digital microphones.
Moreover, the advanced Sigma-delta ADCs can provide dual-mode operation to dynamically adapt to the audio performance/power consumption trade-off.
Last, but not least, several advantageous features can be embedded in Sigma-delta ADCs, such as analog gain control stage before the Sigma-delta ADC to achieve higher dynamic range, and synchronized audio data frequency in both Sigma-delta ADC and DAC to reduce the crosstalk effects, programmable sampling frequency, and so on.

Dolphin Design’s ultra-high performance ADC and DAC

Dolphin Design’s 16 to 24-bit Sigma-delta high resolution ADC and DAC are specifically designed to be a perfect choice for Voice First Devices/Applications, with enhanced audio performance both on ADC and DAC, and more importantly their comprehensive codec architecture and design methodology which takes into account noise resilience features, microphone characteristics, headsets output amplifier, programmable analog gain control, etc..

First, we know that the audio codec is very sensitive noises (power supply noise, ambient noise, PGA noise, microphone noise, …), and these noises degrade audio performance. Therefore, the embedded ultra-low noise linear regulator is designed to give the best resilience to power supply noise. Coupled with the embedded ultra-low noise input PGA and Integrated low-noise microphone bias for driving MEMS microphones, it will guarantee codec ultimate performances achievement. All these design precautions make the audio codec more robust to the noise sources. Moreover, and more importantly from the SoC integrator’s perspective, the contentious topics that arise during audio integration are transparent, which dramatically eases the design cycle and thus shortens the time-to-market of the final audio product.

Dolphin Design’s ultra-low power and high-performance analog-to-digital converter (ADC) is based on a 16 to 24-bit Sigma-delta high resolution architecture with a recording path that targets Voice First Applications. in TSMC 22nm uLL process technology, the ADC SNR is 106 dB A-Weighted. The two-stage configurable fine-tuning (programmable gain step) gain control unit allows a wide Dynamic Range of 106 dB A-Weighted, and its high Dynamic Range makes it very suitable both for near-field and far-field voice application scenarios. Moreover, it offers an ultra-low input referred noise of 3.8 µVrms A-Weighted at PGA gain of +20 dB, which is a key parameter that determines whether the ADC input noise floor matches the microphone’s requirements. Lastly, its ultra-fast wake-up capability (< 1 ms) makes it an ideal companion for Voice Activity Detection (VAD) to give ultimate power consumption reduction with Keyword spotting or voice-triggered applications.

Now looking at Dolphin Design’s high performance digital-to-analog converter (DAC) for the playback path, it is also based on 16 to 24-bit Sigma-delta high resolution architecture with very high SNR and ultra-low noise floor on headphone output, which delivers the best-in-class audio quality for music playback. The DAC can be programmed to different operating modes to meet different application requirements especially for power consumption reduction and optimization purposes. Given in TSMC 22nm uLL process technology, the SNR is 120 dB A-Weighted with a THD of -90 dB. The two-stage configurable fine-tuning (programmable gain step) gain control and highly programmable headphone amplifier works efficiently together with the DAC, and one stereo differential capless headset output to minimize the BoM cost, which makes it the perfect fit for TWS earbud chips, in terms of the enhanced performance, ultra-low noise floor on HP output and lower BoM cost.

Microphones are vital for providing the high-quality input that all the applications mentioned here need, to deliver an outstanding user experience and excellent audio quality. MEMS microphones with best-in-class audio quality specifications can deliver the required performance. Every microphone can record a range of sound pressure levels (SPLs); this is known as the dynamic range of a microphone. The upper limit of the dynamic range is defined as the acoustic overload point (AOP), while the lower limit is defined by the microphone’s self-noise. This lower threshold is known as the “noise floor” of a microphone, and it defines the signal-to-noise ratio (SNR). A microphone cannot record any sound below its noise floor. A microphone can only pick up signals which have an SPL above its noise floor.

SNR and input range are important parameters for assessing individual microphone performance. TWS incorporates up to six microphones (3 microphones per earbud). Depending on the customer, either a high signal to noise ratio (SNR) or a low total harmonic distortion (THD) at high sound pressure levels (SPL) can be desired. Since the audio codec (ADC recording path) will interface directly with the microphone, it is best to ensure that the audio codec’s ADC matches the selected microphone in terms of noise issues to avoid degradation of the audio recording quality.

The main contributors that can degrade the audio recording quality are the ambient noise, the noise floor of the microphone, the noise from the biasing circuitry of the microphone, and the noise from the ADC and its associated Programmable Gain Amplifier (PGA), as shown in the Figure 3.

Fig.3: The main noise contributors

The SNR is often used as the decisive performance to match an audio ADC with a microphone; however, it is not so simple as SNR_ADC > SNR_microphone.Two audio ADCs may have the same SNR but different noise floors, as the SNR of an ADC depends on its maximum input voltage and its noise floor. To ensure a match between the ADC and the microphone, both the SNR and (most importantly) the noise floor must be taken into consideration.

Therefore, the best way to determine whether the ADC matches the microphone requirements is to compute the input referred noise of the ADC. The input noise floor of the ADC is calculated as shown in the formula:

As an example, if we consider an ADC with 1.27 Vpp max input voltage swing (equivalent to 450 mVrms) and 85 dB of SNR, its input referred noise is equal to:

It is important to note that ADCs SNR often varies according to the gain of the PGA as shown in Figure 4. In our example the 85 dB SNR was obtained with a 0 dB input gain. This ADC offers a 75 dB SNR when the gain of the PGA is set to 20 dB. The new input referred noise is equal to:

Figure 4 illustrates the evolution of the noise floor when the PGA gain changes from 0 dB to 20 dB. In most cases, the ADC will have to convert signals with a magnitude below 100 mVrms. Consequently, as the microphone generates low magnitude signals, it is more relevant to consider the input inferred noise floor with a high PGA gain instead of the usual specified SNR at 0 dB.

Fig.4: The evolution of the noise floor with different PGA gains

The Dynamic Range has been introduced to characterize the dynamic range between the noise floor at the input during the highest gain setting, and the maximum input signal when the gain is set to 0 dB. The Dynamic Range is more representative of the microphone application requirement since it gives the maximum voltage dynamic over the complete gain range of the ADC.

Fig.5: Dynamic Range of the reference ADC

Figure 6 shows that the noise floor of the ADC is roughly equal to the one of the microphones, and the dynamic range of the microphone is fully covered by the performance of the reference ADC, therefore we can consider that the pairing of the microphone and the ADC has been properly performed.

Fig.6: The microphone and the ADC pairing

As to the design of the ANC earbuds in TWS applications, which microphone type is more suitable? Basically, it is highly recommended to select a microphone with high sensitivity, low power, high SNR and low THD (or THD+N) for TWS applications. There are some benefits of MEMS microphones (analog or digital) compared to Electret Condenser (ECM) microphone types, for instance, MEMS microphones have higher SNR, smaller package size, lower power and higher sensitivity than ECM microphones.

For the TWS earbuds and TWS Smart Speaker, MEMS microphones (especially digital ones) are more robust against the RF signals of Wi-Fi or Bluetooth, which improves voice pick-up quality and stability. In addition, since the TWS earbuds application is space-constrained, the usage of MEMS microphones becomes more attractive with their smaller package sizes, and the reduction in both printed circuit board area and component cost can be achieved due to the analog and digital circuitry included in the MEMS microphone construction. Consequently, MEMS microphones are highly recommended to be used in TWS earbud applications. For TWS earbuds with the ANC feature, from the cost perspective, the FF and FB used for hybrid ANC should use the analog MEMS microphone (two analog microphones must have the same characteristics), and the digital MEMS microphone should be used for phone calls.

This section explained the important role that audio codec plays in achieving an enhanced audio performance, matching the ADC with a microphone, specifying a microphone’s characteristics, applying it to a system’s gain staging, and why, although sensitivity is related to SNR, it is not an indication of the microphone’s quality as is SNR. Whether designing with an analog or digital MEMS microphone, this section should help a designer choose the best microphone for an application and get the fullest performance from that device.

ABOUT THE AUTHORS

Hai YU received his PhD in Nano and Micro Electrical Engineering in 2011 from TIMA laboratory, Grenoble Alpes University for his work on low-cost highly efficient fault tolerant processor design for mitigating the reliability issues in nanometric technologies. Hai joined Dolphin Design in 2012 and is currently working as Lead Application Engineer, focusing on Audio & Processing IPs platforms.

Clément MOULIN graduated from ENSEEIHT Toulouse in electronic and signal processing in 2006. After 8 year of leading hardware development in NFC domain, Clément joined Dolphin Design in 2020 and is currently working as an Application Engineer, focusing on Audio & Processing IPs platforms.

If you wish to download a copy of this white paper, click here