Improve speech recognition in applications such as smartphones and tablets with high-performance voice capture SoCs

The market for mobile/portable devices such as smartphones and laptops has continued to grow rapidly in recent years. While these products continue to integrate more new functions to enhance the user experience, there is still ample room for improvement in the user experience of basic voice communication functions, especially in noisy environments to improve speech intelligibility while maintaining the natural fidelity of speech. For example, when a user is walking in a crowded commercial district, the surrounding environment may be filled with car horns, engine roaring, construction noise, crowd noise, footsteps and even wind noise. It is difficult to provide clear voice communication. In addition, manufacturers are adding video calling capabilities to emerging tablets and more. When using these mobile/portable devices for conference calls, the surrounding environment may also include a variety of noises, such as office noise, surrounding conversations, computer noise, stroke noise and glassware hitting, etc. The call effect is also not easy.

In these applications, to reduce or filter out environmental noise and improve the effect of voice communication, different methods can be used, such as special noise reduction microphones, analog circuit noise reduction or digital circuit noise reduction (see Table 1). Each of these methods has its own characteristics. In comparison, the method of noise reduction using digital circuits is flexible, the acoustic design complexity is low, and the noise reduction effect is superior. Of course, in addition to providing good noise reduction, portable device designers also face multiple design constraints and challenges, such as size, energy consumption, physical acoustic design, audio fidelity, and cost.

Improve speech recognition in applications such as smartphones and tablets with high-performance voice capture SoCs
Table 1: Comparison of different noise reduction techniques.

Advanced dual microphone real-time adaptive noise reduction technology
ON semiconductor has recently introduced the BelaSigna R261 high-performance voice capture system-on-chip (SoC) based on digital circuit noise reduction technology. The device features advanced dual-microphone noise reduction techniques that help designs provide excellent noise reduction (see Figure 1). This advanced signal processing technology accepts signals from two microphones and can distinguish between different types of signals, extract effective speech information and suppress ambient noise, thereby improving speech recognition.

Improve speech recognition in applications such as smartphones and tablets with high-performance voice capture SoCs
Figure 1: The BelaSigna R261 uses an advanced real-time adaptive noise reduction algorithm.

The BelaSigna R261 has speech extraction algorithms built into its integrated ROM memory. This algorithm utilizes one or more sensors to extract the waveform propagation signal without prior knowledge of the sound source or sensor location. This scheme utilizes global optimization criteria, works in frequency domain, time domain and spatial domain at the same time, and has no restrictions on the number of sound sources and sensors, and has nothing to do with the signal-to-noise ratio (SNR). Also works optimally, making it ideal for applications such as cell phones and portable computers that need to extract useful speech signals from different noise domains.

This adaptive noise suppression algorithm provides a noise suppression capability of 25 dB, which can separate the required speech and ambient noise in real time, and is suitable for various speech sources and speech in various locations, while ensuring natural sound quality (after processing by other solutions). The sound is not natural and full), which can effectively work with microphones of various qualities.

Analysis of key features of BelaSigna R261
BelaSigna R261 is a high-performance voice capture SoC that integrates a digital signal processor (DSP), voltage regulator, phase-locked loop (PLL), level shifter and ROM memory. Such a high level of integration is comparable to other solutions. ratio, can reduce the bill of materials (BOM). As shown in Figure 2, the device supports dual microphone direct input, the noise reduction algorithm is built into the integrated ROM memory, the application controller based on DSP architecture provides high performance and ultra-low power consumption, provides dual-channel analog output, and supports digital microphones output. In addition, the built-in power management module supports supply voltages from 1.8 V to 3.3 V, the built-in on-chip PLL provides multiple frequency options, and an I2C interface is also provided.

Improve speech recognition in applications such as smartphones and tablets with high-performance voice capture SoCs
Figure 2: BelaSigna R261 high-performance voice capture SoC functional architecture diagram.

It is particularly worth mentioning that the dual-microphone real-time adaptive noise reduction algorithm adopted by BelaSigna R261 provides two basic algorithm modes, namely long-distance pickup mode (algorithm mode 0) and short-range pickup mode (algorithm mode 1). . Algorithm mode 0 is optimized for long-distance pickup, can pick up voice up to 6 meters away, while suppressing noise, and supports 360-degree omnidirectional pickup, suitable for laptop, hands-free phone/conference or mobile phone hands-free call mode . In this mode, excellent speech intelligibility is provided even when the user is not aiming at the microphone, or even away from the microphone, thereby enhancing the user’s freedom of use. Algorithm mode 1 is optimized for close-range pickup. At this time, the user is very close to the microphone (the distance is less than 5 cm), that is, the voice is picked up at a close distance, which effectively suppresses various environmental noises. It is suitable for mobile phones, learning machines, walkie-talkies, etc. Equipment working in a loud noise environment.

In addition to these two basic algorithm modes, BelaSigna R261 also offers custom algorithm modes to help manufacturers meet specific application needs. This algorithm mode supports special configurations and can be adjusted by loading new algorithm parameters through an external EEPROM or I2C control interface. Algorithmic effects can be optimized for specific applications, microphone type, location, and other system parameters.

Table 2: BelaSigna R261 supports different modes such as remote pickup, close pickup and customization.

As mentioned above, the BelaSigna R261 offers a high level of integration with built-in adaptive noise reduction algorithms and can be directly connected to a digital microphone interface or the microphone input of the main chip (baseband processor). Therefore, in addition to supporting multiple pickup patterns, another important advantage of this device is ease of integration into the design, which minimizes the time and engineering effort required for design-in, because the design team does not have to Develop or acquire algorithms without designing complex support and interface circuits.

The device also enables cost-conscious original equipment manufacturers (OEMs) to use two inexpensive (not necessarily matched) omnidirectional microphones in their designs, making microphone placement more flexible and eliminating the need to tune the microphones on the production line, further saving time and cost. Housed in an extremely compact 5.3 mm2 WLCSP package (both 26-ball and 30-ball versions), this SoC takes up significantly less board space than other options, making it ideal for even the most space-constrained portable consumer electronics form factors Get it. In addition, the device consumes 15 mA of current at 3.3 V, which is extremely low power consumption.

BelaSigna R261 Application Design Points
Since BelaSigna R261’s ROM-based noise reduction algorithm is very flexible, there are many possible choices for the microphone layout (physical acoustic design), but the default algorithm works optimally only when the microphones are laid out in the following ways: 1) Two microphones facing the user mouth; 2) the midpoint of the two microphones is located within 10 to 25 mm from each microphone. Of course, other microphone layout configurations can also be used when using custom mode.

In terms of circuit design, the BelaSigna R261 is designed to support both digital and analog processing in a single system. Due to this mixed-signal circuit nature, careful design of printed circuit board (PCB) routing is critical to maintaining high audio fidelity. To avoid coupling noise into the audio signal path, keep the digital signal traces away from the analog signal traces. To avoid electrical feedback coupling, it is also necessary to isolate the input traces from the output traces.

In terms of grounding design, the ground plane should be divided into two parts, namely the analog ground plane (VSSA) and the digital ground plane (VSSD). The two ground planes should be connected together at a single point, the star connection point. The star connection point should be at the ground of the capacitor at the output of the power regulator. Of course, these are just some of the issues that designers need to be aware of when applying BelaSigna R261 designs. Detailed design points can be found in Reference 2.

Portable device audio system designers need high-performance voice capture solutions that are easy to integrate into their systems while meeting their size, power, and cost requirements. ON Semiconductor, the premier supplier of high-performance silicon solutions for energy-efficient electronics, offers designers an easy choice with the BelaSigna R261 high-performance voice capture SoC. This device has a high level of integration, built-in advanced adaptive noise reduction algorithm, supports multiple voice pickup modes, and enables applications such as smart phones, walkie-talkies, notebooks and tablet computers to provide clear and comfortable voice communication, with extremely high design flexibility At the same time, it is small in size and low in power consumption, which facilitates the selection of low-cost microphones, so that manufacturers of various portable consumer electronics products can greatly improve speech recognition and customer satisfaction, and speed up the process of product launch.

The Links:   G150XTN06A GBPC3504-G