Instructions for use of echo cancellation¶

AEC is the abbreviation of Acoustic Echo Cancellation, which adaptively tracks the transformation of the echo path and suppresses the echo signal of the speaker reaching the microphone terminal in real time to improve the recognition effect of the target voice and evaluate whether to enable AEC according to the actual application scenario. When this function is enabled, speech recognition can also be interrupted when there is a long playback content, or when playing audio media resources (such as speakers playing MP3 songs).

This document describes how to use this feature.

1. Echo cancellation algorithm¶

The schematic block diagram of the application principle of AEC is as follows, the reference signal is generated by the speaker to play signal B, the human voice is the target voice signal A, when the sound is played, signal B and signal A enter the chip after complex mixing in the application environment, AEC algorithm according to the reference signal and the mixed signal, suppress signal B, improve the signal-to-noise ratio of signal A, and then enter the speech recognition engine, thereby improving the recognition effect.

Figure 1-1 Schematic Diagram of Echo Cancellation

2. Echo cancellation algorithm software configuration method¶

Users can open the ci_ssp_config.c file in the SDK package, and the echo cancellation algorithm has the following parameters for user debugging:

aec_config_t aec_config =
{
  .mic_channel_num = 1, //Number of microphone signal channels
  .ref_channel_num = 1, //Number of reference signal channels
  .aec_control_mode = COMPUTE_REF_AMPL_MODE, //ENABLE_PLAYING_STATE_MODE:Perform aec control based on broadcast status; COMPUTE_REF_AMPL_MODE:Perform aec control based on the reference amplitude
                                            //Use COMPUTE by default_ REF_ AMPL_ MODE, recommended if the broadcast status is accurate ENABLE_PLAYING_STATE_MODE model
  .aec_gain = 1.0f,     //Gain value
  .aec_enable_threshold = 6000.0f,//The reference signal judgment threshold value i ##s set to 0, indicating that the judgment is invalid, and the aec continues to process it
  .aec_mic_div_ref_thr = 0.05f,
  .nlp_flag = 2,        //The non-linear processing module selects a mode that defaults to Mode 2, 0: Not used, 1: Use Mode 1 with greater distortion, 2: Use Mode 2 with less distortion, 3: Use Mode 2 first and then Mode 1
  .aggr_mode = 1,       //Default 1
  .fft_size = 256,      //Frequency domain processing frequency points
  .alc_off_codec_adc_gain_mic = 20,
  .alc_off_codec_adc_gain_ref = 0

};

Attention

Note the stereo application of the two-channel stereo: modify the parameters ref_channel_num , mic_channel_num

3. Echo cancellation algorithm application configuration method¶

The AEC reference signal (and the signal played by the horn) can be derived from the following two sources:

1: Reference comes from the speaker playback of the voice module itself, this application is referred to as internal AEC;
2: Refer to the speaker sound playback from other external players, this application is referred to as external AEC.

3.1 Precautions for the use of internal AEC¶

3.1.1 Hardware circuit description¶

In the single-microphone solution, another unused codec channel (micpr) of the CI130X chip is used as the AEC reference signal input channel. At low volume, it is recommended to use AB amplifiers, such as SMG4890; At large volumes, Class D amplifiers are recommended. The AEC feedback signal is standard I2S, 16bits 16K sampling rate, full scale voltage 3.3V, then the voltage amplitude range after the reference voltage divider circuit at maximum volume is 100-150mV.

Figure 3-1 Internal AEC reference hardware circuit diagram

3.1.2 Notes¶

If your company designs the hardware by itself, you need to pay attention to the following matters in the application, and our company can assist your company to check the schematic. If it is stereo two-channel, a two-channel codec is required to use stereo recovery as AEC reference signal processing If a Class D amplifier is used, a filter line needs to be added (refer to the Class D amplifier filter line in subsequent external AEC elimination).

3.2 Precautions for the use of external AEC¶

In external AEC applications, if it is a single mic application, the external reference signal can be input as an analog signal from the MIC R; If it is a dual-MIC application, an external codec is required to collect the reference signal, and the different signal output types (digital, analog) of the power amplifier will cause the peripheral hardware lines to be inconsistent.

3.2.1 The power amplifier outputs an analog signal¶

(1) Hardware principle

The AEC reference signal is standard I2S, 16bits 16K sample rate.

The amplifier output signal is analog (such as a Class AB power amplifier), and the reference signal path uses ADC (Shunxin ES7243E, single-ended input) as the sampling of the reference signal (as shown below). The full-scale voltage is 3.3V, and the voltage amplitude after the reference divider circuit at maximum volume is 100-150mV.

The voltage divider circuit can refer to the following figure:

Fig. 3-2 Circuit diagram of analog power amplifier

Synchronize the player’s playback status to the AEC module:

If you use ENABLE_PLAYING_STATE_MODE mode to synchronize the external announcement status, set the announcement status command as follows

ciss_set(CI_SS_PLAY_STATE,CI_SS_PLAY_STATE_PLAYING); Set the current ciss_set in the broadcast state (CI_SS_PLAY_STATE, CI_SS_PLAY_STATE_IDLE); The setting is currently in a non-announced state

3.2.2 The amplifier outputs a digital signal¶

(1) Hardware principle

The AEC feedback signal is standard I2S, 16bits 16K sampling rate.

The amplifier output signal is digital (such as Class D power amplifier), and the reference signal path uses an ADC (such as AD51050, Shunxin ES7243E, differential input) as the sampling of the reference signal (as shown in the figure below). The full-scale voltage is 3.3V, and the voltage amplitude after the reference divider circuit at maximum volume is 100-150mV.

The voltage divider circuit can refer to the following figure:

Figure 3-3 Digital power amplifier AEC related circuit diagram

If the audio source played by the product (or voice module) is two-channel stereo, it needs to go through the dual-channel codec and then connect to the power amplifier.

Synchronize the player’s playback status to the AEC module:

If you use ENABLE_PLAYING_STATE_MODE mode to synchronize the external announcement status, set the announcement status command as follows

ciss_set(CI_SS_PLAY_STATE,CI_SS_PLAY_STATE_PLAYING); Set the current ciss_set in the broadcast state (CI_SS_PLAY_STATE, CI_SS_PLAY_STATE_IDLE); The setting is currently in a non-announced state

4. Precautions and influencing factors of AEC effect¶

4.1 Reference signal considerations¶

Avoid saturation of the reference analog signal

Fig. 4-1 Reference Analog Signal Saturation Spectrum

Fig. 4-2 Saturation Waveform of Reference Analog Signal

Debugging Suggestions: The signal is distorted before the voltage divider circuit, and the volume needs to be adjusted or the low impedance speaker replaced.

The reference analog signal of AEC is an analog small signal, which is susceptible to interference, and special attention needs to be paid to the cleanliness of the reference signal when layout. When designing, refer to the analog signal as much as possible and away from the high-frequency signal. If there is a problem with the circuit design, bad signals such as pulses, clutter, aliasing, etc. are usually introduced into the acquisition signal or microphone signal, which will lead to data loss and affect the effect.

Debugging suggestions: When this situation occurs, the hardware circuit design should be modified first to ensure that the circuit signal is clean, and then the acoustic test is carried out, and the reference signal is interfered with and the AEC effect is very obvious.

The reference signal is acquired from the rear end of the amplifier, which is closest to the real sound of the horn. When the power amplifier has no delay can be used as a reference signal at the front end of the power amplifier, and the power amplifier has a delay (especially some power amplifiers with their own EQ), it is recommended to take the reference signal from the back end of the power amplifier. If the signal is drawn from the PA backend, be careful not to exceed the output range of the ADC.

4.2 MIC signal considerations¶

Avoid interception of the acquired MIC voice signal

Debugging suggestions: play a full-scale sweep signal at the maximum volume of the speaker, the speaker can not appear broken, resonance and other phenomena, and the MIC acquisition signal and reference signal cannot be intercepted. Excessive speaker size can lead to more severe distortion, and it is necessary to choose the appropriate volume range in practical applications (speakers are not distorted).

4.3 Speaker Considerations¶

Reduce harmonic distortion, which is caused by a system that is not fully linear. In the entire audio path, the factors affecting the harmonic distortion of the signal are: the distortion curve of the speaker unit, the structure design of the rear cavity of the speaker, the front mesh cover of the speaker, the structure of the microphone radio hole, etc.; All of this will cause the distortion of the microphone signal to increase. The greater the distortion of the microphone signal, the worse the similarity between the microphone signal and the recovery signal, and the worse the performance of echo cancellation. The frequency response of the horn is too poor at 200~4kHz, and the structure has resonance, which will affect the AEC effect.

Usually the speaker will have a certain distortion, the volume is close to the speaker playback limit, will aggravate the distortion, try to use the time, so that the speaker works in the distortion range of a small amplitude, different speakers will have different fixed distortion interference.

Choosing a speaker also needs to choose a speaker with small distortion, refer to Figure 4-3-1, the following figure is the sound played by 4 speakers, as can be seen in the figure, some speakers in different frequency bands distortion is relatively large, as shown in the latter three of the figure, these different frequency bands of voice distortion, will seriously affect the AEC effect, try not to use this speaker.

Fig. 4-3-1 Sound played by 4 speakers

When using the speaker, avoid the microphone signal clipping caused by the loud sound of the speaker.

Fig. 4-3-2 Avoid clipping the microphone signal

Debugging suggestions: Try not to use micro speakers, ultra-thin speakers and other speakers with high resonance frequency, this type of single low-frequency distortion is large; If the rear cavity structure of the speaker allows, the passive radiator or guide tube design can increase the low frequency while reducing the nonlinear vibration of the horn.

4.4 Environmental Considerations¶

Reduced mechanical noise In products such as home appliances or robots, the variables introduced by fan noise and motor noise can cause the microphone signal to have a particularly large noise floor, and the recognition effect is greatly affected.

Debugging suggestions: In this type of equipment, special attention should be paid to the distance between the internal fan, motor and microphone, if the structure space allows, the microphone can be sealed separately to prevent internal sound transmission.

4.5 Position relationship between microphone and speaker¶

Debugging suggestions: the position relationship between the microphone and the speaker, in principle, under the conditions of structural space permitting, the microphone needs to be as far away from the horn position as possible, the microphone does not exceed the horizontal plane where the speaker is facing, the closer the microphone is to the speaker, the lower the signal-to-noise ratio, the greater the possibility of microphone distortion, considering the size of the structure, it is recommended that the microphone is about 10cm away from the speaker, and the position relationship between the microphone and the speaker has little impact on the AEC processing effect.

Figure 4-4 Position of Microphone and Speaker

Note that the speaker playback port is away from the microphone and avoid facing the microphone.

5. Methods for debugging effects¶

Figure 5-1 is the time domain effect of the recording board, on which you can see that the original left channel data (upper half of the figure) is a mixture of the target voice and the speaker playback sound, after AEC algorithm processing, suppress the sound played by the speaker, and improve the signal-to-noise ratio of the target voice, as shown in the right channel data (lower half of the figure) in Figure 5-1.

Figure 5-1 Recording board acquisition time domain effect

Figure 5-2 below shows the frequency domain display for the same audio, and this result is normal.

Figure 5-2 Frequency domain display of audio

When the IIS output parameter is set to audio_pre_rslt_write_data((int16_t*)ref, (int16_t*)micl), the reference signal data and the left channel of the microphone are output, as shown in Figure 5-3, you can see that when there is playback, the reference signal has data, and when there is no playback, the reference signal is very small and close to 0.

Figure 5-3 Reference signal and microphone signal waveforms

Figure 5-4 shows the frequency domain display of the same sound:

Figure 5-4 Reference signal and microphone signal frequency domain

Attention

Try to ensure that the reference signal and microphone signal are not distorted and not too small.

6. Exception debugging steps¶

Attention

During the debugging of exceptions, the audio is recorded and saved, which is conducive to rapid analysis.

Step 1: Determine whether there is an exception in the SDK configuration
Step 2: The reference signal (REF), output signal (DST) and microphone signal (MIC) are collected by the sound board to determine the cause of the abnormality
- At present, many abnormal situations appear in the reference signal, and examples of existing problems are:
- 1: There is no data in the reference channel, you need to check the hardware path, check whether the ref is left channel input or right channel input, default left channel;
- 2: The reference signal interference is relatively large, and the hardware path needs to be checked;
- 3: The reference signal amplitude deviates greatly from the standard value (too large or too small), check the voltage divider circuit and the ALC gain value, and use the standard board as a reference;
- 4: If there is saturation during the voltage division process of the reference signal, it is necessary to check the voltage division circuit and the alc gain value, AEC abnormal debugging case
- The problems with abnormal microphone signal are:
- 1: During AEC processing, the ALC is not closed, check the playback status and ALC switching status;
- 2: If the microphone signal is intercepted, check the playback status, ALC switching state, and ALC gain value;
- 3: The microphone signal sensitivity is different, you need to replace the microphone to confirm whether the microphone is abnormal;
- 4: The microphone amplitude is too small, AEC is not effective, and the threshold value in AEC processing needs to be adjusted and changed to ALC gain.
Step 3: Replace different modules to analyze whether the peripheral hardware device difference causes the exception

Peripheral devices include: microphone, speaker, voltage divider circuit, power amplifier, codec, etc
Step 4: Check whether environmental factors cause the AEC effect to decrease, and the AEC effect will decrease in the environment with large reverberation;

When AEC applications do not meet expectations, it is recommended to use standard SDKs and hardware as references, and analyze the specific reasons through standard accessory reference modules.