Recognition Performance and Testing¶

Why is it difficult to recognize when speaking faster?
¶

When speech rate increases, pronunciation changes often occur. Excessively fast speech can lead to distorted pronunciation, altering voice characteristics. Similar to human communication, if a speaker speaks very quickly, the listener may have difficulty understanding. Therefore, recognition accuracy decreases, especially in noisy environments.

We currently have technical solutions that maintain high recognition rates even at normal fast speech rates. For more details, please consult our technical support.

What aspects should be considered during product testing?
¶

During product testing, avoid having people speaking around the test area. The test environment should only have necessary noise, with no other interfering noises. The test room should have minimal reverberation to reduce echo. During whole system testing, avoid machine operation noise interfering with the test.

It’s best to use our standard testing procedures, paying attention to the voice speed of testers and ensuring the microphone meets our recommended specifications for optimal test results.

What aspects should be checked if recognition rate issues are found after whole system testing?
¶

If this issue occurs, first check the whole system structure, especially whether the microphone is molded and installed according to our recommendations, whether it’s properly sealed and fixed, and whether it’s away from noise sources. If no issues are found, try removing the voice module and microphone for testing in the same environment to see if recognition improves. If bare board testing is successful, focus on structural checks. If bare board testing results are poor, analyze and improve the recognition rate based on bare board performance.

What aspects should be checked if recognition rate issues are found after voice module testing?
¶

If using our provided module, check if command words need optimization and make adjustments in software. If using custom circuit boards, program our standard module with the same firmware and use the same microphone for comparative testing in the same environment. If our standard module performs normally, check your board design according to our hardware requirements. If you cannot locate the issue, contact our technical support for assistance.

Additionally, it’s recommended to record video of the test site and simultaneously record audio at the module’s microphone position using a phone. Send both the video and audio to our technical support for faster analysis.

Is there an easy way to collect test audio recordings?
¶

You can use a phone to record audio. Place the phone’s microphone in the same direction as the voice module’s microphone and position it next to the module (avoiding vibration or air outlet interference from other devices). Start recording on the phone and test according to the actual product testing method.

What are the environmental requirements for testing?
¶

For optimal testing results, we recommend using our recommended equipment models, such as high-fidelity speakers. Use microphones with -32±3dB specifications. Test room size: minimum 4m * 4m, maximum 6m * 6m, simulating a home environment.

The test room should have proper soundproofing or be relatively isolated, with no interference between external and test environment sounds. There should be no obvious noise outside the test room (such as traffic or market noise). Reverberation value range: 0.3-0.6. Before testing, verify the background noise level. In quiet environments, maintain 35-45dB. News noise range: 58-60dB. In quiet environments, test voice or broadcast sound should be around 60dB. In noisy environments, test voice or broadcast sound should be 70-75dB, with SNR > 15dB.

For accurate test results, it’s recommended to use at least 2 test modules per group and take the average of the results.

What precautions should be taken when using our automated testing?
¶

We provide an automated speech recognition testing tool, with usage methods available in our documentation center. This tool is only for command word recognition rate/false recognition rate. When using it, to ensure recognition accuracy, all debugging serial port prints except recognition results should be disabled in the automated testing firmware. If the firmware includes voice prompt functionality, voice prompt audio duration should not exceed 2 seconds, otherwise it will affect overall recognition results. If the firmware requires wake-up before command recognition, automated speech recognition testing tool are not suitable; manual testing is recommended.

For automated testing, firmware must use a single network. Automated testing audio must use standardized audio files. To improve testing efficiency, it’s recommended to remove silence at the beginning and end of automated test audio files. During automated testing, the HUB must have separate power supply, audio playback must use artificial mouth or high-fidelity speakers, and SNR in quiet and noisy environments must be > 15dB.

What precautions should be taken during manual testing?
¶

For manual testing, recommend test personnel aged between 18-60 years (except for children’s products), using standard Mandarin to read command words. Command word reading speed should not be too fast, maintaining a rate of 150-180 Chinese characters per minute. In quiet environments, maintain voice/broadcast sound at around 60dB. In noisy environments, maintain voice/broadcast sound at 70-75dB, with SNR > 15dB. Test distance should be 3m-5m. When playing noise, the speaker position should not directly face the microphone, but should be in the same direction or facing away to avoid noise wave interference with the voice waveform to the microphone.

Is the programming time the same between automated machines and PC?
¶

The programming time is basically the same if the baud rate settings are identical, as it involves a full partition upgrade.

Error: Firmware size exceeds flash configuration
¶

Verify that the correct chip model is selected during packaging
Check if any partition content is too large to fit during packaging

Is there a power-saving mode available? Since the system is battery-powered, can the voice chip’s frequency be reduced when not in use?
¶

An external switching circuit is required.

The progress bar stops at a few percent during firmware download
¶

(1) Check if the correct chip model is selected

(2) Try using a USB-to-TTL converter with a crystal oscillator

Is the PA control pin in the platform-generated firmware triggered by default on high or low level?
¶

It depends on the board type: J-type is generally low-level triggered, while S-type is high-level triggered.

Why is there slight noise from the speaker when the audio module is not responding?
¶

“Check for PCB trace interference using our checklist. If the issue persists, ensure the PLAYER_CONTROL_PA is configured to enable the power amplifier only during playback.”

Recognition Performance and Testing¶

Why is it difficult to recognize when speaking faster? ¶

What aspects should be considered during product testing? ¶

What aspects should be checked if recognition rate issues are found after whole system testing? ¶

What aspects should be checked if recognition rate issues are found after voice module testing? ¶

Is there an easy way to collect test audio recordings? ¶

What are the environmental requirements for testing? ¶

What precautions should be taken when using our automated testing? ¶

What precautions should be taken during manual testing? ¶

Is the programming time the same between automated machines and PC? ¶

Error: Firmware size exceeds flash configuration ¶

Is there a power-saving mode available? Since the system is battery-powered, can the voice chip’s frequency be reduced when not in use? ¶

The progress bar stops at a few percent during firmware download ¶

Is the PA control pin in the platform-generated firmware triggered by default on high or low level? ¶

Why is there slight noise from the speaker when the audio module is not responding? ¶