Chipintelli Non-chinese Language Product Recognition Acceptance Criteria¶

Table 1. Near-field Dataset ONLY (distance ≤ 1 m)¶

# Utterances per command	50-99	100-200	≥200
Quiet recognition rate (background noise 35-45 dB, SNR ≥15 dB, distance <3 m)	>85 %	>90 %	>95 %
Noisy recognition rate (ambient noise 55-60 dB, SNR ≥15 dB, distance <3 m)	>60 %	>70 %	>80 %

Table 2. Far-field + Near-field Dataset (distance ≤ 5 m)¶

# Utterances per command	50-99	100-200	≥200
Quiet recognition rate (background noise 35-45 dB, SNR ≥15 dB, distance <5 m)	>85 %	>90 %	>95 %
Noisy recognition rate (ambient noise 55-60 dB, SNR ≥15 dB, distance <5 m)	>60 %	>70 %	>80 %

False Wake-up Requirement (applies to both scenarios)¶

Noise-source test: <3 false wake-ups within 24 h
Test conditions: TV programme; distance to MIC 1.5 m; aligned with mic (0°); single noise source; noise source must NOT contain command words or acoustically similar content.

General Recording & Test Notes¶

Record in a normal, quiet home environment; pronounce each command word 3-5 times at a relatively fast speaking rate.
Speakers must pronounce clearly and standardly; maintain a balanced male/female ratio.
Command words should be 4-8 syllables long and phonetically distinct to reduce confusion and false wake-ups.
Recognition rate = average correct recognition across all command words.
Near-field data: at least one microphone located ≤1 m from the speaker.
Far-field data: at least two microphones located ≤5 m from the speaker, with the farther mic ≥3 m away; recommend 3-4 devices at different distances recording simultaneously.
Audio data must be saved in WAV format with a sample rate no lower than 44.1 kHz.