Command Words and Firmware Production Guide¶
1. Speech recognition processing flow and required resources¶
The process of speech recognition and the required resources are shown in the figure below. The microphone converts speech into digital signals and sends them to NN for recognition. NN recognition requires two resources, acoustic model and language model, and output string after NN recognition. Then look up the string output by NN in the command word information table. If it is not found, it indicates that it is misidentified and will not be processed. If it is found, it is effective recognition. Then it obtains relevant information according to the found command words, carries out corresponding application function processing, and finally calls the prompt sound player to play the prompt sound.
Notes
- Language model: generated according to command words for NN recognition;
- Acoustic model: used for NN identification, generally related to factors such as language and application scenario;
- Command word information table: used to save information related to command words, such as command string, whether it is a wake-up word, corresponding prompt tone, etc;
- Prompt tone: used to make corresponding feedback prompt after recognizing the command word. Currently, MP3 is supported.
The following sections describe how to generate the required resources.
2. Make command word model file¶
- Log in Chipintelli Speech AI Development Platform, Please follow the procedure in Develop firmware components:☞Develop firmware components
3. Generate voice¶
- From the main menu of the platform, enter the “Broadcast Audio Synthesis” interface,Follow the steps in the following links:Develop firmware components
4. Make firmware¶
4.1. Edit command word information table file¶
Copy the command word information table file downloaded from section 2.2: “[60000] {xxxx}. xls” to the path:% SDK_ PATH%\projects\sample_ xxx\firmware\user_ file\cmd_ Info , replace the original file beginning with [60000], and make relevant modifications according to the project logic, mainly including associating broadcast, setting wake-up words, adjusting recognition sensitivity, etc.
Notes
- Model name: used to set the model name corresponding to the current set of command words. At present, there are two models: NN ID (acoustic model file ID) and ASR ID (language model file ID).
- Model ID: used to set the model ID number corresponding to the current set of command words. The number 0 or greater can be filled in, but it must match the document prefix [ID]. For example, [3] asr_ xxx_ cmd. The dat file ID is 3, and the model ID of ASR ID is 3.
- Command word: command word string.
- Command word ID: the command word ID defined by the developer, which is convenient for rapid development and implementation of logic. By default, different command words cannot use the same command word ID. If you must, you can modify the script file “cmd_info. bat”, cmd_ info. Add “– no cmd id duplicate check” after the exe command.
- Command word semantic ID: semantic ID, a string semantic ID customized by Chipintelli, which is unique. If the product considers home networking, this ID can be used to solve the problem of command word conflict for multiple devices.
- Confidence: It is used to adjust the recognition sensitivity of command words and solve false recognition.
- Wake up words: used to specify wake-up words.
- Compound words: used to specify compound words, which are both wake-up words and command words, with less wake-up step.
- Expected words: used when some command words are particularly difficult to recognize.
- Unexpected words: some command words are particularly easy to recognize, but another similar correct command word cannot be recognized, leading to false recognition.
- Special word count: used when short command words intercept long command words with the same content. For example, “heating” and “heating for three minutes”, which may be “heating for three minutes”, will also result in “heating”. The solution is to set a special word count for the “heating” command word. After the “heating” is identified, wait for a while to see if there are similar command results. If there are, discard the “heating”. It should not be set too large, otherwise the response time of “heating” will be significantly increased.
- Broadcasting type: mainly used to specify the selection method when multiple broadcasting options are selected. Currently, two options are supported: “random selection” (select_index is set to - 1 when calling the broadcast interface) and “user-defined selection” (select_index is set to the value to be selected when calling the broadcast interface).
- Broadcast ID: broadcast audio file ID (that is, audio serial number in Chapter 4). Combined broadcast is connected by ‘+’. If there are multiple broadcast options, each option occupies one column, with a maximum of 127 columns.
- Model group ID: used for multi model switching. In the SDK demo, 0 is the command word model by default, and 1 is the wake-up word model.
Prompt
- If there is a broadcast that is not associated with the command word, you can create a false command word, that is, the command word string is not used to generate the language model and will not be recognized, but can be played through some strings.
- The ID of the file name of the edit command word information table must be 60000 and cannot be modified, such as [60000] {xxxx}. xls.
- Pay attention to the appropriate application of combined broadcast, select broadcast and multi model switching function, which can reduce the size of firmware and save FLASH space.
4.2. Edit code to meet project requirements¶
- User logic is mainly implemented in the “system_msg_deal. c” file;
- The UserTaskManageProcess function is a user logic processing task, in which various messages are processed, such as voice recognition messages, key messages, serial port messages, etc.
- Find the message to be processed and implement the corresponding logic function. For example, IO control, broadcast voice selection, switching model, parameter adjustment, serial port reporting, etc.
- If there is information that needs to be saved by shutting down, you can use ci_ For saving the nvdm module, please refer to the relevant codes for volume setting in the standard demo in the SDK.
Notes
- If there is a command word to switch the model, and the command to switch the model includes voice broadcast, pay attention to the call order of the switch model interface and voice broadcast, which needs to be determined according to the model where the broadcast is.
4.3. Compose and burn firmware¶
4.3.1. Copy resource file¶
Firmware production directory is shown below:
- Put the language model file (asr_zn_214_CI130x. dat) generated in Chapter 2 into the asr directory under the firmware directory, and set the file ID number according to the ASR ID in the Command Word Information Table edited in Section 4.1, such as [0] asr_ zn_ 214_ CI130x.dat. If dual networks are used, the language models of wake-up words and command words should be placed in this directory;
- Put the acoustic model file GE-CH-S-V00214.fefixbin3676 generated in chapter 2 into the dnn directory under the firmware directory, and set the file ID number in NN ID of Command Word Information Table edited in accordance with section 4.1, such as [0] GE-CH-S-V00214.fefixbin3676. If the model already exists in the NN directory, there is no need to replace it;
- Put the broadcast audio file (wav format audio file under TTS_wav directory) generated in Chapter 3 into the voice directory under the firmware directory, and set the folder ID number according to VOICE GROUP in Command Word Information Table edited in Section 3.1, such as [0] voice.
- Compile project code and generate user_ Code.bin, in user_ Code directory;
- Composite partition bin file. Double click to run “Composite partition bin file. bat”. After the completion, the system will display the file in asr, dnn, and user_ A bin file with the same name as the directory is generated under the file and voice directories.
4.3.2. Packaging Firmware¶
To package upgrade, double-click to run “Package Upgrade. bat”, select “CI130X” in the pop-up interface, and then select “Firmware Package”. For more methods to use the tool, press F1 to view the help:
Enter the packaging interface:
Notes
- Config: software and hardware information area;
- User, ASR, DNN, Voice, UserFile, etc.: firmware partition information area.
- Menu bar.
Packaging steps:
-
- Fill in the software and hardware related information in the version information area;
-
- Select or fill in the bin file path of each partition;
-
- Click “Package Firmware”;
-
- If the pop-up window prompts address conflict, adjust the size of each partition and re execute step 3;
-
- The pop-up prompt “Firmware has been generated” indicates successful packaging.
-
For more information about the firmware packaging interface and problem handling, please refer to SDK document ☞Instructions for Serial Port Upgrade Tool.
4.3.3. Burn firmware¶
Click “Firmware Upgrade” in the package upgrade tool:
-
- Select or fill in the firmware path;
-
- Check the serial port connected to the device to be upgraded;
-
- Other options: forced updating of all partitions, authentication files, and encryption;
-
- The module to be upgraded switches to upgrade mode (short circuit PG and EN pins);
-
- Restart the equipment to be upgraded and start upgrading;
-
- Wait for the completion of the upgrade. If the upgrade is successful, the device will automatically boot into the firmware code. If there is a power on announcement, the power on announcement can be heard.