Learn About Intelligent Speech¶
Basic knowledge¶
Intelligent speech is an important part of artificial intelligence technology, including speech recognition, semantic understanding, natural language processing, speech interaction, etc. The problem solved by intelligent speech is to enable the device to sense the world around with hearing, to interact with people in the most natural way, and to make control and life more convenient.
The foundation of intelligent speech is to improve the recognition rate of speech recognition through neural network technology. At the same time, people’s intentions can be analyzed with semantic understanding and corresponding manipulation can be carried out. During feedback, preset sounds can be played or synthesized through speech synthesis to play and output results.
At present, there are many ways to process intelligent speech, including online speech and offline speech. Because intelligent speech processing requires computing power, cloud servers are used to provide computing power for recognition and semantic processing at the beginning. A common intelligent speech processing process is shown in the following figure.
With the continuous development of technology, a special terminal intelligent speech chip has emerged, which can directly process speech recognition, semantic understanding and other functions on the terminal device through the terminal computing power on the chip. Offline speech has begun to rise. Because offline speech has the advantages of protecting user privacy, fast response, and control without network, it has become the standard speech control mode for many control devices. In the future, speech processing will be implemented more at the edge to reduce server consumption and network bandwidth, and save social resources. As a provider of services and content, the cloud will cooperate with the end speech to jointly serve people’s lives.
Offline speech introduction¶
The offline speech scheme uses local processing speech recognition and other functions, which requires no network, and has better response speed and privacy security than the online scheme. The offline speech solution needs to use intelligent speech chips to process intelligent speech functions, which is more suitable for processing control devices, such as control household appliances (air conditioners, sockets, etc.).
A comparison of offline speech and online speech functions is shown in the following table.
Project | Offline speech | Online speech |
---|---|---|
Whether to connect to the network | No need to connect to the network | Need to connect to the network |
Response speed | Very fast (usually about 0.2S) | Fast (affected by network quality) |
Number of instructions (speech library) | 1~1000 (local speech library) | unlimited (cloud speech library) |
Fuzzy recognition | Not supported, fixed term must be used | Supported |
speech analysis | Singlechip query local speech database analysis | Cloud computing query database analysis |
Extended functions | None | With entertainment, life services and other functions |
At present, our company has launched several offline speech solutions, and an application block diagram is shown below.
Offline speech introduction¶
Offline speech has the advantages of no networking and fast response; Online speech has the advantage of access to rich cloud content and services. In the actual scheme, the advantages of the two can be combined. Control functions are implemented by offline speech, and content and services are implemented by online speech. This can not only ensure that the basic functions are used independent of the network, protect user privacy, but also obtain the required content and services through the network under the user’s control and permission, which is very convenient. At present, it has been applied in furniture equipment such as smart home appliances.
At present, our company has launched an off-line speech solution, which can realize hundreds of service skills including online music, video, social networking, news, encyclopedia, stocks, recipes, children’s education and other high-frequency life scenes, and can meet the needs of most products. An application block diagram is shown below.
AIoT speech introduction¶
At present, the Internet of Things is very mature. All kinds of devices can be connected through Ethernet, WIFI, Bluetooth and other ways to achieve interconnection control. At present, IOT control, especially household equipment, still needs to use mobile phones and other devices as the center. In actual use, especially when the device is in front of you, starting with a mobile phone is not the most convenient control method. In addition, when the mobile phone and other central equipment are in failure, there is a lack of control methods between each device and it can not be used, which has certain limitations. At present, speech, as the most natural interaction mode, can be combined with IOT to solve the problems of distribution network in IOT control and some pain points in the need of center. It can also enable devices to provide services for users together after interconnection, and it is very convenient for one speech entrance device to control all IOT devices. Especially, with the emergence of special intelligent speech chip, the cost of its scheme has been greatly reduced, and it has been widely used in IOT equipment such as central control screen, panel, socket, large and small household appliances.
At present, our company has launched a speech AIOT solution, and an application block diagram is shown below.