Under the influence of the continuous advancement in AI technologies and users’ consumption demand, voice interaction is showing a growing tendency towards the ultimate experience of natural human-like conversation, featuring:
At the same time, these trends are also bringing innovation and restructuring to corresponding industries. Take the auto industry as an example: In 2020, terms like "HMI (Human Machine Interaction) and IV (Intelligent Vehicle)" were uttered in our hearing in an endless stream. While transforming and innovating themselves, various carmakers are also making a sudden all-out effort on in-vehicle voice by leveraging suppliers that are able to provide them with voice interaction capabilities.
However, there are only a handful of companies that are able to provide the "3H" voice products, i.e., the ones featuring “High openness, High smoothness & High quality-price ratio”. Among them, the Qing AI open voice platform of PATEO CONNECT+ can be counted as an "optimal solution".
As the first company in the world to work on in-car Chinese speech recognition, PATEO CONNECT+ is also the only IoV company in the industry that owns a voice platform. Currently, it has constructed the only open voice platform in the industry that integrates professional voice companies, such as Baidu AIG, Xiaomi, vivo, iFLYTEK and AISpeech, and professional voice-bots like Xiaoice; through custom microphone arrays, skills and speakers, it is able to tailor AI SDK within five minutes to empower the auto sector; also, after 6 iterations, the Qing AI open voice platform it built has evolved to Qing AI 3.0, fully independently controllable, and capable of delivering an industry-leading user experience.
So, why on earth this Qing AI open voice platform is able to exist as an "optimal solution"? Today, let’s check it out.
What is the Qing AI open voice platform?
PATEO Qing AI open voice platform, as a configurable capability platform, represents a complete R&D system, and can serve as a voice assistant that offers intelligent and user-friendly services.
It supports multiple capabilities such as virtual character, Voice Reproduce Service (VRS), and emotional TTS, so as to build an emotional and warm intelligent voice assistant that can perceive & identify changes in user emotions, communicate with the user through emotional responses, and take the initiative to extend greetings to, show solicitude for & offer reminders to the user, hence emotional and personalized services.
It not only boasts an industry-leading voice experience featured by a fast speed, strong multi-turn dialogue capabilities, accurate recognition in high noise environments, high-accuracy recognition of speech at a low decibel level, and precise recognition of ultra-long texts, etc., but also supports the integration of offline and online voice services, realizing complex natural language dialogue scenarios with full-duplex interaction through efficient speech recognition, multi-dimensional semantic processing and personalized voice interaction.
What attributes does Qing AI open voice platform possess?
Five features empower the platform to support custom configuration and personalized creation:
Custom pluggable business based on platform
Owing to its platform based capabilities, in addition to accurate & efficient voice distribution and multi-skilled result arbitration, it enables pluggable custom services and product iterative upgrade.
Efficient one-to-many application based on flexibility
Its one-to-many flexibility allows the platform to offer different voice service skills based on project and vehicle model, thus meeting different business needs through a set of efficient and flexible custom services; to be quickly applied to various OEM projects; to support multiple integration methods of different systems such as Linux and Android, and different terminals such as IVI and cell phone, thus capable of quickly empowering AI-based interaction experience, and providing intelligent voice products that are competitive in the IoV industry.
Intelligent mobile space & the Internet of Everything (IoE) empowered by Qing AI
The platform provides services for more than 1,000 scenarios in 40-plus areas, covering all aspects of the user's daily life, including mobility, media, entertainment, business, dining, accommodation, and socializing; at the same time, it enables the interconnection between the vehicle and the smart home devices such as TV, as well as watch, sensor, drone and multiple other IoT terminals.
People-oriented interaction mode
In life, this platform product is not only an assistant, but also a companion who pushes life reminders and recommends various life services to the user in real time, thoughtfully and attentively, through user preference analysis, in combination with the interaction mode that integrates multiple senses, such as image recognition, gesture interaction and emotion perception.
Qing AI provides a fast, accurate, stable and clear user experience, and creates voice products that deliver the ultimate experience and outperform their peers.
Specifically speaking, the fast navigation search enables 22 POI retrievals in navigation system in 30s; the accuracy rate of one-shot recognition could approach 100% with a speech level of about 45 dB and an environmental noise level of 90 dB; when using voice to send a long text message, it is able to not only intelligently split sentences, but also well respond to your speech no matter how fast it is; also, the voice interaction design is clear and easy to understand, among others.
Ensure stability of IVI voice service in seconds
1) Speech Recognition (SR): It takes no more than 800ms from the beginning of the user’s speech to the feedback of the first word, and no more than 75ms from the end of the user's speech to the feedback of all recognition results
2) Semantic parsing: The feedback time is stably kept within 150ms
3) CP: The feedback time is stably kept within 1s in 90% of cases
4) Wakeup & recognition accuracy rates: maintained above 95% even when the device has been running for a long time (over 24h)
5) Resource usage: Qing AI steadily requires about 200MB of RAM to run, thereby an extremely low system resource usage to save device hardware costs
24/7 automated monitoring to ensure the stability of the voice platform
The platform adopts 24/7 automated monitoring to monitor, in real time, the SR, semantic parsing, service skills, as well as the accuracy, correctness and response time of the dimensions of results returned by CPs, so as to ensure its own stability.
After ten years of intensive cultivation, the platform now boasts the industry-leading multi-turn voice interaction capabilities — supporting context inheritance, scenario switch, multi-round retention and coreference resolution, and thereby making the dialogue in the multi-round eco interaction process smoother and closer to reality; supporting access to multiple NLU platforms, and realizing semantic arbitration through the self-developed central control system with services that are able to combine the scenario data and the user personal data furnished by the terminal, plus the semantic results returned by the NLU platform for comprehensive judgment, in a bid to choose the most reasonable result to return to the user, thus maintaining the continuity and accuracy of the conversation to the greatest extent.
It is learned that the next-generation Qing AI products will be based on big data, adopt deep neural networks (DNNs), enable continuous optimization & iteration, and integrate voice, image, vision, gesture, emotion and other multimodal interaction methods, so as to become a perceptive, emotional, caring, and self-learning partner who grows with you. In the future, all mass-produced vehicle models involved in PATEO's projects will be equipped with the Qing AI voice platform.