![]() |
市場調査レポート
商品コード
1613812
中国の自動車マルチモーダルインタラクション開発(2024年)China Automotive Multimodal Interaction Development Research Report, 2024 |
||||||
|
中国の自動車マルチモーダルインタラクション開発(2024年) |
出版日: 2024年12月08日
発行: ResearchInChina
ページ情報: 英文 270 Pages
納期: 即日から翌営業日
|
現在のコックピットインタラクション用途の中で、音声インタラクションはインテリジェントコックピットでもっとも広く、もっとも頻繁に使用されています。ResearchInChinaの最新の統計によると、2024年1月~8月、自動音声システムは約1,100万台の自動車に搭載され、前年同期比10.9%増加し、搭載率は83%に達しました。Baidu Apolloのインテリジェントコックピット事業の総経理であるLi Tao氏は、「コックピットを利用する人の頻度は、当初は1日3~5回だったが、現在では2桁に増え、音声対話技術をリードする一部のモデルでは3桁近くにまで達している」と指摘しました。
音声認識機能の頻繁な使用は、ユーザーのインタラクティブな体験を大いに最適化するだけでなく、タッチや顔認識などの他のインタラクティブ方式との融合という開発動向も促進しています。例えば、NIO Banyan 2.4.0のフルキャビンメモリ機能は顔認識に基づいており、NOMIは情報を記録した乗員に積極的に挨拶します(例、「Good morning, Doudou」)。Zeekr 7Xは音声認識とアイコンタクトを統合し、ドライバーは目で見て話しながら操作でき、首を傾けて音声で車を操作できます。
音声認識や顔認識などの成熟したインタラクション方式に比べ、指紋、静脈、心拍数などのバイオメトリクス技術はまだ探求と開発の初期段階にありますが、徐々に量産され、利用されつつあります。例えば、BYDは2024年に手のひら静脈認証機能を発売し、便利な車両ロック解除を実現しました。GenesisとMercedes-Benzは、それぞれ2025 Genesis GV70と2025 Mercedes-BenzEQE BEVに指紋認証システムを搭載し、ユーザーは指紋だけで本人確認、車両の始動、決済などのさまざまな操作を行うことができます。加えて、Exeed Sterraは新ETモデルでもArcSoftの視覚認識技術を使用し、車内インテリジェントヘルスモニタリング機能を実現しており、これは心拍数、血圧、血中酸素飽和度、呼吸数、心拍変動という5つの主な身体的指標を含むヘルスレポートを出力します。
生体認証技術の導入は、運転の利便性を向上させるだけでなく、自動車の安全保護性能を大幅に高め、疲労運転や自動車盗難などの潜在的な安全上の危険を効果的に防止します。将来的には、こうした生体認証技術はインテリジェントコネクテッドカーの開発にさらに広く組み込まれ、ドライバーにより安全でパーソナライズされたモビリティ体験を提供するようになると見られます。
事例1:Genesis 2025 GV70の指紋認証システムは、ユーザーが指紋認証を通じてパーソナライズされた設定(シート、ポジションなど)を迅速に適用できるほか、車両の始動/運転もサポートします。また、簡単操作、指紋決済、駐車係モードなどのパーソナライズされた連携機能があります。
事例2:BYDの手のひら静脈認証システムは、手のひら静脈データをカメラで読み取り、8~20cmの距離、水平方向360度、垂直方向15度で認識します。業務用画像取得モジュールで静脈パターンの画像を取得し、アルゴリズムで特徴を抽出して保存し、最終的に識別と認識を実現します。将来的には、まずハイエンドブランドYangwangモデルに搭載される可能性があります。
事例3:Exeed Sterra ETモデルはDHSインテリジェントヘルスモニタリング機能を搭載しています。先進の視覚マルチモーダルアルゴリズムに基づき、体表から健康状態をリアルタイムで分析し、心拍数、血圧、血中酸素飽和度、呼吸数、心拍変動という5つの主な身体的指標を測定し、ヘルスレポートを出力することができます。
当レポートでは、中国の自動車産業について調査分析し、マルチモーダルインタラクションの主流の方式、2024年発売のにおけるインタラクション方式の利用、各OEM/サプライヤーのソリューション、開発動向などの情報を提供しています。
Multimodal interaction research: AI foundation models deeply integrate into the cockpit, helping perceptual intelligence evolve into cognitive intelligence
China Automotive Multimodal Interaction Development Research Report, 2024 released by ResearchInChina combs through the interaction modes of mainstream cockpits, the application of interaction modes in key vehicle models launched in 2024, and the cockpit interaction solutions of OEMs/suppliers, and summarizes the development trends of cockpit multimodal interaction fusion.
Among current cockpit interaction applications, voice interaction is used most widely and most frequently in intelligent cockpits. According to the latest statistics from ResearchInChina, from January to August 2024, the automate voice systems were installed in about 11 million vehicles, a year-on-year increase of 10.9%, with an installation rate of 83%. Li Tao, General Manager of Baidu Apollo's intelligent cockpit business, pointed out that "the frequency of people using cockpits has increased from 3-5 times a day at the beginning to double digits today, and has even reached nearly three digits on some models with leading voice interaction technology."
The frequent use of voice recognition function not only greatly optimizes user interactive experience, but also promotes the development trend of fusing with other interactive modes such as touch and face recognition. For example, the full-cabin memory function of NIO Banyan 2.4.0 is based on face recognition, and NOMI actively greets occupants who have recorded information (e.g., "Good morning, Doudou"); Zeekr 7X integrates voice recognition with eye contact to enable the driver to see and speak to control, and tilt his/her head to control the car via voice.
Compared with the mature interaction modes such as voice and face recognition, biometric technologies such as fingerprint, vein, and heart rate are still in the early stage of exploration and development, but they are gradually being mass-produced and applied. For example, BYD launched a palm vein recognition function in 2024, which can realize convenient vehicle unlocking; Genesis and Mercedes-Benz introduced fingerprint recognition systems in the 2025 Genesis GV70 and 2025 Mercedes-Benz EQE BEV respectively, allowing users to complete a range of operations such as identification, vehicle start and payment only with fingerprints; in addition, Exeed Sterra still uses visual perception technology provided by ArcSoft in new ET model, realizing in-cabin intelligent health monitoring function, and outputting health reports for users including five major physical indicators, i.e., heart rate, blood pressure, blood oxygen saturation, respiratory rate and heart rate variability.
Introduction of biometric technology not only improves driving convenience, but also significantly enhances the safety protection performance of vehicles, effectively preventing potential safety hazards such as tired driving and car theft. In the future, these biometric technologies will be more widely integrated into the development of intelligent and connected vehicles, providing drivers with a safer and more personalized mobility experience.
Case 1: Fingerprint recognition system of Genesis 2025 GV70 allows users to quickly apply personalized settings (seats, positions, etc.) through fingerprint authentication, and also supports vehicle start/drive. In addition, there are personalized linkage functions such as easy to use, fingerprint payment, and valet mode.
Case 2: BYD's palm vein recognition system uses a camera to read palm vein data for recognition at a distance of 8-20cm, 360 degrees horizontally and 15 degrees vertically. It uses professional image acquisition module to obtain images of vein patterns, extracts characteristics through algorithms and stores them, and finally realizes identification and recognition. In the future, it may be first installed in high-end brand Yangwang models.
Case 3: Exeed Sterra ET model is equipped with DHS intelligent health monitoring function. Based on advanced visual multimodal algorithm, it can analyze health status in real time according to the surface of the human body, measure the five major physical indicators of heart rate, blood pressure, blood oxygen saturation, respiratory rate and heart rate variability, and output a health report.
China Society of Automotive Engineers clearly defines and classifies intelligent cockpits in its jointly released white paper. The classification system is based on capabilities achieved by intelligent cockpits, comprehensively considers the three dimensions of human-machine interaction capabilities, scenario expansion capabilities, and connected service capabilities, and subdivides intelligent cockpits into five levels from L0 to L4.
With the wide adoption of AI foundation models in intelligent cockpits, HMI capabilities have crossed the boundary of L1 perceptual intelligence and entered a new stage of L2 cognitive intelligence.
Specifically, in the stage of perceptual intelligence, intelligent cockpit mainly relies on the in-cabin sensor system, such as cameras, microphones and touch screens, to capture and identify the behavior, voice and gesture information of driver and passengers, and then convert the information into machine-recognizable data. However, limited by established rules and algorithm framework, the cockpit interaction system in this stage still lacks the capability of independent decision and self-optimization, which is mainly reflected in the passive response to input information.
After entering the cognitive intelligence stage, intelligent cockpits can comprehensively analyze multiple data types such as voice, vision and touch by virtue of powerful multimodal processing capabilities of foundation model technology. This feature makes intelligent cockpits highly intelligent and humanized, able to actively think and serve, as well as keenly perceive actual needs of the driver and passengers, providing users with personalized HMI services. perceives
Case 1: SenseAuto introduced an intelligent cockpit AI foundation model product, A New Member For U, at the 2024 SenseAuto AI DAY. It can be regarded as the "Jarvis" on the vehicle, which can weigh up occupants' words and observe their expressions, actively think, serve, and plan. For example, on the road, it can actively turn up the air conditioner temperature and lower music volume for the sleeping children in the rear seat, and adjust the chassis and driving mode to the comfort mode to create a more comfortable sleeping environment. In addition, it can actively detect the physical condition of occupants, find the nearest hospital for the sick ones, and plan the route.
Case 2: NOMI Agents, NIO's multi-agent framework, uses AI foundation models to reconstruct NOMI's cognition and complex task processing capabilities, allowing it to learn to use tools, for example, calling search, navigation, and reservation services. Meanwhile, according to complexity and time span of the task, NOMI is able to perform complex planning and scheduling. For example, among NOMI's six core multi-agent functions, "NOMI DJ" recommends a playlist that suits the context to users based on their needs, and actively creates an atmosphere; "NOMI Exploration" understands based on spatial orientation, matches map data and world knowledge, and answers children's questions, for example, "what is the tower on the side?".