SenseTime bets on multimodal AI to regain its edge

Chinese AI pioneer SenseTime is leveraging its computer vision roots to lead the next phase of AI, shifting towards multimodal systems and embodied intelligence in the physical world. Co-founder and chief scientist Lin Dahua stated that this approach mirrors Google's, starting with vision capabilities as the core and adding language to build true multimodal systems.

SenseTime, a Hong Kong-listed company long regarded as one of the world's leading facial recognition providers, is seeking a new role in the generative AI era that began with ChatGPT's launch three years ago. In an interview with the Post on Wednesday, co-founder and chief scientist Lin Dahua explained that the company's longstanding expertise in vision-based AI positions it strongly to lead in embodied intelligence, robotics, and AI agents operating in real-world environments, amid growing debates on the limits of large language models (LLMs).

"Our strategic approach is somewhat similar to Google’s in the United States, which primarily focuses on multimodal AI including the latest Nano Banana Pro. They also start with vision capabilities as the core, then add language abilities to create real multimodal systems," said Lin, who is also an associate professor of information engineering at the Chinese University of Hong Kong.

Extending the comparison to Google—which has deep capabilities across the AI stack, including its own TPU chips for training models—Lin noted that SenseTime's decision as early as 2018 to build large-scale data centres laid a solid foundation for its ambitions. As of August, the company's total computing power stood at about 25,000 petaflops, up 8.7 per cent since the start of the year, after surging 92 per cent over the whole of 2024.

This pivot signals SenseTime's shift from hype to more hardware-focused investments, aiming to regain its edge in multimodal, real-world AI.

関連記事

Korean tech firms Samsung, LG, and Hyundai showcase AI robots, laptops, and 'Physical AI' innovations at bustling CES 2026 in Las Vegas.
AIによって生成された画像

Korean firms highlight AI innovations at CES 2026

AIによるレポート AIによって生成された画像

Ahead of CES 2026 in Las Vegas, major Korean tech firms including LG Electronics, Hyundai Motor Group, and Samsung Electronics unveiled AI-centric products and visions. They presented strategies like 'AI in Action' and 'Physical AI,' showcasing advances in robotics, laptops, memory, and more across daily life and industry. The events emphasized AI extending beyond screens into real-world applications.

専門家は2026年をワールドモデルの画期的な年と予測しており、これらは大規模言語モデルよりも物理世界を深く理解するよう設計されたAIシステムである。これらのモデルはAIを現実に根ざすことを目指し、ロボット工学や自動運転車の進歩を可能にする。Yann LeCunやFei-Fei Liのような業界リーダーは、空間知能を革命化する可能性を強調している。

AIによるレポート

Hangzhou-based startup DeepSeek has not announced plans for its next major AI model release, but its technical papers suggest potential advances. The papers highlight how AI infrastructure innovations could drive efficiency and scale up model performance.

As AI platforms shift toward ad-based monetization, researchers warn that the technology could shape users' behavior, beliefs, and choices in unseen ways. This marks a turnabout for OpenAI, whose CEO Sam Altman once deemed the mix of ads and AI 'unsettling' but now assures that ads in AI apps can maintain trust.

AIによるレポート

AIブームが続く中、GPT-5のようなチャットボットは人気を急速に失いつつある。業界関係者は2026年はQwenの年になると予測している。この変化は、中国のスタートアップRokidの革新によって強調されている。

AI投資バブルの懸念にもかかわらず、台湾の投資家は同技術へのコミットメントを維持している。同島国は同セクターの潜在的過大評価について心配の兆候を示していない。

AIによるレポート

ラスベガスのConsumer Electronics Showで、Nvidia、Razer、HyperXなどの企業が、パフォーマンスとユーザーエクスペリエンスの向上を目指したAI強化ゲーミング技術を公開しました。これらの発表は、ゲーミング周辺機器とソフトウェアへの人工知能の統合拡大を強調しています。一部は即時アップデートですが、他は概念的なプロトタイプのままです。

 

 

 

このウェブサイトはCookieを使用します

サイトを改善するための分析にCookieを使用します。詳細については、プライバシーポリシーをお読みください。
拒否