中国系オープンウェイトモデルの台頭

先月末、Hugging FaceやMITの研究者らによって、オープンウェイトモデルの勢力分布の分析に関する論文が公開されていた。

Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem

Since 2019, the Hugging Face Model Hub has been the primary global platform for sharing open weight AI models. By releasing a dataset of the complete history of weekly model downloads (June 2020-August 2025) alongside model metadata, we provide the most rigorous examination to-date of concentration dynamics and evolving characteristics in the open model economy. Our analysis spans 851,000 models, over 200 aggregated attributes per model, and 2.2B downloads. We document a fundamental rebalancing of economic power: US open-weight industry dominance by Google, Meta, and OpenAI has declined sharply in favor of unaffiliated developers, community organizations, and, as of 2025, Chinese industry, with DeepSeek and Qwen models potentially heralding a new consolidation of market power. We identify statistically significant shifts in model properties, a 17X increase in average model size, rapid growth in multimodal generation (3.4X), quantization (5X), and mixture-of-experts architectures (7X), alongside concerning declines in data transparency, with open weights models surpassing truly open source models for the first time in 2025. We expose a new layer of developer intermediaries that has emerged, focused on quantizing and adapting base models for both efficiency and artistic expression. To enable continued research and oversight, we release the complete dataset with an interactive dashboard for real-time monitoring of concentration dynamics and evolving properties in the open model economy.

www.arxiv.org

2020年6月〜2025年8月の期間、週間ダウンロード履歴とメタデータを分析し、オープンウェイトモデルのエコシステムにおける集中度や変遷を分析したもの。対象は約851,000モデル、対象期間の総ダウンロード数は22億回。

オープンウェイトの主導権はアメリカから中国へ

分析の結果、アメリカ大手企業、Google、Meta、OpenAIらの支配力が2025年急激に低下。代わって2025年以降は中国勢が台頭しており、特にDeepSeekとQwenに集中している。

つまり、Model Hubにおける「誰が主導しているのか」については、明確にllama系からqwen系あるいはdeepseek系に移っている。これはモデル開発者の中ではとっくのとうに共通の認識だと思う。8月下旬に「VCの門戸とたたく企業家はほとんどQwen使っているんじゃね？」というEconomist記事があった。

China is quietly upstaging America with its open models

How worried should OpenAI and other labs be?

www.economist.com

When entrepreneurs walk into the offices of Andreessen Horowitz (a16z), a big American venture-capital firm, the odds these days are that their startups are using AI models made in China. “I’d say 80% chance [they are] using a Chinese open-source model,” says Martin Casado, a partner at a16z.
…
In other words, while American labs are betting big on the fortunes to be made by pushing the frontiers of intelligence, their open-weight Chinese rivals are more focused on encouraging adoption of AI. If they succeed, the DeepSeek shock may be just the beginning.
(起業家がアメリカの大手ベンチャーキャピタル企業a16zのオフィスを訪れると、近年そのスタートアップ企業が中国製AIモデルを採用している可能性が非常に高い。a16zのパートナーであるマーティン・カサド氏は「80%の確率で中国のオープンソースAIモデルを使用していると言える」と述べている。… つまり、アメリカの研究機関が知能の限界を押し広げることで得られる利益に巨額の投資を行う一方で、オープンウェイト型の中国企業はAIの普及促進により重点を置いている。もし彼らが成功すれば、DeepSeekショックは始まりに過ぎないかもしれない。 )

Financial Timesが本論文を元に「中国がオープン領域でアメリカを凌駕している」と報じている¹。

China leapfrogs US in global market for ‘open’ AI models

Beijing-backed technology gains ground as American giants hold fast to ‘closed’ AI strategies

www.ft.com

本論文では勢力分布だけでなく、モデルの性質や構造、透明性、開発者エコシステムについても分析されている。

ダウンロードされるモデルの平均サイズは17倍に
テキストだけでなく、画像、音声、その他モーダリティを扱うモデルが一般化
量子化モデルは5倍、MoEアーキテクチャは7倍の伸び
透明性は低下: 学習データ開示のモデルのダウンロード割合は79.3%(2022)から39%(2025)に。真のオープンソースに該当しないオープンウェイトモデルが、真のオープンソースモデルを上回っている

ここでいう真のオープンソースとは、The Open Source AI Definitionに従うものを指す。

こうした分析の透明性を上げるため、ダッシュボードを公開している。ダッシュボードの期間（Download Date Range）を本論文の対象期間外である2025年9月以降も選択できるが、先月440万以上ダウンロードされているopenai/gpt-oss-120bが出てこないので、モデルのデータ自体は更新されていない模様

Open Model Evolution - a Hugging Face Space by economies-open-ai

This app lets you create and interact with data visualizations and UI controls directly in a web browser. Simply upload or connect your data, adjust filters or settings, and instantly see charts, t...

huggingface.co

今も中国系が主導しているのか？

論文の分析対象期間以降はどうなんだろう

前述の通り、今年の8月にOpenAIよりgpt-ossが公開された。Model Cardにも記載されているが、agentic workflowに使いやすいアーキテクチャ、reasoning effort＝推論コストと推論の深さを調整可能、Structured outputsやCoTが強力、と、比較的利用しやすいモデルとなっている。

取り回しの良いサイズであるgpt-oss-20Bは、同等サイズのQwenやDeepseekよりもダウンロード数がはるかに多い。vLLMなどの周辺ツールはこのモデル勢力図に大きく影響を受けるため、強いモデルの多様性は重要だと思う。

日本語訳 > [FT]中国、オープン型AIモデルで米国抜き世界トップに - 日本経済新聞 ↩︎

オープンウェイトの主導権はアメリカから中国へ#

今も中国系が主導しているのか？#

オープンウェイトの主導権はアメリカから中国へ

今も中国系が主導しているのか？