答え公開済み2 か月前Last edited 先月23 ソース

GoogleのAIが「Google」をスペルできない衝撃の理由

Googleの「AI Overview」が「GoogleのPは2つ」などと誤答するのは、AIが単語を「トークン」という塊で処理し、内部の個々の文字を認識していないためだ。「単語内の文字数カウントはLLMにとって既知の課題」とGoogleは認めたが、これは大規模言語モデルの設計思想に根差す問題で、パッチ修正は簡単ではない。

Studio Global AIで検索して事実確認さらにトレンドページを見る

Google AI Overview spelling errors explained: why LLMs fail at basic letter counting — What explains why Google's AI Overview makes basic spelling errors—such as claiming there are two Ps in "Google" or misspelling "journalism"Google's AI Overview confidently miscounts letters because of fundamental tokenization limitations in large language models.
AI プロンプト
Create a landscape editorial hero image for this Studio Global article: What explains why Google's AI Overview makes basic spelling errors—such as claiming there are two Ps in "Google" or misspelling "journalism". Article summary: Your diagnosis is essentially correct. Here is the full explanation, sourced to both the news reports and the AI research literature.. Topic tags: general, general web, user generated, academic. Reference image context from search candidates: Reference image 1: visual subject "# Google's AI Overview still can't spell, and the internet is very aware of it. A phone shows AI Overviews getting a spelling question wrong. Google's AI tools remain abysmal at an" source context "Google's AI Overview still can't spell, and the internet is very aware of it" Reference image 2: visual subject "# Google's AI Overview still can't spell, and the internet is very aware of it.
openai.com

「Google」のPは1つ、とAIは答えられない

2026年5月下旬、GoogleのAI検索機能「AI Overview（AIによる概要）」が初歩的なスペルミスを連発し、大きな話題となりました。「"Google"という単語に"P"はいくつ？」という問いに、AIは自信満々に「2つ」と回答。また、「journalism（ジャーナリズム）」には"d"が2つあると主張し、自ら「j-o-u-r-n-a-d-i-s-m」と誤ったスペルまで披露しました。Googleも翌日にはこの問題を認め、「単語内の文字数カウントはLLMにとって既知の課題であり、この特定の問題を修正中」との声明を発表しています。

これは偶然のバグではありません。主要な大規模言語モデル（LLM）がテキストを処理する基本的な仕組みに起因する、避けられない現象であり、すぐに根本解決するのは極めて難しい「構造的な盲点」なのです。

なぜAIは文字を読めないのか：「トークン化」の呪い

私たち人間は、単語を「G」「o」「o」「g」「l」「e」という個々の文字の連なりとして認識します。しかし、LLMの認識方法は根本的に異なります。LLMはテキストを「トークン」という単位に分解します。これは単語全体、単語の一部、あるいは稀に1文字といった、事前に定義された語彙に基づくデータの塊です。この処理は「トークン化（Tokenization）」、特にByte Pair Encoding（BPE）のようなアルゴリズムによって行われます。

極端な例を挙げると、「Google」という単語はLLMの内部では ["Google"] という1つのトークン、あるいは


["Go", "ogle"]

といったとして処理されている可能性があります。ここで重要なのは、のように1文字ずつに分解されることはという点です。つまり、モデルはトークンの内部にどんな文字が、いくつ含まれているかという情報を、ネイティブには保持していないのです。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

Studio Global AIで検索して事実確認

人々も尋ねます