答え公開済み2 か月前Last edited 先月19 ソース

Fractileが狙うAI推論ボトルネックの解消

英国のAIチップ企業FractileがシリーズBで2億2000万ドルを調達。AIモデルの推論処理を高速化する専用ハードウェアを開発している。[9][10] 同社のアーキテクチャは「メモリ内計算」により、メモリと計算を同一チップ上で実行しデータ移動を削減することで、推論の遅延や電力消費を低減する狙い。[5][12] 推論性能が大幅に向上すれば、推論時に大量の計算を行う推論型AI、リアルタイムAIアシスタント、エージェント型AIなど新しいAIワークロードの拡大が期待される。[6][13]

Studio Global AIで検索して事実確認さらにトレンドページを見る

Concept illustration of AI inference hardware integrating memory and compute — How is UK AI chip startup Fractile addressing the growing AI inference bottleneck, what did its $220M Series B funding involve, why does theFractile is developing AI chips designed to perform computation directly within memory to reduce inference latency and cost.
AI プロンプト
Create a landscape editorial hero image for this Studio Global article: How is UK AI chip startup Fractile addressing the growing AI inference bottleneck, what did its $220M Series B funding involve, why does the. Article summary: Fractile is attacking the inference bottleneck with specialized AI inference hardware that moves compute much closer to memory, rather than relying on conventional GPU designs that shuttle model data between separate com. Topic tags: general, general web, user generated. Reference image context from search candidates: Reference image 1: visual subject "# Fractile United Kingdom ## Why Fractile matters #### Summary Fractile has raised $220 million in a Series B funding round led by Accel, Factorial Funds, and Founders Fund, wi" source context "Fractile raised $200M | AI Chips | MapCo" Reference image 2: visual subject "Founded in 2022, Fractile aims to address t
openai.com

ここ数年、AI企業はより巨大なモデルを「訓練（トレーニング）」する競争を続けてきました。ところが今、業界では別の問題が急浮上しています。それは、訓練済みモデルを実運用で効率よく動かすことです。

ロンドンのスタートアップ**Fractile（フラクタイル）**は、この課題を解決するための専用AIチップを開発しています。同社は最近、**シリーズBで2億2000万ドル（約340億円規模）**を調達し、AI推論に特化した新しいハードウェアの開発を加速させると発表しました。

Fractileの基本的な考え方はシンプルです。今後のAIの進歩を制限するのは、モデルの性能そのものではなく、それをどれだけ速く、低コストで動かせるかだというものです。

なぜAIの「推論」が新たなボトルネックなのか

現在のAIインフラの多くは、モデルを作るためのトレーニング処理に最適化されています。NvidiaのGPUのようなアクセラレータは、大量の並列計算を処理するのが得意だからです。

しかしモデルが公開されると、処理は**推論（Inference）**へと移ります。これはユーザーの質問に対して、トークン（文字列の単位）を生成しながら応答を作る段階です。

このとき問題になるのは計算能力だけではありません。むしろ重要なのは次の要素です。

メモリ帯域幅（大量データをどれだけ速く読み書きできるか）
メモリアクセスの遅延

大規模モデルは、推論時に膨大な重みデータや中間データを何度も読み込む必要があります。もしハードウェアが十分な速度でデータを移動できなければ、計算ユニットがどれだけ速くても全体の処理速度は上がりません。

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

Studio Global AIで検索して事実確認

人々も尋ねます