답변게시됨2개월 전Last edited 지난달15 소스

AI ‘모델 붕괴’란 무엇인가: AI가 AI 데이터를 학습할 때 생기는 문제

연구에 따르면 AI가 생성한 합성 데이터로 반복 학습하면 ‘모델 붕괴(model collapse)’가 발생해 데이터 분포의 희귀 패턴이 점점 사라진다. 재귀적 학습 과정에서는 확률이 높은 패턴만 반복적으로 강화되면서 분포의 ‘꼬리(tail)’에 있는 드문 사례들이 빠르게 사라진다.

Studio Global AI로 검색 및 팩트체크 인기 페이지 더 보기

Concept illustration of AI model collapse showing synthetic data loops shrinking a distribution and removing rare patterns — What does the new study on AI model collapse find about preventing degradation when models are trained on synthetic data, why does recursiveRecursive training on AI‑generated data can gradually erase rare patterns from a model’s learned distribution, a phenomenon researchers call model collapse.
AI 프롬프트
Create a landscape editorial hero image for this Studio Global article: What does the new study on AI model collapse find about preventing degradation when models are trained on synthetic data, why does recursive. Article summary: The study describes model collapse as a failure mode where recursively trained generative models lose information about the original data distribution, especially its rare or low-probability regions.. Topic tags: general, government, education, academic, general web. Reference image context from search candidates: Reference image 1: visual subject "However, as AI-generated data increasingly populates the internet, an important question arises: What happens when new AI models are trained on datasets containing their previous o" source context "Avoiding Model Collapse in AI Training - Risk Insight" Reference image 2: visual subject "Artificial intelligence models
openai.com

생성형 AI는 점점 더 합성 데이터(synthetic data)—즉 이전 AI 모델이 만든 텍스트나 이미지—를 학습 데이터로 사용하고 있다. 하지만 최근 연구들은 이런 방식이 장기적으로 **‘모델 붕괴(model collapse)’**라는 심각한 문제를 일으킬 수 있다고 경고한다.

모델 붕괴란, AI가 사람이나 현실 세계에서 생성된 데이터가 아니라 다른 AI가 만든 결과물을 반복적으로 학습할 때 모델이 점차 현실의 다양성을 잃는 현상을 말한다.

특히 시간이 지날수록 데이터 분포에서 드물게 나타나는 패턴이 사라지고, 모델이 점점 더 평범하고 평균적인 패턴만 재생산하게 된다.

모델 붕괴(Model Collapse)란 무엇인가

모델 붕괴는 생성 모델이 이전 모델의 출력물을 학습 데이터로 삼아 재귀적으로(recursorively) 학습할 때 발생하는 성능 저하 현상이다.

연구에 따르면 이 과정에서는 **되돌릴 수 없는 결함(irreversible defects)**이 생긴다. 구체적으로는 데이터 분포의 ‘꼬리(tail)’ 영역, 즉 드물지만 중요한 사례들이 점점 사라진다.

시간이 지나면 모델은 다음과 같은 특징을 보인다.

Studio Global AI

Search, cite, and publish your own answer

Use this topic as a starting point for a fresh source-backed answer, then compare citations before you share it.

Studio Global AI로 검색 및 팩트체크

사람들은 또한 묻습니다.