Open‑source maintainers have already reported being overwhelmed by floods of AI‑generated contributions, a phenomenon analysts sometimes describe as an "AI slop" wave that strains review capacity and project governance.
One of the most serious concerns is security.
Security researchers have documented a measurable increase in vulnerabilities linked to AI‑generated code. For example, researchers tracking vulnerability disclosures reported at least 35 new CVE entries in March 2026 directly tied to AI‑generated code, with the true number likely higher because many cases lack detectable metadata.
Multiple studies also suggest that AI coding tools frequently reproduce insecure patterns from their training data. Testing across major models found that around 45% of generated code samples introduced common security vulnerabilities, including issues in the OWASP Top 10 categories.
Another risk involves secret leakage. Analysis of real development workflows found that commits assisted by AI exposed credentials more than twice as often as human‑only commits (3.2% vs. 1.5%), contributing to a broader rise in hard‑coded credentials in public repositories.
Together, these trends increase the probability that vulnerabilities, insecure configurations, or leaked keys reach production systems.
The risks become particularly visible in systems built around AI agents and automation tools.
The open‑source AI assistant platform OpenClaw has been cited by security researchers as a major example of exposed AI infrastructure. Investigations have identified tens of thousands of internet‑accessible deployments, many vulnerable to takeover due to misconfiguration or outdated software.
In some scans, more than 21,000 publicly accessible instances were found online, with misconfigured systems leaking API keys, OAuth tokens, and plaintext credentials.
Additional analysis of the platform’s ecosystem uncovered problems inside its extension marketplace as well: scanning nearly 4,000 skills revealed 283 packages—about 7.1%—containing critical credential‑handling flaws that could expose sensitive data.
These incidents illustrate a broader issue: when powerful AI agents are deployed without strong security practices, they can effectively become public control panels for the systems they integrate with.
Many developers emphasize that the real danger is not the tools themselves but who uses them and how.
Traditional software development assumes that the person writing the code understands its architecture, dependencies, and security boundaries. Vibe coding breaks that assumption.
If someone cannot read or reason about the generated code, they may still be able to deploy a working application—but they may not recognize:
In practice, this can produce software that works in demonstrations but fails under real‑world conditions. Engineers often describe these systems as “happy‑path software”—applications that function only under ideal scenarios because their creators cannot fully evaluate the underlying logic.
Even when AI‑generated code works correctly, it can accumulate technical debt quickly.
Because AI dramatically increases the amount of code produced per developer, organizations end up managing larger and more complex codebases. If that code contains redundant logic, inconsistent design patterns, or weak documentation, future changes become more expensive and risky.
Security researchers warn that this dynamic creates a form of “security debt,” where vulnerabilities accumulate faster than organizations can identify and fix them.
In other words, the productivity boost happens immediately—but the maintenance costs appear later.
The same dynamic—cheap generation combined with expensive evaluation—is beginning to appear in science.
AI systems are increasingly used to search literature, generate hypotheses, draft papers, and assist with peer review workflows.
In some cases, the results are promising. Experiments show that language models can generate plausible and sometimes novel scientific hypotheses that researchers can test experimentally.
However, large studies comparing human and AI‑generated research ideas suggest that AI‑generated hypotheses often perform worse when evaluated experimentally.
Editors and researchers are increasingly concerned about the possibility of an "AI slop" effect in academia as well. A 2026 editorial in Science warned that undisclosed or excessive use of AI in manuscript production could degrade the reliability of the scholarly record if oversight does not keep pace.
Across both software engineering and research, the core issue is the same.
AI dramatically reduces the cost of producing outputs—code, papers, hypotheses, or designs. But the cost of evaluating those outputs still depends on expert human judgment.
When generation becomes nearly free but evaluation remains scarce, systems can become flooded with plausible but unreliable work. In software, that appears as insecure code and fragile systems. In science, it may appear as large volumes of weak hypotheses or low‑quality manuscripts.
The challenge for organizations and institutions is not simply adopting AI tools—it is building the review processes, security practices, and governance needed to keep the resulting flood of output from becoming vibe slop.
Comments
0 comments