LLMs may systematically favor or exclude certain study types, languages, or results. Researchers should compare AI screening decisions against a gold-standard human-set to calibrate for this .
Machine learning systems are often trained on conventional wisdom and published literature, which already skews toward positive results. This can silently amplify existing biases in the evidence base .
Do not blindly accept AI-suggested studies, extracted data, or risk-of-bias assessments. Cross-check a substantial random sample manually .
In 2025, Cochrane, the Campbell Collaboration, JBI, and the Collaboration for Environmental Evidence jointly released a statement requiring that all AI use in evidence syntheses be reported openly .
A three-pillar guideline for responsible AI in systematic reviews calls for retrieval-augmented generation (RAG) with verifiable source attribution, positioning AI as a "calibrated partner" rather than a replacement .
Improved transparency, clearer reporting standards, and greater user training are all needed to support responsible adoption of AI in evidence synthesis .
AI can reduce manual workload by 50–75% across literature screening, data extraction, and risk-of-bias assessment without sacrificing PRISMA-grade accuracy — when paired with researcher oversight . But the same studies confirm that AI introduces its own biases (selection bias, confirmation bias, training-data bias). The antidote is human oversight, transparent reporting, and rigorous validation. Never outsource critical thinking to the tool.
Comments
0 comments