Blog
Insights, commentary, and updates from the replication community.

Replication Soundtrack vol. 1
A fun look back at the Utrecht Replication Games
June 15, 2026 · Lenka Fiala
The AI Replication Engine: Inside the Benchmark Behind the Beta
A new paper, currently under peer review, tests the Engine on 74 economics and political-science papers across three distinct verification tasks, using two frontier models (GPT-5.5 and Claude Opus 4.7).
June 10, 2026 · Bruno Barbarioli

What Happens After A Bug Is Found?
Why is discovering an error often followed by anxiety?
June 2, 2026 · Ghina Abdul Baki
BITSS Annual Conference Takeaways
We presented our Second Meta Paper at the Berkeley Institute for Transparency in the Social Sciences (BITSS) Annual Conference on April 16th. Our presentation built on our First Meta Paper: assessing reproducibility and robustness in the social sciences.
May 25, 2026 · Derek Mikola
What Three UK Events Showed Me About the I4R Network
Last week, I4R felt less like an organization and more like a living research network. On Monday, researchers gathered in Cambridge. Two days later, another group met at University College London. The next day, we were at King’s College London. Different rooms, different participants, different local teams, but one shared purpose: to understand how AI is changing research.
May 22, 2026 · Juan P. Aparicio
A Sourdough Reminder About Decision Making Without Statistics
My mother and I troubleshoot her twice-weekly sourdough. It passes any reasonable test: it is bread. It is safe to eat. It is great buttered, toasted, sandwiched or crouton-ed. It keeps a couple of days. It is the delicious carb you want to eat.
May 21, 2026 · Derek Mikola

What is our place in the multiverse?
The notions of reproducibility and robustness are foundational to the work we do at the Institute for Replication. A related term that has been making the rounds recently is “multiverse analysis”, so I naturally had to look into it. The origin of the term can be traced back to S…
April 21, 2026 · Luna Fazio
Non-random coding errors
Just a normal conversation between researchers: A: I ran the test, the effect is negative and not significant. B: That must be wrong. I saw the graph, the effect clearly exists and is positive. A: I'll double check the code. Maybe person B is right:
April 14, 2026 · Lenka Fiala