About SafeBench
The SafeBench competition is designed to stimulate research on new benchmarks which assess and reduce risks associated with artificial intelligence. We are providing $250,000 in prizes - five $20,000 prizes and three $50,000 prizes for top benchmarks.
Why Benchmarking?
Metrics drive the ML field. As such, if we want to understand the progress of AI we must define the metrics that correlate with progress on the problems we care about. Formalizing these metrics into benchmarks is crucial to evaluate and foresee potential risks that these models may present.
AI has seen exciting advances recently. If progress continues at the current rapid rate, AI is positioned to be one of the most transformative technologies in our lifetime, giving rise to systems that can outperform humans in most or all intellectual domains. Such advances could greatly improve the world, resolving long-standing scientific challenges and bringing about unprecedented economic prosperity; however, they also pose risks. For example, advanced AI could be weaponized, be used in a malicious manner, concentrate power, or pursue objectives which are unsafe or otherwise harm humans.
We think that technical ML safety research can reduce these hazards. In particular, we believe that developing benchmarks is one of the most important ways of measuring and assisting in reducing potential harms.
As an example, benchmarks that may previously have won include:
For an in-depth discussion on how to develop good benchmarks, see this blogpost.
SafeBench Competition Timeline
Mar 25, 2024: Competition Launch
Feb 25, 2025: Submission Deadline
Apr 25, 2025: Winners Announced
Center for AI Safety
SafeBench is a project of Center for AI Safety, a technical research non-profit that focuses on reducing societal-scale risks from AI systems. Our past benchmarks include MMLU (the most widely used benchmark for LLMs), the Weapons of Mass Destruction Proxy, the out-of-distribution detection baseline, and the distribution shift benchmarks. See more of our research here.