NeurIPS 2024 @ Vancouver, Canada
Dec 15, 2024
(in-person)
This workshop aims to clarify key questions on the safety of agentic AI systems and foster a community of researchers working in this area.
To this end, we have prepared a diverse and comprehensive schedule of speakers and organizers, to bring together experts in the field. We are also making a call for papers below, with submissions also accepted through the 'submit' button above.
Please note that the order of speakers in the schedule below is subject to change depending on speaker schedule and availability. We will have a confirmed speaker order closer to the time.
Call for papers deadline: September 14th, 2024
The schedule above is tentative and the order of speakers could change.
Foundation models are increasingly being augmented with new modalities and access to a variety of tools and software [9, 5, 3, 11]. Systems that can take action in a more autonomous manner have been created by assembling agent architectures or scaffolds that include basic forms of planning and memory, such as ReAct [12], RAISE [7], Reflexion [10], and AutoGPT + P [2], or multi-agent architectures such as DyLaN [8] and AgentVerse [4]. As these systems are made more agentic, this could unlock a wider range of beneficial use-cases, but also introduces new challenges in ensuring that such systems are trustworthy [6]. Interactions between different autonomous systems create a further set of issues around multi-agent safety [1]. The scope and complexity of potential impacts from agentic systems means that there is a need for proactive approaches to identifying and managing their risks. Our workshop will surface and operationalize these questions into concrete research agendas.
This workshop aims to clarify key questions on the trustworthiness of agentic AI systems and foster a community of researchers working in this area. We welcome papers on topics including, but not limited to, the following:
• Research into safe reasoning and memory. We are interested in work that makes LLM agent
reasoning or memory trustworthy, e.g., by preventing hallucinations or mitigating bias.
• Research into adversarial attacks, security and privacy for agents. As LLM agents interact
with more data modalities and a wider variety of input/output channels, we are interested in work
that studies or defends against possible threats and privacy leaks.
• Research into controlling agents. We are interested in novel control methods which specify goals,
constraints, and eliminate unintended consequences in LLM agents.
• Research into agent evaluation and accountability. We are interested in evaluation for LLM
agents (e.g., automated red-teaming) and interpretability + attributability of LLM agent actions.
• Research into environmental and societal impacts of agents. We are interested in research that
examines the environmental cost, fairness, social influence, and economic impacts of LLM agents.
• Research into multi-agent safety and security. We are interested in research that analyzes novel
phenomena with multiple agents: emergent functionality at a group level, collusion between agents, correlated failures, etc.
Google Deepmind & Associate Professor, UC Berkeley
Senior Staff Research Scientist, Google Deepmind
Assistant Professor, Northeastern
Associate Professor, Ohio State
Associate Professor, Cambridge
Associate Professor, UIUC
Professor, UC Berkeley
Associate Professor, University of Chicago
Associate Professor, Princeton
Associate Professor, KAIST
PhD Candidate (3rd year), UC Berkeley CS
Project Manager, Center for AI Safety