Summary

This paper presents a framework for evaluating safety risks in generative AI systems through a sociotechnical lens. The authors argue that current capability-focused evaluations are insufficient and propose a three-layered approach that considers:

Capability evaluation (technical components)
Human interaction evaluation (human-AI interaction)
Systemic impact evaluation (broader societal effects)

The paper also surveys existing safety evaluations and identifies key gaps in current approaches.

Key Points

Current safety evaluations focus too narrowly on technical capabilities while ignoring important contextual factors
The proposed framework provides structured, comprehensive approach considering both technical and social dimensions
Major gaps exist in evaluations for:
- Several key risk areas
- Human interaction and systemic impacts
- Multimodal AI systems
The authors propose practical steps to close these gaps and outline roles for different stakeholders

Contribution

The paper makes two main contributions:

A sociotechnical framework for safety evaluation that systematically considers context and emergent effects
A comprehensive survey of current safety evaluation approaches and identification of gaps, with proposed solutions

Limitations/Future Work

Evaluation cannot catch all potential risks
Some risks are difficult to operationalize and measure accurately
Evaluations embed normative choices that need to be made explicit
Need for standardization and independent evaluation approaches
Challenge of evaluating impacts before deployment

📖 tl;dr Responsible AI

Explorer

Sociotechnical Safety Evaluation of Generative AI Systems

Summary

Key Points

Contribution

Limitations/Future Work

Recent

Sociotechnical Safety Evaluation of Generative AI Systems

The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers

Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations

Summaries of RAI concepts, research, and frameworks

AART AI-Assisted Red-Teaming with Diverse Data Generation for New LLM-powered Applications

Table of Contents