📖 tl;dr Responsible AI

❯

Evaluation and Testing

❯

Tools and Utilities

Folder: Evaluation-and-Testing/Tools-and-Utilities

5 items under this folder.

Apr 13, 2025
Azure AI Foundry Agent Evaluate SDK
Apr 13, 2025
LLM Comparator
Apr 13, 2025
Microsoft RAI Impact Assessment Guide Summary
Apr 13, 2025
SafeArena: Evaluating the Safety of Autonomous Web Agents
Apr 13, 2025
WildGuard: Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

Recent

Sociotechnical Safety Evaluation of Generative AI Systems
The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers
Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
Summaries of RAI concepts, research, and frameworks
AART AI-Assisted Red-Teaming with Diverse Data Generation for New LLM-powered Applications

Created with Quartz v4.4.0 © 2025

GitHub
Bluesky