📖 tl;dr Responsible AI

      • Case Study Aggregators
      • Generative AI and Labor: Power, Hype, and Value at Work
      • Red Teaming Methods in AI Security
      • Responsible AI Glossary
      • Azure AI Foundry Risk and Safety
      • The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence
        • Agent-SafetyBench - Evaluating the Safety of LLM Agents
        • BBQ: A Hand-Built Bias Benchmark for Question Answering
        • CrowS-Pairs - A Challenge Dataset for Measuring Social Biases in Masked Language Models
        • HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
        • AART AI-Assisted Red-Teaming with Diverse Data Generation for New LLM-powered Applications
        • Building Safe GenAI Applications - An End-to-End Overview of Red Teaming for Large Language Models
        • On Verbalized Confidence Scores for LLMs
        • STAR: SocioTechnical Approach to Red Teaming Language Models
        • Azure AI Foundry Agent Evaluate SDK
        • LLM Comparator
        • Microsoft RAI Impact Assessment Guide Summary
        • SafeArena: Evaluating the Safety of Autonomous Web Agents
        • WildGuard: Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
      • Evaluating the Social Impact of Generative AI Systems
      • NIST AI 600-1: Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile
      • Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
      • Know Thy Judge - On the Robustness Meta-Evaluation of LLM Safety Judges
      • Red Teaming Language Models to Reduce Harms - Methods, Scaling Behaviors, and Lessons Learned
      • Red-Teaming in the Public Interest
      • Safety Alignment Should Be Made More Than Just a Few Tokens Deep
      • Sociotechnical Safety Evaluation of Generative AI Systems
      • The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers
      • Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
      • LICENSE
      • README
    Home

    ❯

    tags

    ❯

    Tag: bias

    Tag: bias

    2 items with this tag.

    • Apr 13, 2025

      BBQ: A Hand-Built Bias Benchmark for Question Answering

      • paper
      • responsible-ai
      • bias
      • QA
      • benchmark
    • Apr 13, 2025

      CrowS-Pairs - A Challenge Dataset for Measuring Social Biases in Masked Language Models

      • responsible-ai
      • bias
      • benchmarks
      • language-models
      • stereotypes

    Recent

    • Sociotechnical Safety Evaluation of Generative AI Systems

    • The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers

    • Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations

    • Summaries of RAI concepts, research, and frameworks

    • AART AI-Assisted Red-Teaming with Diverse Data Generation for New LLM-powered Applications

    Created with Quartz v4.4.0 © 2025

    • GitHub
    • Bluesky