Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
danielbis 's Collections
safety
Datasets
agents
decoding
cpt

safety

updated about 1 month ago
Upvote
-

  • agentlans/prompt-safety-classification

    Viewer • Updated Aug 24 • 72.1k • 32

  • Jammies-io/safety-refusal

    Viewer • Updated Aug 25 • 100 • 11

  • RefusalBench: Generative Evaluation of Selective Refusal in Grounded Language Models

    Paper • 2510.10390 • Published Oct 12 • 2

  • nvidia/Aegis-AI-Content-Safety-Dataset-2.0

    Viewer • Updated Jun 9 • 33.4k • 4.49k • 66
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs