Goal
eval-practices
Shared evaluation practice: benchmarks, evals, fitness, measuring model and agent quality, bias, alignment failures, de-identification, style tests
Claims
Contributing links
Goal
Shared evaluation practice: benchmarks, evals, fitness, measuring model and agent quality, bias, alignment failures, de-identification, style tests
Claims
Contributing links