Testing Agent Skills Systematically with Evals by dominik-kundel, gabriel-chuaCore argument: Agent skills are untestable vibes until you build an eval pipeline — define success metrics, capture traces, write graders, and compare scores over time.ai-agentstestingdeveloper-experienceai-toolsJan 22, 2026