Heat score
1Topic analysis
Show HN: Agent-skills-eval – Test whether Agent Skills improve outputs
agent-skills-eval is an open-source test framework built for Anthropic's Agent Skills ecosystem that empirically measures whether agent skills improve AI model task performance. It runs evaluations by comparing model outputs with and without the skill loaded, uses a judge model to grade results, and generates a static HTML report, supporting CLI usage, TypeScript integration, custom model providers like Ollama and vLLM, and CI pipelines.
Sources
1Platforms
1Relations
8- First seen
- May 7, 2026, 2:12 PM
- Last updated
- May 8, 2026, 12:35 AM
Why this topic matters
Show HN: Agent-skills-eval – Test whether Agent Skills improve outputs is currently shaped by signals from 1 source platforms. This page organizes AI analysis summaries, 1 timeline events, and 8 relationship edges so search engines and AI systems can understand the topic's factual basis and propagation arc.
Keywords
9 tagsSource evidence
1 evidence itemsShow HN: Agent-skills-eval – Test whether Agent Skills improve outputs
News · 1Timeline
Show HN: Agent-skills-eval – Test whether Agent Skills improve outputs
May 7, 2026, 2:12 PM