Question 1

什么是“Show HN: Agent-skills-eval – Test whether Agent Skills improve outputs”？

Accepted Answer

Show HN: Agent-skills-eval – Test whether Agent Skills improve outputs 是 Link News 基于事实数据库聚合的新闻话题，当前摘要为：agent-skills-eval is an open-source test framework built for Anthropic's Agent Skills ecosystem that empirically measures whether agent skills improve AI model task performance. It runs evaluations by comparing model outputs with and without the skill loaded, uses a judge model to grade results, and generates a static HTML report, supporting CLI usage, TypeScript integration, custom model providers like Ollama and vLLM, and CI pipelines.

Question 2

这个话题覆盖了哪些来源？

Accepted Answer

这个话题当前覆盖 1 个来源平台，并持续汇总相关新闻、搜索与社交讨论信号。

Question 3

这个话题有哪些可追溯证据？

Accepted Answer

当前页面展示 1 条来源证据、1 个时间线节点，并保留原始出处链接便于核验。

Show HN: Agent-skills-eval – Test whether Agent Skills improve outputs

Why this topic matters

Keywords

Source evidence

Show HN: Agent-skills-eval – Test whether Agent Skills improve outputs

Timeline

Related topics

Vibe coding and agentic engineering are getting closer than I'd like

Higher usage limits for Claude and a compute deal with SpaceX

DeepSeek 4 Flash local inference engine for Metal

Natural Language Autoencoders: Turning Claude's Thoughts into Text

DeepSeek 4 Flash local inference engine for Metal

Natural Language Autoencoders: Turning Claude's Thoughts into Text

Vibe coding and agentic engineering are getting closer than I'd like

Higher usage limits for Claude and a compute deal with SpaceX