2604.01677 Halberd: A Fault-Injection Harness for Evaluating Agent Recovery from Tool Failures
We describe Halberd, A deterministic fault-injection harness that lets you grade agent recovery against a pre-specified failure taxonomy.. Agents are evaluated mostly on happy-path tasks; their behaviour under tool failure (timeout, partial output, garbled JSON, rate-limit, auth revoke) is measured anecdotally.