I Put My AI Research System's Failure Rate on Its Own Front Page

May 2026

I built an AI research system that now publishes 389 canonical papers after duplicate cleanup and later corpus imports. Its strict audit gate now passes the full corpus, and that gate remains visible on the front page instead of being hidden behind a green checkmark. Here is why every AI research release should show its evidence contract.

An Agent Run Is Not Done When the Model Stops Talking

May 2026

When an AI agent finishes a task, you need to know more than "it stopped generating." You need to know whether it finished cleanly, whether the output matches the objective, whether there's evidence for the claims, and whether anyone can reproduce what happened. Most agent systems answer none of these questions.

Debugging Xid 13 and Xid 43 on NVIDIA sm_121 with No Vendor Support

In progress

Notes from a private lab campaign patching FlashInfer, CUTLASS blockscaled TMA, and vLLM for NVIDIA's Blackwell consumer variant. Covers the CuTe-DSL fallback path, illegal-instruction traces, misaligned-address patterns, and why sm_121 breaks most vendor kernels in ways H100 and B200 do not.