On AI infrastructure, agent reliability, and what it takes to make autonomous systems trustworthy enough to ship.
I built an AI research system that now publishes 389 canonical papers after duplicate cleanup and later corpus imports. Its strict audit gate now passes the full corpus, and that gate remains visible on the front page instead of being hidden behind a green checkmark. Here is why every AI research release should show its evidence contract.
When an AI agent finishes a task, you need to know more than "it stopped generating." You need to know whether it finished cleanly, whether the output matches the objective, whether there's evidence for the claims, and whether anyone can reproduce what happened. Most agent systems answer none of these questions.
Notes from a private lab campaign patching FlashInfer, CUTLASS blockscaled TMA, and vLLM for NVIDIA's Blackwell consumer variant. Covers the CuTe-DSL fallback path, illegal-instruction traces, misaligned-address patterns, and why sm_121 breaks most vendor kernels in ways H100 and B200 do not.