Crucible: Reward Hacking in an LLM Quantization Tournament
I built a tournament where frontier coding agents tried to compress Qwen3-4B. They did real work first, hit the ceiling, then discovered that overfitting, fabrication, and lying about metrics scored better than honest quantization.