Icml_2026_accepted

Our paper Benchmarking World-Model Learning via Environment-Level Queries was accepted at ICML 2026. The paper introduces WorldTest, a protocol for evaluating world-model learning through environment-level queries that go beyond next-frame prediction — testing whether agents can predict unobserved states, plan action sequences, and detect changes in dynamics. We instantiate WorldTest with AutumnBench, a suite of 43 interactive grid-world environments and 129 tasks across three families: masked-frame prediction, planning, and predicting changes to causal dynamics. I’ll be presenting the poster at the conference.