Tigra 27-07-2025 18:59 Training LLMs to reason by making them play zero-sum games with each other https://open.substack.com/pub/machinelearningatscale/p/doing-rl-without-the-costly-training Doing RL without the costly training data!IntroductionOPEN.SUBSTACK.COM 👍3