Medmarks v0.1: A new LLM benchmark suite of medical tasks
Sophont (blog), 2025
Launches the Medmarks evaluation suite with 20 verifiable and open-ended medical benchmarks, 46 models across 56 configurations, and RL-ready environments to track LLM medical capabilities.