A Critical Look at AI Model Testing and the Risk of Overstated Abilities Recent findings from a new peer-reviewed study ...