Abstract: In this paper, we present CAST-Eval, a novel, comprehensive and domain-specific benchmark designed to assess the knowledge and reasoning capabilities of large language models (LLMs) in the ...