Abstract: In this paper, we present CAST-Eval, a novel, comprehensive and domain-specific benchmark designed to assess the knowledge and reasoning capabilities of large language models (LLMs) in the ...
*注:所有任务的提示(Prompt)都经过严格的人工评估,以确保提示适应不同的模型。提示的评估小组由8名研究生和2 ...
Abstract: Client-side attacks have become very popular in recent years. Consequently, third party client software, such as Adobe's Acrobat Reader, remains a popular vector for infections. In order to ...
TAJS is a dataflow analysis for JavaScript that infers type information and call graphs. The current version of the analysis contains a model of ECMAScript 3rd edition, including the standard library, ...