Abstract: Although Large Language Models (LLMs) are widely adopted for Python code generation, the generated code can be semantically incorrect, requiring iterations of evaluation and refinement. Test ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results