ICLR Invited Talk: Designing, Building, and Training Effective Software Engineering Agents by Xingyao Wang

Invited talk
in
Workshop: The 3rd DL4C Workshop: Emergent Possibilities and Challenges in Deep Learning for Code

Invited Talk: Designing, Building, and Training Effective Software Engineering Agents by Xingyao Wang

Xingyao Wang

[ Abstract ]

Sun 27 Apr 8:10 p.m. PDT — 8:50 p.m. PDT

Abstract:

Recent advances in large language models (LLMs) have transformed them from simple code-completion tools into sophisticated software engineering agents. In this talk, I will explore three complementary dimensions of this evolution. I'll begin with CodeAct, a framework that enables agents to execute code as their primary action mechanism, creating a flexible yet powerful action space that outperforms traditional methods by achieving 20% higher success rates while requiring 30% fewer actions on complex tasks. Next, I'll introduce OpenHands (formerly OpenDevin), an open-source platform for developing generalist coding agents that interact with sandboxed environments through code writing, command execution, and web browsing—just as human developers do. This community-driven project has garnered over 52k GitHub stars and attracts 290+ active contributors worldwide. This platform supports rigorous evaluation across more than 15 tasks, including challenging benchmarks like SWE-Bench and WebArena. Finally, I'll present SWE-Gym, a training environment for software engineering agents, featuring 2,438 real-world Python tasks with executable runtimes and natural language specifications. LLMs trained in SWE-Gym show up to 19% absolute improvement on SWE-Bench, with 12% additional gains from inference-time scaling achieved through trained verifiers.

Chat is not available.

Invited talk in Workshop: The 3rd DL4C Workshop: Emergent Possibilities and Challenges in Deep Learning for Code

Invited Talk: Designing, Building, and Training Effective Software Engineering Agents by Xingyao Wang

Xingyao Wang

Invited talk
in
Workshop: The 3rd DL4C Workshop: Emergent Possibilities and Challenges in Deep Learning for Code