Amazon Launches SWE-PolyBench: A Multilingual Benchmark for AI Coding Agents to Enhance Performance and Development Efficiency
Amazon has introduced SWE-PolyBench, a groundbreaking benchmark designed to evaluate AI coding agents across multiple programming languages, including Java, JavaScript, TypeScript, and Python. This new benchmark addresses the limitations of previous systems like SWE-Bench, which primarily focused on Python and simple bug fixes. SWE-PolyBench is more comprehensive, featuring over 2,000 curated issues that reflect real-world ...

