The Chinese tech giant is the only non-US firm to crack the top five in Code Arena’s latest leaderboard
Alibaba owns the South China Morning Post.
Unlike traditional coding benchmarks such as HumanEval or SWE-bench, which rely on standardised tests, Code Arena users test how well models can independently build complete, interactive web applications from scratch, based on user prompts.
Users then vote on anonymised outputs in blind comparisons, meaning the leaderboard closely reflects the preferences of real-world developers.
The benchmark is run by Arena, an organisation founded by researchers from the University of California, Berkeley in collaboration with University of California San Diego and Carnegie Mellon University.
Source: News - South China Morning Post