OpenAI has decided to discontinue using SWE-bench Verified as a measure for evaluating frontier coding capabilities.
Claims
OpenAI has decided to discontinue using SWE-bench Verified as a measure for evaluating frontier coding capabilities.
Parent: AIEntity: SWE-bench VerifiedImpact: negativeDate: Apr 26, 2026Target: The effectiveness of SWE-bench Verified in assessing advanced software engineering skills.
Source posts
Why SWE-bench Verified no longer measures frontier coding capabilities
https://openai.com/index/why-we-no-longer-evaluate-swe-bench-verified/
#HackerNews #SWEbench #CodingCapabilities #FrontierTech #SoftwareEngineering #TechTrends
0 boosts · 0 favs · 0 replies · Apr 26, 2026
#hackernews#swebench#codingcapabilities#frontiertech#softwareengineering#techtrends
Why SWE-bench Verified no longer measures frontier coding capabilities
Link: https://openai.com/index/why-we-no-longer-evaluate-swe-bench-verified/
Comments: https://news.ycombinator.com/item?id=47910388
0 boosts · 0 favs · 0 replies · Apr 26, 2026