TGTGInsighttelegram intelligenceLIVE / telegram public index
← QIN2DIM's Tech Channel
QIN2DIM's Tech Channel avatar

TGINSIGHT POST

Post #1673

@qin2dim

QIN2DIM's Tech Channel

Views138Post view count
PostedSep 2109/21/2025, 07:12 PM
Post content

Post content

SWE-Bench Pro (Public Dataset) 核心挑战包括多文件编辑,平均改动数百行,跨大型代码库的复杂依赖关系。目前在该基准上排名第一的模型是 gpt-5-high benchmark | Read the paper