Skip to yearly menu bar Skip to main content


Talk

LMArena: An Open Platform for Crowdsourced AI benchmarks

Wei-Lin Chiang

Mission City Ballroom
[ ]
Mon 12 May 11:05 a.m. PDT — 11:25 a.m. PDT

Abstract:

Recent advance in AI has unlocked new capabilities and applications; however, its evaluation still poses significant challenges. We introduce LMArena, an open platform for evaluating AI based on human preferences. Our methodology employs a pairwise comparison approach and leverages input from a global user base through crowdsourcing. The platform has been operational for over two years, collecting ~3 million community votes. LMArena has emerged as one of the most popular LLM leaderboards, widely referenced by leading LLM developers and companies. Our website is publicly available at https://lmarena.ai

Chat is not available.