Talk
LMArena: An Open Platform for Crowdsourced AI benchmarks
Wei-Lin Chiang
Mission City Ballroom
Abstract:
Recent advance in AI has unlocked new capabilities and applications; however, its evaluation still poses significant challenges. We introduce LMArena, an open platform for evaluating AI based on human preferences. Our methodology employs a pairwise comparison approach and leverages input from a global user base through crowdsourcing. The platform has been operational for over two years, collecting ~3 million community votes. LMArena has emerged as one of the most popular LLM leaderboards, widely referenced by leading LLM developers and companies. Our website is publicly available at https://lmarena.ai
Chat is not available.