UI-Bench Leaderboard

Comprehensive evaluation of AI tools for frontend generation, ranked by expert assessment across diverse design challenges.

UI-Bench Leaderboard
UI-Bench Leaderboard
Overall leaderboard aggregated across n = 4,047 blinded pairwise matches
Tool win rates vs loss rates
RatingWin Rate
Orchids
Orchids
Rating (μ): 30.08
95% CI: [26.61, 33.55]
Win rate: 67.5%
30.0867.5%
Orchids logo
Figma Make
Figma Make
Rating (μ): 27.46
95% CI: [24.11, 30.81]
Win rate: 57.1%
27.4657.1%
Figma Make logo
Lovable
Lovable
Rating (μ): 27.14
95% CI: [23.77, 30.51]
Win rate: 54.8%
27.1454.8%
Lovable logo
Anything
Anything
Rating (μ): 25.46
95% CI: [22.15, 28.77]
Win rate: 51.2%
25.4651.2%
Anything logo
Bolt
Bolt
Rating (μ): 24.44
95% CI: [21.15, 27.73]
Win rate: 48.9%
24.4448.9%
Bolt logo
Magic Patterns
Magic Patterns
Rating (μ): 24.23
95% CI: [20.90, 27.56]
Win rate: 47.0%
24.2347.0%
Magic Patterns logo
Same.new
Same.new
Rating (μ): 23.57
95% CI: [20.24, 26.90]
Win rate: 45.8%
23.5745.8%
Same.new logo
Base44 by Wix
Base44 by Wix
Rating (μ): 23.47
95% CI: [20.16, 26.78]
Win rate: 47.4%
23.4747.4%
Base44 by Wix logo
v0
v0
Rating (μ): 22.24
95% CI: [18.87, 25.61]
Win rate: 41.2%
22.2441.2%
v0 logo
Replit
Replit
Rating (μ): 20.95
95% CI: [17.56, 24.34]
Win rate: 38.9%
20.9538.9%
Replit logo

Rankings based on TrueSkill model

Bar lengths are proportional to ratings

Win Rate

Blue bars represent win rates as percentages

UI-Bench Leaderboard
Complete performance data with sortable columns
RankTool
Rating (μ)
Uncertainty (σ)95% CI
Win Rate
#1Orchids30.081.77[26.61, 33.55]67.5%
#2Figma Make27.461.71[24.11, 30.81]57.1%
#3Lovable27.141.72[23.77, 30.51]54.8%
#4Anything25.461.69[22.15, 28.77]51.2%
#5Bolt24.441.68[21.15, 27.73]48.9%
#6Magic Patterns24.231.70[20.90, 27.56]47.0%
#7Same.new23.571.70[20.24, 26.90]45.8%
#8Base44 by Wix23.471.69[20.16, 26.78]47.4%
#9v022.241.72[18.87, 25.61]41.2%
#10Replit20.951.73[17.56, 24.34]38.9%