Loading...
Preparing your experience
Preparing your experience
Reproducible benchmarks for AI systems. Public datasets, clear methodology, and open results. Built to advance the field, not just promote products.
Most AI benchmarks are marketing tools. They're designed to make specific products look good, not to advance the field. The methodology is hidden, the datasets are proprietary, and the results can't be reproduced.
I'm building open benchmarks because I believe the field needs better standards. These benchmarks are:
If you're building AI systems, these benchmarks give you a way to measure progress objectively. If you're evaluating vendors, they give you a common standard to compare against.
All benchmarks are open source. Submit results, suggest improvements, or use them to evaluate your systems.