A Commitment to Transparent Benchmarking
Our mission is to provide an open, data-driven, and reproducible benchmark for AI-powered smart contract security tools.
The Challenge
The rise of AI in Web3 security has introduced a new generation of powerful tools. However, without a standardized evaluation framework, it's difficult for developers and auditors to assess their true effectiveness. Performance claims can be opaque, and comparing tools is often an apples-to-oranges challenge. SCABench was created to solve this problem by establishing a clear, community-vetted methodology for testing and ranking these tools.
Our Core Principles
Verifiable Transparency
Our datasets, raw tool outputs, and final evaluated results will be publicly available, allowing for independent auditing of our findings.
Reproducibility
We will provide the necessary tools and datasets for anyone to independently verify our results.
Community-Driven
Our framework is not set in stone. We actively invite feedback from security researchers, tool developers, and the wider Web3 community to continuously refine our process.
The Proposed Benchmark Framework
Dataset Curation
We will compile a diverse and challenging dataset of Solidity smart contracts, including real-world vulnerabilities from past exploits (e.g., reentrancy, integer overflows), challenges from platforms like Ethernaut and Damn Vulnerable DeFi, and high-quality audited contracts to test for false positives.
Key Evaluation Metrics
Each tool will be scored against several key metrics, including Vulnerability Detection Rate (Recall), Precision (False Positive Rate), Performance (Scan Speed), and Gas Optimization Analysis.
Scoring Algorithm
The final rank will be determined by a weighted score. Our datasets and overall scoring framework are open and public. To ensure accurate vulnerability matching, we use a specialized correlation algorithm. While the algorithm itself is proprietary, its inputs and outputs will be fully auditable to ensure a fair and verifiable process.
Current Status & Roadmap
We are currently in the community feedback phase, working towards our next major milestone.
- Q4 2025: Finalize V1 of the methodology and conduct the initial benchmark run.
Get Involved!
Our methodology will be strongest with your input. We invite you to review our plan and join the discussion.