A Commitment to Transparent Benchmarking

Our mission is to provide an open, data-driven, and reproducible benchmark for AI-powered smart contract security tools.

The Challenge

The rise of AI in Web3 security has introduced a new generation of powerful tools. However, without a standardized evaluation framework, it's difficult for developers and auditors to assess their true effectiveness. Performance claims can be opaque, and comparing tools is often an apples-to-oranges challenge. SCABench was created to solve this problem by establishing a clear, community-vetted methodology for testing and ranking these tools.

Our Core Principles

Verifiable Transparency

Our datasets, raw tool outputs, and final evaluated results will be publicly available, allowing for independent auditing of our findings.

Reproducibility

We will provide the necessary tools and datasets for anyone to independently verify our results.

Community-Driven

Our framework is not set in stone. We actively invite feedback from security researchers, tool developers, and the wider Web3 community to continuously refine our process.

The Proposed Benchmark Framework

Dataset Curation

We will compile a diverse and challenging dataset of Solidity smart contracts, including real-world vulnerabilities from past exploits (e.g., reentrancy, integer overflows), challenges from platforms like Ethernaut and Damn Vulnerable DeFi, and high-quality audited contracts to test for false positives.

Key Evaluation Metrics

Each tool will be scored against several key metrics, including Vulnerability Detection Rate (Recall), Precision (False Positive Rate), Performance (Scan Speed), and Gas Optimization Analysis.

Scoring Algorithm

The final rank will be determined by a weighted score. Our datasets and overall scoring framework are open and public. To ensure accurate vulnerability matching, we use a specialized correlation algorithm. While the algorithm itself is proprietary, its inputs and outputs will be fully auditable to ensure a fair and verifiable process.

Current Status & Roadmap

We are currently in the community feedback phase, working towards our next major milestone.

Q4 2025: Finalize V1 of the methodology and conduct the initial benchmark run.

Get Involved!

Our methodology will be strongest with your input. We invite you to review our plan and join the discussion.

Join our Telegram Contribute on GitHub Email Us