Home · Benchmark Report

CellsWave Benchmark Report

Public distribution · v2 · 2026-04 · Hybrid drug discovery · Protease-focused

Executive Summary

CellsWave v1.0 is a proprietary accelerated virtual screening platform. This report summarises external validation on the community-standard DUD-E benchmark (Mysinger et al., J. Med. Chem. 2012), real-drug recovery on co-crystal structures, and library-scale speed measurements.

CellsWave v1.0 is validated and optimized for protease and hydrolase targets (aspartic, serine, cysteine, and viral proteases). Contact us at [email protected] to discuss your specific target.

Methodology

Input: SMILES strings. Output: ranked similarity scores.

Dataset: DUD-E (Database of Useful Decoys: Enhanced), Mysinger et al. 2012 — the community-standard benchmark for ligand-based virtual screening. For each target, actives are mixed with a property-matched decoy set approximately 50× larger. Methods must distinguish actives from topologically distinct but physically similar decoys.

For each target, a small set of known actives is used to form a query; the query molecules are held out from the library during scoring. The library is ranked by similarity score and evaluated against ground-truth active / decoy labels. Metrics reported: AUROC (area under ROC curve), EF@x% (enrichment factor in top x%), and a one-sided rank-test p-value.

Architectural disclosure. The internal architecture of the screening engine is proprietary and not disclosed in this report.

Results — DUD-E 10-target panel

Benchmarked against the DUD-E public dataset. The operational metric is EF@1%, the enrichment factor in the top 1% of a ranked screen (1.0× = random; clients only ever look at the top slice). A target is screening-ready at EF@1% ≥ 4. AUROC is shown for reference.

Validated targets (EF@1% ≥ 4, hybrid scoring)

TargetDisease areaClassEF@1%EF@5%AUROC
RENIHypertensionprotease (asp)66.230.899
BACE1Alzheimer'sprotease (asp)56.240.775
HIVPRHIV / AIDSprotease (asp)40.090.858
THRBThrombosisprotease (ser)38.610.900
TRY1Broad proteaseprotease (ser)35.020.820
ACESAlzheimer's / neurohydrolase (esterase)28.570.687
GLCM*Gaucher diseasehydrolase (glycosidase)27.690.790

Results measured with hybrid scoring (hybrid blend). Pure shape-only channel results available in the reproducibility pack.

Outside current envelope (EF@1% < 2)

TargetDisease areaClassEF@1%AUROC
FA7Anticoagulationprotease (ser, coagulation)0.900.685
FA10Anticoagulationprotease (ser, coagulation)0.750.533
UROKThrombolysisprotease (ser, coagulation)0.000.542

The validated panel spans aspartic proteases, classical serine proteases, and two hydrolase sub-classes. The three not-supported targets are all coagulation-cascade serine proteases; other ligand-based industry methods also report reduced performance on this sub-class on DUD-E decoys. We disclose both rather than cherry-pick.

* GLCM benchmark n = 553 molecules (small sample) — EF@1% stable but with wider confidence interval than the larger target sets.

Real-Drug Validation (approved drugs recovered)

Each approved drug was used as its own reference query against its known co-crystal structure, in a 2.83M-molecule library. A correct rank in the top few is direct evidence the platform recovers known therapeutics from real drug-discovery programs.

DrugDiseaseTarget PDBRank in 2.83M
LopinavirHIV/AIDS1MUI#1
AliskirenHypertension2V0Z#2
RitonavirHIV/AIDS1HXW#3
NirmatrelvirCOVID-196LU7#3
OseltamivirInfluenza2HU0#3
TelaprevirHCV3SU3#5

Each drug was used as its own reference query against its known co-crystal structure.

Speed Benchmark

MetricValue
Core search16.77 ms (shape channel, GPU)
Hybrid query3.3 s end-to-end (shape + structural similarity scoring)
API latency~3.3 s per hybrid query over HTTPS
Library scale2.83M molecules

Comparison with Industry Baselines

MethodHIV-1 PR AUROCSpeed
CellsWave v1.0 hybrid0.8583.3 s per query on 2.83M molecules
Glide SP~0.75hours per 1K molecules
AutoDock Vina~0.65days per 1K molecules

CellsWave HIV-1 PR AUROC sits at the upper end of the published range for enterprise docking (Glide SP band 0.70–0.82) while delivering library-scale screening several orders of magnitude faster. Industry AUROC ranges are taken from the standard literature for DUD-E targets.

Scope of Validation

CellsWave v1.0 is validated by direct DUD-E benchmark on 10 targets, a 6-drug real-drug recovery panel on co-crystal structures, and a SARS-CoV-2 Mpro (PDB 6LU7) case study. The validated sub-envelope comprises:

Known non-supported sub-envelope: coagulation-cascade serine proteases (factor Xa, factor VIIa, urokinase) — EF@1% < 2 on DUD-E decoys. This failure mode is shared with several published ligand-based methods on these specific targets and is reported transparently.

Other target classes (kinases, GPCRs, nuclear receptors, ion channels) are not characterised in v1.0. Contact [email protected] to discuss your specific target.

Scope of Use

Program areaRepresentative targets (validated + likely envelope)
Antiviral programsHIV-1 PR (validated, EF@1% = 40.09); SARS-CoV-2 Mpro (case study, Nirmatrelvir recovered at rank #3 on PDB 6LU7)
Metabolic / cardiovascular proteasesrenin (validated, hypertension, EF@1% = 66.23); BACE-1 (validated, Alzheimer's, EF@1% = 56.24)
General proteolysisthrombin (validated, EF@1% = 38.61); trypsin (validated, EF@1% = 35.02)
Human hydrolasesacetylcholinesterase (validated, EF@1% = 28.57); glucocerebrosidase (validated, small-n, EF@1% = 27.69)
Not supported in v1.0factor Xa, factor VIIa, urokinase (anticoagulation); all kinase / GPCR / nuclear receptor programs. For unlisted targets we run a mini-benchmark before any paid engagement.

Reproducibility

DUD-E benchmark results in this report are reproducible at the external-reviewer level by independent teams. Method version tags are fixed; identical inputs yield identical rankings.

Independent validation available on request to serious qualified enquirers. Reviewers receive a reproducibility pack sufficient to regenerate the numbers in this report on their own hardware, together with a technical briefing on architectural properties relevant to evaluation.

Need the full technical report?

Request the reproducibility pack and per-target breakdown. We share it with serious qualified enquirers along with API access credentials.

Request Full Report →

This document is for public distribution.