Public distribution · v2 · 2026-04 · Hybrid drug discovery · Protease-focused
CellsWave v1.0 is a proprietary accelerated virtual screening platform. This report summarises external validation on the community-standard DUD-E benchmark (Mysinger et al., J. Med. Chem. 2012), real-drug recovery on co-crystal structures, and library-scale speed measurements.
CellsWave v1.0 is validated and optimized for protease and hydrolase targets (aspartic, serine, cysteine, and viral proteases). Contact us at [email protected] to discuss your specific target.
Input: SMILES strings. Output: ranked similarity scores.
Dataset: DUD-E (Database of Useful Decoys: Enhanced), Mysinger et al. 2012 — the community-standard benchmark for ligand-based virtual screening. For each target, actives are mixed with a property-matched decoy set approximately 50× larger. Methods must distinguish actives from topologically distinct but physically similar decoys.
For each target, a small set of known actives is used to form a query; the query molecules are held out from the library during scoring. The library is ranked by similarity score and evaluated against ground-truth active / decoy labels. Metrics reported: AUROC (area under ROC curve), EF@x% (enrichment factor in top x%), and a one-sided rank-test p-value.
Benchmarked against the DUD-E public dataset. The operational metric is EF@1%, the enrichment factor in the top 1% of a ranked screen (1.0× = random; clients only ever look at the top slice). A target is screening-ready at EF@1% ≥ 4. AUROC is shown for reference.
| Target | Disease area | Class | EF@1% | EF@5% | AUROC |
|---|---|---|---|---|---|
| RENI | Hypertension | protease (asp) | 66.23 | — | 0.899 |
| BACE1 | Alzheimer's | protease (asp) | 56.24 | — | 0.775 |
| HIVPR | HIV / AIDS | protease (asp) | 40.09 | — | 0.858 |
| THRB | Thrombosis | protease (ser) | 38.61 | — | 0.900 |
| TRY1 | Broad protease | protease (ser) | 35.02 | — | 0.820 |
| ACES | Alzheimer's / neuro | hydrolase (esterase) | 28.57 | — | 0.687 |
| GLCM* | Gaucher disease | hydrolase (glycosidase) | 27.69 | — | 0.790 |
Results measured with hybrid scoring (hybrid blend). Pure shape-only channel results available in the reproducibility pack.
| Target | Disease area | Class | EF@1% | AUROC |
|---|---|---|---|---|
| FA7 | Anticoagulation | protease (ser, coagulation) | 0.90 | 0.685 |
| FA10 | Anticoagulation | protease (ser, coagulation) | 0.75 | 0.533 |
| UROK | Thrombolysis | protease (ser, coagulation) | 0.00 | 0.542 |
The validated panel spans aspartic proteases, classical serine proteases, and two hydrolase sub-classes. The three not-supported targets are all coagulation-cascade serine proteases; other ligand-based industry methods also report reduced performance on this sub-class on DUD-E decoys. We disclose both rather than cherry-pick.
* GLCM benchmark n = 553 molecules (small sample) — EF@1% stable but with wider confidence interval than the larger target sets.
Each approved drug was used as its own reference query against its known co-crystal structure, in a 2.83M-molecule library. A correct rank in the top few is direct evidence the platform recovers known therapeutics from real drug-discovery programs.
| Drug | Disease | Target PDB | Rank in 2.83M |
|---|---|---|---|
| Lopinavir | HIV/AIDS | 1MUI | #1 |
| Aliskiren | Hypertension | 2V0Z | #2 |
| Ritonavir | HIV/AIDS | 1HXW | #3 |
| Nirmatrelvir | COVID-19 | 6LU7 | #3 |
| Oseltamivir | Influenza | 2HU0 | #3 |
| Telaprevir | HCV | 3SU3 | #5 |
Each drug was used as its own reference query against its known co-crystal structure.
| Metric | Value |
|---|---|
| Core search | 16.77 ms (shape channel, GPU) |
| Hybrid query | 3.3 s end-to-end (shape + structural similarity scoring) |
| API latency | ~3.3 s per hybrid query over HTTPS |
| Library scale | 2.83M molecules |
| Method | HIV-1 PR AUROC | Speed |
|---|---|---|
| CellsWave v1.0 hybrid | 0.858 | 3.3 s per query on 2.83M molecules |
| Glide SP | ~0.75 | hours per 1K molecules |
| AutoDock Vina | ~0.65 | days per 1K molecules |
CellsWave HIV-1 PR AUROC sits at the upper end of the published range for enterprise docking (Glide SP band 0.70–0.82) while delivering library-scale screening several orders of magnitude faster. Industry AUROC ranges are taken from the standard literature for DUD-E targets.
CellsWave v1.0 is validated by direct DUD-E benchmark on 10 targets, a 6-drug real-drug recovery panel on co-crystal structures, and a SARS-CoV-2 Mpro (PDB 6LU7) case study. The validated sub-envelope comprises:
Known non-supported sub-envelope: coagulation-cascade serine proteases (factor Xa, factor VIIa, urokinase) — EF@1% < 2 on DUD-E decoys. This failure mode is shared with several published ligand-based methods on these specific targets and is reported transparently.
Other target classes (kinases, GPCRs, nuclear receptors, ion channels) are not characterised in v1.0. Contact [email protected] to discuss your specific target.
| Program area | Representative targets (validated + likely envelope) |
|---|---|
| Antiviral programs | HIV-1 PR (validated, EF@1% = 40.09); SARS-CoV-2 Mpro (case study, Nirmatrelvir recovered at rank #3 on PDB 6LU7) |
| Metabolic / cardiovascular proteases | renin (validated, hypertension, EF@1% = 66.23); BACE-1 (validated, Alzheimer's, EF@1% = 56.24) |
| General proteolysis | thrombin (validated, EF@1% = 38.61); trypsin (validated, EF@1% = 35.02) |
| Human hydrolases | acetylcholinesterase (validated, EF@1% = 28.57); glucocerebrosidase (validated, small-n, EF@1% = 27.69) |
| Not supported in v1.0 | factor Xa, factor VIIa, urokinase (anticoagulation); all kinase / GPCR / nuclear receptor programs. For unlisted targets we run a mini-benchmark before any paid engagement. |
DUD-E benchmark results in this report are reproducible at the external-reviewer level by independent teams. Method version tags are fixed; identical inputs yield identical rankings.
Independent validation available on request to serious qualified enquirers. Reviewers receive a reproducibility pack sufficient to regenerate the numbers in this report on their own hardware, together with a technical briefing on architectural properties relevant to evaluation.
Request the reproducibility pack and per-target breakdown. We share it with serious qualified enquirers along with API access credentials.
Request Full Report →This document is for public distribution.