Independent replication achieved

Independently replicated across two cohorts. Wet lab is the next step.

PRR36 and DNAH5 were identified through a governed AI screen of 19,010 genes and independently replicated in the MSKCC cohort (GSE21034). Wet lab biological validation is the planned next phase.

Current report

Two prostate cancer gene candidates replicated across independent cohorts

Current status: replication complete · wet lab planning next

PRR36 replicated — MSKCC cohort DNAH5 replicated — p = 3.55×10⁻¹⁵ Documentation in preparation GSE21034 · 150 tumors · 29 normal

First successful governed AI discovery

Early discovery signal — PRR36

E-DICE-R surfaced a result for further validation.

E-DICE-R's governed AI platform screened 19,010 genes across six molecular evidence types and identified PRR36 as a prostate cancer candidate — with zero prior cancer publications at the time of discovery. No human hypothesis guided the result. It was surfaced from data alone, under full governance controls.

See signal details How it was found Meet the team

Gene candidate PRR36 — Proline-rich region 36 — identified from TCGA prostate cancer dataset.

Key signal 3.7× tumor overexpression, p = 2.57×10⁻²¹, perfect reproducibility across tested dataset.

Literature status Zero publications found connecting PRR36 to any cancer context at time of discovery.

Data source Public TCGA dataset — no proprietary or patient-identifiable data used.

Discovery method Autonomous governed AI workflow — no human guidance in candidate selection.

Discovery by the numbers

Screened at scale. Found through governance.

No pre-selection, no hypothesis. Every gene screened equally — governance controlled the quality gates.

19,010

Total genes screened across all evidence types

95%+

Candidates eliminated by quality gates — no human steering

942

Candidates advanced for cross-database literature review

3.7×

Tumor overexpression in the lead discovery signal, PRR36

Independent cohort replication — MSKCC

Two genes. Two independent datasets. Both replicate.

Both genes tested in an entirely independent cohort — GSE21034 from MSKCC — different patients, different platform, no data overlap with the discovery set. Both held up.

Original discovery candidate

PRR36

Proline Rich Region 36 — prostate cancer overexpression candidate identified from TCGA-PRAD by governed AI screen. Zero prior cancer publications at time of discovery.

3.7×

Overexpression
TCGA-PRAD

2.57×10⁻²¹

p-value
Original cohort

p = 0.003

p-value
MSKCC replication

75–80%

Confidence post-replication

Co-candidate — stronger MSKCC signal

DNAH5

Dynein Axonemal Heavy Chain 5 — co-identified in the same governed screen. Shows an even stronger independent replication signal than PRR36 in the MSKCC cohort.

Significant

Overexpression
TCGA-PRAD

High

Statistical strength
Original cohort

3.55×10⁻¹⁵

p-value
MSKCC replication

80–85%

Confidence post-replication

Confidence change — before vs. after independent replication

Confidence estimates reflect strength of computational evidence — not clinical validation probability.

PRR36 Original candidate

Before: 65–70% → 75–80%

Pre-replication +8–10 pts after MSKCC

p = 0.003

MSKCC cohort · 150 tumor samples · 29 normal

Replicated — MSKCC

DNAH5 Co-candidate

Before: 60–65% → 80–85%

Pre-replication +18–22 pts after MSKCC

p = 3.55×10⁻¹⁵

MSKCC cohort · extremely strong independent signal

Replicated — MSKCC

Discovery cohort vs. replication cohort

Discovery cohort

TCGA-PRAD

The Cancer Genome Atlas — Prostate Adenocarcinoma. Public multi-institution dataset. Used in original governed AI discovery screen that identified PRR36 and DNAH5 from 19,010 genes.

PRR36 p = 2.57×10⁻²¹ · 3.7× overexpression

DNAH5 Significant overexpression identified

Replication cohort — independent

MSKCC — GSE21034

Memorial Sloan Kettering Cancer Center. 150 prostate tumor samples + 29 normal samples. Independent platform, independent patients, no data overlap with TCGA discovery cohort.

PRR36 p = 0.003 — replicated ✓

DNAH5 p = 3.55×10⁻¹⁵ — strongly replicated ✓

Deeper Analysis — April 2026

A deeper layer of findings. An understudied pathway appears.

After replication, a systematic literature search revealed something broader: not just an unstudied gene — an unstudied biological pathway. And DNAH5 overexpresses in isolation from its own protein complex.

PubMed survey — April 2026

The Ciliary Gap: 8 genes. Zero prostate cancer papers.

A search across all canonical axonemal dynein and ciliary motor genes returned zero prostate cancer publications for every gene in the family. Conventional research has not looked here.

Gene

Role in ciliary complex

Papers found

DNAH5 Key candidate

Axonemal dynein heavy chain — primary ciliary motor component, overexpressed alone in our dataset

0 no papers

DNAI1

Dynein intermediate chain — outer dynein arm assembly

0no papers

DNAI2

Dynein intermediate chain — cilia motility regulation

0no papers

DNALI1

Dynein light intermediate chain — inner dynein arm component

0no papers

NME8

Ciliary outer dynein arm — NME/NDP kinase family

0no papers

DNAL1

Dynein axonemal light chain 1 — outer arm structural subunit

0no papers

CCDC114

Outer dynein arm docking complex — ciliary anchor protein

0no papers

RSPH4A

Radial spoke head 4A — inner ciliary structural component

0no papers

DNAH5 — Expression Pattern

Overexpressed alone — without its normal partners

DNAH5 has 10 high-confidence STRING interaction partners (scores 0.935–0.997), all ciliary components. None of the 10 appear in the 942 passing genes from the governed screen.

If the ciliary program were being activated in prostate cancer, the whole complex would be expected to rise together. Only DNAH5 rises. Two possible interpretations:

Moonlighting: DNAH5 may be serving a non-ciliary function in prostate tumor cells — a recognized phenomenon in cancer biology
Pathway disruption: Ciliary signaling may be dysregulated in a way distinct from other cancers

Neither interpretation is established. Both are testable in wet lab. This is an atypical expression pattern — the biological role remains unknown until further study.

PRR36 + ARHGEF38 — Convergence

Two independent paths converge on the same gene pair

PRR36 has one STRING interaction partner: ARHGEF38, a Rho GEF involved in cytoskeletal regulation and cell motility. ARHGEF38 independently appears in the 942 passing genes from the same governed screen — through a completely separate evidence path.

Two unrelated screening paths converging on the same gene pair is unlikely by chance. This provides functional context for an otherwise unknown gene:

ARHGEF38 is a cytoskeletal regulator — biology mechanistically linked to invasion and metastasis
The convergence is the most actionable functional lead from the current analysis
Pattern is consistent with dysregulated cell motility — a known driver of progression

This is a mechanistic observation. Association with clinical aggressiveness requires validation against outcomes data.

Preliminary interpretation — not a clinical claim

What these signals may suggest — and what they do not yet establish

PRR36 + ARHGEF38

Points toward cytoskeletal control and cell movement. This biology is mechanistically linked to invasion, metastasis, and disease progression. The signal is consistent with a more invasive phenotype — not yet demonstrated to track with clinical outcomes such as Gleason score or recurrence.

DNAH5 isolated overexpression

May reflect a cellular stress state, altered intracellular transport, or dedifferentiation — patterns that can appear in aggressive tumors, but less mechanistically direct than the PRR36/ARHGEF38 signal. Biological role remains uncharacterized.

To suggest aggression association, the following are needed:

Gleason score, metastatic vs. localized samples, biochemical recurrence, or overall/progression-free survival data. Current framing: "Consistent with more aggressive biology — not yet established." Wet lab and outcomes data are the decisive next steps.

Preliminary clinical context — not yet established

What these signals could mean for patients — if confirmed.

These are mechanistic observations from computational data, not clinical proof. What follows is what the biology may suggest — and what validation against outcomes data is needed to confirm.

Detection

Earlier, more precise risk stratification

If PRR36 and DNAH5 expression tracks with aggressive disease, they could flag higher-risk biology earlier — moving toward treating the patient's specific tumor rather than all prostate cancer identically. This requires association with Gleason score, recurrence, or outcomes data first.

Treatment context

A signal toward invasive potential

PRR36 + ARHGEF38 points to cytoskeletal regulation and cell motility — biology mechanistically tied to invasion and spread. If this pathway is active, it could inform decisions around surgery, radiation, or systemic therapy. Not yet demonstrated against clinical outcomes.

Drug targets

New mechanisms, new intervention points

Most treatments target well-studied pathways. An unstudied mechanism — whether DNAH5 repurposed outside its ciliary complex, or PRR36 feeding into cytoskeletal signaling — opens the door to more precise therapies with fewer off-target effects. Requires functional validation first.

Current framing

Consistent with more aggressive biology — not yet established

To move from "mechanistic hints" to clinical claims, validation is needed against: Gleason score, metastatic vs. localized samples, biochemical recurrence, and survival data. That is exactly what the next phase addresses.

Multi-omics methodology

Six evidence types. Each one governed.

Every candidate was evaluated across six independent molecular evidence streams. A candidate only advances if it passes the quality gate for its evidence type. The governance layer enforces these gates — there is no manual override, no bias toward prior literature.

Gene expression (transcriptomics)

Differential expression analysis across tumor vs. normal tissue using TCGA RNA-seq data. Minimum fold-change threshold enforced by governance gate.

Passed — 3.7× overexpression

Mutation profiling (genomics)

Somatic mutation frequency and pattern analysis across prostate cancer samples. Evaluates whether genetic alterations are consistent with oncogenic activity.

Passed — mutation pattern consistent

Survival analysis

Kaplan-Meier and Cox proportional hazards modeling across TCGA patient cohorts. High expression associated with survival outcome differences.

Passed — survival signal present

Cross-dataset reproducibility

Signal consistency observed across multiple sub-cohorts within TCGA. The governance layer requires perfect reproducibility before a candidate advances to literature review.

Passed — 100% reproducibility

Literature and database review

Automated cross-reference against PubMed, CancerGene, COSMIC, and primary cancer databases. Candidates with extensive prior publication are deprioritized — novelty is valued.

Passed — 0 cancer publications found

Structural druggability assessment

AlphaFold-based protein structure prediction combined with binding pocket detection and druggability scoring using standard bioinformatics pipelines.

⊙ In validation

Governance-enabled discovery pipeline

How PRR36 was found autonomously.

Policy-enforced quality gates at every stage — no human steering on which genes to examine, which thresholds to apply, or which candidates to advance.

Dataset ingestion — TCGA prostate cancer cohort

Public TCGA dataset loaded into the governed pipeline. No proprietary or patient-identifiable data. Governance layer validates dataset integrity and provenance before any analysis begins.

Input: 19,010 genes | Source: TCGA | Type: RNA-seq + mutation + clinical

Multi-evidence screening — six parallel evidence streams

All 19,010 genes are screened simultaneously across expression, mutation, survival, and reproducibility evidence types. Each evidence stream has a governed quality gate. Candidates failing any gate are eliminated at this stage.

Output: 942 candidates passed all primary quality gates (>95% eliminated)

Authority-weighted ranking — evidence strength scoring

Remaining 942 candidates are ranked by a governed authority score that weights evidence strength, statistical significance, reproducibility, and cross-modal consistency. Candidates with higher authority scores represent stronger, more consistent signals.

PRR36 ranked in top tier — combined authority score above threshold

Literature cross-reference — novelty validation

Top-ranked candidates are cross-referenced against PubMed, COSMIC, CancerGene, and primary cancer literature databases. Automated search across all indexed sources. The governance layer records all searches and their results as part of the audit trail.

PRR36: 0 publications found connecting this gene to any cancer — novel candidate — no prior publications found

Cryptographic audit trail — full execution log

Every decision in the pipeline — every quality gate pass, every evidence weight, every ranking score, every literature query — is logged in a SHA-256 hash-chained audit trail. The discovery is replayable from first principles at any time.

Hash-chained log: complete | Replayable: yes | Deterministic: yes

Output — PRR36 surfaced as lead discovery candidate

PRR36 emerged as the lead candidate from a fully autonomous, governance-controlled pipeline. It was not chosen by a human. It was not hypothesized in advance. It was surfaced by evidence, ranked by authority, and confirmed as having no prior cancer publications through automated literature review. This is what a governed AI platform can do.

Discovery candidate — currently under biological evaluation

Next phase — biological testing

Replication complete. Wet lab is next.

Computational discovery done. Independent replication complete. Wet lab experiments will determine whether PRR36 and DNAH5 behave in living cancer cells the way the data predicts.

01 — qPCR validation

Quantitative PCR — expression confirmation

Quantitative PCR experiments in prostate cancer cell lines will directly measure PRR36 and DNAH5 mRNA expression levels, confirming whether overexpression observed in public RNA-seq data holds in controlled laboratory conditions.

Planning

02 — Protein presence

Western blot & IHC — protein-level confirmation

RNA expression must be paired with protein-level evidence. Western blot and immunohistochemistry (IHC) assays will test whether the overexpression detected in transcriptomic data translates to measurable protein abundance in tumor tissue.

Planning

03 — Functional studies

Cell knockdown & overexpression assays

If a gene is truly oncogenic, disrupting its expression should affect cancer cell behavior. Knockdown and overexpression assays will test whether PRR36 and DNAH5 influence proliferation, migration, or apoptosis in prostate cancer cell lines — the first functional evidence of biological role.

Pending — follows qPCR & protein

Computational replication ≠ clinical validation

Independent computational replication across two cohorts is a strong scientific milestone — but it is not wet lab validation, and it is not a clinical claim. PRR36 and DNAH5 are computationally discovered candidates with independent replication confirmed. The next phase is biological testing. We will update this page as each stage completes and will not advance clinical claims before that evidence exists.

What comes next

Validation roadmap — careful science, open progress.

We will update this page as each stage completes.

Stage 01

Computational discovery

Governed multi-omics screen — 19,010 genes, six evidence types, PRR36 & DNAH5 identified, audit trail complete.

Complete

Stage 02

Independent replication

Both PRR36 (p=0.003) and DNAH5 (p=3.55×10⁻¹⁵) replicated in independent MSKCC cohort GSE21034 — 150 tumors, 29 normal samples.

Complete — April 2026

Stage 03

Scientific documentation

Governing documentation for "Two Previously Unreported Prostate Cancer Gene Candidates Identified Through Governed Multi-Omics Screening of TCGA-PRAD" — in preparation.

In preparation — Apr 2026

Stage 04

Wet lab validation

qPCR expression confirmation, Western blot/IHC protein presence, and functional cell-line knockdown assays for PRR36 and DNAH5. Follows computational replication phase.

Pending — next stage

Stage 05

Structural druggability

AlphaFold protein structure and binding pocket assessment. VQE-assisted druggability scoring under development for both candidates.

Pending

Stage 06

Clinical relevance review

Dr. Fontanez (DNP, FNP-BC) will assess signal relevance against clinical prostate cancer patient outcomes for both PRR36 and DNAH5.

Pending

The team behind the discovery

Interdisciplinary depth. Responsible execution.

This discovery was produced by a governed AI platform — but validating it requires human expertise. Our team spans the full validation chain: from computational biology and multi-omics analysis to applied mathematics, clinical research, and full-stack system delivery.

George Soto, MBA

Founder & Principal Investigator

System architect of the governed discovery platform. Designed the authority computation, quality gates, and deterministic execution pipeline that surfaced PRR36. Lead author, SPIE 2026.

Dr. Athar Hussain, PhD

Advisor — Data Science & Multi-omics

PhD Biotechnology, NIBGE-PIEAS. Leads biological validation, multi-omics integration, and interpretation of the PRR36 signal. Active research lab with undergraduate and graduate students.

Dr. Laura Fontanez, DNP

Advisor — Clinical Research

Board-certified Family Nurse Practitioner (FNP-BC). Provides clinical relevance assessment — bridging computational findings with real-world patient outcomes and healthcare settings.

Full team profiles Research overview Platform overview Contact us Partner & investor access

Method and process — our proprietary IP

We claim the method that identifies, validates, and governs its use.

E-DICE-R is a proprietary governed multi-omics pipeline. The IP is the structured, reproducible, authority-scored process — not the gene. PRR36 is a result produced by that method. The method is what we sell. The method is what we protect.

E-DICE-R — Governed Pipeline

The method and process

E-DICE-R converts multi-omics discovery into a governed, repeatable decision system for identifying and prioritizing therapeutic and diagnostic targets. The invention defines a structured pipeline that ingests heterogeneous data, applies authority-based scoring and validation, and produces ranked, auditable outputs.

Method: defined pipeline for ingesting, processing, scoring, and ranking targets from multi-omics data
Decision engine: authority-based filtering, evidence weighting, and cross-cohort validation
Outputs: ranked candidates for diagnostics, biomarkers, and therapeutic targeting
System: architecture enforcing reproducibility, auditability, and deterministic workflows
Pharma linkage: integration into target selection, validation, and drug development workflows

PRR36 — Exploratory Signal

What our method found

PRR36 showed 3.7× overexpression in our TCGA prostate cancer analysis, with p = 2.57 × 10⁻²¹, supporting follow-up as an exploratory, under-studied candidate. These findings are statistically strong within the analyzed public dataset, but biomarker utility, novelty, IP position, and druggability still require separate validation.

Quantitative signal: 3.7× differential expression in the prostate cancer cohort
Statistical strength: p = 2.57 × 10⁻²¹ under the applied test model
Data source: public TCGA dataset, de-identified tier
Workflow: reproducible analysis pipeline with auditable outputs
Interpretation scope: association only — not causality or clinical utility
Next steps: cohort replication, subtype analysis, survival correlation, and functional characterization

E-DICE-R invention position

E-DICE-R turns multi-omics discovery into a governed decision system.

Method claim

A governed method for ingesting, processing, scoring, and ranking targets from multi-omics data with authority-based scoring and non-compensating validation gates.

Defined pipeline: ingestion → scoring → ranking → output
Evidence weighting and cross-cohort validation
Ranked outputs for diagnostics, biomarkers, and therapeutic targeting

System claim

A deterministic, auditable architecture that enforces governance, maintains a complete evidence chain, and guarantees replayable outputs.

Reproducibility and auditability by design
Deterministic workflows and evidence traceability
Architecture-level protection around the ranking engine

Pharma workflow claim

A pharmaceutical workflow layer connecting governed outputs to target selection, validation tracking, and downstream drug-development decision support.

Structured evidence packages for pharma portfolio review
Target selection and validation tracking
Application-level use of identified targets, including PRR36

SDK

E-DICE-Edge · E-DICE-Edge Partner Vault

Deployable molecular governance for partner pipelines

A programmable governance layer for pharmaceutical and multi-omics workflows

Authority-based scoring Deterministic audit workflows External dataset integration Partner Vault

E-DICE-Edge SDK exposes the molecular governance layer of E-DICE-R as a deployable integration for external pipelines. This run serves as an illustrative example — governed multi-omics analysis surfacing a high-confidence, underexplored candidate with full reproducibility and evidence traceability. Through E-DICE-Edge Partner Vault, partners apply authority-based scoring, non-compensating validation gates, and deterministic audit workflows directly to their own datasets — reducing downstream validation risk. View Partner Vault pricing →

Independently replicated across two cohorts. Wet lab is the next step.

E-DICE-R surfaced a result for further validation.

Screened at scale. Found through governance.

Two genes. Two independent datasets. Both replicate.

PRR36

DNAH5

A deeper layer of findings. An understudied pathway appears.

The Ciliary Gap: 8 genes. Zero prostate cancer papers.

Overexpressed alone — without its normal partners

Two independent paths converge on the same gene pair

What these signals may suggest — and what they do not yet establish

To suggest aggression association, the following are needed:

What these signals could mean for patients — if confirmed.

Earlier, more precise risk stratification

A signal toward invasive potential

New mechanisms, new intervention points

Consistent with more aggressive biology — not yet established

Six evidence types. Each one governed.

How PRR36 was found autonomously.

Replication complete. Wet lab is next.

Quantitative PCR — expression confirmation

Western blot & IHC — protein-level confirmation

Cell knockdown & overexpression assays

Computational replication ≠ clinical validation

Validation roadmap — careful science, open progress.

Interdisciplinary depth. Responsible execution.

George Soto, MBA

Dr. Athar Hussain, PhD

Dr. Laura Fontanez, DNP

We claim the method that identifies, validates, and governs its use.

The method and process

What our method found

E-DICE-R turns multi-omics discovery into a governed decision system.

Replicated. Deeper pattern found. Wet lab is the next step.