Real-world evidence (RWE) has moved from a nice-to-have supplement to clinical trials to a critical component of regulatory submissions, payer negotiations, and clinical decision-making. Flatiron Health and Medidata (a Dassault Systèmes company) represent two distinct approaches to generating and analyzing RWE, and understanding their differences matters for anyone building products in the clinical evidence space.
Origin and Focus
Flatiron Health (acquired by Roche in 2018 for $1.9B) started as an oncology-focused EHR company and evolved into a real-world data platform. Their core asset is a curated, longitudinal oncology dataset derived from 280+ community cancer clinics using Flatiron's OncoEMR and partner EHR systems. They've expanded beyond oncology but it remains their stronghold.
Medidata (acquired by Dassault Systèmes in 2019 for $5.8B) started as a clinical trial management platform (Medidata Rave) and expanded into synthetic control arms and RWE through Medidata Acorn AI. Their core asset is 30,000+ clinical trials with 9M+ patient records, plus partnerships with health data aggregators.
Data Sources and Quality
Flatiron: Structured EHR data + manually abstracted clinical data from medical charts. Flatiron employs 3,000+ clinical data abstractors (technology-assisted) who review physician notes, pathology reports, and imaging results to extract structured endpoints like progression-free survival, line of therapy, and biomarker status. This human-in-the-loop curation is expensive but produces research-grade data that the FDA has accepted in regulatory submissions.
Medidata: Clinical trial data (structured, protocol-defined) + claims data + EHR partnerships. Medidata's trial data is inherently high-quality because it's collected under GCP protocols with monitoring. Their RWE offering (Acorn AI) synthesizes trial data with real-world claims to build synthetic control arms — using historical trial patients as comparators for single-arm studies.
Regulatory Acceptance
Flatiron: Strong FDA track record. Flatiron data has been used in 20+ FDA submissions, including label expansions and post-marketing commitments. The FDA's Oncology Center of Excellence has specifically cited Flatiron data in guidance documents. Roche/Genentech uses Flatiron data extensively for regulatory strategy.
Medidata: Growing regulatory footprint, particularly for external control arms. Medidata's synthetic control arm approach was used in Bavarian Nordic's FDA approval of JYNNEOS (smallpox/monkeypox vaccine) — one of the first approvals leveraging synthetic controls. The approach is gaining traction for rare diseases where randomized controlled trials are impractical.
AI and Analytics
Flatiron: NLP models for chart abstraction acceleration (reducing manual review time by 40-60%), automated endpoint detection, and cohort identification. Flatiron's AI is primarily used to make human curation faster, not to replace it. Their approach prioritizes data quality over automation speed.
Medidata: AI-powered patient matching for clinical trials, synthetic control arm generation using propensity score matching and causal inference, and predictive enrollment modeling. Medidata's Intelligent Trials platform uses ML to optimize trial design, site selection, and patient recruitment.
Use Cases
Choose Flatiron for: oncology-specific RWE studies, FDA regulatory submissions requiring curated real-world data, comparative effectiveness research, post-marketing safety surveillance in oncology, and commercial analytics (market share, treatment patterns).
Choose Medidata for: clinical trial optimization and design, synthetic/external control arms for single-arm trials, multi-therapeutic area RWE (Medidata's trial data spans all disease areas), and integrated trial-to-RWE workflows where you need both clinical trial management and real-world evidence from one platform.
The Convergence
Both platforms are converging toward the same vision: an integrated evidence generation platform that spans clinical trials and real-world data. Flatiron is adding trial capabilities (Flatiron Edge for decentralized trials). Medidata is deepening RWE partnerships. The winner will be whoever can seamlessly connect prospective trial data with retrospective real-world data in a single analytical environment.
For product managers in clinical evidence, the lesson is clear: the artificial boundary between "clinical trial data" and "real-world data" is dissolving. Build your products for a world where evidence is continuous, not episodic.
Frequently Asked Questions
What is real-world evidence (RWE) in healthcare?
Real-world evidence is clinical evidence derived from real-world data (RWD) — data collected outside of traditional clinical trials, including electronic health records, claims databases, patient registries, wearables, and genomic data. RWE is used to supplement clinical trial data for regulatory submissions, support payer coverage decisions, monitor post-market safety, and guide clinical practice. The FDA has increasingly accepted RWE in regulatory decisions since the 21st Century Cures Act of 2016.
What is a synthetic control arm?
A synthetic control arm uses historical patient data (from previous clinical trials or real-world sources) as the comparator group instead of enrolling new control patients in a randomized trial. This approach is particularly valuable for rare diseases where enrolling enough patients for a control arm is impractical or unethical. Medidata pioneered this approach, and it was used in the FDA approval of JYNNEOS. The FDA accepts synthetic controls when properly validated, though they're not yet a substitute for randomized controlled trials in most indications.
How does Flatiron Health collect its oncology data?
Flatiron collects data through a network of 280+ community cancer clinics, many of which use Flatiron's OncoEMR system. The key differentiator is Flatiron's 3,000+ clinical data abstractors who manually review physician notes, pathology reports, and imaging results to extract structured research-grade endpoints like progression-free survival, line of therapy, and biomarker status. This technology-assisted human curation produces higher quality data than pure NLP extraction, which is why the FDA has accepted Flatiron data in 20+ regulatory submissions.