What Real-Time Clinical Trials Demand From Your Tech Stack
By John Oncea, Chief Editor, Clinical Tech Leader

The FDA’s real-time clinical trials initiative is being discussed as a policy story and a regulatory milestone. It is also, and perhaps more consequentially for the industry, an infrastructure story.
Making trial data visible to FDA reviewers as studies progress does not happen by itself. It requires an unbroken chain of capability: sites that capture data consistently and on time, sponsors that can integrate and transmit it without delay, and technology platforms that can do all of this in a way that regulators consider credible. Each link in that chain is a place where the current industry often falls short.
The most important technical clarification from the FDA’s May 15 Industry Information Session: RTCT is not continuous access to raw electronic health record data. The agency expects structured signal reporting, predefined endpoint feeds, aggregated safety outputs, and AI-assisted summaries transmitted through validated pipelines. That scope is narrower than some early coverage implied, but the infrastructure requirements are no less demanding.
This is Article 2 of a three-part series. Article 1 covered what the FDA announced and why it matters. Article 3 examines the legal, regulatory, and operational risks of RTCT.
The Fundamental Shift: From Batch Reporting To Continuous Flow
The most important technology change RTCT introduces is the replacement of end-of-study batch reporting with continuous data flow. Under the traditional model, data moves through a predictable sequence: sites capture it, sponsors clean and lock the database, analysts process it, and the FDA eventually receives a submission. There is time, sometimes a great deal of time, built into every step.
Under RTCT, that buffer disappears for the signals being transmitted. Every delay in site entry, every inconsistency in query resolution, every gap in the data pipeline becomes potentially visible to regulators before sponsors have had the opportunity to address it.
The practical implication is that real-time review is different from real-time execution. If the upstream workflow is still slow or fragmented, regulators will simply see flawed data sooner. The value of RTCT depends almost entirely on the quality of the operational infrastructure that produces the data being transmitted.
Data Capture And Integration: Where The Chain Breaks
The most common failure point in clinical data pipelines is also the most upstream: site-level data entry. Sites routinely enter data days after collection. Under batch reporting, a weekly clean-up cycle was standard and generally acceptable. Under RTCT, a significant lag means the FDA’s live view of a trial may already be outdated before a reviewer opens the dashboard.
Multi-site trials compound the problem. Different sites have different habits. One site resolves queries in 24 hours; another takes two weeks. In near-real-time monitoring, those inconsistencies surface as visible data quality gaps, not as footnotes in a submission that someone will eventually address, but as active discrepancies that the FDA can observe as they occur.
Tala Fakhouri, who built AI policy at the FDA before implementing it for sponsors at Parexel, frames scalability as the central technical challenge. “The biggest bottleneck is how do we scale this technology solution to all of the players involved in the ecosystem?” she said. The proof-of-concept studies demonstrate that the model works for two sponsors with modern, integrated tech stacks. The harder question is how it works across dozens of sponsors, hundreds of sites, and the full range of legacy technology environments that currently define the clinical research industry.
Kent Thoelke, CEO of Paradigm Health, the technical platform used in AstraZeneca’s proof-of-concept, describes an architecture that begins not at the sponsor or the EDC, but at the clinical care environment itself. “The architecture starts at the clinical trial site, as data are generated during routine patient care and trial execution,” he said. “Our platform connects to the healthcare provider environment and ingests both structured and unstructured trial-relevant data, labs, vitals, medications, encounters, physician notes, imaging reports, pathology reports, and protocol-specific data.”
Once ingested, that data is mapped to both traditional Case Report Forms and a study-specific Clinical Trial Reporting Schema defined collaboratively by the sponsor, the FDA, and the technology partner before the trial begins. The schema specifies which protocol-defined events should be surfaced, when, and under what workflow controls. Signals are then routed through governance workflows, including site review, sponsor review, correction mechanisms, version control, and audit tracking, before or alongside FDA visibility, depending on the agreed study model.
“The FDA is not receiving a raw EHR feed or unrestricted patient-level data stream. The FDA-facing layer is a controlled transmission of predefined signal metadata tied to the agreed schema.” – Kent Thoelke, CEO, Paradigm Health
A credible real-time data pipeline also needs standardized data structures, validated transmission protocols, and governance documentation sufficient to demonstrate to regulators that the data flowing through the pipeline is as dependable as data submitted through traditional channels. On interoperability standards, Thoelke was specific about how that was built in the proof-of-concept: “We leveraged direct EHR integrations using government-mandated interoperability standards, such as FHIR, which is actually a great example of one part of the government benefiting from standards created and enforced by another part of the government.” That FHIR foundation was layered with normalized site-level data collection workflows and supplemented by JSON-based API integrations with FDA systems, built collaboratively with the agency’s own technology teams.
Building all of that took years of prior investment in healthcare provider integrations. For the broader industry, the implication is direct: sponsors and vendors that have not made comparable investments in site-level integration cannot simply acquire or deploy their way to RTCT readiness quickly.
Thoelke also identified where latency actually lives in a production pipeline, and it is not where most technology teams instinctively look. “The main latency risks are not usually the technology itself. They are operational dependencies, delayed site documentation, external lab, or imaging results not yet finalized, reconciliation workflows, or missing source information that requires clarification before release.” Proximity to the point of care is what mitigates those dependencies; distance from it is what creates them.
AI And Analytics: A Support Tool, Not A Black Box
The FDA’s RTCT initiative is explicitly built on AI and data science advances. The agency’s RFI is formally titled the “AI-Enabled Optimization of Early-Phase Clinical Trials Pilot Program.” AI is not peripheral to RTCT; it is load-bearing.
The May 15 Industry Information Session reinforced a point the April announcement underemphasized: AI tools used in RTCT workflows must be auditable, reproducible, traceable, and explainable. The FDA is not proposing that AI-generated outputs alone drive regulatory decisions. It is proposing that AI serve as a support tool for reviewers who remain accountable for the interpretations they make.
Fakhouri is direct about where the readiness gap actually lies. “You can’t have your AI ice cream unless you’ve had your data vegetables,” she said. “Your data has to be centralized, not siloed, and accessible for AI applications.” When she began working with sponsors on AI implementation, she encountered a consistent pattern: pressure from leadership to adopt AI, but data infrastructure too fragmented to support it. Silos, weak governance, and what she calls “pilotitis,” small-scale AI pilots that could never scale, were the norm. That has begun to change, with more emphasis on data readiness, nimble governance structures, and pairing domain experts with technical implementers.
Thoelke draws the same boundary from the implementation side. The AI-enabled portions of the workflow are focused on data organization, classification, mapping, extraction, and workflow acceleration, surfacing relevant information within unstructured clinical documents, mapping data to protocol-specific fields, or identifying potential events that meet predefined criteria. But interpretation remains human. “A signal that appears important still needs to be interpreted by a human reviewer within the broader medical, statistical, safety, and operational context of the study,” he said. “AI is not making independent clinical conclusions or regulatory determinations.”
“You have to just assume that your information, in the next six months or next year, will be analyzed by an AI tool from a regulator. So your data has to be AI-ready.” – Tala Fakhouri, Parexel
For RTCT specifically, the AI requirements are concrete: signal detection tools need to flag safety events, eligibility issues, and endpoint trends during the trial rather than after database lock, and those tools need to be validated, documented, and subject to change control so that mid-study model updates do not create integrity questions. Fakhouri also noted the regulatory direction of travel: the FDA has deployed an AI-assisted review tool called ELSA, and the European Medicines Agency has Regulus. Agencies are building AI-assisted workflows for submission analysis. AI-ready data is not a future requirement; it is a present one.
Monitoring And Safety Workflows: The Coordination Problem
Risk-based monitoring was designed around scheduled site visits and defined thresholds, a model built for a batch-reporting world where monitoring cycles could be aligned with data milestones. Real-time data does not wait for monitoring schedules.
RTCT forces sites, sponsors, and CROs onto the same operational timeline simultaneously, something the industry is not currently structured to handle. In a real-time environment, all four parties, sites, CROs, sponsors, and the FDA, need to be operating in something approaching real time together. That is not a technology problem in the narrow sense. It is a coordination problem.
It requires written and practiced cross-party escalation protocols: documented answers to questions such as who acts when an AI flags a safety signal, whether the site calls the sponsor’s medical monitor directly or routes through the CRO, and what obligations exist when a discrepancy is visible before the sponsor or CRO has formally acknowledged it. Those escalation trees need to exist in contracts and training materials before trials go live.
On the data quality side, Thoelke described a feedback dynamic from the proof-of-concept that should recalibrate how sponsors think about what RTCT costs operationally: “One of the major learnings from the proof-of-concept work is that earlier visibility can actually improve data quality, because inconsistencies and missingness are identified while the clinical context is still fresh at the site.” When a discrepancy is flagged days after the clinical event rather than months later during batch review, the site coordinator who recorded the data is still available to explain it. That is not just a compliance advantage; it is a practical one.
Staff competency requirements also change. GCP training was designed for the batch-reporting world. Coordinators and monitors in an RTCT environment need additional capabilities: how to interpret live dashboards, how to respond to real-time queries, and what their obligation is when they see a discrepancy before the sponsor or CRO does.
Data Governance And Auditability
In traditional trials, data governance is important. In RTCT, it is existential. When the FDA has a near-real-time view of trial activity, every correction, every edit, and every late entry is potentially visible to regulators as it happens. Unexplained data modifications are not footnotes; they are red flags that reviewers may observe before the sponsor has had a chance to contextualize them.
Thoelke describes what live auditability looks like in practice: “Every output is tied back to underlying source information through controlled workflows, versioning, and role-based review. The platform tracks data lineage from source capture through mapping, review, correction, approval, and FDA-facing release, including who reviewed the information, when it was modified, whether it was corrected or updated, and which workflow state it was in at the time of transmission.” The goal is to make provenance questions answerable quickly because the workflow preserves the relevant traceability structure. “In many ways, the RTCT model is designed to improve auditability compared with traditional, later-stage resolution processes, where issues may surface months after the clinical event.”
Governance requirements that become non-negotiable in RTCT include: a unified data dictionary across all parties, since sites, sponsors, and CROs frequently use different terminology for the same data fields; a clearly defined and operationally enforced data entry standard; and validation documentation for every AI or automated tool used in the pipeline. In batch reporting, terminology discrepancies get harmonized during data management. In real time, they surface as ambiguity the FDA will observe before anyone has resolved it.
Privacy controls also become more complex. Live regulatory visibility into ongoing trials creates questions about patient confidentiality, particularly for rare disease populations where individual data points can be identifying. Sponsors need to design data architectures that enable signal transmission without exposing patient-level source data, a distinction the FDA clarified on May 15, but one that many sponsors have not yet operationalized.
Fakhouri identifies the governance controls that become non-negotiable under RTCT as a cluster: centralized data architecture, AI governance frameworks with documented model logic and change-control processes, audit readiness, and genuine cross-functional coordination between clinical operations, regulatory affairs, and pharmacovigilance. “Data needs to be centralized and AI-ready, not siloed in legacy systems,” she said. “AI governance needs to be in place before regulators are watching live.”
What This Means For The Broader Industry
In practical terms, the industry’s transition to RTCT is less a single new product category than a broad modernization pressure. The strongest platforms will be those supporting continuous, auditable, and interoperable research operations. Companies that built their tech stacks around batch-reporting assumptions now carry technical debt that real-time visibility will make apparent.
Thoelke put the site-level dependency plainly: “The RTCT effort reminds us all that regulatory review cannot purely be a downstream analytics problem. It depends on high-quality site-level data entry. The closer data capture occurs to the point of care, and the more integrated the workflow is within normal clinical operations, the more reliable and actionable the resulting signal metadata is.” No amount of sophisticated downstream analytics can reliably surface meaningful signals from data that was entered late, inconsistently, or without clinical context.
For sponsors, CROs, and vendors considering their own readiness, the proof-of-concept’s primary lesson may be less about what technology is required and more about what investment timeline is realistic. The infrastructure that made the AstraZeneca proof-of-concept work was built over years, not deployed in response to an announcement. Organizations that want to be positioned for the pilot, or for the broader adoption that follows, are already behind if they have not started.
This is Article 2 of a three-part series. Article 1 covered what the FDA announced and why it matters. Article 3 examines the legal, regulatory, and operational risks of RTCT, including unresolved questions about sponsor liability, blinding integrity, and DSMB governance.