top of page

Real-World Data and the Potential to Identify Oncology Therapeutic Options

Dec 12, 2020

ORION by VieCure

Volume 1, Issue 12

Fred Ashbury, PhD and Devan Birch, BA

Randomized clinical trials (“RCTs”) are expensive and require considerable time and effort to administer, record, analyze data, and generate results (if successful) that can change practice. Moreover, patient participation rates are abysmally low - studies have repeatedly shown that fewer than 1 in 20 cancer patients participate in trials. Low participation rates in RCTs occur for several reasons, including patient-related barriers (such as knowledge of and attitudes towards research and practical concerns such as transportation issues, need to provide care to others, employment), lack of available trials for patients at the practice in which they’re being treated, absence of trained human resources to facilitate recruitment in smaller community practices, and few or ineffective tools/platforms in oncology practices to identify potential trial participants efficiently and effectively.



Given the challenges of RCT implementation, recruitment, and retention in the community setting, where at least 70% of cancer patients are treated in the US, there has been considerable discussion recently on collecting real-world data (RWD) from cancer practices as an alternative to clinical trials to identify potential therapeutic options. RWD, it is posited, based on patient characteristics and treatment patterns in the practice setting, could be more rapidly gathered in sufficient numbers, deidentified, stored, cleaned and analyzed to accelerate applications to the Food & Drug Administration for approval of drugs for new indications or novel agents. In order to achieve this in a cost-effective way, computerized tools that systematically collect RWD as discrete data are necessary.


RWD elements can include patient demographics, clinical data (e.g., diagnosis, stage, histology, molecular characteristics, comorbidities, allergies, concurrent medications), surgery (type), radiation (type, dose, fraction), chemotherapy, anti-hormonal therapy, immunotherapy, targeted therapy (drug(s), dose, frequency, duration, line of therapy, treatment intent), and outcomes data. A major source of these data are commercial electronic medical record (EMR) platforms, where the potential for discrete data capture exists. As Sweetenham notes, if the data quality from these sources is not what is produced from well-designed, controlled studies, the conclusions from the analyses of these data are suspect. We examined the completeness of electronic data captured as discrete data in an EMR used in two different oncology practices. Clinical staff in both practices had considerable experience with the platform after several years of use. We looked at several data elements in terms of those that would be minimally needed to understand practice patterns. The results of our analyses were not encouraging, as the tables below demonstrate.

Table 1 shows (Site A and B) that data completeness was high for certain elements, including patient demographic variables, physiologic data, diagnosis, and lab data. Other data, however, were poorly captured discretely, including treatment-related data, toxicities, and assessments.


Table 1: Data Element Completeness at Site A & Site B

More importantly, genomic data are stored as documents in the practices’ EMR instead of being captured discretely. By implication, manual or other means of electronic extraction, such as natural language processing tools, of genomic, treatment and outcome data from the PDF reports and other notes are needed to amplify the quantity and quality of data captured for these patients in this EMR platform to be useful to understand patterns of care and the potential for understanding other potential therapeutic options. Better yet, however, would be to have an electronic platform that is architected to capture these data easily and discretely to facilitate extraction, analysis, interpretation and reporting.



Comments From Paul Bunn Jr MD FASCO

Distinguished Professor of Medicine and the James Dudley Endowed Professor of Lung Cancer at the University of Colorado School of Medicine


A lot of relevant physical, clinical and genetic/genomic data are not available discretely. Some data are easy, but frequently missing, including race, ethnicity, performance data, PDL-1 expression, weight and weight change. Other data are more complicated, like quality-of-life assessments, response to treatment, duration of response, progression free survival, toxicities (grade and assessments), but vital to have to facilitate clinical trials recruitment, opportunities to identify novel therapeutics, and quality assurance. We need to get the National Cancer Institute, Food & Drug Administration and pharmaceutical and biotech industries, and radiation therapy machine manufacturers that collect necessary data to work with electronic health/medical record companies to capture the required data. Where any of these companies have been unwilling or unable to collaborate, these real-world data are less than optimal.

bottom of page