June 7, 2022 | “If we think about the idea of a data ecosystem, and one that’s particularly focused on cancer, what we want to be able to do is integrate all different layers and types of data so that we can have potential maximal impact for cancer research,” the National Cancer Institute’s Jill Barnholtz-Sloan said at the Bio-IT World Conference & Expo last month. During a session focused on the ways AI is transforming oncology research and care delivery, she explained how the Cancer Research Data Commons (CRDC) can be leveraged for AI and highlighted several projects.
The CRDC is intended “to preserve the long-term value of NCI-funded data and to allow for ease in data submission, access, and interoperability,” added Barnholtz-Sloan, an associate director for informatics and data science at NCI’s Center for Biomedical Informatics & Information Technology and senior investigator in the Division of Cancer Epidemiology and Genetics. The aim is to speed along cancer research via “integrative analysis of multimodal data types.”
Work is underway on a cloud-based automated data tool for medical imaging that includes collaboration from Google and Deloitte, and there is an NCI-Department of Energy collaboration involving several projects, all of which are focused on predictive models for oncology. Another example can be found in the Surveillance, Epidemiology, and End Results (SEER) Program, which looks to take data available in the clinical domain and use natural language processing (NLP) and other tools to extract information needed to enhance the cancer registry system.
Barnholtz-Sloan was joined by several representatives from industry (AstraZeneca, OWKIN, and Tempus Labs). Together, the panelists also discussed current applications and future directions for AI applications in oncology, including the use of AI in drug discovery and development.
“You’ve heard a lot about data today so far—where to get the data, how to connect the data—and I wanted to just talk about one part here that’s kind of missing from the equation,” said James Chen, senior VP of cancer informatics at Tempus Labs, whose remarks were geared toward the intersection where AI meets the user.
Still a practicing medical oncologist at The Ohio State University, Chen shared that he decided to leave full-time clinical practice at least in part because oncologists are devoting so much of their time to interfacing and completing clerical tasks with the electronic health record (EHR). There is not only a huge charting aspect involved, but the guidelines are also quickly changing.
To help illustrate, Chen gave the example of the arrival of a new breast cancer drug. How are oncologists supposed to know about all of the new developments taking place, and who is going to keep them updated? It's not like there’s an ESPN “SportsCenter” recap to tune into every night, he quipped.
The result is that even when potentially game-changing developments emerge, it can be challenging to keep up. Data is great, he added, but now there is the problem of too much information, too much documentation, and too many rapidly changing guidelines. And all of this is leading to note bloat. A doctor doesn’t have an extra 30 seconds to hop onto a phone and log into an AI application during a patient visit.
So one crucial question is how can AI precision medicine be incorporated into the EHR itself? From Chen’s vantage point, AI deployment needs to be “at the point of care, at the time of care, and at the course of care.” The data coming to physicians is monitored, siloed, and trapped, but by making these discrete elements “one of the first things you can do is you can unleash the power of a lot of these AI technologies.”
Finding Undiagnosed and Miscoded Cancer Patients with Cachexia
Several other conference sessions were also geared toward AI and precision health. In one of them, presenters revealed how Pangaea Data’s AI technology is discovering clinical features characterizing cachexia in cancer patients.
The condition impacts many patients with cancer as they lose muscle and often a significant amount of body weight. While it is not a new condition, it has remained under-diagnosed and poorly understood, explained Judith Sayers, a surgical training fellow at NHS Lothian. And it isn’t as simple as people with cancer losing their appetite. “It’s really a complex metabolic syndrome driven by inflammation and hormonal and metabolic changes,” she said.
The clinical consequences can be severe. Loss of muscle leads to loss of physical function and reduced physical activity, leading to worsened quality of life. In addition, patients with cachexia tend to have impaired responses to cancer treatments and shorter survival.
According to Sayers, the condition impacts about one-third of all cancer patients, and that figure is even higher in certain cancer types, like pancreatic or lung. The management of cancer cachexia is multi-modal, reflecting the underlying multifactorial etiology. It generally involves nutritional support and rehabilitation. In some situations, appetite stimulants or anti-inflammatory drugs may also be used.
However, to implement any management strategies, there is a need to detect and identify these patients, ideally in the early stages. In addition to the clinical benefits of early diagnosis and treatment, there is also a substantial cost-benefit.
So how can cachexia patients be detected earlier? It’s a question London-based Pangaea Data is attempting to address using its AI-driven software product called PIES. Jinqing Zhang, the company's head of AI, said its technology is being applied to automatically determine the clinical features that can categorize cachexia and then use those features to define the patients at risk.
He highlighted the results of a study conducted using a dataset with discharge summaries of nearly 60,000 patients from the ICU. According to the results, PIES found roughly six times more cancer patients with cachexia than using ICD codes alone (316 patients versus 51).
Next, the models will be applied to more patient data by including cancer patients from primary and secondary care in the United Kingdom’s National Health Service (NHS). After that, the plan is to find cancer patients with cachexia in collaboration with NHS Lothian and the University of Edinburgh.
Extracting Info from EHRs to Create Research-Quality Data
Zhang also took part in a separate presentation explaining how Pangaea Data uses its AI capabilities, especially NLP, to extract features from EHRs to come up with research quality data. Health records within EHR systems are rich sources of information, explained co-presenter VK Gadi, director of medical oncology at UI Health and associate director of the University of Illinois Cancer Center.
They can contain lab results, notes from medical professionals, and complex data files from external vendors. But getting at all that data and analyzing it to make treatment decisions is a significant challenge in medicine at this time, he continued. The information is trapped and cannot be extracted other than looking at the final line that indicates a patient has a certain mutation or level of risk, and the context is missing because it’s essentially buried within the health record.
Extracting tumor genomic test data results from EHRs is difficult, at least in part, because of the lack of standards geared toward communicating genomic information, Gadi explained. The files can differ from one vendor to another, and there is an inability to easily share information across EHR systems. In addition, results are often delivered in an unstructured format.
But building research-quality genomic data by extracting features from the clinical text is crucial for better understanding practice patterns and how to improve patient outcomes, and Pangaea Data is looking to tackle this issue at scale using AI.
The presentation included results from a pilot study that assessed the ability of NLP algorithms to convert unstructured text data and PDF-formatted tumor genomic test results into research-quality data. According to the results, PIES extracted 26 critical features (such as biomarkers and demographics) for breast cancer patients with 97% accuracy.
A large NCI grant is being put together to apply this type of machine learning within several health systems. The plan is to extract across two cancer types (breast and lung) and examine the impact of tumor genomic testing on health equity and outcomes. There will also be efforts to self-correct within the health care systems using this technology to help discover patterns.
Collaborating institutions include the University of Washington, Henry Ford Health in Detroit, Kaiser Permanente in Colorado, the University of North Carolina at Chapel Hill, Washington University in St. Louis, and the University of Illinois Cancer Center. While focusing on these two cancer types, there is potential to apply the AI to over 5 million data records from over 20,000 patients.
‘There’s No God in the Machine’
During a session entitled “Precision Medicine…First We Need Accurate Medicine,” panelists explained that precision medicine has evolved to stratify patient populations using an array of new technologies but cautioned that they might not reflect the full complexity of clinical medicine. They see a need for greater transparency about the accuracy of a diagnosis before precision medicine approaches can be applied.
According to Michael Montgomery, co-founder and CEO of Stable Solutions, machine learning faces several bumps in the road. There are issues with data, the models themselves, and explainability. And then there is the human problem. To help illustrate, he shared the example of blood pressure measurements, which tend to be gathered inconsistently.
Maybe you’re in a noisy clinic. Maybe you just polished off a cup of coffee. Maybe you have a full bladder. Or maybe you’re worried about the doctor visit. These and other variables can affect the measurement that makes its way into databases. And if a machine learning model is applied to that data, it could lead to faulty results.
“There’s no God in the machine,” Montgomery reminded his audience. We’re the ones who gather the data. We’re the ones storing the data. We’re the ones creating the learning patterns for the models. And although there have been some techniques that attempt to get around this, the models are still not entirely self-correcting.
Michael Liebman, managing director of IPQ Analytics, provided the example of triple-negative breast cancer, which is sometimes thought of as a relatively easy diagnosis because the patient is negative on three different tests. But the tests are not administered uniformly, he pointed out, and patients deemed to be triple-negative in one center may not be at another.
The underlying content and systems that are part of the entire health care delivery environment are commonly thought of as sources of truth, but that isn't necessarily the case, noted Jonathan Morris, VP of provider solutions and CMIO at IQVIA.
For decades, electronic medical records have been viewed as the future of healthcare, and they are still largely viewed in this light. Over time, we’ve built up the system-centric EHRs as the system of record and the source of truth, even though the data within may not be entirely accurate.
The information may not be entirely complete or up-to-date, either, as one anecdote revealed. Morris explained that he received his first two COVID vaccinations in a high school gym, his flu shot in an airport, and his COVID booster in a pharmacy parking lot. But when he later took part in a telehealth appointment, his doctor had none of these updates on his end.
We’re increasingly seeing the site of care delivery transition out of traditional locations like the hospital or clinic and into other settings. From his vantage point, we’re in the very beginning stages of a profound shift in consumer wellness, prevention, and early engagement that will take place before encountering traditional medicine.
According to Morris, as location migrates, so will the systems that we commonly think of as the sources of truth. As we increasingly move away from traditional care settings, patients are becoming creators of content as apps and other tools make it possible to participate in health in new ways.
In the past, a patient may have been asked to describe their sleep during a doctor visit to explore the possibility of insomnia or other health issues. Today, on the other hand, wearables can provide more detailed information about sleep patterns, among other insights. This ability to connect the patient in can be useful in the context of modeling algorithms, he added. It can also help identify patients for clinical trials and keep patients engaged following their completion.
Paul Nicolaus is a freelance writer specializing in science, nature, and health. Learn more at www.nicolauswriting.com.