May 28, 2019 | False positives and negatives are an uncomfortable reality in bioassays. A recent study published in JCO Precision Oncology (DOI:10.1200/PO.18.00191) suggests that most assay discordance is the result of technical variations. According to Leonora Balaj, Instructor of Neurosurgery at Massachusetts General Hospital (MGH), improving the accuracy of these tests will require a return to the basics, beginning with improvements to the pre-analytical process.
With the reliability of these tests being a concern, change is inevitable. A certain level of re-education is required, Balaj says, both for the primary care doctors and the patients themselves. On the patient-level, they may come into a test expecting results the biopsy cannot provide, and oftentimes the doctors latch onto protocol that they've mastered for years that they don't want to abandon.
On behalf of Diagnostics World, Christina Lingham and Rory McCann spoke with Balaj about the issues with variant calling in assays, ways of improving the readability of the test, and how all of this impacts the consumer.
Editor's note: Lingham and McCann, Conference Producers at Cambridge Healthtech Institute, are planning a track dedicated to Clinical Application of Circulating Biomarkers at the upcoming Next Generation Dx Summit in Washington, DC, August 20-22. Balaj will be speaking on the program. Their conversation has been edited for length and clarity.
Diagnostics World News: A recent study in JCO Precision Oncology reports on faulty tests, and issues with variant calling and discordant data were referenced. What do you think is the impact of this finding?
Leonora Balaj: It's a huge issue, and I would say probably the common theme, the common denominator that keeps coming up, independent on what the assay is—sequencing, or droplet PCR, or targeted sequencing, or arrays—is that we all seem to run into the same problems when we're dealing with low copy number input, low amount of information, right? The liquid biopsy field is particularly affected by this low input, low availability of sample. It doesn't matter how sensitive we think they are, they all have huge limitations in being able to detect these rare events that we think are an indication of the presence of a tumor. I think it's no surprise to anyone really that they did amazing work here. It's rare to see this type of work, where they performed this independent study with this really well-controlled set of samples and data analysis. I don't think it's a huge surprise to a lot of people in the field that this is happening. But it is good for the field to raise a red flag and say, "Hold on," because maybe what we see is not exactly what we think we're seeing, and also maybe we are seeing what we want to see because one of the issues is that typically we like to see more changes that correlate with pathology.
And so even in this study when they see more concordance on gene variants where there's an indication of a pathology, than on other genes there is an indication that it's a benign thing. So again, that goes with the idea that we like to see more variants that are an indication of a pathology, and so there's certainly work to be done in trying to optimize these algorithms, and the way that we look at data. Remember that the problem is always having these very rare events coupled with these very highly sensitive technologies. But if you think of an ideal world, none of them have been tested properly. We're testing new ways to look at liquid biopsy in biology, which we think we understand, but maybe we don't understand properly and exactly, with the new technologies that obviously have been tested in an artificial setting, with spikings and all the great controls, but has not really ever been tested in a biological setting. It's a little bit like the chicken and the egg. We don't know which ones to test it for, and we don't know which one is perfectly working, so it takes a little bit of testing different things, and testing in different settings, and then trying to optimize things. And over the years, hopefully we'll get there and I'm sure we will get there, but I feel like that's where we are a little bit now: Stuck.
And I come from the exosome field, where in India we have a very similar issue, which is the fact that exosomes are limited. Either there are not a lot of them or there are a lot of them in the plasma, but they don't necessarily come from the cell of origin that you're trying to study.
You may have a billion exosomes, but maybe only a hundred thousand of them are really the ones that you're interested in, so you can come up with really great technologies, but you're still looking at information coming from a billion exosomes. But how do you really go to those hundred thousand exosomes? People are trying to enrich and deplete the background and all of that. But again, it's a work in progress to try and understand, really, what works in this biological concept. Remember we're talking biology, which we understand, but there are still things that we were still learning, day in and day out.
What do you think are some ideas on how we can improve reliability of the test?
This goes back to really field of liquid biopsy, which is intriguing, but it's also facing major challenges. One challenge is the fact that we don't have a lot of the starting materials, so we're doing this in 1ml of plasma, 2ml of plasma, sometimes 4 or 5ml of plasma if we're lucky. But nobody really wants to optimize an assay that's on 10ml of plasma because a patient may not love it, and a clinician may not love to be able to extract all this blood from their patients. Everybody started to stay on the lower limit, but 1 or 2ml may give you very few copies.
Now on the other hand, you're trying to use this variant calling where you say, I'm going to look for these rare events. So variant calling, combined with a very rare event, is undoubtedly going to create a lot of false events and false positives.
This is a good red flag to raise, so that it sort of puts the brakes a little bit on the field. Everybody that wants to publish something similar, or imagine a company that's just working on a test, they'll have to measure themselves a little bit, and have to explain to their consumers, or to their audience, where they think they stand and what they think they're doing to reduce these biases. For me, I think it's a good thing if there are more studies like this, and I'm sure there will be.
One way of improving it is trying to go back to the basics. There is plenty of evidence that shows that most of the errors in creating a test—upwards of 60% actually—is based on the sample in this pre-analytic stage. That includes sample collection, how it's handled and how it's processed. As you can imagine, collecting a blood sample is really a straightforward thing, but also not, depending on if you have a good patient, if you have good phlebotomies, if you have some hemolysis, if you left the sample on the bench for an extra 30 minutes or not. All of those things combined can increase the error rates, the technical problems of any test.
Now with variant calling, you have that added layer of the algorithms that you're developing, right? I guess what I'm trying to say is, paying attention to all these pre-analytical steps and trying to optimize them as well as you can for a specific platform, for specific kits, combined with well worked-out algorithms like they've done here, where you try and compare a few, and see which one works best.
And the other thing that they mentioned in this study, which is a really good point, is: test yourself. Test your assays. Add in positive controls, add in your negative controls, add in samples that that you're blinded to, so you can really challenge your tests and your algorithms to make sure that what you're seeing is not what you set yourself to see to start with.
Who do you think the gatekeepers are that are ensuring test quality, and what do you think they're doing to ensure efficacy and improving the tests?
That's a really good question. I think a lot of this work really goes back to the lab that's optimizing the assays, and to their work ethic. What I would hope is that people that are developing the tests really use the external controls to challenge themselves and challenge their assay, and work with different types of bio fluids and try to make sure that the assay's really working as it should, and not just in their ideal world.
That's a little part of it. So again, the gatekeepers are the people that are actually developing the tests. The other part of it are the clinicians, and that's probably the biggest gatekeeper here to make sure that there's enough education and enough will to shift the clinical care a little bit, whether that's being open to new tests and new ideas. And also with that, what happens is that in academic centers—for example, at MGH—it's easier for a clinician to know the literature, and understand what's going on, and be open to novel tests. But imagine that this may not be very easy for most of America, where patients go and see their local doctor or their local hospital, and an academic big hospital may be hours away, and so shifting this culture around clinical care will be a major thing. And I don't know how exactly that's going to come, but that is a big problem at this point.
You said local hospitals may have a harder time than say, MGH. Is there a lack of accountability for a local hospital that may not have the same visibility that a larger research hospital might have?
So maybe a local hospital is not the right term, but imagine a local, small doctor's office somewhere in the middle of the country where there's not a lot of new research going on. Either way, there's a part of the clinical world that, traditionally, they've been doing things a certain way, and they keep doing things a certain way because the protocols are established, and there has to be a biopsy for them to be able to continue to do their work, and so if you go in and say, "Hey, I have a great test for you, and you don't need to biopsy your patients anymore," they're going to be like, "Why? This is what I do." It's not going to be overnight, but there has to be that change where, maybe the next generation of clinicians will not be so strongly tied to doing those biopsies, thinking of a holistic approach for the patient care.
And also, let me just add one more thing because in a way, clinicians should be very strict as to what exactly to accept. They can't accept any test. That's a given, but there has to be a good balance between understanding what's a good test and what actually can change and improve a patient's life or patient's care versus saying "No" to everything. There has to be a good balance in there.
So how do you think this all will impact consumers?
I think a good consumer is always an educated consumer. Obviously, we don't want every patient to become a scientist, but understanding how things work and understanding what's actually possible goes a long way. For example, you see this in the news sometimes, where they'll say, "We can diagnose a tumor using one microliter of blood." Right? And it sounds so easy, it sounds so cool, and that's what everybody remembers. But that's not exactly the truth. Educating people and educating patients is always a good thing so that they understand that it takes time to actually get a good test, and a test that actually works.
But on the other hand, you can't expect patients to know everything. The other part of this will be with insurance companies to try and see how the tests work and what the insurance companies can cover. There will be a stage where a test is close enough, it has shown enough validity, that it can actually help the patient, it can actually help improve a life, or a good companion diagnostic, or whatever it is. It's been validated enough, but maybe it's not validated enough to where the insurance company says, "Yes, I'm going to reimburse you for this test." A lot of patients will find themselves where they'll end up paying for this test by themselves or struggling and not having the money.
I think that's where consumers probably try and be a little bit more aware of what's happening, so that insurance companies, which do have a lot of power now, try and be more reasonable and accept these new tests when they're really valid.
Foundations, like patient advocacy groups and all of that, could do a better job, I think, in trying to screen through the sea of tests that are constantly being developed, to try and understand which ones are the real ones, which ones have more potential. For example, I work on brain tumors. Thankfully it's not a very common disease, and the test that we are developing can actually have a huge impact on these patients. But it will be rare to get a company to be interested in developing a test for this subset of disease because it's not cost-effective. They would spend a lot of money in developing a CLIA-validated test, but then the market for it probably would never cover the actual cost of developing the test. So that's where, again, foundations and patient advocacy groups could have a voice and try to raise money, or change policies to bring awareness to these rare, low incidence diseases that still require a test.
That's a great segue to how your work addresses some of these issues with variant calling and reliability. How do you ensure that the test is robust and reliable?
We don't necessarily do a lot of sequencing of whole genome, whole transcriptome sequencing, but the issues are always the same. When you have a low amount of input material, every test is going to have issues, it's not going to be as robust. In brain tumors we're interested in two particular mutations for now, and the best way for us to do it at this point is droplet PCR because one of them in particular is actually a point mutation, so sequencing is a huge cost when you're only interested in one specific mutation. That's the reason why we don't really do sequencing for this type of analysis. But nonetheless, I said, low input, same issue. And the way we've optimized the assay is by challenging ourselves, having all our positive controls, all our negative controls, blinding ourselves.
We work very closely with an industry partner, so we'll send blind samples to them and get back the data, and we mash them up with our patients' clinical data that we have and lower the limit of sensors for detection. That's where you need a lot of work to ensure that what you see is a real signal and not a false positive, which is the issue that we're dealing with. And the easier thing for us, at least in this case, is that this mutation does not really exist in normal people, so if it's present, then there is a tumor. But again, we want to make sure that we get zero copies when there is no mutation in the primary tumor, and so that lower limit of detection is very important for us to establish stuff that we've done.