Insight into tumour marker development

The growth of molecular genetic technologies has enabled advances in personalized medicine, particularly in the field of cancer diagnosis, prognosis and therapeutic decision-making. CLI caught up with Dr Jennifer Keynton (Diagnostics R&D Manager at Horizon Discovery) to find out more about the development of tumour markers and the role of reference standards in cancer diagnostics and treatment.

The development of next-generation sequencing (NGS) has revolutionized the amount of data that can be generated on potential cancer-causing mutations. This has allowed the development of precision or personalized medicine, with diagnosis and therapy regimens being specific to the nature of the individual patient’s tumour/cancer. What have been the discoveries in the past 20 years that have interested you most in this field?

The field of companion diagnostics has grown significantly in the past 20 years, since the regulatory approval of trastuzumab – an antibody treatment for HER2-positive breast cancer, the first drug approved with a companion diagnostic – in 1998. Terms relating to the genetics of breast cancer are now in the public domain, with phrases such as ‘triple-negative’ and ‘BRCA1/2 mutations’ broadly understood to confer poor prognosis and inform treatment pathways in patients. In 2014, the first poly (ADP-ribose) polymerase (PARP) inhibitor, olaparib, was approved to treat advanced ovarian cancer with germline BRCA mutations. Our understanding of the implication of BRCA mutations and DNA damage continues to grow, with indications for PARP inhibitors now in breast, ovarian, pancreatic and, most recently in 2020, prostate cancer. Companion diagnostics are essential to deliver these drugs to the correct patients. Another area where understanding has grown considerably is the epidermal growth factor receptor (EGFR) pathway, particularly the role of KRAS, BRAF and other downstream effectors in non-small cell lung cancers (NSCLCs). Several tyrosine kinase inhibitors have been approved for the treatment of NSCLC with EGFR exon 19 deletions or exon 20 substitution mutations, and, earlier this year, sotorasib, a drug targeting KRASG12C mutants in NSCLC, was approved alongside two companion diagnostics. The high incidence and poor outcomes of NSCLC patients means that improvements in the diagnosis and treatment of these cancers have the potential for great positive impacts.

One of the most exciting fields of research in diagnostics is epigenetics. Hypermethylation of the O6-methylguanine-DNAmethyltransferase (MGMT) promoter in glioblastoma was one of the first epigenetic modifications identified as a prognostic indicator [1]. Since then, several epigenetic in vitro diagnostics have been approved for a broad range of cancers, including colorectal, breast, lung and cervical. Our understanding of epigenetic modifications and their impact in cancer continues to strengthen [2], and many pan-cancer diagnostic developers – such as Guardant, Grail, Foundation Medicine and Burning Rock – are beginning to employ methylation profiling. The impact of this understanding on cancer detection and patient outcomes holds great promise and I hope to see it realized in my lifetime.

There is a vast amount of literature about the identification of new tumour markers, but the translation of this information into new clinical molecular diagnostic assays is slow – what is the process for tumour marker validation and what hinders it?

There are multiple stages in the discovery, adoption and exploitation of tumour biomarkers as diagnostic indicators. Initial observations of biomarker correlation with disease incidence or prognosis must be rigorously validated, and this is complicated hugely by patient heterogeneity, genetic pleiotropy and the multifactorial nature of most cancers [3]. Forward genetic approaches, such as genomewide association studies, also require extensive validation to understand biological mechanisms. The use of cellular and animal disease models can be challenging, expensive and time consuming – here, the adoption of CRISPR (clustered regularly interspaced short palindromic repeats) technologies, including gene editing and more recent advances in gene inhibition and activation [4], has great potential to validate complex disease mechanisms using highly functional applications.

As more complex biomarkers emerge there is a greater need for standardization and harmonization efforts, the lack of which can hinder adoption and clinical utility. Recent examples include biomarkers such as tumour mutational burden, microsatellite instability (MSI) and homologous recombination deficiency, all of which are promising, but still in the process of standardization. Different test developers may use different metrics, risk score calculations, and cut-offs, which can complicate our understanding and ability to make informed decisions for patients. Collaborative efforts, such as Friends of Cancer Research [5], play a key role in these harmonization efforts.

Following on from identification, clinical utility of the biomarker must also be determined. Is the identified biomarker actionable, does it allow clear stratification of treatment pathways, and, finally, does this result in improved patient outcomes? The process of biomarker identification through translation could be viewed as a funnel, with a great many inputs yielding few outputs, and how we maximize that output is an ongoing challenge.

A variety of sample types have to be analysed when screening for tumour markers, what are the implications/ impacts of these on assay development?

The format of a clinical sample, either a solid or liquid biopsy, is the key factor influencing the sample analyte properties and therefore the downstream assay workflow. Traditional formalinfixed paraffin-embedded (FFPE) samples have numerous advantages; they form the largest archive of biological material available, are relatively stable and allow the preservation of both cellular and morphological details. They do, however, come with their own unique challenges.

The act of formalin fixation creates crosslinks between proteins and nucleic acids, which causes fragmentation of the nucleic acids. Fragmentation is enhanced by tissue autolysis during fixation – larger biopsies, through which fixatives are slower to penetrate, are particularly susceptible to this – and the act of paraffin embedding requires elevated temperatures which can further fragment nucleic acids. The subsequent storage temperature of samples also affects DNA integrity, with prolonged ambient storage leading to increased degradation. These factors combined can result in the loss of high molecular weight DNA from samples, which can potentially influence the results of downstream analysis. The crosslinks themselves can also lead to sequence artifacts by interfering with the activity of Taq polymerase during endpoint PCR or NGS library preparation [6]. By refining the steps of nucleic acid extraction from FFPE specimens, it is possible to limit the influence of these factors.

Deparaffinization is required to remove the paraffin prior to extraction, but some chemicals used to perform this step may also lead to artifacts. Proteinase K treatment is used to digest crosslinks and solubilize nucleic acids, although DNA and RNA respond differently, with shorter incubation times sufficient for RNA. Heat incubation can also be used to remove crosslinks and thereby reduce sequencing artifacts; however, heat incubation can also lead to increased fragmentation. DNA restoration can be performed by commercial kits designed for FFPE DNA repair prior to downstream applications.

When developing assays for FFPE material it is therefore important to consider the integrity of DNA or RNA required for analysis. Both RNA integrity (RIN) and DV200, the percentage of RNA fragments >200 nucleotides, are key metrics of RNA quality and have been reported to vary widely among FFPE tissues depending on tissue origin and correlate with library yield [7]. For PCR-based techniques, amplicon length must also be considered; fragmented DNA may yield better results when amplicons are shorter. Orthogonal validation of determined variants is essential to eliminate artifacts caused by formalin-induced nonreproducible sequence alteration.

Another limitation of FFPE samples is that they may not fully capture tumour heterogeneity, owing to the localized nature of the biopsy. This is a key benefit of liquid biopsy, which can sample the tumour independently of the site, or even when the primary site is unknown. The minimally invasive nature of liquid biopsy is also attractive and allows repeated sampling and monitoring throughout disease progression and treatment. However, the nature of the analytes in a liquid biopsy are vastly different to those of a traditional FFPE specimen. The proportion of tumour DNA in plasma is relatively low, meaning that the variant allele frequencies (down to 0.1%, or even lower when monitoring minimum residual disease after treatment) that must be analysed can push the limit of detection for standard PCR and NGS techniques. As with FFPE samples, pre-analytical factors can influence sample properties in liquid biopsies. For example, the blood collection tube used can impact the presence of contaminating genomic DNA from white blood cell lysis, among others. Although the field of liquid biopsy is still emerging, advances have been made to standardize pre-analytical protocols to limit the influence of these variables [8] – again, the role of consortia, such as BloodPAC, in this standardization is key [9].

Reference standards are a crucial part of an accurate assay. What goes into the development of good reference standards?

When selecting a suitable reference standard, it is important to consider the technical challenges of the assay, for example the analyte concentration, and variant class or frequency. Validation of a diagnostic assay requires rigorous analysis of the assay performance against a set of known variables. The sensitivity, specificity and accuracy of the assay must be established along with assay robustness and limit of detection. Having highly characterized samples on which to perform this validation, where crucial parameters are controlled, is essential. Clinical samples are, by their very nature, heterogeneous owing to pre-analytical variables such as sample quality (DNA yield, integrity, and purity), patient variability (including mutation, allele frequency, and cancer stage) and sample processing methods (fixative artifacts, discussed above). Clinical samples are also finite and can be difficult to obtain because of ethical considerations, with rare mutations challenging to come by.

Cell-line derived reference standards (e.g. OncoSpan, Horizon Discovery) can therefore act as a key tool in the development and validation of diagnostic assays [10]. The desired biomarker variants are either sourced or engineered in highly characterized cell lines and their presence is then confirmed through Sanger sequencing, NGS and droplet digital PCR (ddPCR). Cell-line variants are then combined to yield clinically relevant variant allele frequencies and biomarker composition before processing the cell blends into the desired format to mimic the patient sample; FFPE, gDNA or cfDNA. The design and manufacture should meet ISO 9001 and 13485 quality standards to ensure the reference standards are fully validated, tracible and quality controlled. Cell-line derived reference standards provide a source of highly characterized, reproducible, and renewable reference material, while offering great commutability to patient samples as a result of their genomic complexity – a property that is hard to mimic in synthetically derived samples. Orthogonal validation of refence standards is key to understanding their performance across multiple platforms and using assays, such as ddPCR, Sanger sequencing and NGS, to validate provides confidence that the reference material will deliver the expected specifications agnostic to the analysis method. As tumour profiling assays evolve to consider multiple biomarkers and metrics, cell-line-derived reference standards can deliver the sample complexity required to validate them and monitor their ongoing performance. Reference standards provide the constant by which you can rigorously assess your assay performance.

What do you envisage for the future development of tumour diagnostics?

The future of tumour diagnostics is an exciting field, with new technologies, biomarkers and analytes emerging at rapid speed. Liquid biopsy holds great promise, with the potential to lower barriers to testing and allow routine screening and monitoring. The impact of early detection on cancer patient outcomes is a great incentive, as we know that the earlier a cancer is found the more treatable it is. Screening of the healthy population is therefore an attractive prospect and with pan-cancer tests like Galleri™ coming to the market the hope is that these broad screens will complement existing targeted screening programmes. Diagnostic tests are being developed to analyse multiple analytes – for example, in prostate cancer, antigens, imaging, genetic signatures and gene expression levels are all used to identify and stratify disease. As our understanding improves, the information we must assimilate to successfully diagnose and prognose grows exponentially. This leads to the requirement for huge computational power and machine learning to analyse the data landscape. The cost of whole genomeand whole exome sequencing continues to fall, and with the advancement of techniques like long-read sequencing the availability of high-quality genomic data has never been so high. In the not-sodistant future, one can imagine personal genomic data for each patient to be commonplace. As the data we accumulate grows, we gain opportunity to truly democratize healthcare outcomes, by improving our understanding of complex influences like race and ethnicity in disease. Ultimately, if the goal is to improve tumour diagnostics, then the most impact will be realized when the tests are accessible, cost-effective, and available to as broad a population as possible.

The author
Jennifer Keynton PhD, Diagnostics R&D Manager
at Horizon Discovery, a PerkinElmer company
Horizon Discovery, Cambridge Research Park,
Waterbeach, Cambridge CB25 9TL, UK
For further information visit Horizon Discovery

1. Weller M, Stupp R, Reifenberger G, Brandes AA, van den Bent MJ, et al. MGMT promoter methylation in malignant gliomas:
ready for personalized medicine? Nat Rev Neurol 2010; 6: 39–51.
2. Saghafinia S, Mina M, Riggi N, Hanahan D, Ciriello G. Pan-cancer landscape of aberrant DNA methylation across human tumors.
Cell Rep 2018; 25(4): 1066–1080.e8 (
3. Bien S A, Peters U. Moving from one to many: insights from the growing list of pleiotropic cancer risk genes.
Br J Cancer 2019; 120: 1087–1089 (
4. Abarca J, Strezoska Z, Vermeulen A. What is CRISPRa vs CRISPRi? Horizon Discovery 2021; April
5. Friends of Cancer Research (
6. Srinivasan M, Sedmak D, Jewell S. Effect of fixatives and tissue processing on the content and integrity of nucleic acids.
Am J Pathol 2002; 161(6): 1961–1971 (
7. Technical Note: RNA Sequencing. Evaluating RNA quality from FFPE samples. Illumina 2016 (
8. Godsey JH, Silvestro A, Barrett JC, Bramlett K, Chudova D, et al. Generic protocols for the analytical validation of
next-generation sequencing-based ctDNA assays: a joint consensus recommendation of the BloodPAC’s Analytical Variables
Working Group. Clin Chem 2020; 66(9): 1156–1166 (
9. BloodPAC (
10. Reference standards for confident assay validation. Horizon Discovery (