Alison Pic 07

Methylation landscape as a general test for cancer

DNA methylation at the cytosine of CpG dinucleotides is a key form of epigenetic regulation of gene expression and aberrant hypermethylation of the promoter regions of certain genes has been identified in many cancers. The ability to analyse methylation status from non-invasively collected samples (such as saliva, sputum, stool and urine) as well as circulating tumour (ct)DNA in blood has led to much interest in methylation status as a potential biomarker for diagnosis, prognosis and treatment monitoring of cancer. Indeed, a fecal blood test for colorectal cancer screening (Cologuard®) that includes aberrant methylation testing of NDRG4 and BMP3 genes was approved by the Food and Drug Administration in the USA in 2014. However, methylation state analysis of specific promoter regions requires the use of technically demanding methods, such as PCR of bisulfite-treated DNA, pyrosequencing, methylation-specific PCR, methyl BEAMing and genomic sequencing, that have limitations of one sort or another in their use as high-throughput screening tools.
Recently, though, a paper by Sina et al. (“Epigenetically reprogrammed methylation landscape drives the DNA self-assembly and serves as a universal cancer biomarker” in Nature Communications 2018; 9(1): 4915) has described how the changes in methylation patterns in cancer genomes have a general effect on the physicochemical properties of DNA and that this change can be used as a potential universal cancer biomarker. In the transition from normal to malignant neoplasm, the general genomic methylation pattern changes from one of dispersed methylation to general hypomethylation but with increased clustering of methylation at regulatory regions. This change in the ‘methylation landscape’ results in a difference in the solvation properties between the normal and the cancer DNA polymer, which in turn affect the affinity of DNA to gold: the more highly aggregated normal DNA exhibits low adsorption to gold, whereas the less aggregated cancer DNA shows high adsorption. The authors have been able to use these properties to create a highly sensitive and specific, non-invasive, quick (≤10 min) colorimetric assay for the detection of cancer that needs only minimal sample preparation and small DNA input. So far, identification of this ‘Methylscape’ biomarker is only an indication of the presence of cancer – further work-up is needed to determine location, type and stage of disease. However, this seems like the ideal first test to determine whether a patient’s symptoms are caused by cancer or not.

C369 Menke Fig1 New

Methodology for finding and interpreting efficient biomarker signals

Modern ‘omics’ and screening technologies make possible the analysis of large numbers of proteins with the aim of finding biomarkers for individually tailored diagnosis and prognosis of disease. However, this goal will only be reached if we are also able to sensibly sort through the huge amounts of data that are generated by these techniques. This article discusses how data analysis techniques that have been developed and refined for over a century in the field of psychology may also be applicable and useful for the identification of novel biomarkers.

by Dr J. Michael Menke and Dr Debosree Roy

Introduction
The profession and practice of medicine are rapidly moving towards more specialization, more focused diagnoses and individualized treatments. The result will be called personalized medicine. Presumably genetic predisposition will remain the primary biological basis, but diagnosis and screening will also evolve from complex system outputs observed as increases or decreases of levels of biomarkers in human secretions and excretions. In this sense, the exploration in the human sciences will undoubtedly expand to new frontiers, interdisciplinary cooperation, new disease reclassifications, and the disappearance of entire scientific professions.

Big data and massive datasets by themselves can never answer our deepest and most troubling questions about mortality and morbidity. After all, data are dumb, and need to be properly coaxed to reveal their secrets [1]. Without theories, our great piles of data remain uninformative. Big data need to be organized for, and subjected to, theory testing or data fitting to best competing theories [2, 3] to avoid spurious significant differences, conceivably the biggest threat to science in history [4, 5].

Old tools for big data

New demands presented by our ubiquitous data require new inferential methods. We may discover that disease is emergent from many factors working together to create a diagnosis in one person that, in fact, actually has many different causes in another person with the same diagnosis. Perhaps there are new diseases to be discovered. There might be better early detection and treatment. Much like the earliest forms of life on earth, pathology is much more complicated than just the rise of plant and animal kingdoms as taught mid-twentieth century in evolution.

Although new methodologies may meet scientific requirements of big data, tools already in existence may obviate the need to invent new ones. In particular, methods developed by and for psychologists over more than 100 years may already be an answer. Established data organization and analysis have already been developed by psychologists to test theories about nature’s most complex systems of life. Inference and prediction from massive amounts of data from multiple sources might yield more from these ‘fine scalpels’ without the need for brute force analyses, such as tests for statistical differences that look significant in many cases because of systematic bias in population data arising from unmeasured heterogeneity. The development of some of the most applicable psychological tools began in the early 20th century for measuring intelligence, skills and abilities. Thus, these tools have been used and refined for over a century. From psychological science emerged elegant approaches to data analysis and reduction to evaluate persons and populations for test validity, reliability, sensitivity, specificity, positive and negative predictive values, and efficiency. Psychological testing and medical screening share a common purpose: measure the existence and extent of largely invisible or hard to measure ‘latent’ attributes by establishing how various indicators that are attached to the latent trait react to the presence or absence of subclinical or unseen disease. Biomarkers are thus analogues of test questions, with each biomarker expressing information that helps establish the presence or absence of disease and its stage of progression. The analogous process recommended in this paper is simply this: How many and what kind of biomarkers are sufficient to screen for disease?

Biomarkers for whole-person healthcare

Although the use of biomarkers seems to buck the popular trend of promoting whole person diagnosis and treatment, biomarkers per se are actually nothing new. Biomarkers as products of human metabolism and waste have played an important role over centuries of disease diagnosis and prognosis, preceding science and often leading to catastrophic or ineffective results (think of ‘humours’ and ‘bloodletting’ as examples). Today, blood and urine chemistries are routinely used for focusing on a common cause (disease) of a number of symptoms. Blood in the stools, excessive thirst, glucose in urine, colour of eye sclera, round out information attributable to a common and familiar cause crucial for identifying and treating a system or body part. Signs of thirst and frequent urination may be necessary, but not sufficient for diagnosis of diabetes mellitus, yet can lead to quick referral or triage. The broad category of the physiological signs (biomarkers) has extended along with technology to the microscopic and molecular.

Today, the general testing for and collection of biomarkers in bodily fluids is a growing medical research frontier. However, too many, biomarkers can be confused with genes and epigenetic expressions of genes. Small distinctions might uncover the discovery of new genes leading to new definitions of disease, more accurate detection, and more personal treatment.

With the flood of data unleashed by research in these areas, a new and fundamental problem arises: How do we make sense of all these data? For now, professions and the public may be putting their faith in ‘big data’ in order to make biomarkers clinically meaningful and informative. We are in good company with those who remind us that data are dumb and can be misused to support bias, and that lots of poor quality data do not compile good science. At its heart, scientific theories need to be tested and scientific knowledge built in supported increments.
Biomarkers as medical tests
As with any medical test, some biomarkers are more accurate, or more related, to disease presence and absence and therefore are better indicators of underlying disease state. Thus, some biomarkers are more accurate than others; or put another way, biomarkers represent ‘mini-medical’ tests and their levels of contribution to diagnoses and prognoses depend upon random factors, along with sensitivity, specificity, and disease prevalence [6]. Some biomarkers may increase in presence with disease but lower with health, or the opposite – lower concentrations with disease. To complicate matters further, there are probably plenty of mixed signals, i.e. biomarker A is more sensitive than biomarker B, but B is more specific than A. Blending the information acquired by multiple biomarkers needs to be organized and read in a sequence to reduce false signals – positives or negatives – or at least minimize errors based on risk of disease and morbidities and mortalities.

Thus, managing and analysing the flood of biological diagnostic data is not the concern here, but rather its interpretation and clinical application. Balancing biomarker information at the clinical level is the function of translational research. Test-and-measurement (T&M) psychologists have worked on the science of organizing and interpreting individual items as revealing underlying latent constructs for over a century. Through the extremely tedious task of measuring human intelligence, skills and abilities, some already developed T&M tools could help improve the science, accuracy and interpretation of biomarkers [6].

Psychometric properties of biomarkers

Before embarking on a psychometric approach to biomarker interpretation, some common definitions are required. For instance, what is sensitivity or specificity? A psychometric or medical test shows high sensitivity when the underlying disease or person characteristic is also high. For intelligence, a high-test score implies high intelligence. On a single well-crafted test question, the probability of answering it correctly (formally called probability of endorsing) increases along with higher intelligence; if the question is associated with high intelligence, then the question is a strong or weak indicator of personal intelligence. When many test questions are indicators of intelligence, more correctly endorsed answers of good questions should indicate more intelligence. Indeed, some questions may even be ‘easier’ than others, leading to the need to design questions to fill out the continuum of an underlying intelligence being measured. This procedure is item analysis, a part of item response theory, see Figure 1 for an illustration of how multiple items ‘cover’ a given theta or disease.

Notice how irrelevant is the concept of sensitivity in clinical screening and diagnosis. Sensitivity means that if we already know for sure someone is smart or has a disease, the test and its questions will be correct in describing latent construct (referred to ‘theta’) a certain percentage of the time, based upon the test’s ability to detect and describe the presence or degree of the latent trait. Thus, the proportion of time the question is correct, given that we already know the person’s underlying status, is test or item sensitivity. Sensitivity is a test characteristic given we already know the latent trait – disease status. Symbolically, sensitivity is p(T+|D+), the probability of a positive test score (T+) given we already know the person has the disease (D+). Similarly, specificity is p(T−|D−), the probability of a negative test (T−) or item given that we already know that the patient is confirmed disease-free (D−).

Bayesian induction
Bayes Theorem is useful for many reasons, some controversial. But the conversion of disease prevalence along with biomarker sensitivity and specificity, will axiomatically give the probability of an individual having a disease given a positive test.

In Bayesian terms, the positive predictive value (PPV) is the posterior probability of a patient with a positive test. Two important properties of the PPV are: 1. It is a conversion of population prevalence turned into personal probability of disease based on a person’s positive test; and 2. PPV varies directly with the population prevalence of the disease. One cannot interpret a PPV without starting from its known or estimated population prevalence. PPV decreases with rare disease and increases with common disease, irrespective of tests’ sensitivity or specificity estimates. For further details see Figure 3 in the open access article ‘More accurate oral cancer screening with fewer salivary biomarkers’ by Menke et al. [7].

Sensitivity and specificity are characteristics of the test, not any patient. Such deductive processes are not at all clinically useful. In fact, diagnosing and screening are exactly the inverted probability of that: what is the inferred disease state, D+ or D−, from positive and negative test results? In other words, we want p(D+|T+) instead of p(T+|D+), and p(D−|T−) instead of p(T−|D−). The method for inverting the probabilities from test to patient characteristics is by the application of Bayes’ Theorem. This inverted probability is highly influenced by disease prevalence, however, whereas sensitivity and specificity are not.

Role of prevalence in disease detection
Generally, the higher the disease prevalence in a population, the easier it is to detect. Fortunately, this coincides with good intuitive sense. In fact, when screening for diseases, we need to read the biomarker results diachronically to take advantage of the information added by each biomarker.  ‘Diachronically’ refers to reading over time. In the case of biomarker screening, all biomarker antibodies or other detectors of biomarker presence will require the fewest number of biomarkers when read in context of other present biomarkers. Diachronic refers to the order in which biomarkers are read, not the order in which they are administered.

Biomarkers can be strongly or weakly informative. The indicator of strong or weak biomarkers is the diagnostic likelihood ratio, which is shown in the image above.
More explicitly this is called a positive diagnostic likelihood ratio, abbreviated +LR. The higher the +LR, the more information it conveys about the presence or absence of disease. The objective of the inverted probability, p(D+|T+), is called the positive predictive value of a test, PPV.

Diachronic contextual reading
When used in conjunction with other biomarkers, [p(D+|T1, T2, T3, …Tn)], the tests’ accuracy can be increased, but only if the test results are read diachronically. For instance, ‘passing along’ only positive test findings to another biomarker amounts to throwing out true negatives in the sample (and a few false negatives), which increases the ability to detect suspected diseased screened persons from a more prevalent sample pool. After five to ten of these ‘pass-alongs’, depending on original disease prevalence, the PPV can approach 100%, signifying great confidence that a disease is present and further testing and treatment are required. Also, panels of biomarkers – multiple biomarkers used in a single unit for screening – can also have a PPV. In some cases, biomarkers only appear in panels in which case, there is a resultant sensitivity, specificity and PPV for the entire panel.

Biomarkers that are too sensitive might generate too many false positives. This problem can be overcome with a biomarker or biomarkers to ‘clean out’ the false positives. Highly specific biomarkers will throw out false negative ones, a perspective balanced with sensitive biomarkers. Sensitivity and specificity generally vary inversely for each given biomarker. Those high on one attribute tend to be low on the other. Overall, according to our previous experience in meta-analyses, we found specificity was the primary attribute for quickly and accurately screening a population.

The exceptional biomarker can be high on both test attributes. In most cases, the information from mediocre biomarkers can be improved by combining them into biomarker panels with a combined accuracy stronger than any individual biomarker. Once biomarkers are ranked from high to low, wherein they pass along positive test results from highest to lowest dLR, the number of biomarkers required to achieve a PPV near 1.0 is considerably fewer than if biomarkers are ordered from lowest to highest dLR (Fig. 2).

Meta-analysis
As you may have inferred by now, the methodology of identifying the best biomarkers is via meta-analysis. A word of caution for diagnostic meta-analyses. There are software packages for the meta-analysis of medical tests. Meta-DiSc is one such tool [8, 9]. Material on its development may be found here [9]. When last checked, the Meta-DiSc program was being revised to correct some estimate errors and researchers were re-directed to a Cochrane Collaboration page [10]. In short, it is important not to add up all cells as if they represent one large study, because this misrepresents study homogeneity and therefore variance.

We recommend a meta-analysis that uses an index of evidential support [11–13]. In so doing, the weighting of data based on sample size alone may be avoided [7].

Partitioning panels with evidential support estimates

Biomarkers may be either high on sensitivity or specificity. Others may be very high in one attribute, but not the other. Few are high on both. This issue may be overcome by combining a panel made up of the same biomarker(s) of interest, where individual biomarker member weaknesses may be averaged out by including other biomarkers with complementary strengths. A biomarker with high sensitivity and low specificity may be combined with biomarkers of complementary strengths, such as those with low sensitivity and high specificity. The scenario is to combine those biomarkers high in one trait with those high with its complement. This can be tricky as an average accuracy might fall along a diagonal in a receiver operating characteristic (ROC) chart, rendering it a useless test. Indeed, the idea is to maximize the area under the curve on a ROC chart by ‘pulling the curve’ up into the upper left corner to create more area under the curve, representing diagnostic accuracy. For further details see Figure 2 in the open access article ‘More accurate oral cancer screening with fewer salivary biomarkers’ by Menke et al. [7].

The question is whether the combined accuracy is synergistically greater from using two biomarkers or becomes just an arithmetic average of two biomarkers. This conundrum is solved by making sure there are data points in the upper right corner to ‘pull up’ the ROC curve and maximize the area under the curve, which translates roughly to diagnostic accuracy. In fact, sensitivity to cancer or any other disease must be inverted to PPV before the biomarker exhibits utility. Somewhat paradoxically, just using more biomarkers does not increase screening accuracy without being read in the diachronic context of other tests done at the same time. not (again, refer to Fig. 2 in this article).

Should cancer tests detect only binary signals?
From a test and measures perspective, each biomarker is a kind of test question, where the answer to each question is the state of disease in the body. Some questions or biomarkers or biomarker panels are more or less informative because they are more or less sensitive and specific to detecting disease. The answers sought are binary – yes or no. The patient either has a disease or does not. It is up to the properties of the tests to reveal the truth.

As mentioned before, biomarker accuracy varies. No medical test of any kind is 100% accurate. Biomarkers associated with cancer can and do appear at lower levels in healthy individuals. We must understand this principle to decide whether other tests or panels are necessary to improve screening or diagnostic information.

When educational psychologists measure traits and abilities, e.g. IQ, they ask a series of questions. To the degree that the questions are answered ‘correctly’, a person scores higher and has more of the trait or ability to be measured. Creating a survey or questionnaire is a rigorous process. Think of an underlying variable (IQ) as the latent construct. ‘Construct’ is the intended concept we attempt to measure. The construct is not directly measurable, and thus called latent. Each question is a kind of probe that, to various degrees of accuracy, allows indirect observation of the latent construct or disease state. By analogy, biomarkers can be interpreted as test questions indicating the existence of a latent trait or disease.

Pushing the test analogy further, biomarkers might be negatively keyed, i.e. the levels of certain biomarkers are reduced in the presence of disease, or positively keyed with biomarker presence associated with disease. Whereas assessment of traits and abilities measures a continuous scale of latent construct presence, biomarkers answer a simple binary choice: Is the disease present or not?

Biomarker accuracy is estimated by its sensitivity and specificity. Test questions are subject to data reduction techniques (factor analysis), internal consistency within factors, and item response theory to identify redundant questions and design new questions cover gaps in detecting an underlying disease state.

As we are not basic scientists, but rather behavioural and population ones, we cannot address the clinical and laboratory aspects of biomarkers, but in collaboration with colleagues at dental programmes here  in Mesa, Arizona and in Malaysia, we came to understand that some biomarkers are more informative than others in screening and diagnosing disease.

Unidimensionality, monotonicity, and local independence properties
Test items should obey conditions of unidimensionality, monotonicity, and local independence. Briefly applied to medical tests, biomarkers should be indicative of the same latent construct (presence of disease), but individual biomarkers should increase (be positive for disease) along with the actual presence of disease [14].

The application of item response theory to academic test scores will reveal that there are gaps in assessment that miss progress or degree of the latent construct. When graphed on person–item maps, the high-ability persons will score higher on the test – i.e. endorse more items, especially the most difficult ones. The item–person map might show two areas of concern: redundant items that may be removed from the test to make the test more efficient, and abilities that cannot be determined owing to items clustering over small ranges of the latent construct. This is exemplified in Figure 1 in Warholak et al. [15].

As for biomarker disease screening, test or panel gaps may miss a subclinical or early stage disease by not matching the stage with biomarkers that would alert us to that stage of disease. In effect, this would be a blind-spot that more research may be required to fill. On the one hand, for a binary screening outcome – yes or no – gaps are not crucial. On the other hand, the discovery of gaps may lead to better science and better early disease detection.

Generalizability theory

Generalizability theory – or G-Theory – is a tool developed by Lee Cronbach and colleagues at Stanford around 1972 [16]. Without getting into excessive detail, it should suffice in this article that G-Theory be mentioned as a methodology for identifying sources of error, bias, or interference in statistical modelling of complex systems. As an example of the reasons for developing G-Theory in the first place, students are taught by professors within classes in courses in schools and states and countries. Each level of this education hierarchy may become a source of variability. If what we want to produce is a similar product in student graduates, as minimal competency in medicine, we may glean interference – variability – introduced by various levels or one specific level. With G-Theory, the primary source of variance may be identified and modified accordingly.

In the biomarker analogy, some biomarkers introduce more confusion than they resolve and can be eliminated or modified to improve reliability and consistent accuracy.

Conclusion
Although biomarker research is being funded and undertaken at unprecedented levels, it is important to remember the credible handling the data in a scientific manner is still the key to understanding and discovery. Big data still needs to answer the question of ‘What does it all mean?’ Yet, we recommend starting with  highly refined methodology developed for T&M of human skills, abilities and knowledge. At the very least T&M science might minimize errors, increase medical test efficiencies, and may be used to complement or confirm findings for translational research.
References
1. Pearl J, Mackenzie D. The book of why: the new science of cause and effect. Basic Books 2018.
2. Platt JR. Strong inference: certain systematic methods of scientific thinking may produce much more rapid progress than others. Science 1964; 146(3642): 347–352.
3. Chamberlin TC. The method of multiple working hypotheses. Science 1897, reprint 1965; 148: 754–759.
4. Kline RB. Beyond significance testing: reforming data analysis methods in behavioral research. American Psychological Association 2004.
5. Ziliak ST, McCloskey DN. The cult of statistical significance: how the standard error costs us jobs, justice, and lives. The University of Michigan 2011.
6. Kraemer HC. Evaluating medical tests: objectives and quantitative guidelines. Sage Publications 1992.
7. Menke JM, Ahsan MS, Khoo SP. More accurate oral cancer screening with fewer salivary biomarkers. Biomark Cancer 2017; 9: 1179299X17732007 (https://journals.sagepub.com/doi/full/10.1177/1179299X17732007?url_ver=Z39.88-2003&rfr_id=ori%3Arid%3Acrossref.org&rfr_dat=cr_pub%3Dpubmed#articlePermissionsContainer).
8. Zamora J, Abraira V, Muriel A, Khan K, Coomarasamy A. Meta-DiSc: a software for meta-analysis of test accuracy data. BMC Med Res Methodol 2006; 6: 31.
9. Zamora J, Muriel A, Abraira V. Statistical methods: Meta-DiSc ver 1.4. 2006: 1–8 (ftp://ftp.hrc.es/pub/programas/metadisc/MetaDisc_StatisticalMethods.pdf).
10. Cochrane Methods: screening and diagnostic tests 2018  (https://methods.cochrane.org/sdt/welcome).
11. Goodman SN, Royall R. Evidence and scientific research. Am J Public Health 1988; 78(12): 1568–1574.
12. Menke JM. Do manual therapies help low back pain? A comparative effectiveness meta-analysis. Spine (Phila Pa 1976) 2014; 39(7): E463–472.
13. Royall R. Statistical evidence: a likelihood paradigm. Chapman & Hall/CRC 2000.
14. Beck CJ, Menke JM, Figueredo AJ. Validation of a measure of intimate partner abuse (Relationship Behavior Rating Scale-revised) using item response theory analysis. Journal of Divorce and Remarriage 2013; 54(1): 58–77.
15. Warholak TL, Hines LE, Song MC, Gessay A, Menke JM, Sherrill D, Reel S, Murphy JE, Malone DC. Medical, nursing, and pharmacy students’ ability to recognize potential drug-drug interactions: a comparison of healthcare professional students. J Am Acad Nurse Pract 2011; 23(4): 216–221.
16. Shavelson RJ, Webb NM. Generalizability Theory: a primer. Sage Publications 1991.

The authors
J. Michael Menke* DC, PhD, MA; Debosree Roy PhD
A.T. Still Research Institute, A. T. Still University, Mesa, AZ 85206, USA

*Corresponding author
E-mail: jmenke@atsu.edu

p13 02

Novel biomarkers in malignant pleural mesothelioma

Mesothelioma is a fatal cancer of mesothelial cells caused by previous asbestos exposure. Numerous biomarkers have been tested for their ability to diagnose or monitor pleural mesothelioma, but none are in routine clinical practice. This article aims to briefly outline the literature to date and future research directions.

by Dr David T. Arnold and Prof. Nick A. Maskell

Introduction
Mesothelioma is a cancer of mesothelial cells that carries a very poor prognosis. It is almost exclusively caused by previous inhalation of asbestos fibres, which is usually through industrial employments (ship building, lagging, railway work, etc) or from working on pre-existing asbestos products (plumbing, carpentry, etc). Given there is a 40-year mean latency from exposure to presentation, the European incidence of mesothelioma is expected to rise until around 2020, in keeping with the use and subsequent banning of asbestos in the 1980s [1]. However, given ongoing unregulated use of asbestos in China, India and Russia, cases of mesothelioma will continue to occur worldwide.

Mesothelioma can occur in the pleural cavity and peritoneum (with a 4 : 1 ratio) and more rarely in the pericardium and tunica vaginalis. Malignant pleural mesothelioma (MPM) is the most common with around 2500 new cases in the UK every year and will form the basis of the rest of this review [2].

Survival from MPM is dependent on histological subtype, of which there are four: epithelioid, sarcomatoid, biphasic and desmoplastic. Epithelioid accounts for 70 % of the overall cases and has the best prognosis, with a median survival of 13–14 months. Sarcomatoid has the worst prognosis, at 4 months, and is usually felt by clinicians to be not amenable to therapy.

Presentation
The main symptoms from MPM are shortness of breath, cough and chest pain. Given that it is a highly metabolically active tumour, patients can also develop systemic symptoms of fevers/sweats, fatigue and weight loss, indicating a more advanced stage. Around 90 % of individuals with MPM present with a pleural effusion (fluid collection around the lung), and any male with a history of asbestos exposure and a unilateral effusion has a 60 % of having malignancy [3]. MPM is highly locally invasive, which can cause chest pain, but rarely metastasises unless the pleura is disrupted by diagnostic or therapeutic procedures, which can cause tract metastases.

Diagnosis and imaging

Patients who present with a pleural effusion will invariably have cytological analysis of the fluid first. However, pleural fluid cytology alone is not usually sufficient to make a diagnosis of MPM [3]. If the patient is well enough then a biopsy is performed either radiologically, via medical thoracoscopy or surgically. These procedures are invasive, so there has been significant interest in additional diagnostic methods.

The mainstay of radiological investigations is computerised tomography (CT), with magnetic resonance imaging and positron emission tomography (PET) currently limited to the research setting. However, differentiating MPM from benign pleural thickening or other pleural malignancies is unreliable using CT alone [2]. Given the drawbacks in current cytopathological and radiological investigations for MPM there is a huge potential role for serum or pleural fluid biomarkers. Biomarkers would allow earlier detection of malignancy in at-risk groups (e.g. the asbestos exposed), reduce the need for invasive biopsies and speed the diagnostic pathway to treatment.

Treatment and monitoring

Sadly, due to the highly invasive nature of MPM, treatment is often palliative from diagnosis. The current standard of care is based on the results of a non-placebo-controlled trial from 2003. Vogelzang and colleagues used a combination of pemetrexed (an anti-folate) and cisplatin (platinum-based agent) chemotherapy [4]. This combination adds a modest 2 months to overall survival, with a response rate of only 30 %. Treatment for MPM had not significantly advanced until the publication of the Mesothelioma Avastin Cisplatin Pemetrexed Study (MAPS) trial in 2015, which showed a 2-month survival advantage when bevacizumab [an anti-vascular endothelial growth factor (VEGF) immunotherapy] was added to standard chemotherapy [5].

The role of surgery for MPM is highly controversial, with significant variation in operative rates internationally. This controversy exists because there are no randomised trials of radical surgical intervention against best medical therapy. Large case series of patients with positive surgical outcomes exist, but they are often highly selective of younger patients with good performance status.

Both chemotherapeutic and surgical management would benefit from a greater ability to prognosticate patients at baseline and assess response to treatment. Currently, serial CT scanning is the gold standard of disease monitoring in MPM. Similar to other malignancies an attempt to measure change in tumour is made using the RECIST criteria. However, unlike other malignancies, MPM grows as a rind around the chest wall so volume measurement is difficult. A modified RECIST criteria has been developed, but is time intensive, and, with the added complications of pleural fluid and plaques, is rarely used outside the research setting. Other radiological methods to monitor disease have been examined, with a recognition that biomarkers would be ideal as a method of monitoring disease in a non-invasive manner [6].

Mesothelin
Soluble mesothelin (SM) is a 40 kDa glycoprotein over-expressed by the epithelioid component of malignant mesothelial cells. Its exact biological role remains uncertain. Discovered in the serum of patients with ovarian cancer, it was subsequently found in serum, pleural fluid and urine of patients with MPM. See Figure 1 in the open access article by Hassan et al. for a schematic showing the maturation and structure of mesothelin [7].

There has been considerable research attention on its utility in diagnosis or monitoring MPM. The majority of these studies have used a commercial platform, the Mesomark® ELISA. Unfortunately, despite some positive initial signals, a meta-analysis by Cui and colleagues in 2014 demonstrated that the overall sensitivity for detecting MPM was 0.61 in serum and 0.79 in pleural fluid [8]. The level of SM rises with increased epithelioid disease bulk, and, therefore, can be low in early-stage disease and may never rise in sarcomatoid or desmoplastic subtypes. For a diagnosis with such profound consequences for the individual, as well as medico-legal implications, this inability to reliably detect MPM has limited its widespread diagnostic use.

More recent research has focused on its ability to monitor MPM during treatment or follow-up (Table 1).

Although there is considerable heterogeneity between these 10 studies in terms of primary outcome measures, each has demonstrated that a rising SM is correlated with clinical or radiological disease progression. A falling SM following chemotherapy or surgery was strongly indicative of treatment response. The future of SM in disease monitoring depends on the results of currently recruiting prospective trials.
Fibulin-3
Fibulin-3 is a glycoprotein that promotes tumour growth and invasion through the phosphorylation of epidermal growth factor. In 2012, the New England Journal of Medicine published the results of a landmark study which reported that serum fibulin-3 had a 100 % sensitivity for detecting early-stage MPM [9]. Unfortunately, several follow-up studies using the same commercial ELISA have been unable to replicate these results. Ren and colleagues published a meta-analysis of eight studies which found the sensitivity to be around 87 % with a specificity of 89 % [10].

Osteopontin

Osteopontin is over-expressed in several malignancies and several studies have focused on its prognostic abilities in MPM. Early studies used serum osteopontin, without appreciating the impact of its thrombin cleavage site on results. More recent studies have used more accurate plasma measurement and shown that osteopontin has no role in diagnosis or monitoring [11]. Interestingly, there is evidence that baseline osteopontin is a marker of poor prognosis, independent of histology, treatment modality or other biomarkers.

Vascular Endothelial Growth Factor (VEGF)
VEGF has been studied as a potential diagnostic or therapeutic target in MPM. Although baseline VEGF correlates with disease stage and survival it is not used in the clinical setting. However, following the publication of the MAPS trial (of the anti-VEGF immunotherapy bevacizumab) there has been a renewed focus on whether baseline VEGF can select responders from non-responders. These studies have been directed at pan-VEGF, as opposed to any specific isoform, and have not shown any definitive role to date.

Proteomic studies
A modern approach to biomarker discovery and validation is exemplified by the DIAPHRAGM study [12]. Tsim and colleagues took the results of a promising 13-protein diagnostic panel developed and internally validated by Ostroff [12]. The internal validation cohort had reported an area under curve (AUC) for detecting MPM of 0.95 (38 patients with MPM). A flaw of previous biomarker validation is that follow-up work is performed on small retrospective cohorts. It often takes several years or decades before the true utility of the biomarker is established. The DIAPHRAGM study aims to quickly and definitively validate or reject the SOMAscan assay, alongside fibulin-3, in a prospective, powered and clinically relevant manner. The final results of the research are awaited, but regardless, this approach has set a standard for biomarker discovery and validation in MPM.

Summary
MPM is a highly aggressive cancer that is difficult to diagnose and monitor. The potential scope for biomarkers is huge. Serum SM has shown the most promise in monitoring disease. A biomarker that can reliably diagnose early-stage MPM remains elusive.

References
1. Robinson BW, Lake RA. Advances in malignant mesothelioma. N Engl J Med 2005; 353: 1591–1603.
2. Woolhouse I, Bishop L, Darlison L, de Fonseka D, Edey A, Edwards J, Faivre-Finn C, Fennell DA, Holmes S, et al. BTS guideline for the investigation and management of malignant pleural mesothelioma. BMJ Open Respir Res 2018; 5(1): e000266.
3. Arnold DT, De Fonseka D, Perry S, Morley A, Harvey JE, Medford A, Brett M, Maskell NA. Investigating unilateral pleural effusions: the role of cytology. Eur Respir J 2018; 52(5): pii: 1801254.
4. Vogelzang NJ, Rusthoven JJ, Symanowski J, Denham C, Kaukel E, Ruffie P, Gatzemeier U, Boyer M, Emri S, Manegold C, et al. Phase III study of pemetrexed in combination with cisplatin versus cisplatin alone in patients with malignant pleural mesothelioma. J Clin Oncol 2003; 21(14): 2636–2644.
5. Zalcman G, Mazieres J, Margery J, Greillier L, Audigier-Valette C, Moro-Sibilot D, Molinier O, Corre R, Monnet I, et al. Bevacizumab for newly diagnosed pleural mesothelioma in the Mesothelioma Avastin Cisplatin Pemetrexed Study (MAPS): a randomised, controlled, open-label, phase 3 trial. Lancet 2016; 387(10026): 1405–1414.
6. Hooper CE, Lyburn ID, Searle J, Darby M, Hall T, Hall D, Morley A, White P, Rahman NM, et al. The South West Area Mesothelioma and Pemetrexed trial: a multicentre prospective observational study evaluating novel markers of chemotherapy response and prognostication. Br J Cancer 2015; 112(7): 1175–1182.
7. Hassan R, Bera T, Pastan I. Mesothelin: a new target for immunotherapy. Clin Cancer Res 2004; 10(12 Pt 1): 3937–3942 (http://clincancerres.aacrjournals.org/content/10/12/3937.long).
8. Cui A, Jin XG, Zhai K, Tong ZH, Shi HZ. Diagnostic values of soluble mesothelin-related peptides for malignant pleural mesothelioma: updated meta-analysis. BMJ Open 2014; 4(2): e004145.
9. Pass HI, Levin SM, Harbut MR, Melamed J, Chiriboga L, Donington J, Huflejt M, Carbone M, Chia D, et al. Fibulin-3 as a blood and effusion biomarker for pleural mesothelioma. N Engl J Med 2012; 367(15): 1417–1427.
10. Ren R, Yin P, Zhang Y, Zhou J, Zhou Y, Xu R, Lin H, Huang C. Diagnostic value of fibulin-3 for malignant pleural mesothelioma: A systematic review and meta-analysis. Oncotarget 2016; 7(51): 84851–84859.
11. Lin H, Shen YC, Long HY, Wang H, Luo ZY, Wei ZX, Hu SQ, Wen FQ. Performance of osteopontin in the diagnosis of malignant pleural mesothelioma: a meta-analysis. Int J Clin Exp Med 2014; 7(5): 1289–1296.
12. Tsim S, Kelly C, Alexander L, McCormick C, Thomson F, Woodward R, Foster JE, Stobo DB, Paul J et al. Diagnostic and Prognostic Biomarkers in the Rational Assessment of Mesothelioma (DIAPHRAGM) study: protocol of a prospective, multicentre, observational study. BMJ Open 2016; 6(11): e013324.
13. de Fonseka D, Arnold DT, Stadon L, Morley A, Keenan E, Darby M, Armstrong L, Virgo P1, Maskell NA. A prospective study to investigate the role of serial serum mesothelin in monitoring mesothelioma. BMC Cancer 2018; 18(1): 199.
14. Bonotti A, Simonini S, Pantani E, Giusti L, Donadio E, Mazzoni MR, Chella A, Marconi L, Ambrosino N, et al. Serum mesothelin, osteopontin and vimentin: useful markers for clinical monitoring of malignant pleural mesothelioma. Int J Biol Markers 2017; 32(1):e126–e131.
15. Hassan R, Sharon E, Thomas A, Zhang J, Ling A, Miettinen M, Kreitman RJ, Steinberg SM, Hollevoet K, Pastan I. Phase 1 study of the antimesothelin immunotoxin SS1P in combination with pemetrexed and cisplatin for front-line therapy of pleural mesothelioma and correlation of tumor response with serum mesothelin, megakaryocyte potentiating factor, and cancer antigen 125. Cancer 2014; 120: 3311–3319.
16. Nowak AK, Brown C, Millward MJ, Creaney J, Byrne MJ, Hughes B, Kremmidiotis G, Bibby DC, Leske AF, et al. A phase II clinical trial of the vascular disrupting agent BNC105P as second line chemotherapy for advanced malignant pleural mesothelioma. Lung Cancer 2013; 81: 422–427.
17. Franko A, Dolzan V, Kovac V, Arneric N, Dodic-Fikfak M. Soluble mesothelin-related peptides levels in patients with malignant mesothelioma. Dis Markers 2012; 32: 123–131.
18. Hollevoet K, Nackaerts K, Gosselin R, De Wever W, Bosquée L, De Vuyst P, Germonpré P, Kellen E, Legrand C, et al. Soluble mesothelin, megakaryocyte potentiating factor, and osteopontin as markers of patient response and outcome in mesothelioma. J Thorac Oncol 2011; 6: 1930–1937.
19. Creaney J, Francis RJ, Dick IM, Musk AW, Robinson BW, Byrne MJ, Nowak AK. Serum soluble mesothelin concentrations in malignant pleural mesothelioma: relationship to tumor volume, clinical stage and changes in tumor burden. Clin Cancer Res 2011; 17: 1181–1189.
20. Wheatley-Price P, Yang B, Patsios D, Patel D, Ma C, Xu W, Leighl N, Feld R, Cho BC, et al. Soluble mesothelin-related Peptide and osteopontin as markers of response in malignant mesothelioma. J Clin Oncol 2010; 28: 3316–3322.
21. Grigoriu BD, Chahine B, Vachani A, Gey T, Conti M, Sterman DH, Marchandise G, Porte H, Albelda SM, Scherpereel A. Kinetics of soluble mesothelin in patients with malignant pleural mesothelioma during treatment. Am J Respir Crit Care Med 2009; 179: 950–954.

The authors
David T. Arnold* MBBCh, BSc, MRCP and Nick A. Maskell BMedSci, BM, BS, FRCP, DM, FCCP
Academic Respiratory Unit, Learning and Research Building, Southmead Hospital, Bristol, UK

*Corresponding author
E-mail: arnold.dta@gmail.com

p14 11

Therapeutic drug monitoring of antiepileptic drugs

Antiepileptic drugs (AEDs) are widely used and their number is steadily increasing. Therapeutic drug monitoring of AEDs, when performed correctly, can be a valuable tool for the treating physician. This article describes the indications, limitations and pitfalls that must be observed when measuring and interpreting AED serum concentrations.

by Dr Arne Reimers and Prof. Eylert Brodtkorb

Why measure antiepileptic drug serum concentrations?
Antiepileptic drugs (AEDs) are widely used, not only for epilepsy, but also for a range of non-epilepsy conditions, such as bipolar (manic-depressive) disorder, migraine and neuropathic pain [1]. Thus, the total number of AED users substantially exceeds the number of people with epilepsy. Therapeutic drug monitoring (TDM) has for many years been used to support AED treatment, as many of these drugs have unfavourable pharmacokinetic properties, a potential to problematic drug interactions as well as narrow therapeutic windows. TDM is a means of assisting clinical decision-making and should always be done with a specific question in mind. The general indications for TDM of AEDs are listed in Table 1.

Non-linear and linear pharmacokinetics
TDM of AEDs has a long clinical tradition. When the concept of TDM was introduced in the early 1970s, phenytoin was one of the first drugs to which it was applied [2]. This was mainly because phenytoin, then one of the most frequently used AEDs, has so-called non-linear pharmacokinetics. Linear kinetics means that the serum concentration is linearly correlated with dose – a doubling of the dose will double the serum concentration. This applies to almost all medicinal drugs. However, some drugs exhibit non-linear or saturation kinetics; phenytoin is one of them. Doubling the phenytoin dose may result in an unpredictable increase of the serum concentration. Thus, monitoring the phenytoin serum concentration was desirable and soon became available in large parts of the world.

Most other AEDs, however, exhibit linear kinetics. Why then is it important to measure their serum concentrations? One reason is the nature of epilepsy itself and the issue of prophylactic treatment. The only clinical marker for successful management is the extent of seizure control. However, epileptic seizures may occur in random patterns. The intervals between seizures may be minutes or months, and if a seizure occurs, it may have dramatic consequences, not only for the patient, but even for others. Thus, it can be very demanding to evaluate the therapeutic effect of AED treatment by clinical observation alone.

Absorption, distribution, metabolism and excretion
In addition, the pharmacokinetics of AEDs may be affected by changes in absorption, distribution, metabolism and excretion (ADME). Co-morbidity, pregnancy, drug interactions, pharmacogenetic polymorphisms, etc, all may considerably affect the ADME of AEDs (Fig. 1). Pregnancy may induce pronounced pharmacokinetic alterations, including increased volume of distribution, elevated renal clearance, and induction of hepatic metabolism. Breakthrough seizures in previously seizure-free patients may occur [3–5].

The serum concentration of carbamazepine may rise threefold and produce toxic symptoms when the patient is prescribed certain antibiotics which inhibit its metabolism, such as erythromycin. On the other hand, carbamazepine and other inducers of hepatic metabolism, may reduce serum concentrations of several other drugs, among them valproate, lamotrigine and hormonal contraceptives. Valproate is also a potent inhibitor of drug-metabolizing liver enzymes and may double lamotrigine concentrations. The clinically important induction of the metabolism of lamotrigine by combined oral contraceptives was detected by routine use of TDM [6]. Gabapentin is excreted almost exclusively by the kidneys; hence reduced kidney function will give increased serum concentrations.

Adherence

Poor adherence to prescribed treatment is one of the most important obstacles to the management of epilepsy [7, 8]. It has been documented that roughly half of all patients take their medicine more or less irregularly [9]. A recent study in patients admitted to hospital with acute epileptic seizures found that almost 40 % had less than 75 % of their usual trough AED serum concentration, indicating one or more missed doses [8] (Fig. 2). In such situations, it is crucial that the treating clinician receives the lab result as soon as possible to be able to decide on how to proceed with the management of the patient. Should the daily AED dose be increased or not? In the event that the seizure occurred because of a missed intake, it would not be appropriate; dose increase could even be harmful to the patient. If the serum concentration was adequate (according to prescribed dose), the occurrence of a seizure would suggest that the daily dose was too low and should be increased. This decision must be made quickly as the patient usually will be dismissed from hospital the next morning. It is essential to identify pseudo-refractory epilepsy. Clinically unrecognized non-adherence is often mistaken as drug-resistant epilepsy [10].

How it is normally done
The common convention is that blood samples for measuring the concentration of AEDs be taken drug-fasting in the morning (i.e. from 12 h to a maximum of 24 h after the last dose intake, and before the morning dose). Also, the patient must be in pharmacological steady state. This means that the amount of drug administered per unit time is in equilibrium with the amount of drug eliminated from the body during the same time. For all drugs, this state is reached after five times the drug’s plasma half-life. These rules apply after every dose change (Fig. 2E). The difficulties in complying with these rules are an important obstacle to TDM and is one major reason its routine use is discredited in many parts of the world. If a blood sample is taken before steady state is reached, or when the patient is not drug-fasting, the interpretation of the measured blood concentration is tricky and requires profound clinical-pharmacological experience.
Most commonly, the analyses are performed in a central lab using serum or plasma, either with immunologic or chromatographic methods. Usually, the total AED concentration (protein-bound plus unbound drug) is measured. In certain situations, e.g. in the elderly with hypoalbuminemia or in pregnant women, it is desirable to measure the unbound (free) proportion of an AED. This applies mainly to valproate and phenytoin which are >90 % protein bound. Hypoalbuminemia may cause signs of overdose despite only modest total AED concentration. However, unbound concentrations are rarely requested and not offered by all labs.

Reference ranges for antiepileptic drugs
It must be noted that reference ranges (RRs) for AEDs apply to the treatment of epilepsy. RRs for bipolar disorder have been suggested [11] but are not broadly established, whereas in the treatment of chronic pain states, treatment is usually guided by the clinical response alone. Unfortunately, with few exceptions, most RRs are not well documented. The exceptions are those AEDs that have been around for decades, e.g. phenytoin, carbamazepine and valproate. For them, broadly accepted RRs are supported by long clinical experience.
For the newer AEDs (introduced after 1990), there is a considerable lack of data. One of the reasons for the poor documentation is that drug manufacturers rarely publish serum concentrations obtained in clinical phase III or IV studies. Another reason is a lack of studies specifically aimed at examining the correlation between serum concentrations and effect. Thus, RRs for AEDs are often based on extrapolation of pharmacokinetic data obtained in preclinical studies, or on data from large routine databases, i.e. by applying some sort of population kinetics. Such data often lack clinical correlates owing to incomplete information provided on the request forms.

One consequence of the above is that the RRs used by different labs, and reported in the literature, are often incoherent. Another weakness of these population-based RRs is the fact that many patients achieve a satisfactory therapeutic effect with serum concentrations below the RR, while others need concentrations above the RR, yet without suffering symptoms of overdose. This is also the reason why the term ‘therapeutic range’ should not be used; it wrongly implies that any concentration outside that range is ‘non-therapeutic’.

The concept of individual RRs where each patient serves as his/her own reference [12] is an alternative approach. An obvious prerequisite for this concept is the availability of several consecutive serum concentration measurements (within reasonable time intervals) in the individual patient as well as close clinical follow-up, to correlate various serum concentrations with their corresponding clinical effect. It would also be desirable to have non-sufficient concentrations as well as toxic concentrations. Most of these individual therapeutic ranges would fall within the population-derived RRs. However, as mentioned above, some patients respond well to concentrations outside the common RR. For the sake of clarity, it has been suggested that such individual RRs be called individual therapeutic ranges [13]. Despite its advantages, neither the concept itself nor the term individual therapeutic range can be regarded as generally established.

Concluding remarks
TDM of AEDs is controversial, as it has been repeatedly emphasized that ‘treating patients is more important than treating blood levels’ [14]. Clinical evaluation and follow-up will continue to be the leading element in the management of epilepsy.
Nevertheless, when correctly applied, appropriately sampled and analysed, as well as correctly interpreted, TDM stands out as an important and relatively inexpensive tool for optimizing the drug treatment of epilepsy. Obviously, blinding for the actual serum concentrations may have severe untoward consequences in specific patient populations, such as pregnant women and patients with poor medication-taking behaviour.

References
1. Johannessen Landmark C. Antiepileptic drugs in non-epilepsy disorders: relations between mechanisms of action and clinical efficacy. CNS Drugs 2008; 22(1): 27–47.
2. Richens A. Drug estimation in the treatment of epilepsy. Proc R Soc Med 1974; 67(12 Pt 1): 1227–1229.
3. Cappellari AM, Cattaneo D, Clementi E, Kustermann A. Increased levetiracetam clearance and breakthrough seizure in a pregnant patient successfully handled by intensive therapeutic drug monitoring. Ther Drug Monit 2015; 37(3): 285–287.
4. Reimers A, Helde G, Becser Andersen N, Aurlien D, Surlien Navjord E, Haggag K, Christensen J, Lillestølen KM, Nakken KO, Brodtkorb E. Zonisamide serum concentrations during pregnancy. Epilepsy Res 2018; 144: 25–29.
5. Voinescu PE, Park S, Chen LQ, Stowe ZN, Newport DJ, Ritchie JC, Pennell PB. Antiepileptic drug clearances during pregnancy and clinical implications for women with epilepsy. Neurology 2018; 91(13): e1228–1236.
6. Sabers A, Buchholt JM, Uldall P, Hansen EL. Lamotrigine plasma levels reduced by oral contraceptives. Epilepsy Res 2001; 47(1–2): 151–154.
7. Faught E. Adherence to antiepilepsy drug therapy. Epilepsy Behav 2012; 25(3): 297–302.
8. Samsonsen C, Reimers A, Bråthen G, Helde G, Brodtkorb E. Nonadherence to treatment causing acute hospitalizations in people with epilepsy: an observational, prospective study. Epilepsia 2014; 55(11): e125–128.
9. Adherence to long-term therapies: evidence for action World Health Organization 2003; http://www.who.int/chp/knowledge/publications/adherence_report/en/.
10. Brodtkorb E, Samsonsen C, Sund JK, Bråthen G, Helde G, Reimers A. Treatment non-adherence in pseudo-refractory epilepsy. Epilepsy Res 2016; 122: 1–6.
11. Hiemke C, Bergemann N, Clement HW, Conca A, Deckert J, Domschke K, Eckermann G, Egberts K, Gerlach M, et al. Consensus guidelines for therapeutic drug monitoring in neuropsychopharmacology: update 2017. Pharmacopsychiatry 2018; 51(1–02): 9–62.
12. Landmark CJ, Johannessen SI, Tomson T. Dosing strategies for antiepileptic drugs: from a standard dose for all to individualised treatment by implementation of therapeutic drug monitoring. Epileptic Disord 2016; 18(4): 367–83.
13. Patsalos PN, Berry DJ, Bourgeois BF, Cloyd JC, Glauser TA, Johannessen SI, Leppik IE, Tomson T, Perucca E. Antiepileptic drugs – best practice guidelines for therapeutic drug monitoring: a position paper by the subcommission on therapeutic drug monitoring, ILAE Commission on Therapeutic Strategies. Epilepsia 2008; 49(7): 1239–1276.
14. Chadwick DW. Overuse of monitoring of blood concentrations of antiepileptic drugs. Br Med J (Clin Res Ed) 1987; 294(6574): 723–724.

The authors
Arne Reimers*1,2 MD PhD and Eylert Brodtkorb3,4 MD PhD
1Dept. of Clinical Chemistry and Pharmacology, Division of Laboratory Medicine, Skåne University Hospital, Lund, Sweden
2Department of Clinical Chemistry and Pharmacology, Lund University, Lund, Sweden
3Dept. of Neuromedicine and Movement Science, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
4Dept. of Neurology and Clinical Neurophysiology, St. Olavs University Hospital, Trondheim, Norway

*Corresponding author
E-mail: arne.reimers@med.lu.se

p18 10

Alternative sampling strategies for antiepileptic drug monitoring

The continued use of first-generation antiepileptic drugs (AEDs) and their usually pronounced intra- and inter-individual variability, have made AEDs among the most common medications for which therapeutic drug monitoring (TDM) is performed. As the most cost-effective, rational and clinically useful methodologies are being pursued for TDM interventions, suitable sampling alternatives (e.g. dried blood samples and saliva) for the conventional venous sampling have been proposed.

by Sofie Velghe and Prof. Christophe P. Stove

Background
Administration of appropriate antiepileptic drugs (AEDs) is the mainstay in the attempt to provide epilepsy patients with a seizure-free, normal life. AEDs constitute a structurally and pharmacologically diverse group of drugs for which different criteria for classification are used, e.g. classification based on time of introduction by the pharmaceutical industry (i.e. first-, second- and third-generation of AEDs) [1]. In this way, carbamazepine (CBZ), phenytoin (PHT), phenobarbital (PB) and valproic acid (VPA) belong to the first-generation of AEDs, because of their introduction prior to 1990 [1]. Examples of the second-generation of AEDs are, among others, oxcarbazepine, vigabatrin and topiramate, whereas lacosamide, retigabine and eslicarbazepine are categorized as third-generation AEDs [1]. Another, clinically relevant classification is based on their spectrum of activity. Here, a distinction can be made between AEDs with a broad (i.e. effective against multiple types of seizures) and a narrow (i.e. effective against specific types of seizures for example focal epilepsy) spectrum [2]. Table 1 provides an overview of the licensed AEDs in Belgium, together with their plasma reference ranges, classified based on their activity spectrum. The treatment strategy of epilepsy is typically twofold: initially a treatment of acute tonic-clonic seizures, generally with benzodiazepines, is necessary, followed by an initiation of a chronic, preventive treatment with AEDs. Preferably, the latter consists of a monotherapy with one AED for which the dose is slowly titrated upwards when necessary. However, for some forms of epilepsy or in cases where a monotherapy at the maximum dosage is insufficient, a combination therapy with multiple AEDs is needed.

The generally narrow therapeutic indices, causing toxicity to be a common issue, together with their frequent use (i.e. for epilepsy, but also for pain and bipolar disorder) has made first-generation AEDs one of the most common medication groups for which therapeutic drug monitoring (TDM) is performed [3].

Owing to the large inter-individual variety in types of epilepsy and in the severity of epileptic seizures, the same dosage of an AED causes a symptom decrease in some patients, whereas in others epileptic seizures remain poorly controlled. Furthermore, some patients experience complete seizure control with an AED blood concentration below or above a set reference range, making TDM of AEDs quite challenging. Therefore, dosage adjustment should preferably be performed by combining the results of TDM with the clinical outcome. In other words, at the start of an AED treatment, a clinician must aim at obtaining an AED blood concentration within a set reference range, followed by a titration upwards or downwards, depending on the clinical symptoms. In this context, the concept of the ‘individual therapeutic concentration/range’ arose, being the AED concentration or range of concentrations for which an individual patient experiences an optimum response [4]. In order to define this ‘individual therapeutic concentration/range’, achieving the optimum desired clinical outcome can also be seen as an indication for TDM of AEDs. Determining the latter concentration or range can be performed for every AED, also including the AEDs for which a reference range is currently still lacking. To do so, the steady-state AED(s) concentration(s) should preferably be measured twice (2–4 months apart) once a patient has reached his/her optimum AED regimen [3].
Alternative sampling strategies for TDM of AEDs
Limitations coupled to the traditional way of performing TDM of AEDs (i.e. in plasma or serum samples) are the invasiveness of the sampling technique and the typically large amounts of blood that are sampled. In addition, sampling requires a phlebotomist, which obliges a visit to a hospital or doctor. Therefore, a growing interest in the use of non-invasive or minimally invasive alternative sampling strategies for TDM of AEDs has arisen. In this regard, dried blood spots (DBSs) are undoubtedly, besides oral fluid, the most widely used alternative matrix. On the one hand, benefits coupled to the use of DBSs are: (i) possibility of home sampling, since the samples are generally obtained by the use of a finger prick; (ii) non-contagious character, making it possible to send the samples via regular mail to a laboratory; (iii) only a small sample volume is necessary, which makes it very attractive for certain patients, such as those with anemia and young children; (iv) suitability for automation of sample processing and analysis; and (v) increased stability for many analytes, which can be of utmost importance for AEDs, given the controversy concerning the stability of some first-generation AEDs in serum collected via gel separator tubes [3, 5, 6]. On the other hand, DBS use also suffers from some challenges: (i) the small sample volume requires sensitive analytical instrumentation; (ii) risk of contamination; (iii) the hematocrit (Hct) effect; (iv) possibility of analyte concentration differences between capillary and venous blood; (v) adequate sampling is necessary, imposing the need for proper training of patients on the sampling technique; and (vi) influence of spotted blood volume and the punch location, especially when partial DBS punches are analysed [5, 6]. Among these challenges, the Hct effect is undoubtedly the most discussed issue related to DBS analysis. Variations in Hct influence the spreading of blood on the filter paper: blood with a higher Hct will spread less compared to blood with a lower Hct, impacting the spot size and spot homogeneity. Furthermore, the Hct may also influence matrix effect and recovery. With this impact in mind, many strategies to cope with this issue have been made over the past few years (reviewed in De Kesel et al. [7] and Velghe et al. [8]). Among these are volumetrically generated dried blood samples, which are analysed entirely. These could be DBSs on conventional filter paper [9], or, alternatively, samples generated via volumetric absorptive microsampling (VAMS) (Fig. 1), a technique by which a fixed volume of blood is wicked up via an absorbent tip [10]. We recently demonstrated the potential of VAMS for AED monitoring [11]. However, It needs to be stated that, if no large differences are anticipated in the Hct of the target population, it can be assumed that the impact of the Hct will remain limited and partial-punch analysis will likely not pose an issue for DBS-based AED analysis [12–14].
As TDM is most often performed on plasma or serum samples, reference ranges for AEDs are typically set for these matrices. Hence, if one wants to derive a plasma concentration from a (dried) blood concentration, there is a need for a ‘conversion’. This can be done by establishing average blood : plasma ratios or, alternatively, by plotting (dried) blood concentrations versus plasma concentrations of a reference set of samples and using the resulting calibration equation to derive ‘calculated plasma concentrations’ from a test set of samples. Obviously, this will also be accompanied with an additional level of uncertainty [11–14].

Alternatively, dried serum/plasma spots might be generated directly, using devices that contain filters that essentially allow passage of the liquid portion of blood but will stop the cellular portion [15–17]. Although several devices have been developed, it remains to be fully established (for AEDs, as well as for other analytes) whether the concentrations that can be derived from the resulting dried plasma/serum spots effectively mirror those in liquid plasma/serum.

Lastly, it should also be remarked that dried blood samples may also be used – without a need for conversion – for the follow-up of someone’s ‘individual therapeutic concentration/range’, once this has been established. On the one hand, this overcomes the need of using specialized dedicated devices, which typically come at an increased cost; on the other hand, this avoids the introduction of an additional level of conversion-associated uncertainty.

Conclusion
TDM of AEDs via DBS, VAMS or dried plasma/serum spots is an interesting application with the potential for a better follow-up of patients. Large-scale studies are warranted to substantiate the benefit for the patient and the corresponding potential associated cost savings.

References
1. Milosheska D, Grabnar I, Vovk T. Dried blood spots for monitoring and individualization of antiepileptic drug treatment. Eur J Pharm Sci 2015; 75: 25–39.
2. Commented drug code. BCFI 2018 (www.bcfi.be) [In Dutch/French].
3. Patsalos PN, Spencer EP, Berry DJ. Therapeutic drug monitoring of antiepileptic drugs in epilepsy: a 2018 update. TDM 2018; 40: 526–548.
4. Patsalos PN, Berry DJ, Bourgeois BF, Cloyd JC, Glauser TA, Johannessen SI, Leppik IE, Tomson T, Perucca E. Antiepileptic drugs – best practice guidelines for therapeutic drug monitoring: a position paper by the subcommission on therapeutic drug monitoring, ILAE Commission on Therapeutic Strategies. Epilepsia 2008; 49: 1239–1276.
5. Wilhelm AJ, den Burger JC, Swart EL. Therapeutic drug monitoring by dried blood spot: progress to date and future directions. Clin Pharmacokinet 2014; 53: 961–973.
6. Velghe S, Capiau S, Stove CP. Opening the toolbox of alternative sampling strategies in clinical routine: A key-role for (LC-)MS/MS. Trac-Trend Anal Chem 2016; 84: 61–73.
7. De Kesel PM, Sadones N, Capiau S, Lambert WE, Stove CP. Hemato-critical issues in quantitative analysis of dried blood spots: challenges and solutions. Bioanalysis 2013; 5: 2023–2041.
8. Velghe S, Delahaye L, Stove CP. Is the hematocrit still an issue in quantitative dried blood spot analysis? J Pharm Biomed Anal 2018; 163: 188–196.
9. Velghe S, Stove CP. Evaluation of the Capitainer-B Microfluidic device as a new hematocrit-independent alternative for dried blood spot collection. Anal Chem 2018; 90: 12893–12899.
10. Denniff P, Spooner N. Volumetric absorptive microsampling: a dried sample collection technique for quantitative bioanalysis. Anal Chem 2014; 86: 8489–8495.
11. Velghe S, Stove CP. Volumetric absorptive microsampling as an alternative tool for therapeutic drug monitoring of first-generation anti-epileptic drugs. Anal Bioanal Chem 2018; 410: 2331–2341.
12. Linder C, Andersson M, Wide K, Beck O, Pohanka A. A LC-MS/MS method for therapeutic drug monitoring of carbamazepine, lamotrigine and valproic acid in DBS. Bioanalysis 2015; 7: 2031–2039.
13. Linder C, Wide K, Walander M, Beck O, Gustafsson LL, Pohanka A. Comparison between dried blood spot and plasma sampling for therapeutic drug monitoring of antiepileptic drugs in children with epilepsy: A step towards home sampling. Clin Biochem 2017; 50: 418–424.
14. Linder C, Hansson A, Sadek S, Gustafsson LL, Pohanka A. Carbamazepine, lamotrigine, levetiracetam and valproic acid in dried blood spots with liquid chromatography tandem mass spectrometry; method development and validation. J Chrom B 2018; 1072: 116–122.
15. Ryona I, Henion J. A Book-type dried plasma spot card for automated flow-through elution coupled with online SPE-LC-MS/MS bioanalysis of opioids and stimulants in blood. Anal Chem 2016; 88: 11229–11237.
16. Kim JH, Woenker T, Adamec J, Regnier F. Simple, miniaturized blood plasma extraction method. Anal Chem 2013; 85: 11501–11508.
17. Hauser J, Lenk G, Hansson J, Beck O, Stemme G, Roxhed N. High-yield passive plasma filtration from human finger prick blood. Anal Chem 2018; 90: 13393–13399.

The authors
Sofie Velghe PharmD and Christophe P. Stove* PharmD, PhD
Laboratory of Toxicology, Department of Bioanalysis, Faculty of Pharmaceutical Sciences, Ghent University, 9000 Ghent, Belgium

*Corresponding author
E-mail: christophe.stove@ugent.be

C365 Cawood Fig1

Benefits of specific drugs of abuse analysis by tandem mass spectrometry in urine and oral fluid

Quantitative specific drug analysis by tandem mass spectrometry allows a wide range of drugs to be analysed in either urine or oral fluid to confirmation standards. The repertoire of drugs is based on drugs of abuse implicated in drug-related deaths in Scotland and currently includes 27 specific drugs and metabolites.

by Dr Paul Cawood and Joanne McCauley

Background
Drugs of abuse have traditionally been identified by immunoassay screening methods. Some of these are relatively non-specific and require second-line confirmatory tests, traditionally by gas chromatography–mass spectrometry (GC-MS). As drugs are not volatile this requires derivatization to render the drugs volatile. Tandem mass spectrometry (TMS) has the advantage that samples can be analysed directly without derivatization.

Drug-related deaths in Scotland are the highest in Europe and are increasing steeply [1, 2], even though the number of substance misusers has not changed recently. Most deaths are due to accidental overdosing with opiates, which causes death from heart or respiratory failure. The steep increase is the result of poly-drug use, with gabapentin/pregabalin and street benzodiazepines (such as etizolam and alprazolam) implicated in a large number of these deaths. Identification of many of these drugs is not possible by traditional immunoassay screening methods even with GC-MS confirmation. However, it is possible to identify many of these drugs by TMS.

Specific quantitative drug analysis by TMS
Urine and oral fluid drugs of abuse method
A rapid method for the analysis of drugs of abuse in urine has been reported previously [3]. This method has been modified for the analysis of drugs implicated in drug-related deaths in Scotland [2]. One transition per drug can increase the risk of false-positive results [4]; hence,   each drug has two transitions and a closely matched deuterated internal standard in order to avoid these issues. Calibrators and quality control samples are made from Ceriliant certified standards. The standard set comprises morphine, codeine, 6-monoacetyl morphine (6-MAM), dihydrocodeine (DHC), oxycodone, gabapentin, pregabalin, methadone, EDDP (2-ethylidene-1,5-dimethyl-3,3-diphenylpyrrolidine, methadone metabolite), buprenorphine, norbuprenorphine, tramadol, amphetamine, 3,4-methylenedioxymethamphetamine (MDMA, or ecstasy) , methamphetamine, cocaine, benzoyl ecgonine (BEC), diazepam, nordiazepam, temazepam, oxazepam, 7-amino-clonazepam, nitrazepam, alprazolam, diclazepam, delorazepam and etizolam. Stock standard solution is made by adding 100 µg of each standard to a 20 ml volumetric flask, resulting in 5 000 µg/L. Calibrators are prepared at: 5, 10, 20, 30, 100, 300 and 1000 µg/L with quality controls at 10, 20, 50, 100, 300 and 400 µg/L in 3 % human serum albumin. The albumin prevents non-specific binding to the container.

Spot urine samples are collected in universal containers and oral fluid is collected into a Sarstedt salivette cortisol collection device (without preservative).

50 µL of calibrator, quality control, patient urine or oral fluid has 20 µL of zinc sulphate (0.1 mol/L) and 150 µL internal standard mixture (containing 17 deuterated internal standards – 1 µg/100 mL methanol) added. The sample is mixed and centrifuged. 75 µL of supernatant is removed and added to 300 µL of water. A volume of 20 µL is injected.

TMS analysis
Samples are analysed on a Waters Xevo tandem mass spectrometer using a Waters Acquity ultra high performance liquid chromatography HSS C18 1.8 µm, 100 mm column at 50 °C. The sample is eluted using a multi-step gradient of water (1 % formic acid 2 mM ammonium acetate) and acetonitrile (1 % formic acid), starting at 98 % water/2 % acetonitrile to 63 % / 37 % at 3.4 min then to 5 % / 95 % at 4.5 min, reverting to 98 %/2 % at 5.2 min (Fig. 1).

Drugs are identified using the quantitative ion transition having the same peak shape as the qualitative ion transition; retention times need to match the corresponding deuterated internal standard and the quantifying ion to qualifying ion ratio matches that of the calibrators (Fig. 2). Drugs are reported as positive when above the corresponding threshold level. Threshold levels are broadly based on Driving Under the Influence of Drugs (DRUID) or European Workplace Drug Testing Society (EWDTS) confirmation test levels for both urine and oral fluid (Table 1).

We analyse 4 000 urine and 17 000 oral fluid samples each year. These are predominantly from drug problem users (Fig. 3).

Drugs of abuse in urine

TMS has the advantage of greatly reducing false-positive results seen with immunoassay methods and negating the need for second-line confirmatory tests. However, the use of urine as a sample medium still has a number of disadvantages: it is susceptible to adulteration or spiking with drugs; sample collection is not witnessed; urine drug concentrations vary depending on hydration status. This can affect whether a drug is reported as positive or negative relative to threshold levels. Additionally, some drugs are excreted relatively unchanged in urine, whereas other drugs are highly metabolized and conjugated, in which case unchanged parent drug levels can be low. In order to keep the sample preparation simple it was decided not to hydrolyse drugs in urine but to measure predominantly parent drugs, including metabolites only where necessary. This required threshold levels to be adjusted to give comparable positivity to immunoassay methods (Table 1).

Drugs of abuse in oral fluid
Oral fluid overcomes many of the disadvantages of urine: sample collection can be witnessed; samples cannot be adulterated or spiked; and threshold levels are not affected by hydration status. Since we have offered an oral fluid service most clinicians have switched from urine to oral fluid testing. Parent drugs predominate in oral fluid, with metabolite levels being generally absent or uninformative, with the exception of BEC and nordiazepam. Drugs are predominantly weak bases and diffuse from serum (pH 7.4) into oral fluid (pH 4.0–6.0). As such, some drugs are then unable to diffuse back out again. This can result in oral fluid drug levels being higher in oral fluid than in blood. Levels can remain positive for longer in oral fluid than in blood or urine, giving a longer duration of detectability for some drugs (Table 1) [5].

Opiates
Heroin contains diacetyl morphine and acetyl codeine. Both of these are rapidly metabolized into 6-MAM and codeine respectively. Both 6-MAM and codeine further metabolize to morphine. Morphine is the major excretory product of heroin in urine and is detectable in urine up to 72 h after heroin has been taken [6]. Finding 6-MAM confirms heroin has been taken. Finding codeine in the absence of 6-MAM is also compatible with codeine consumption. 6-MAM is the major heroin component in oral fluid and this always indicates heroin use. Morphine and codeine levels are generally lower than 6-MAM in oral fluid. Finding morphine in oral fluid, in the absence of 6-MAM or codeine usually indicates a pure morphine preparation has been taken. Long detection times for 6-MAM in oral fluid have been reported in a Norwegian study which analysed daily blood, urine and oral fluid samples in 20 heroin overdose cases. They reported that 6-MAM can remain positive in oral fluid for 5 days or more after heroin had been taken. In one case, the heroin test was positive 8 days after exposure [7]. Dihydrocodeine, tramadol and oxycodone can be readily identified in both urine and oral fluid.

Cocaine
Cocaine is rapidly metabolized into BEC. BEC is better than cocaine as a urine marker of cocaine use, and can be detected for 48–72 h after cocaine use [6]. However, cocaine predominates in oral fluid at much higher levels than BEC. Cocaine can remain positive in oral fluid for up to 5 days after cocaine has been taken.

Methadone/buprenorphine
Methadone and buprenorphine are prescribed for the treatment of opioid dependence and are metabolized into EDDP and norbuprenorphine, respectively. EDDP/methadone and norbuprenorphine/buprenorphine concentrations are measured in urine. Usually EDDP levels are significantly higher than methadone. Norbuprenorphine levels are usually much higher than buprenorphine. Finding methadone/buprenorphine levels greater than EDDP/norbuprenorphine indicates the sample has been spiked. Parent methadone and buprenorphine appear in oral fluid whereas EDDP and norbuprenorphine do not. Buprenorphine is administered sublingually and levels in oral fluid are very high in samples collected immediately after administration. To avoid this, oral fluid samples should not be collected within 1 h of the buprenorphine dose. Buprenorphine half-life varies from 2 to 24 h [8] and oral fluid can be negative for buprenorphine if the sample is collected the next day after a low dose.

Amphetamines
Amphetamine, MDMA and methamphetamine are excreted relative unchanged in urine. Hence, parent drugs are analysed in both urine and oral fluid.

Gabapentinoids
Gabapentin and pregabalin are predominantly excreted unchanged in urine so the parent drug is readily detected in both urine and oral fluid. A survey of substance misusers in Lothian in 2012 indicated that gabapentin was taken to potentiate the high obtained from methadone and to increase the level of intoxication [9]. 92 % of sample positive for gabapentinoids are also positive for methadone or buprenorphine confirming that these drugs are taken to boost the intoxicating effects of opiate and opioids.

Benzodiazepines
These drugs are highly metabolized and conjugated with only a small amount of parent drug excreted unchanged in urine. As such threshold levels are much lower than immunoassay screening methods. Diazepam metabolizes into nordiazepam and temazepam, both of which metabolize into oxazepam. Nordiazepam is also a metabolite of chlordiazepoxide. Finding diazepam, nordiazepam, temazepam and/or oxazepam is consistent with diazepam. Finding nordiazepam in the absence of diazepam is also consistent with chlordiazepoxide. Nordiazepam has a longer half-life than both diazepam and chlordiazepoxide and remains positive for longer than either parent drug. Detecting temazepam only, nitrazepam only or oxazepam only is consistent with those drugs being taken. These patterns persist in both urine and oral fluid, although threshold levels are lower in oral fluid compared to urine (Table 1).

Street benzodiazepines
Following the 2016 drug-related deaths Scotland report [1] we introduced testing for etizolam, delorazepam, diclazepam and alprazolam into the standard set. These drugs are generally not available by prescription in the UK. Alprazolam and etizolam are short acting, whereas delorazepam and diclazepam are long acting. Alprazolam is six times more potent than diazepam [10].

Conclusion and future developments
Gabapentinoid use is widespread and is almost always used to potentiate methadone and other opiates or opioids. There is an increasing trend for more potent street benzodiazepines. This poly-drug use has a detrimental effect on judgement and behaviour leading to inadvertent overdosing. Poly-drug use is the main reason for the increase in drug-related deaths in Scotland in recent years [2]. Identifying the main drugs implicated in these deaths is only possible by TMS. In the future, additional drugs can be considered for inclusion, such as phenazepam (30 deaths in 2017); flubromazepam (9); fentanyl (15); mirtazapine (59); amitriptyline (36); sertraline (12); fluoxetine (12); olanzapine (9); quetiapine (11) and zopiclone (29). There is evidence that these are being abused by substance misuse clients and these are all implicated in significant numbers of drug-related deaths in Scotland [11].

References

1. Drug-related deaths in Scotland in 2016. A National Statistics report for Scotland. National Records of Scotland 2017 (https://www.nrscotland.gov.uk/files//statistics/drug-related-deaths/drd2016/drug-related-deaths-16-pub.pdf).
2. Drug-related deaths in Scotland in 2017. A National Statistics report for Scotland. National Records of Scotland 2018 (https://www.nrscotland.gov.uk/files//statistics/drug-related-deaths/17/drug-related-deaths-17-pub.pdf).
3. Eichhorst JC, Etter ML, Rousseaux N, Lehotay DC. Drugs of abuse by tandem mass spectrometry: a rapid, simple method to replace immunoassays. Clin Biochem 2009; 42: 1531–1542.
4. Sauvage FL, Gaulier JM, Lachatre G, Marquet P. Pitfalls and prevention strategies for liquid chromatography-tandem mass spectrometry in selected reaction-monitoring mode for drug analysis. Clin Chem 2008; 54(9): 1519–1527.
5. Bosker WM, Huestis MA. Oral fluid testing for drugs of abuse. Clin Chem 2009; 55(11): 1910–1931.
6. Baselt RC, Cravey RH. Disposition of toxic drugs and chemicals in man. 4th edition. Chemical Toxicology Institute 1995; IBSN: 978-0962652318.
7. Baird CRW, Fox P, Colvin LA. Gabapentinoid abuse in order to potentiate the effects of methadone: a survey among substance misusers. Eur Addict Res 2014; 20(3): 115–118.
8. Kuhlman JJ Jr, Lanlani S, Magluilo J, Levine B, Darwin WD. Human pharmacokinetics of intravenous, sublingual and buccal buprenorphine. J Anal Toxicol 1996; 20(6): 369–378.
9. Vindenes V, Enger A, Nordal K, Johansen U, Christophersen AS, Øiestad EL. Very long detection times after high and repeated intake of heroin and methadone, measured in oral fluid. Forensic Sci 2014; 20(2): 34–41.
10. Aden GC, Thein SG Jr. Alprazolam compared to diazepam and placebo in the treatment of anxiety. J Clin Psychiatry 1980; 41(7): 245–248.
11. Barnsdale L, Gounari X, Graham L. The National Drug-Related Deaths Database (Scotland) Report. Analysis of deaths occurring in 2015 and 2016. Information Services Division, NHS National Services Scotland 2018 (https://www.isdscotland.org/Health-Topics/Drugs-and-Alcohol-Misuse/Publications/2018-06-12/2018-06-12-NDRDD-Report.pdf).

The authors
Paul Cawood* PhD
Joanne McCauley BSc
Department of Clinical Biochemistry, Royal Infirmary of Edinburgh, Edinburgh, UK

*Corresponding author
E-mail: Paul.cawood@nhs.net

27596 Greiner 140x204 VACUETTE SAFELINK en

VACUETTE® SAFELINK holder with male luer lock

27788 Cellavision CLI white

Meet CellaVision® DC-1

Full Automation of IFA with EUROPattern

27634 Stago 18 12039 AP Max Generation EN 140x204 HD

The Max Generation is now complete