Clinical coagulation assays are an important part of anticoagulation measurements and monitoring. Despite the rise of new promising technologies, traditional coagulation assays were largely unchanged in the last decades. Here we discuss the application of microfluidics and nanotechnology to clinical coagulation diagnostics and anticoagulation therapy monitoring.
by Dr Francesco Padovani and Prof. Martin Hegner
Fast, accurate and reliable determination of multiple coagulation parameters is crucial for a correct diagnosis of blood coagulation disorders. The two most common coagulation assays performed regularly in hospital environments are prothrombin time (PT) and activated partial thromboplastin time (aPTT). These two assays measure the time required for the onset of fibrinogen proteolysis that is followed by the formation of a fibrin network [1]. The measurement is usually performed by increased impedance or turbidity. Upon determination of an abnormal coagulation time, further testing is required (e.g. one-stage clotting assays or chromogenic substrate assays). Despite their extreme usefulness, these assays are not factor specific and they are sensitive only if the factor activity is below 50 %. Additionally, fibrinolysis, crosslinking, clot strength or initial blood plasma viscosity (important mechanical parameters that relate to coagulation) are not measured, and finally they do not evaluate or monitor acute bleeding or thrombosis risk. These drawbacks demand for the development/standardization of novel strategies that can improve the clinical diagnosis process. Global hemostasis assays such as thromboelastography (TEG), thrombin generation, and overall hemostasis potential are promising technologies that, despite being around for decades, are not routinely used by hematologists. These assays are based on bench-top devices and require dedicated clinical laboratories and qualified personnel. Novel strategies based on microfluidics and nanotechnology may enable point-of-care testing (with potential for self-testing), self-monitoring and a great reduction in sample volume needed [2].
Anticoagulation monitoring and measurement
Accurate, reliable and frequent measurement and monitoring of anticoagulant therapies such as warfarin or heparin is vital to their effectiveness. When control is poor, patients experience more complications such as joint pain, bleeding and strokes [3]. The gold standards used for assessing the level of anticoagulation control are the percent time in therapeutic range (TTR) and international normalized ratio (INR). Both of these assays rely on standardization of the patient’s PT against an international standard. TTR is usually calculated with the method by Rosendaal that employs linear interpolation to assign an INR value to each day between successive observed INR values [4]. Therefore, patients who undergo an anticoagulation therapy have to frequently assess coagulation parameters. Systematic reviews showed that self-testing and self-management are an effective and safe intervention [5]. Self-testing devices should be of simple use, provide fast and analytically accurate results, and they should require minimal amount of sample. Ideally, they should also be portable.
Novel strategies exploiting microfluidics and nanotechnology
Novel approaches that employ microfluidics and nanotechnology have been developed in recent years. The main advantages of these techniques are high sensitivity and a great potential for miniaturization and point-of-care testing. Some studies proposed the use of quartz crystal microbalance (QCM) to measure the viscoelastic properties of blood plasma clot formation [6–9]. QCM consists of a quartz crystal resonator whose resonant frequency is dependent on the mass adsorbed onto the sensor and on the viscoelastic properties of the fluid surrounding the resonator. These studies showed superior performances to conventional TEG and required relatively small sample volumes. However, deconvolution of unspecific protein adsorption and liquid viscoelastic properties are very complex, hindering the potential to accurately measure clot strength development during coagulation. Other studies employed surface plasmon resonance (SPR) detection. SPR is a popular technology in the field of biomarker detection. A polarized light beam hits a glass/liquid interface causing an electromagnetic field exiting the glass. If a thin metal film is applied between the glass and the liquid surface plasmons are excited. The reflected light is collected by a sensor and upon receptor/target recognition the reflectivity curve shifts [10]. Extrapolation of viscoelastic parameters is not feasible. To the best of our knowledge, only PT time was measured using this technology [11]. Our laboratory exploited nanomechanical resonators to quantify coagulation parameters. The resonators are arrays of microcantilevers (beams clamped at one end) that oscillate at high speed. When immersed in a fluid, the viscosity and density can be measured in real time by tracking quality factor and resonant frequency of the oscillation [12]. By combining microfluidics technology, ensuring uniform mixing of coagulation reagents, with a high degree of automation and accurate extrapolation of the results, nanoresonators demonstrated great ability to measure clinically relevant coagulation parameters [13]. Along with PT and aPTT, other parameters are measured within the same test run, such as initial plasma viscosity, clot strength (final viscosity), initial and final coagulation rates. For example, patients with severe hemophilia showed a low initial plasma viscosity, low clot strength (bleeding), and low coagulation rates. By mixing hemophiliac patients’ plasma with 30 % of normal control the coagulation rates and the clot strength were improved, but not completely restored indicating the degree of severity (Fig. 1). To detect deficiencies of specific factors, an immunoassay can be integrated in situ allowing for diagnosis of factor deficiency within a single test run. Furthermore, the diagnostic array can be reused repeatably by regeneration in a cleaning solution [13]. The same microcantilever technology was applied to measure fibrinolysis in real time. It is well known that impaired function of the fibrinolytic system increases the risk of thrombosis [14]. By pre-mixing a patient’s blood plasma with tissue plasminogen activator and performing a PT (or aPTT) assay, the PT (or aPTT) and the following induced fibrinolysis can be measured. Parameters such as starting clot strength, final dissolved clot strength and 50 % lysis time (Fig. 2) provide useful information for assessing the patient’s thrombotic risk. Finally, anticoagulation treatment (heparin) was measured with low and high concentration of heparin mixed with normal control plasma (Fig. 3). Potentially, a patient under anticoagulation treatment could self-monitor their status and self-manage their therapy according to the results. For example, the final clot strength could indicate bleeding risk and the therapy can be adjusted to suit the particular needs of the specific patient (personalized medicine). All these measurements were performed with a low sample volume (<20 µl) and a high degree of automation (reducing operator intervention and complexity).
Anticoagulation measurement and monitoring employs assays that have gone largely unchanged for decades. The rise of new technologies such as microfluidics and nanotechnology carry great potential for integration with standard clinical assays. Global hemostasis assays could pave the way for an improvement in the current clinical coagulation diagnostics. Miniaturization, personalized medicine, point-of-care testing, automation, self-testing and self-monitoring are all interesting approaches that could overcome current drawbacks of gold standards in coagulation measurements. However, all these strategies require more standardization and more clinical studies to assess and exploit their potential.
Figure 1. Representation of the suspended microresonators oscillating at high speeds (approx. 300 kHz) and microfluidics set-up. Clot strength (viscosity) curves over time for normal control samples, mild hemophilia and severe hemophilia patients’ plasma during activated partial thromboplastin time (aPTT) assays performed with nanoresonators. The array of sensors is first immersed in human blood plasma (green area) and then, at time 0 s, coagulation is triggered with the specific reagents (orange area). Final clot strength, coagulation rates and aPTT values are dependent on the degree of severity. (Padovani F, Duffy J, Hegner M. Nanomechanical clinical coagulation diagnostics and monitoring of therapies. Nanoscale 2017; 9(45): 17939–17947 [13] – Reproduced by permission of The Royal Society of Chemistry.)
Figure 2. Clot strength developing over time for tissue plasminogen activator (tPA) assisted fibrinolysis. Normal control plasma was mixed with a 350 ng/ml tPA solution. After the measurement of the plasma viscosity, the coagulation is triggered at time 0 s with PT reagents. As soon as the coagulation is triggered, the clot strength increases, but at the same time the activity of tPA starts to lyse the fibrin network. After approx. 32 min, the clot is completely dissolved and the final strength is lower than the starting plasma viscosity. This difference is due to the fibrin breakage into soft fibrin particles that have no viscosity. Some of the parameters that can be extracted are PT (see zoom plot), starting clot strength (C+B), final dissolved clot strength (C), and time (50 % Ly) required to reach half-clot strength (50 % B). (Padovani F, Duffy J, Hegner M. Nanomechanical clinical coagulation diagnostics and monitoring of therapies. Nanoscale 2017; 9(45): 17939–17947 [13] – Reproduced by permission of The Royal Society of Chemistry.)
Figure 3. Effects of heparin on the clot strength development during an aPTT test. After measurement of plasma viscosity, coagulation is triggered at time 0 s with aPTT reagents. Higher concentrations of heparin cause a more prolonged aPTT but the final clot strength is always in the normal range. (Padovani F, Duffy J, Hegner M. Nanomechanical clinical coagulation diagnostics and monitoring of therapies. Nanoscale 2017; 9(45): 17939–17947 [13] – Reproduced by permission of The Royal Society of Chemistry.)
The authors
Francesco Padovani PhD and Martin Hegner*PhD
Centre for Research on Adaptive Nanostructures and Nanodevices (CRANN), School of Physics, Trinity College Dublin, Dublin, Ireland
*Corresponding author
Modern ‘omics’ and screening technologies make possible the analysis of large numbers of proteins with the aim of finding biomarkers for individually tailored diagnosis and prognosis of disease. However, this goal will only be reached if we are also able to sensibly sort through the huge amounts of data that are generated by these techniques. This article discusses how data analysis techniques that have been developed and refined for over a century in the field of psychology may also be applicable and useful for the identification of novel biomarkers.
by Dr J. Michael Menke and Dr Debosree Roy
The profession and practice of medicine are rapidly moving towards more specialization, more focused diagnoses and individualized treatments. The result will be called personalized medicine. Presumably genetic predisposition will remain the primary biological basis, but diagnosis and screening will also evolve from complex system outputs observed as increases or decreases of levels of biomarkers in human secretions and excretions. In this sense, the exploration in the human sciences will undoubtedly expand to new frontiers, interdisciplinary cooperation, new disease reclassifications, and the disappearance of entire scientific professions.
Big data and massive datasets by themselves can never answer our deepest and most troubling questions about mortality and morbidity. After all, data are dumb, and need to be properly coaxed to reveal their secrets [1]. Without theories, our great piles of data remain uninformative. Big data need to be organized for, and subjected to, theory testing or data fitting to best competing theories [2, 3] to avoid spurious significant differences, conceivably the biggest threat to science in history [4, 5].
Old tools for big data
New demands presented by our ubiquitous data require new inferential methods. We may discover that disease is emergent from many factors working together to create a diagnosis in one person that, in fact, actually has many different causes in another person with the same diagnosis. Perhaps there are new diseases to be discovered. There might be better early detection and treatment. Much like the earliest forms of life on earth, pathology is much more complicated than just the rise of plant and animal kingdoms as taught mid-twentieth century in evolution.
Although new methodologies may meet scientific requirements of big data, tools already in existence may obviate the need to invent new ones. In particular, methods developed by and for psychologists over more than 100 years may already be an answer. Established data organization and analysis have already been developed by psychologists to test theories about nature’s most complex systems of life. Inference and prediction from massive amounts of data from multiple sources might yield more from these ‘fine scalpels’ without the need for brute force analyses, such as tests for statistical differences that look significant in many cases because of systematic bias in population data arising from unmeasured heterogeneity. The development of some of the most applicable psychological tools began in the early 20th century for measuring intelligence, skills and abilities. Thus, these tools have been used and refined for over a century. From psychological science emerged elegant approaches to data analysis and reduction to evaluate persons and populations for test validity, reliability, sensitivity, specificity, positive and negative predictive values, and efficiency. Psychological testing and medical screening share a common purpose: measure the existence and extent of largely invisible or hard to measure ‘latent’ attributes by establishing how various indicators that are attached to the latent trait react to the presence or absence of subclinical or unseen disease. Biomarkers are thus analogues of test questions, with each biomarker expressing information that helps establish the presence or absence of disease and its stage of progression. The analogous process recommended in this paper is simply this: How many and what kind of biomarkers are sufficient to screen for disease?
Biomarkers for whole-person healthcare
Although the use of biomarkers seems to buck the popular trend of promoting whole person diagnosis and treatment, biomarkers per se are actually nothing new. Biomarkers as products of human metabolism and waste have played an important role over centuries of disease diagnosis and prognosis, preceding science and often leading to catastrophic or ineffective results (think of ‘humours’ and ‘bloodletting’ as examples). Today, blood and urine chemistries are routinely used for focusing on a common cause (disease) of a number of symptoms. Blood in the stools, excessive thirst, glucose in urine, colour of eye sclera, round out information attributable to a common and familiar cause crucial for identifying and treating a system or body part. Signs of thirst and frequent urination may be necessary, but not sufficient for diagnosis of diabetes mellitus, yet can lead to quick referral or triage. The broad category of the physiological signs (biomarkers) has extended along with technology to the microscopic and molecular.
Today, the general testing for and collection of biomarkers in bodily fluids is a growing medical research frontier. However, too many, biomarkers can be confused with genes and epigenetic expressions of genes. Small distinctions might uncover the discovery of new genes leading to new definitions of disease, more accurate detection, and more personal treatment.
With the flood of data unleashed by research in these areas, a new and fundamental problem arises: How do we make sense of all these data? For now, professions and the public may be putting their faith in ‘big data’ in order to make biomarkers clinically meaningful and informative. We are in good company with those who remind us that data are dumb and can be misused to support bias, and that lots of poor quality data do not compile good science. At its heart, scientific theories need to be tested and scientific knowledge built in supported increments.
Biomarkers as medical tests
As with any medical test, some biomarkers are more accurate, or more related, to disease presence and absence and therefore are better indicators of underlying disease state. Thus, some biomarkers are more accurate than others; or put another way, biomarkers represent ‘mini-medical’ tests and their levels of contribution to diagnoses and prognoses depend upon random factors, along with sensitivity, specificity, and disease prevalence [6]. Some biomarkers may increase in presence with disease but lower with health, or the opposite – lower concentrations with disease. To complicate matters further, there are probably plenty of mixed signals, i.e. biomarker A is more sensitive than biomarker B, but B is more specific than A. Blending the information acquired by multiple biomarkers needs to be organized and read in a sequence to reduce false signals – positives or negatives – or at least minimize errors based on risk of disease and morbidities and mortalities.
Thus, managing and analysing the flood of biological diagnostic data is not the concern here, but rather its interpretation and clinical application. Balancing biomarker information at the clinical level is the function of translational research. Test-and-measurement (T&M) psychologists have worked on the science of organizing and interpreting individual items as revealing underlying latent constructs for over a century. Through the extremely tedious task of measuring human intelligence, skills and abilities, some already developed T&M tools could help improve the science, accuracy and interpretation of biomarkers [6].
Psychometric properties of biomarkers
Before embarking on a psychometric approach to biomarker interpretation, some common definitions are required. For instance, what is sensitivity or specificity? A psychometric or medical test shows high sensitivity when the underlying disease or person characteristic is also high. For intelligence, a high-test score implies high intelligence. On a single well-crafted test question, the probability of answering it correctly (formally called probability of endorsing) increases along with higher intelligence; if the question is associated with high intelligence, then the question is a strong or weak indicator of personal intelligence. When many test questions are indicators of intelligence, more correctly endorsed answers of good questions should indicate more intelligence. Indeed, some questions may even be ‘easier’ than others, leading to the need to design questions to fill out the continuum of an underlying intelligence being measured. This procedure is item analysis, a part of item response theory, see Figure 1 for an illustration of how multiple items ‘cover’ a given theta or disease.
Notice how irrelevant is the concept of sensitivity in clinical screening and diagnosis. Sensitivity means that if we already know for sure someone is smart or has a disease, the test and its questions will be correct in describing latent construct (referred to ‘theta’) a certain percentage of the time, based upon the test’s ability to detect and describe the presence or degree of the latent trait. Thus, the proportion of time the question is correct, given that we already know the person’s underlying status, is test or item sensitivity. Sensitivity is a test characteristic given we already know the latent trait – disease status. Symbolically, sensitivity is p(T+|D+), the probability of a positive test score (T+) given we already know the person has the disease (D+). Similarly, specificity is p(T−|D−), the probability of a negative test (T−) or item given that we already know that the patient is confirmed disease-free (D−).
Bayesian induction
Bayes Theorem is useful for many reasons, some controversial. But the conversion of disease prevalence along with biomarker sensitivity and specificity, will axiomatically give the probability of an individual having a disease given a positive test.
In Bayesian terms, the positive predictive value (PPV) is the posterior probability of a patient with a positive test. Two important properties of the PPV are: 1. It is a conversion of population prevalence turned into personal probability of disease based on a person’s positive test; and 2. PPV varies directly with the population prevalence of the disease. One cannot interpret a PPV without starting from its known or estimated population prevalence. PPV decreases with rare disease and increases with common disease, irrespective of tests’ sensitivity or specificity estimates. For further details see Figure 3 in the open access article ‘More accurate oral cancer screening with fewer salivary biomarkers’ by Menke et al. [7].
Sensitivity and specificity are characteristics of the test, not any patient. Such deductive processes are not at all clinically useful. In fact, diagnosing and screening are exactly the inverted probability of that: what is the inferred disease state, D+ or D−, from positive and negative test results? In other words, we want p(D+|T+) instead of p(T+|D+), and p(D−|T−) instead of p(T−|D−). The method for inverting the probabilities from test to patient characteristics is by the application of Bayes’ Theorem. This inverted probability is highly influenced by disease prevalence, however, whereas sensitivity and specificity are not.
Role of prevalence in disease detection
Generally, the higher the disease prevalence in a population, the easier it is to detect. Fortunately, this coincides with good intuitive sense. In fact, when screening for diseases, we need to read the biomarker results diachronically to take advantage of the information added by each biomarker. ‘Diachronically’ refers to reading over time. In the case of biomarker screening, all biomarker antibodies or other detectors of biomarker presence will require the fewest number of biomarkers when read in context of other present biomarkers. Diachronic refers to the order in which biomarkers are read, not the order in which they are administered.
Biomarkers can be strongly or weakly informative. The indicator of strong or weak biomarkers is the diagnostic likelihood ratio, which is shown in the image above.
More explicitly this is called a positive diagnostic likelihood ratio, abbreviated +LR. The higher the +LR, the more information it conveys about the presence or absence of disease. The objective of the inverted probability, p(D+|T+), is called the positive predictive value of a test, PPV.
Diachronic contextual reading
When used in conjunction with other biomarkers, [p(D+|T1, T2, T3, …Tn)], the tests’ accuracy can be increased, but only if the test results are read diachronically. For instance, ‘passing along’ only positive test findings to another biomarker amounts to throwing out true negatives in the sample (and a few false negatives), which increases the ability to detect suspected diseased screened persons from a more prevalent sample pool. After five to ten of these ‘pass-alongs’, depending on original disease prevalence, the PPV can approach 100%, signifying great confidence that a disease is present and further testing and treatment are required. Also, panels of biomarkers – multiple biomarkers used in a single unit for screening – can also have a PPV. In some cases, biomarkers only appear in panels in which case, there is a resultant sensitivity, specificity and PPV for the entire panel.
Biomarkers that are too sensitive might generate too many false positives. This problem can be overcome with a biomarker or biomarkers to ‘clean out’ the false positives. Highly specific biomarkers will throw out false negative ones, a perspective balanced with sensitive biomarkers. Sensitivity and specificity generally vary inversely for each given biomarker. Those high on one attribute tend to be low on the other. Overall, according to our previous experience in meta-analyses, we found specificity was the primary attribute for quickly and accurately screening a population.
The exceptional biomarker can be high on both test attributes. In most cases, the information from mediocre biomarkers can be improved by combining them into biomarker panels with a combined accuracy stronger than any individual biomarker. Once biomarkers are ranked from high to low, wherein they pass along positive test results from highest to lowest dLR, the number of biomarkers required to achieve a PPV near 1.0 is considerably fewer than if biomarkers are ordered from lowest to highest dLR (Fig. 2).
As you may have inferred by now, the methodology of identifying the best biomarkers is via meta-analysis. A word of caution for diagnostic meta-analyses. There are software packages for the meta-analysis of medical tests. Meta-DiSc is one such tool [8, 9]. Material on its development may be found here [9]. When last checked, the Meta-DiSc program was being revised to correct some estimate errors and researchers were re-directed to a Cochrane Collaboration page [10]. In short, it is important not to add up all cells as if they represent one large study, because this misrepresents study homogeneity and therefore variance.
We recommend a meta-analysis that uses an index of evidential support [11–13]. In so doing, the weighting of data based on sample size alone may be avoided [7].
Partitioning panels with evidential support estimates
Biomarkers may be either high on sensitivity or specificity. Others may be very high in one attribute, but not the other. Few are high on both. This issue may be overcome by combining a panel made up of the same biomarker(s) of interest, where individual biomarker member weaknesses may be averaged out by including other biomarkers with complementary strengths. A biomarker with high sensitivity and low specificity may be combined with biomarkers of complementary strengths, such as those with low sensitivity and high specificity. The scenario is to combine those biomarkers high in one trait with those high with its complement. This can be tricky as an average accuracy might fall along a diagonal in a receiver operating characteristic (ROC) chart, rendering it a useless test. Indeed, the idea is to maximize the area under the curve on a ROC chart by ‘pulling the curve’ up into the upper left corner to create more area under the curve, representing diagnostic accuracy. For further details see Figure 2 in the open access article ‘More accurate oral cancer screening with fewer salivary biomarkers’ by Menke et al. [7].
The question is whether the combined accuracy is synergistically greater from using two biomarkers or becomes just an arithmetic average of two biomarkers. This conundrum is solved by making sure there are data points in the upper right corner to ‘pull up’ the ROC curve and maximize the area under the curve, which translates roughly to diagnostic accuracy. In fact, sensitivity to cancer or any other disease must be inverted to PPV before the biomarker exhibits utility. Somewhat paradoxically, just using more biomarkers does not increase screening accuracy without being read in the diachronic context of other tests done at the same time. not (again, refer to Fig. 2 in this article).
Should cancer tests detect only binary signals?
From a test and measures perspective, each biomarker is a kind of test question, where the answer to each question is the state of disease in the body. Some questions or biomarkers or biomarker panels are more or less informative because they are more or less sensitive and specific to detecting disease. The answers sought are binary – yes or no. The patient either has a disease or does not. It is up to the properties of the tests to reveal the truth.
As mentioned before, biomarker accuracy varies. No medical test of any kind is 100% accurate. Biomarkers associated with cancer can and do appear at lower levels in healthy individuals. We must understand this principle to decide whether other tests or panels are necessary to improve screening or diagnostic information.
When educational psychologists measure traits and abilities, e.g. IQ, they ask a series of questions. To the degree that the questions are answered ‘correctly’, a person scores higher and has more of the trait or ability to be measured. Creating a survey or questionnaire is a rigorous process. Think of an underlying variable (IQ) as the latent construct. ‘Construct’ is the intended concept we attempt to measure. The construct is not directly measurable, and thus called latent. Each question is a kind of probe that, to various degrees of accuracy, allows indirect observation of the latent construct or disease state. By analogy, biomarkers can be interpreted as test questions indicating the existence of a latent trait or disease.
Pushing the test analogy further, biomarkers might be negatively keyed, i.e. the levels of certain biomarkers are reduced in the presence of disease, or positively keyed with biomarker presence associated with disease. Whereas assessment of traits and abilities measures a continuous scale of latent construct presence, biomarkers answer a simple binary choice: Is the disease present or not?
Biomarker accuracy is estimated by its sensitivity and specificity. Test questions are subject to data reduction techniques (factor analysis), internal consistency within factors, and item response theory to identify redundant questions and design new questions cover gaps in detecting an underlying disease state.
As we are not basic scientists, but rather behavioural and population ones, we cannot address the clinical and laboratory aspects of biomarkers, but in collaboration with colleagues at dental programmes here in Mesa, Arizona and in Malaysia, we came to understand that some biomarkers are more informative than others in screening and diagnosing disease.
Unidimensionality, monotonicity, and local independence properties
Test items should obey conditions of unidimensionality, monotonicity, and local independence. Briefly applied to medical tests, biomarkers should be indicative of the same latent construct (presence of disease), but individual biomarkers should increase (be positive for disease) along with the actual presence of disease [14].
The application of item response theory to academic test scores will reveal that there are gaps in assessment that miss progress or degree of the latent construct. When graphed on person–item maps, the high-ability persons will score higher on the test – i.e. endorse more items, especially the most difficult ones. The item–person map might show two areas of concern: redundant items that may be removed from the test to make the test more efficient, and abilities that cannot be determined owing to items clustering over small ranges of the latent construct. This is exemplified in Figure 1 in Warholak et al. [15].
As for biomarker disease screening, test or panel gaps may miss a subclinical or early stage disease by not matching the stage with biomarkers that would alert us to that stage of disease. In effect, this would be a blind-spot that more research may be required to fill. On the one hand, for a binary screening outcome – yes or no – gaps are not crucial. On the other hand, the discovery of gaps may lead to better science and better early disease detection.
Generalizability theory
Generalizability theory – or G-Theory – is a tool developed by Lee Cronbach and colleagues at Stanford around 1972 [16]. Without getting into excessive detail, it should suffice in this article that G-Theory be mentioned as a methodology for identifying sources of error, bias, or interference in statistical modelling of complex systems. As an example of the reasons for developing G-Theory in the first place, students are taught by professors within classes in courses in schools and states and countries. Each level of this education hierarchy may become a source of variability. If what we want to produce is a similar product in student graduates, as minimal competency in medicine, we may glean interference – variability – introduced by various levels or one specific level. With G-Theory, the primary source of variance may be identified and modified accordingly.
In the biomarker analogy, some biomarkers introduce more confusion than they resolve and can be eliminated or modified to improve reliability and consistent accuracy.
Although biomarker research is being funded and undertaken at unprecedented levels, it is important to remember the credible handling the data in a scientific manner is still the key to understanding and discovery. Big data still needs to answer the question of ‘What does it all mean?’ Yet, we recommend starting with highly refined methodology developed for T&M of human skills, abilities and knowledge. At the very least T&M science might minimize errors, increase medical test efficiencies, and may be used to complement or confirm findings for translational research.
The authors
J. Michael Menke* DC, PhD, MA; Debosree Roy PhD
A.T. Still Research Institute, A. T. Still University, Mesa, AZ 85206, USA
*Corresponding author
The activated partial thromboplastin time coagulation assay is one of the most frequently performed tests in hematology, and has a variety of uses in clinical practice. Accurate interpretation of the test depends on both clinical context (i.e. why the test was ordered) as well as an understanding of each laboratory’s normal reference range and assay sensitivity regarding detection of factor deficiencies, (unfractionated) heparin therapy and lupus anticoagulant.
by Dr Julianne Falconer and Dr Emmanuel J. Favaloro
The activated partial thromboplastin time (APTT) assay is a commonly requested coagulation test, perhaps second only to the prothrombin time (PT)/international normalized ratio (INR), as used to monitor vitamin K antagonist (VKA) therapy such as warfarin. The APTT test assesses the intrinsic pathway of coagulation and has a variety of clinical uses; however, it is primarily used to screen for hemostasis issues, factor deficiencies, lupus anticoagulant (LA) or to monitor unfractionated heparin (UFH) therapy dosing. The test is sensitive to, but not specific for, detection of these abnormalities or influences. APTT prolongation may also be seen in liver disease, disseminated intravascular coagulation (DIC) and in the presence of factor inhibitors. Interpretation of an APTT result, be it normal or prolonged, is dependent on both the clinical context and the characteristics of the reagents and the assay as performed on particular instruments. The establishment of normal reference intervals (NRIs) and assessment of the assay in terms of its sensitivity to heparin, LA and clotting factors are important to provide accurate information for clinical interpretation [1].
Uses of the APTT assay
The APTT test is a global assay that measures the time to fibrin clot formation via the contact factor (‘intrinsic’) pathway (Fig. 1). The APTT test is usually performed on fully automated platforms, and involves activation of coagulation within the test (plasma) sample by the addition of specific reagents (containing phospholipids, contact factor activator and calcium chloride). The type of contact factor activator, and the type and concentration of phospholipid, used in the APTT reagent affects the sensitivity of the assay to, and thus its prolongation by, factor deficiencies, as well as to the presence of UFH and LA [1, 2].
The APTT is commonly used to monitor anticoagulation therapy using UFH (Table 1). It may also be prolonged, however, in the presence of VKAs including warfarin, as well as direct oral anticoagulants (DOACs) such as dabigatran (direct thrombin inhibitor) and rivaroxaban (anti-FXa inhibitor). The APTT is generally less sensitive to, but may still be slightly prolonged, by anticoagulation with low molecular weight heparin (LMWH) and with apixaban, another DOAC (anti-FXa inhibitor).
In the absence of anticoagulation therapy, an ‘isolated’ prolonged APTT may be used to determine a clinically important factor deficiency, for example as a screen for hemophilia A (FVIII deficiency), hemophilia B (FIX deficiency), or hemophilia C (FXI deficiency), or even von Willebrand Disease (VWD; which may be associated with loss of FVIII) [1]. An ‘isolated’ prolonged APTT, however, could instead be indicative of a clinically unimportant factor deficiency such as FXII or other contact factor deficiency. Other alternatives for an ‘isolated’ prolonged APTT include a factor inhibitor or LA. Despite causing prolongation of APTT in vitro, LA may be associated clinically with increased risk of thrombosis rather than bleeding. A prolonged APTT may be accompanied by a prolonged PT in the context of liver disease, DIC or fibrinogen (or other ‘common factor pathway’ deficiency/ies). Clinical context, therefore, must form the basis for accurate interpretation of APTT, be it either normal or prolonged, and together with other routine coagulation studies is essential to guide further investigations (Fig. 2).
A large number of commercial APTT reagents are now available, with wide variation in the type of contact factor activator and phospholipid source and concentration used. This will result in variation in sensitivity to all typical influences; thus also causing substantial variation in NRIs between APTT reagents, and requiring the establishment and verification of NRIs based on both the reagent and instrument in use. Unawareness of variation in APTT reagent sensitivity in context of clinical picture will lead to flawed clinical interpretation of results.
Establishing and verification of NRIs
A minimum of 20 normal individuals may be sufficient to establish a NRI for PT and APTT, according to guidance documents provided by the Clinical and Laboratory Standards Institute (CLSI) [3, 4]. However, a larger number of normal individuals is recommended to establish an initial NRI, following which a smaller sample of normal individuals may be used for future verification purposes [1].
As an example, Figure 3 shows an initial (historical) NRI estimation for APTT testing using a dataset of nearly 80 normal individuals. This included one outlier sample result (Fig. 3a), which was removed to produce the cleaner dataset used to produce the subsequent NRI. A statistical normality test was performed and showed the distribution to be near Gaussian, allowing parametric statistical assessment. For APTT testing, the NRI would aim to evaluate the 95 % confidence interval, approximating a mean
± 2 standard deviation (SD) assessment (Fig. 3b). Logarithmic transformation can instead be used to normalize test data when it is non-parametric and fits a log distribution (e.g. Fig. 3c).
If a NRI has been previously established by the laboratory or by the manufacturer of the APTT reagent using a specific reagent/instrument combination, the laboratory could use a process of transference to verify the ‘established’ NRI as fit for purpose. This may be done by establishing that a majority of samples in a small set of normal donors give values within the established NRI (e.g. >18 out of a set of 20 normal samples). Samples obtained from normal individuals or a dataset of normal patient test results may be used to assess a new lot of reagent to establish whether an existing NRI can be maintained when changing reagent lots.
Factor (deficiency) sensitivity
Factor sensitivity of an APTT assay (representing a specific reagent/instrument combination) can be assessed in a number of ways. One method involves serial dilution of either in-house or commercially derived normal plasma, into single-factor deficient plasma, in order to generate a series of aliquots with reducing factor levels. These samples are then tested by APTT and for factor level. The APTT reagent is regarded to be sensitive to the level of factor that correlates with the upper limit of the NRI.
A more accurate process, though particularly difficult to perform outside of a hemophilia centre, is to establish APTT values from true patients with various known factor levels [1, 2] (e.g. Fig. 4).
As a general guide, if the APTT is used for screening factor deficiencies, then the patient APTT value should be above the NRI when their factor level is below around 30–40 U/dL for FVIII, FIX, and FXI.
Sensitivity of APTT to UFH
Despite the changing landscape of anticoagulation therapy with the addition of direct anti-Xa inhibitors (rivaroxaban and apixaban) and a direct thrombin inhibitor (dabigatran) [5, 6], both LMWH and UFH continue to be frequently used in clinical practice. In turn, the APTT continues to be a generally preferred method of UFH monitoring over anti-FXa, given the wide availability and relative low cost of the assay. However, unlike the calibrated anti-FXa assay, APTT results are subject to variation between different instruments, be they be based on optical or mechanical clot detection methods [7], different APTT reagents (including variation between different lots of the same reagent type) and algorithms used on instruments for raw data processing. This poses a substantial problem with regards to historical recommendations to maintain patients on UFH between 1.5 and 2.5 times the ‘normal reference value’ (as based on limited evidence [8]). Therapeutic ranges should therefore be defined with specific reference to the instrument/reagent combination used locally [9].
One ‘spiking method’ involves testing samples containing known quantities of UFH diluted into normal pool plasma, as then tested by APTT and anti-FXa methods, allowing an estimation of the APTT therapeutic interval [1]. However, variations in certain components of patient plasma, as well as the non-physiologically processed nature of the UFH used, can impact on the interpretation of data obtained using this method. A better method involves ex vivo assessment of plasma obtained from patients on UFH therapy, with these tested for both APTT and anti-FXa, and then to establish a UFH therapeutic range for APTT that matches the therapeutic range for anti-FXa (e.g. 0.3–0.7 U/mL). It is important to recognize that individual response to UFH according to APTT is affected by many influences, including (but not limited to): antithrombin level; high or low levels of coagulation factors and proteins such as von Willebrand factor or proteins released from endothelial cells or platelets, competing with antithrombin for heparin binding; or increased FVIII levels in acute phase response; or reduction in FXII; or presence of LA (etc).
To obtain a cleaner data set to establish UFH therapeutic ranges, the following steps can be undertaken during sample collection and processing [1].
• Ensure baseline PT, APTT and INR testing prior to commencement of UFH are within their NRIs.
• Exclude underfilled samples, samples with visible hemolysis or likely platelet activation and release of heparin neutralizer platelet factor 4 (PF4).
• Exclude samples containing LMWH or other anticoagulants (e.g. VKAs, DOACs).
• Adhere to manufacturer guidelines with regards to the window from time of blood collection to testing.
• Double centrifuge samples when freezing them for batch testing (to remove residual platelets, which release PF4 and phospholipids on thawing).
• Accumulate data over a suitable time period to account for day-to-day test result variability.
• Aim for 30 or more data points.
• Appropriately dilute samples with anti-Xa activity above the test’s linearity limit.
• Remove data points reflecting ‘gross’ outliers.
LA sensitivity
The LA sensitivity of a particular APTT reagent can be assessed by comparing APTT tests of samples containing LA, for example by comparison of mean clotting times for each reagent.
Given that the APTT is a phospholipid-dependent assay, the test may be susceptible to prolongation in the presence of LA. However, differences in the phospholipid type and concentration between APTT reagents account for wide variation seen in the degree of prolongation of APTT, including due to LA. The LA sensitivity of the APTT reagent also has bearing on the use of APTT to monitor UFH and must inform the establishment of an algorithm to further investigate unexpectedly prolonged APTTs.
In one empirical method, initial testing using an LA sensitive method (e.g. dilute Russell viper venom time; dRVVT) is initially used to formulate a set of LA-positive samples of various ‘strengths’. Different APTT reagents can then be used to test the samples and the data for each sample can be plotted again the upper reference limit of the APTT for each reagent [1]. The ratio of clotting time of each LA-positive sample (of varying strengths) to the mean normal APTT derived from normal plasma samples is calculated. The median of these ratios allows different reagents to be ranked according to LA sensitivity. It can then become clear which APTT reagents are most (versus least) sensitive to LA. These can then be differentially selected according to the laboratory desire. For example, a laboratory may prefer to select an APTT reagent that is relatively LA ‘insensitive’, as combined with good factor VIII/IX/XI and UFH sensitivity if there is a desire to use a general purpose APTT screening reagent (i.e. hospital laboratory monitoring UFH, but wishing to avoid LA detection in asymptomatic patients). Alternatively, a laboratory may select an LA sensitive and an insensitive APTT reagent pair if they wish to assess for LA in symptomatic (thrombosis and/or pregnancy morbidity) patients.
Interpretation of a normal or a prolonged APTT must take into account both clinical context, including presence of anticoagulant therapy, as well as the methods and reagents used by the laboratory. The sensitivity of a particular APTT reagent to detect UFH therapy, LA and factor deficiencies has significant bearing on diagnostic assessment and therapy monitoring, and thus reflects essential knowledge for laboratory and clinical staff alike.
Figure 1. The activated partial thromboplastin time (APTT) assay measures the clot time to formation of fibrin via the contact factor pathway and is dependent on contact factors (FXII and above), and then FXI, FIX, FVIII, FX, FV, and FII. The APTT is also affected by vitamin K antagonists (VKAs; ‘W’), but more importantly is used to monitor unfractionated heparin (UFH; ‘H’) therapy and also to assess for potential hemophilia (FVIII, FIX or FXI deficiency). The APTT is also sensitive to the presence of other anticoagulants, including direct oral anticoagulants (DOACs) such as dabigatran (‘D’) and rivaroxaban (‘R’), and potentially also apixaban (‘A’) for some reagents. The APTT may also be utilized as part of a panel of tests to help assess for lupus anticoagulant (LA). (Modified from Favaloro EJ, et al. How to optimize activated partial thromboplastin time (APTT) testing: solutions to establishing and verifying normal reference intervals and assessing APTT reagents for sensitivity to heparin, lupus anticoagulant, and clotting factors. Semin Thromb Hemost 2019; 45: 22–35 [1].)
Figure 2. An algorithm that provides one recommended approach for the follow-up of an abnormal APTT. Always exclude an anticoagulant effect first – there is no point investigating a prolonged APTT associated with anticoagulant use. Then consider the patient’s history, or the clinical reason for the test order, both of which assist in terms of follow-up approach. APTT, activated partial thromboplastin time; FBC/CBC, full blood count (UK/Australia)/complete blood count (USA); DIC, disseminated intravascular coagulation; DOAC, direct oral anticoagulant; EDTA, ethylenediaminetetraacetic acid; F, factor; LA, lupus anticoagulant; PT, prothrombin time. (Modified from Favaloro EJ, et al. How to optimize activated partial thromboplastin time (APTT) testing: solutions to establishing and verifying normal reference intervals and assessing APTT reagents for sensitivity to heparin, lupus anticoagulant, and clotting factors. Semin Thromb Hemost 2019; 45: 22–35 [1].)
Table 1. The APTT test. A multipurpose and sensitive assay, but not specific for any individual parameter. List is not meant to be all inclusive.
DOACs, direct oral anticoagulants; VWD, von Willebrand disease.
*PT should also be prolonged if APTT is prolonged in the indicated setting.
(Modified from Favaloro EJ, et al. How to optimize activated partial thromboplastin time (APTT) testing: solutions to establishing and verifying normal reference intervals and assessing APTT reagents for sensitivity to heparin, lupus anticoagulant, and clotting factors. Semin Thromb Hemost 2019; 45: 22–35 [1].)
Figure 3. Historical data from our laboratory to illustrate the process of deriving a normal reference interval (NRI) for the APTT, and using nearly 80 normal individual plasma samples. (a) APTT of all samples tested shown as a dot plot; one clear outlier shown as a red asterisk. (b) Data cleaned of outliers [i.e. in this case the single red asterisk sample in (a)]. (c) NRR estimate as mean ± 2 standard deviations (SDs) to provide approximate 95 % coverage. Bar graphs of parametric data processing and log transformed data processing shown. The NRI for this data set approximates 27–38 sec. (Modified from Favaloro EJ, et al. How to optimize activated partial thromboplastin time (APTT) testing: solutions to establishing and verifying normal reference intervals and assessing APTT reagents for sensitivity to heparin, lupus anticoagulant, and clotting factors. Semin Thromb Hemost 2019; 45: 22–35 [1].)
Figure 4. Ex vivo heparin versus APTT evaluation. (a) Samples from all patients identified to be on heparin (as identified by our laboratory information system) and for which an APTT was performed at the time of evaluation are also tested for anti-FXa level. The APTT therapeutic range is that corresponding to a heparin level of 0.3–0.7 U/mL by anti-Xa. However, many data points in this figure do not reflect UFH alone. Some points may instead reflect low molecular weight heparin (e.g. likely to be the sample yielding an anti-Xa value close to 0.7 U/mL but with normal APTT) or alternatively UFH co-incident to FXII deficiency or LA, or else patients potentially transitioning from UFH to VKAs. These data points can be removed to yield a ‘cleaner’ data set, as shown in (b). (Modified from Favaloro EJ, et al. How to optimize activated partial thromboplastin time (APTT) testing: solutions to establishing and verifying normal reference intervals and assessing APTT reagents for sensitivity to heparin, lupus anticoagulant, and clotting factors. Semin Thromb Hemost 2019; 45: 22–35 [1].)
The authors
Julianne Falconer1 MBBS and Emmanuel J. Favaloro*1,2 PhD, FFSc (RCPA)
1Haematology, Institute of Clinical Pathology and Medical Research (ICPMR), NSW Health Pathology, Westmead Hospital, NSW, Australia.
2Sydney Centres for Thrombosis and Hemostasis, Westmead Hospital
*Corresponding author
