© 2001 by Oxford University Press
Journal of the National Cancer Institute Monographs, No. 30, 17-21,
2001
© 2001 Oxford University Press
Interpreting and Integrating Risk Factors for Patients With Primary Breast Cancer
Correspondence to: Gary M. Clark, Ph.D., Breast Center at Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030 (e-mail: gmclark{at}breastcenter.tmc.edu).
| ABSTRACT |
|---|
|
|
|---|
The term risk factor has different meanings in different contexts. Some factors may be patient specific (e.g., race, age, socioeconomic status, and environment), while others may be disease specific (e.g., biomarkers measured on tumor specimens, serum, and bone marrow). These factors have several potential clinical uses, including diagnosing a disease or assessing the risk of developing a disease, estimating prognosis for patients diagnosed with a specific disease who receive no therapy, predicting response to a particular therapy, monitoring response to therapy during a treatment course, and identifying targets of opportunity for new therapies. This article focuses on prognostic and predictive biomarkers and provides guidelines for interpreting published reports about these biomarkers. Application of these guidelines to the assessment of micrometastases in bone marrow of patients with breast cancer yields the conclusion that standardized techniques that are sensitive and reproducible for detecting micrometastases are needed before we can evaluate their prognostic significance.
| INTRODUCTION |
|---|
|
|
|---|
The primary objective of this presentation is to give members of the Consensus Development Panel and readers of the scientific literature some guidelines for interpreting published reports about biomarkers. These guidelines are then applied to the assessment of micrometastases in bone marrow of patients with breast cancer.
| CLINICAL USES FOR BIOMARKERS |
|---|
|
|
|---|
There are several potential clinical uses for biomarkers (Table 1
|
Although the clinical uses for biomarkers are quite varied, there are many common features that should be considered when evaluating biomarker studies. First, there should be a biologic hypothesis that underlies the study. What is the evidence that this biomarker might be a diagnostic, prognostic, or predictive factor? The authors should state whether their study is a first-generation pilot study or a definitive study designed to confirm previous findings. All too often a study that was designed as a pilot or feasibility study becomes a definitive study when the first paper is published. If this is a definitive study, then there should be sufficient sample size to address the question and the biomarker should be analyzed by using both univariate and multivariate techniques that include other established factors. The assay used to assess biomarker status should be validated before conducting the study, and the definition of a positive assay or an abnormal result should be stated clearly. Before new biomarkers are recommended for routine clinical practice, the assay must be shown to be reproducible, and the results must be shown to be able to be generalized to other sets of patients.
| COMMON CLINICAL ENDPOINTS FOR BIOMARKER STUDIES |
|---|
|
|
|---|
Interpretation of biomarker studies requires a clear understanding of the clinical endpoints that were used in the studies. Some commonly used endpoints include overall survival, breast cancer-specific survival, relative survival, disease-free survival, progression-free survival, event-free survival, tumor response, and modulation of a biomarker. Some of these endpoints are clear and unambiguousfor example, overall survival. Other endpoints require explicit definition within the publication. For example, what types of events are included in event-free survival? Does disease-free survival include both local and distant recurrences? Does it include contralateral breast cancers and death caused by breast cancer or other causes? Breast cancer-specific survival requires ascertainment of cause of death. Given the notorious unreliability of cause of death given on death certificates (1,2), authors should state how causes of death were determined in their studies. Tumor response has been standardized, and most authors use a common definition, but some publications include stable disease as a response and others do not. The Southwest Oncology Group has published their criteria for the clinical endpoints that are used in their therapeutic studies (3). These criteria might form the basis for more standardized reporting of clinical outcomes in the future.
There are several different ways to summarize and report clinical outcomes. For example, comparisons of mortality between groups of patients can be presented as the absolute risk difference, the relative risk, or an odds ratio. Each is a legitimate summary statistic, but the same statistic must be used to compare different studies. The absolute difference is easy to interpret, but its impact depends on the individual mortality rates in each group. The relative risk is defined as the risk of dying in the treated group divided by the risk of dying in the control group. It is often estimated by the hazard ratio from a statistical model such as the Cox proportional hazards model (4). The odds ratio is the odds of dying versus the odds of surviving in the treated group divided by the odds of dying versus those of surviving in the control group. Suppose that a control group and a treated group each contains 100 patients. If 50 patients in the control group died but only 10 patients in the treated group died, the absolute risk difference would be 40%. The relative risk of dying is (10 of 100)/(50 of 100), or 0.20. The odds ratio is (10 of 90)/(50 of 50), or 0.11. Table 2
gives examples of these statistics under three different scenarios.
|
| PROGNOSTIC VERSUS PREDICTIVE BIOMARKERS |
|---|
|
|
|---|
Discussion of risk factors for making treatment decisions about adjuvant therapy usually focuses on prognostic biomarkers and predictive biomarkers. It is important to distinguish between these types of factors. I have previously defined a prognostic factor for primary breast cancer as any measurement available at the time of diagnosis or surgery that is associated with disease-free or overall survival in the absence of systemic adjuvant therapy (5,6). Note that this definition permits application of a standard therapy (e.g., surgery) that all patients are likely to receive. A predictive factor is any measurement associated with response or lack of response to a particular therapy. Response can be defined by using any of the clinical endpoints described above.
The graphic examples in Fig. 1
help to illustrate the differences between prognostic factors and predictive factors. The lower line in Fig. 1, A
, demonstrates that axillary lymph node status is a prognostic factor. Untreated patients with negative lymph nodes who receive no adjuvant therapy have a better clinical outcome than untreated patients with positive lymph nodes. If patients are treated and the improvement in survival is the same for lymph node-negative and lymph node-positive patients, as shown in Fig. 1, A
, then lymph node status does not differentially predict which patients will benefit from therapy. Thus, the lines are parallel and the biomarker is not a predictive factor. This is consistent with results from the overview analyses that have shown that the reduction in odds of death for either chemotherapy (7) or hormonal therapy (8) is the same for lymph node-negative and lymph node-positive patients.
|
If untreated patients with ER-positive (ER+) tumors have the same clinical outcome as untreated patients with ER-negative (ER-) tumors, as shown in Fig. 1, B
The common feature in Fig. 1, BD
, that demonstrates a predictive effect is the differential response to therapy represented by nonparallel lines. In statistical terms, this constitutes an interaction between response to treatment and biomarker status. When an interaction exists and investigators combine treated and untreated patients in analyses of potential prognostic factors, it is difficult, if not impossible, to separate the predictive effects from the prognostic effects. Unless these effects are separated, it is very difficult to make treatment recommendations based on the status of the biomarker. How can we recommend a particular therapy if we do not know if the expected good outcome will be caused by effective therapy or by the presence of a good prognostic factor?
| FALSE-NEGATIVE AND FALSE-POSITIVE STUDIES |
|---|
|
|
|---|
Evaluation and interpretation of the published literature dealing with biomarkers is fraught with difficulties. Many reported biomarker results are either falsely negative or falsely positive.
Statistical power is the probability of detecting an effect when it really exists. Increasing the sample size increases power. Most studies of potential prognostic or predictive factors are substantially underpowered, so that a negative result may reflect either a small or nonexistent marker effect or a lack of statistical power. Evaluation of a potential predictive biomarker is much more difficult than evaluation of a new therapy. Biomarker status cannot be randomized, and imbalance must be taken into account. A test for an interaction between response to treatment and biomarker status can require up to four times more events than a test for a treatment main effect (9). Biomarker studies are often conducted as ancillary studies to already completed, randomized clinical trials that were designed to compare treatments. The subsets of patients who do or do not have the biomarker are much smaller than the randomized groups who received the treatments, so statistical power is greatly reduced. An additional consideration is that many biomarker studies require retrieval of paraffin blocks or other archived tissues for the assessment of biomarker status. The combined rate of specimen retrieval and assay evaluability is often less than 50% of the patients who participated in the clinical trial. Therefore, most studies of potential predictive biomarkers will produce false-negative results.
However, many of the results that appear in the literature about biomarkers are probably false-positive results. Publication bias is a well-known phenomenon. It is much easier to publish an apparent positive result than a negative result. Results may appear to be positive as a consequence of multiple tests of hypotheses. If we perform enough subset analyses, sooner or later one of these analyses will produce statistically significant results by chance alone. In the case of biomarkers, a common problem is the identification of a cut point that separates biomarker results into high- and low-risk categories. Searches for an optimal cut point are simply multiple tests of hypotheses that inevitably lead to significant results, and adjustments must be made before accepting these positive findings (10). Other problems include analysis of many potential biomarkers but publication of only the significant factors and analysis of multiple subsets of patients but conclusions based only on the significant results.
It is necessary to validate the results of apparently positive biomarker studies. This validation should include the use of standardized, reproducible assays with predefined scoring systems and cut points. A common mistake is to assess biomarker status in an additional cohort of patients, combine these patients with the initial cohort, and perform a single analysis. The new cohort must first be analyzed as a stand-alone validation set that is not contaminated with the overly optimistic results obtained in the original cohort after tests of multiple hypotheses. It may or may not be appropriate to combine the cohorts after this initial analysis. The literature is replete with multiple reports from the same investigators who included subsets of the same patients in analyses that were intended to validate their initial findings.
| MICROMETASTASES IN BONE MARROW |
|---|
|
|
|---|
The Consensus Development Panel asked for an assessment of the published literature concerning the clinical utility of micrometastases in the bone marrow of patients with breast cancer. Three specific questions are asked: 1) What is the prevalence of micrometastases in primary breast cancer? 2) Is the presence of micrometastases associated with known prognostic factors? 3) Does the presence of micrometastases predict clinical outcome?
Two excellent reviews (11,12) of this topic were published in 1998. Each article reviewed 11 studies, with eight in common. Subsequent to these publications, at least three additional studies (1315) have been reported. It is immediately clear that there are many methodologic differences among these 17 published studies. Some used smears and others used cytospin preparations, the number of cells evaluated differed considerably and was often not reported; at least 23 different antibodies were used (antiepithelial cell-surface antigens, anti-milk fat globulins, anticytokeratin components, and antipolymorphic epithelial mucins); different staining procedures were used; some used healthy control subjects to define cut points and others used arbitrary definitions for the presence of micrometastases; and demographic and tumor characteristics of the patients differed among studies. If we ignore these methodologic differences, the weighted average of the prevalence of micrometastases in these 17 studies was 32% (range, 1%49%).
An obvious question when interpreting this result is, What is the impact of using different antibodies to detect micrometastases? Braun and Pantel (12) described differential immunohistochemical staining of bone marrow from noncancer patients using a panel of antibodies (Table 3
). These differences could have a major impact on the reported prevalences in the published studies. The inclusion of patients with different demographic and tumor characteristics could also bias this estimate of prevalence. To partially address this question, each published study was reviewed to determine whether correlations with other factors were examined and, if so, whether significant relationships were found (Table 4
). The conclusions based on this review are less than satisfactory. Only a minority of the studies attempted to associate the presence of micrometastases in bone marrow with other prognostic factors. Of those that did, only five of 11 found a relationship with axillary lymph node status by routine hematoxylineosin evaluation, and three of nine found a relationship with tumor size. Thus, the answer to the second question about relationships with other prognostic factors is unclear.
|
|
It is difficult to determine from the published literature if the presence of micrometastases in bone marrow correlates with clinical outcomes. Most of these studies included univariate analyses with either disease-free or overall survival as the clinical endpoint. Only nine of the 17 studies performed multivariate analyses; three reported a significant association with disease-free survival, and five reported a relationship with overall survival. However, almost all studies included treated and untreated patients. The untreated subsets were generally too small for definitive analyses, and the treated subsets received a variety of heterogeneous therapies. None of the studies performed tests of interactions between response to treatment and the presence of micrometastases. Therefore, it is very difficult to separate the predictive effects from the prognostic effects in any of these studies.
The overall conclusion of this exercise is that meta-analyses of these types of biomarker studies are not appropriate. The differences in detection techniques, sensitivities, and specificities of the assays, patient characteristics, and treatments produce too much heterogeneity among studies to combine their results. To evaluate the prognostic significance of micrometastases, we need well-designed, prospective studies that use sensitive and reproducible standardized techniques for detecting micrometastases. The recent publication by Braun et al. (15) is a step in the right direction; however, we need additional studies to confirm their impressive findings.
| SUMMARY |
|---|
|
|
|---|
The evaluation of biomarkers as prognostic or predictive factors requires the development of standardized and validated assays. It requires study designs that differentiate between prognostic and predictive effects. These studies should have clear eligibility criteria and sufficient numbers of patients and tissues to answer clinically relevant questions with adequate statistical power. We should require a high level of evidence before incorporating new biomarkers into clinical practice (16). Criteria for determining the clinical utility of biomarkers have only recently been proposed. The American Society of Clinical Oncology used very conservative criteria to develop practice guidelines for using biomarkers (17). Partly in response to the lack of consensus about these criteria, the Tumor Marker Utility Grading System was developed to differentiate levels of evidence among published studies (18). The College of American Pathologists used a modification of this system to develop their consensus statements about prognostic factors in breast, colon, and prostate cancers (19).
Study designs to evaluate biomarkers for different clinical uses vary with respect to the types of subjects and/or tissues that need to be studied, the endpoints that need to be measured, and the number of subjects and/or tissues that need to be accrued. However, the basic methodologic principles for good study designs are common to all clinical uses. All study designs should be based on clearly stated hypotheses. Assays should be reproducible and should be performed without knowledge of the clinical data and patient outcome. Results for individual factors should be analyzed by use of multivariate techniques that incorporate standard biomarkers that are already in clinical use. All results should be validated in subsequent studies before they are incorporated into clinical practice.
Very few new prognostic or predictive factors have been validated and endorsed for clinical use during the past several years. Part of the reason is a lack of adherence to proposed guidelines for the design, conduct, analysis, and reporting of results from prognostic factor studies (20). It is time to translate the principles of good study design and analysis that have been developed for clinical trials to the evaluation of new biomarkers.
| NOTES |
|---|
Supported in part by Public Health Service grant P01CA30195 from the National Cancer Institute, National Institutes of Health, Department of Health and Human Services.
| REFERENCES |
|---|
|
|
|---|
1 Goldacre MJ. Cause-specific mortality: understanding uncertain tips of the disease iceberg. J Epidemiol Community Health 1993;47:4916.[Abstract]
2
Hoel DG, Ron E, Carter R, Mabuchi K. Influence of death certificate errors on cancer mortality trends. J Natl Cancer Inst 1993;85:10638.
3 Green S, Weiss GR. Southwest Oncology Group standard response criteria, endpoint definitions and toxicity criteria. Invest New Drugs 1992;10:23953.[CrossRef][ISI][Medline]
4 Cox DR. Regression models and lifetables. J R Stat Soc 1972;34:187220.
5 Clark GM. Do we really need prognostic factors for breast cancer? Breast Cancer Res Treat 1994;30:11726.[CrossRef][ISI][Medline]
6 Clark GM. Prognostic and predictive factors. In: Harris JR, Lippman ME, Morrow M, Osborne CK, editors. Diseases of the breast. Philadelphia (PA): Lippincott Williams & Wilkins; 2000. p. 489514.
7 Early Breast Cancer Trialists' Collaborative Group. Polychemotherapy for early breast cancer: an overview of the randomised trials. Lancet 1998;352:93042.[CrossRef][ISI][Medline]
8 Early Breast Cancer Trialists' Collaborative Group. Tamoxifen for early breast cancer: an overview of the randomised trials. Lancet 1998;351:145167.[CrossRef][ISI][Medline]
9 Peterson B, George SL. Sample size requirements and length of study for testing interaction in a 2 x k factorial design when time-to-failure is the outcome. Control Clin Trials 1993;14:51122.[CrossRef][ISI][Medline]
10 Hilsenbeck SG, Clark GM, McGuire WL. Why do so many prognostic factors fail to pan out? Breast Cancer Res Treat 1992;22:197206.[CrossRef][ISI][Medline]
11 Funke I, Schraut W. Meta-analysis of studies on bone marrow micrometastases: an independent prognostic impact remains to be substantiated. J Clin Oncol 1998;16:55766.[Abstract]
12 Braun S, Pantel K. Prognostic significance of micrometastatic bone marrow involvement. Breast Cancer Res Treat 1998;52:20116.[CrossRef][ISI][Medline]
13 Molino A, Pelosi G, Micciolo R, Turazza M, Nortilli R, Pavanel F, et al. Bone marrow micrometastases in breast cancer patients. Breast Cancer Res Treat 1999;58:12330.[CrossRef][ISI][Medline]
14 Mansi JL, Gogas H, Bliss JM, Gazet JC, Berger U, Coombes RC. Outcome of primary-breast-cancer patients with micrometastases: a long-term follow-up study. Lancet 1999;354:197202.[CrossRef][ISI][Medline]
15
Braun S, Pantel K, Muller P, Janni W, Hepp F, Kentenich CRM, et al. Cytokeratin-positive cells in the bone marrow and survival of patients with stage I, II, or III breast cancer. N Engl J Med 2000;342:52533.
16 Hayes DF. Do we need prognostic factors in nodal-negative breast cancer? Arbiter. Eur J Cancer 2000;36:3026.
17
American Society of Clinical Oncology Expert Panel. Clinical practice guidelines for the use of tumor markers in breast and colorectal cancer. J Clin Oncol 1996;14:284377.
18
Hayes DF, Bast RC, Desch CE, Fritsche H Jr, Kemeny NE, Jessup JM, et al. Tumor marker utility grading system: a framework to evaluate clinical utility of tumor markers. J Natl Cancer Inst 1996;88:145666.
19 Fitzgibbons PL, Page DL, Weaver D, Thor AD, Allred DC, Clark GM, et al. Prognostic factors in breast cancer. College of American Pathologists Consensus Statement 1999. Arch Pathol Lab Med 2000;124:96678.[ISI][Medline]
20 Altman DG, Lyman GH. Methodological challenges in the evaluation of prognostic factors in breast cancer. Breast Cancer Res Treat 1998;52:289303.[CrossRef][ISI][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
