Table of Contents Section of Biostatistics

M.D. Anderson Cancer Center
Department of Biomathematics

Section of Computer Science

Staff and Activities

Staff

Barry W. Brown, Ph.D. (1963) University of California at Berkeley
Chief, Section of Computer Science
Larry and Pat McNeil Professor of Biomathematics
Areas: Mathematical modeling of cancer processes; statistical computing; statistical methodology; software engineering

E. Neely Atkinson, Ph.D. (1981) Rice University
Associate Professor of Biomathematics
Areas: Computational statistics; regression analysis; computer science; modeling of cancer processes

David Tuttle, M.A.M.S. (1985) Rice University
Software Systems Specialist III
Areas: UNIX, network, and TCP/IP administration

Chris Brauner, M.A. (1992) Rice University
Programmer Analyst II
Areas: Statistics and statistical computing

David Gutierrez, B.A. (1984) Rice University
Programmer Analyst II
Areas: Computer architecture; computer networks; distributed processing systems; human-computer interfaces

Lawrence Levy, B.A. (1976) University of Houston
Programmer Analyst II
Areas: Statistical graphics; computational statistics; survival analysis

Dan M. Serachitopol, M.S. (1974) Polytechnic Institute of Bucharest
Programmer Analyst II
Areas: Statistical modelling using the S language; symbolic computing; graphical user interfaces using the TCL/TK language; numerical algorithms; neural networks and genetical algorithms

F. Martin Spears,(*) Ph.D. (1992) Rice University
Visiting Assistant Professor
Areas: Statistical computing and statistics

Activities

Staff of this section are skilled in numerical analyses, systems analysis, computer simulation, programming, language theory, and statistical computing. This expertise is used in a wide variety of applications including: modeling and simulation of the spread of metastases, development of software tools for cancer research, statistical improvement in computational methods for likelihood estimation, and collaboration with clinical research and basic scientists.

This group has developed computer software for study planning and survival analyses that has been distributed to numerous researchers, including those at many of the comprehensive cancer centers.


Footnotes

(*)
Dr. Spears' current address: University of Houston at Clear Lake, Department of Mathematics, 2700 Bay Area Blvd., Houston, TX 77058.(Return)

Investigator-Initiated Research

  1. Progress Against Cancer Mortality: Another Look (Recent Publications, #6). Dr. B. W. Brown, Mr. C. Brauner, and Mr. L. Levy.
    In 1986, Bailar and Smith published a widely noted paper, "Progress against cancer?" in the New England Journal of Medicine, 314:1226-1232, 1986. They examined the mortality rate ascribed to cancer and noted that it had changed very little with time, although there were some modest improvements. Their conclusion was that there had been little progress and they suggested that more resources be placed in prevention rather than improved treatment. We examine the impact of cancer on mortality from a slightly different viewpoint.
    The SEER (Surveillance, Epidemiology, and End Results) data is used in this project. SEER is a government financed cancer registry that covers about 10% of the U.S. population. An examination of overall trends in cancer including incidence and survival requires the use of population-based data in registries that cover defined geographic areas.
    We disagree with Bailar and Smith in that we think that the concept of the primary cause of death is philosophically undefined: although death is certainly observed, its cause is not. We believe that death, like most human events, is frequently the culmination of several factors, and an attribution of primary cause is an unwarranted oversimplification.
    Because the cause of death is not observable, examining the correctness of cancer specific mortality rates appears impossible. It is, however, feasible to compare the rates of noncancer death in those diagnosed with cancer to those in the overall population matched by age and sex. If all excess mortality associated with cancer is recorded as due to cancer, then the patient and population noncancer death rates should be the same if the patient and general population do not differ. The patient population, however, may suffer an increased susceptibility to disease in general compared to the overall population. This susceptibility may be caused by factors such as unhealthy life styles or genetic defects. Consequently, a higher than population death rate from causes other than cancer might be expected in the cancer population. However, a generalized lack of resistance to disease would not explain systematic changes in relative noncancer mortality rates with time after the diagnosis of cancer.
    The ratio of patient-to-population noncancer death rates is greater than 1 for all cancers combined and for the common solid tumors. Leading noncancer causes of death in cancer patients were circulatory and respiratory failures. The noncancer relative risk decreased rapidly after diagnosis and decreased with the patient's age at diagnosis. It increased slightly with the calendar year of diagnosis. Because the largest excess of noncancer death occurred shortly after diagnosis, it seems almost certain that a large portion of the excess was caused by treatment of the cancer. The excess has increased with time from 1973 (the beginning of SEER) to 1986 (the last year examined), which we attribute to increased treatment intensity.
    We conclude that cancer-attributed mortality is not a complete measure of the cancer-associated mortality because it omits a number of deaths associated with cancer. From this observation, we suggest that trials of detection methodology (e.g., mammography) should use as the outcome the number of individuals diagnosed with the cancer and dead within some period of time. The number of deaths attributed to cancer underestimates the value of detection, by ignoring the lesser severity of treatment used in early stage disease. Decreased severity of treatment would be expected to result in fewer treatment related deaths.
    Increasing intensity of treatment is warranted if it decreases cancer mortality more than it increases treatment associated death rates. To determine whether this is so, three year rates of survival of cancer patients were examined. The three year period contains most of the excess noncancer deaths.
    For whites of both sexes and age groups, there is an improvement in three year survival of about 1/2% by year of diagnosis. Statistical analysis shows that this improvement is not likely to be due to chance fluctuation. Increased intensity of treatment appears to save lives, at least in the short run. Blacks show somewhat less improvement in three-year survival than whites; overall the improvement is about 1/4% per year.
    Returning to the primary theme of this project, we suggest that progress against cancer mortality be evaluated by examining the proportion of the (age adjusted) general population who are diagnosed with cancer and dead by a particular age. This criterion has the advantage of being largely immune to early detection bias; if early detection does not prolong life the criterion will be unchanged. The suggested criterion has the advantage of not relying on the unobservable cause of death, and it does take into consideration both cancer incidence and the results of treatment.
    The suggested measure of progress has another advantage over the criterion of the rate of death from cancer. Deaths from cancer occur in patients diagnosed over a wide range of time and the mixture of these diagnosis times is generally unknown. Consequently, it is difficult to determine the outcome on those diagnosed in 1975, say, and to compare the outcome to those diagnosed in 1985, and so to evaluate progress in comparatively recent times. Our measure of progress makes this comparison possible, and we apply it to patients diagnosed in three time periods: 1975-76, 1980-81, and 1985-86.
    The difficulty in applying our criterion is in extrapolating survival beyond the period of observation. We focus on adult onset cancers and desire to determine the proportion of the population diagnosed and dead by ages up to 85 years. This agenda requires the determination of the probability of survival to ages up to 85 years of a patient diagnosed at age 20. The data from the first period studied provides a follow-up of only 15 years; the latest provides only 5 years of follow-up. Originally, it was hoped that fitting the very flexible Log-F model to the data would provide a believable extrapolation, but this was found to be inadequate. A large proportion of cancer patients die soon after diagnosis and this early survival experience is not predictive of later experience.
    A strategy for extrapolating survival at specific times after diagnosis was evolved. Observed survival rates from earlier time periods are used when available. For survival beyond 15 years, results were used from fitting the Log-F model to the 10 to 15 year survival experience of patients diagnosed in 1975-76. This model has been found good for fitting late survival experience. Note that these methods do not consider possible improvements with time in long term cancer survival; there is no data on which to estimate such improvements.
    Cancer incidence is increasing with time and survival after diagnosis is improving with time of diagnosis. The net effect on the proportion diagnosed with cancer and dead by particular ages is shown in the figures. For each race and sex combination, the leftmost chart shows the proportion graphed against age; the rightmost chart shows the change with respect to 1975-76 experience. The solid lines in the figures shows the 1975-76 experience, the dotted line shows that for 1980-81 shows that for 1980-81, and the dashed line is for 1985-86. Note that in the difference figures, the experiences in 1975-76 is taken as the freeline. Cancers related to AIDS are excluded from the graph for white men for the 1985-56 time period. AIDS causes a large number of cancer in young men who then die at an early age, but this was felt to be largely extraneous to the issue of cancer-associated mortality.
    For whites of both sexes, the proportion of the population diagnosed with cancer and dead by ages up to 65 or 70 has remained approximately constant over time. The effects of increasing incidence and survival approximately cancel. At quite advanced ages, everyone dies and this measure reduces to incidence of cancer, which is increasing. For black women, the cancer associated mortality has decreased with time. The incidence of cancer in young black women has decreased and the increase at older ages is more than compensated for by increased survival. The experience of black men has definitely gotten worse. There has been a great increase in cancer incidence in black men over 40, and this increase overwhelms any improvement in survival.

    Left panels. The proportion of the SEER population diagnosed with cancer and dead by particular ages. The solid line presents experience for 1975-76, the dotated line that for 1980-1981, and the dashed line, 1985-1986.
    Right panels. The change in proportions from the experience of 1975-1976.
    Note. AIDS related cancers have been ignored in white men. These cancers add a large number of diagnoses and deaths at young ages, but are deemed extraneous to changes in the impact of cancer on survival.
  2. Statistical Methods for the Diagnosis of CIN from Laser Spectroscopy. Dr. E. Neely Atkinson, Dr. R. Richards-Kortum, and Dr. M. F. Mitchell.
    The goal of this study is to develop methods for diagnosing the presence of cervical intraepithelial neoplasia (CIN) based on an analysis of the fluorescence excitation-emission spectra of cervical tissue in vivo. If the project is successful, it will greatly aid in avoiding unnecessary biopsies with a consequent decrease in cost, discomfort, and risk to patients.
    During colposcopy, suspicious areas of the cervix are illuminated by a laser probe at several excitation wavelengths and the intensity of the resulting fluorescence measured at a number of emission wavelengths, resulting in an excitation emission matrix (EEM) of values. Typically, there are approximately 3 excitation wavelengths and 50 emission wavelengths for each excitation wavelength. Based on various characteristics of the EEM, we seek to classify the tissue as normal, inflamed, squamous intraepithelial lesions (SIL), or metaplastic.
    Although we are exploring a number of potential algorithms, they all comprise three basic phases: data preprocessing, data reduction, and classification. For each individual patient, normal tissue consistently produces a higher peak emission intensity occurring at a lower emission wavelength than abnormal tissue. However, between patients there is great variability in the location and intensity of the peak emission for normal tissue.
    Similar relations hold between the other tissue types. It will probably be necessary to calibrate the EEM for each patient prior to attempting any classification to remove some of the between patient variability. Next, for each area of tissue under examination, there are approximately 150 values in the EEM; in order to derive a classification algorithm which will be small enough and fast enough to be clinically useful, these values need to be reduced to a small number of parameters which retain the relevant information. Two methods are under exploration: fitting parametric curves characterized by a small number of parameters to each excitation-emission spectrum and fitting each spectrum as a linear combination of several orthonormal basis vectors. Finally, the parameters derived from the data reduction must be used to classify the tissue. We are studying several classification algorithms: polychotomous logistic regression, classification trees, and neural networks.
    The number of combinations of techniques available for use at each phase of the data analysis means that this is a potentially lengthy and complicated project. However, if it is successful, the techniques developed in this study will be of great benefit to patients.
  3. Dynamic Interactive Graphics for the Exploratory Analysis of Survival Data (Recent Publications, #3). Dr. E. Neely Atkinson.
    The goal of this project is to develop and implement interactive dynamic techniques which permit clinical researchers to graphically examine survival data in an intuitive fashion in order to discover interesting aspects of their data which may then be analyzed more formally using traditional statistical methods.
    Although powerful techniques are available for testing specific hypotheses in the analysis of survival data, these methods are only useful for answering particular questions about a data set; they are not convenient for examining data in a less structured way, looking for surprising aspects which may then be studied more extensively. The numerous techniques for exploratory data analysis developed for many other data types do not extend directly to survival data, which is characterized by censored data values. Recent advances in computer hardware and software now make it feasible to develop graphical methods for the exploration of survival data which permit clinical researchers to view survival data in various intuitive fashions and to implement these techniques on the computer systems likely to be available to most researchers.
    Suppose, for example, a researcher wishes to examine the effects of several covariates on survival, perhaps searching for subsets of patients who do particularly well or poorly or perhaps attempting to construct an appropriate model for survival analysis. The covariates can be displayed in a scatterplot matrix (a graphical matrix giving all pairwise scatterplots for the variables) and the scatterplot matrix linked to a survival curve. Using a mouse (or other input device), the researcher can select several data points from the scatterplot matrix and see the survival curve computed using only the selected points. As the selection of points changes, the survival curve is continuously updated, permitting the researcher to search for interesting regions of the data. Instead of a survival curve, the scatterplot matrix could be linked to another appropriate display, such as a censored boxplot or an event chart.
    A preliminary version of a computer package embodying these ideas has been implemented in the LISP-STAT system and posted to Netlib. As I obtain feedback and suggestions, I intend to modify and extend the package to make it as useful as possible to clinicians.
  4. Design of Dose-Response Studies (Submitted Publications, #5). Dr. B. W. Brown and Dr. F. M. Spears.
    This study is part of a general effort to use inexpensive computation to design efficient trials so as to obtain the maximum amount of information at minimal cost. The current application applies to studies of the response of cancer to differing doses of some therapeutic agent.
    Most previous work on such designs considers only the case in which the parameters of the dose-response curve is known when the experiment is being planned. This case is unrealistic: if the experimenter knew these parameters, he would not perform the experiment. We focus on designs that are efficient for specified degrees of initial investigator uncertainty.
    In the most widely used dose-response model, typical gains in efficiency are as follows. If the parameters of the dose-response curve are known in advance, a planned experiment typically allows one-half to two-thirds the number of subjects to be used for the same degree of accuracy as an experiment that equally spaces groups of subjects across doses. If the parameters are not well known in advance, a study planned for uncertainty leads to a similar increase in efficiency over a study that uses the most likely values.
    For large degrees of initial uncertainty, a two-stage study allows a similar savings over a one-stage study. In a two-stage study, only a portion of the subjects are used in the first stage of the study. The allocation of the remainder of the subjects depends on the outcome of the subjects in the first stage of the study.
    Chaloner and Larntz applied a similar overall approach with different forms of representation of uncertainty and criteria of efficiency in "Optimal bayesian design applied to logistic regression experiments" in the Journal of Statistical Planning and Inference, 21:191-208, 1989. They found that the number of dose points increases with uncertainty; we have not. We are currently investigating the effect of the representation of uncertainty and the functional form of criteria on good designs.
  5. Hyperbolic Matrix Decompositions in Statistical Computing (Recent Publications, #2). Dr. E. Neely Atkinson.
    An algebraic system known as the hyperbolic complex numbers can be used to compute certain matrix decompositions which fail to exist when ordinary real numbers are used. This fact can be exploited in the computations associated with several models frequently used in the analysis of survival data.
    Parameter estimation in the fitting of statistical models generally requires the numerical minimization of an objective function. At each iteration of the minimization, a matrix factorization known as the Cholesky decomposition is applied to the Hessian, or matrix of second derivatives, of the objective function. The Cholesky factorization requires that the matrix being decomposed be positive definite; when this condition is not met, the Hessian must be modified to ensure positive definiteness. It is desirable to make the modifications as small as possible. By using hyperbolic computations, the Cholesky decomposition can be performed even when the condition of positivity is not met; as the decomposition is being performed, it is apparent what modification would be required to enforce positive definiteness. This approach may lead to smaller changes to the Hessian than currently used methods.
    The Hessian matrix frequently appears as the product of component matrices. By using hyperbolic computations, the required decomposition can be performed directly on the component matrices rather than actually forming the Hessian explicitly. This approach minimizes the effects of round-off error on the computations.
    Both of the above situations arise in the use of the class of accelerated failure time models. The Hessian contains a portion related to the linear elements of the model which can be written in factored form and a portion related to the nonlinear elements which may require modification to guarantee positivity. By applying hyperbolic techniques to this class of models, I hope to derive models which are both more efficient than current techniques and more resistant to numerical difficulties.
  6. Exact Designs and Analyses for Single- and Multiple-Stage Studies with a Binomial Outcome. Dr. Barry W. Brown and Dr. F. M. Spears.
    Many cancer trials involve an outcome, such as complete remission, that is observed shortly after treatment. Ethical considerations usually dictate an early look at the data of such a trial when it is only partially complete. The trial should be terminated early if there is overwhelming evidence that the new treatment is better than the old. In this case, the new treatment graduates from being experimental to being standard practice. The trial should also terminate early if it appears that a demonstration of the superiority of the new treatment is quite unlikely. The new treatment may be more intense and toxic than the old or it may be more expensive. In either case, patients should not be subjected to the new treatment if it appears to lack additional benefits over the old one.
    The purpose of this project is to implement exact methods for planning and analysis of studies with and without early examinations of the data. The overall methodology is quite clear; the difficulty in implementation is the presentation of sufficient information on consequences of choices of stopping points in early looks to the designer of the study.
    The plural in "designs" refers to the fact that all of these methods involve sorting all possible outcomes of the trial in order so that the outcomes most favoring the superiority of the new treatment are on top of the sorted list. Statistical theory shows that there is no unique best sort order for this task; a choice between reasonable orders must be made. Fortunately, comparisons of several orders shows them to be generally consistent.
    This project is closely related to a collaboration between these investigators and Dr. J. J. Lee and Mr. D. Serachitopol, investigating various confidence limits on the difference of two binomial outcomes. It is an interesting irony, that the "exact" methods provide confidence limits that are wider than necessary; that is, some "inexact" methods are better. This finding is an analogy to the Fisher "exact" test for the difference in binomial proportions. The Fisher test is an exact answer to a question rarely asked in practice and provides worse answers to the usual question than approximate methods.
  7. Methods for Dealing with Multiple Hypotheses. Dr. Barry W. Brown and Ms. Kathy Russell.
    The simultaneous investigation of a number of different questions is common in cancer research. For example, several investigators are looking at the pattern of joint occurrence of several genetic alterations suspected to be involved in the cancer process. For each pair of genetic loci, there is a separate question as to whether the alterations are related. In quality of life studies, scores in several psychological dimensions are related to the outcome of clinical treatment. Each pair of psychological dimensions and clinical measures must be evaluated for association. A third example is the examination of a number of patient characteristics at diagnosis for association with the outcome of treatment.
    Calculating statistical significance while ignoring the fact that a number of questions are being asked leads to erroneous findings of evidence. If each of 100 independent tests has a probability of 0.05 of falsely being found significant then there is a probability greater than 0.99 that at least one of the 100 will be found significant.
    In a few cases, specific methods have been devised to deal with the multiplicity of questions being asked. In other cases, an overall test exists, determining whether there is evidence that any of the many results are due to chance. Unfortunately, these special cases do not cover the many instances of multiple hypotheses encountered in cancer research.
    One strategy for dealing with multiplicity is to drastically increase the amount of evidence required to declare a result as more than chance. Such stringent requirements may well lead to missing important real effects.
    Another strategy, and the one under examination, plots the ordered probabilities that each result is due to chance against its rank in the ordering. If the results are independent and all due to chance, the resulting plot should be a straight line. Deviations from linearity indicate results that are not likely to be due to chance. There are both graphic and algorithmic methods for assessing such graphs and we are adding some of our own. A problem with all such methods is that their behavior when the results are not independent but are correlated is unknown. This behavior is being examined via simulation to determine the generality with which such methods can be used.
  8. Goodness of Fit via Smoothing Techniques. Dr. Barry W. Brown, Dr. Joan Staniswalis, and Mr. Dan Serachitopol.
    The determination of the functional relation of a set of covariates on some outcome variable is one of the most frequently encountered in statistics generally as well as cancer research specifically. The covariates could be patient characteristics at entry to a study or the dose of some therapeutic agent and the outcome could be whether or not there is a complete remission of cancer or survival time. Or the covariates could be the amount of a particular chemical generated in cell culture and the outcome the number of cells with a mutation at a specific locus. Models for such data usually are chosen for their simplicity and known properties; it is important to determine whether these models are adequate and if not to improve them until they are.
    This project deals with the problem of ascertaining whether an assumed form of functional relation is a reasonable representation of data and, if not, providing suggestions for its improvement. Standard statistical practice allows a simple model to be compared to a more complex one to determine whether the complexity is necessary, but it does not provide a general comprehensive model for comparison to the one used.
    Drs. Staniswalis and Severini published a paper, "Diagnostics for assessing regression models," in the Journal of the American Statistical Association, 86:684-691, 1991, that proposed a nonparametric model obtained by allowing the overall model to vary continuously with the covariates. This general model is constrained only by the need to be smooth; no particular functional form is necessary. The model is estimated using smoothing techniques, i.e., variations on the idea of a running mean. This estimation proved difficult in practice; smoothing greatly cut down the range of covariate values in the fit, making parameter estimation difficult.
    Mr. Serachitopol noted that the outcome statistic (e.g., probability of complete remission) is easy to estimate locally even though the its relation with the covariates is not. Staniswalis' statistic measuring the difference between the models requires only this outcome statistic. The outcome statistic can be easily used to calculate changes in the model required to make the chosen form match the smoothed result and hence to suggest improvements to the original model.
    We are writing code to allow this technique to be applied to outcomes that are continuous, probabilities, and survival time. This is a somewhat arduous task. The asymptotic theory for the distribution of Dr. Staniswalis' statistic was derived in her paper; however, experimentation shows that this distribution rarely applies to the sample sizes seen in practice. Hence simulation (bootstrapping) under the hypothesis that the original model is good is necessary to determine the statistical significance of the difference between the original model and the smoothed one.

Consultations and Collaborations

  1. Biostatistics Core for: Extensions of radiotherapy research. Dr. L. Peters, Dr. B. W. Brown, Dr. E. N. Atkinson, and Ms. M. J. Oswald.
  2. Biostatistics Core for: A mutational model for childhood cancer. Dr. L. Strong, Dr. B. W. Brown, and Mr. L. Levy.
  3. Human tumor cell radiosensitivity vs. radiocurability. Dr. W. A. Brock and Dr. B. W. Brown.
  4. Patterns of cancer occurrence in relatives of childhood sarcoma and osteosarcoma Patients. Dr. L. Strong and Dr. B. W. Brown.
  5. Multiple primary cancer and mutagen-hypersensitivity. Dr. S. P. Schantz. Dr. B. W. Brown, and Mr. L. Levy.
  6. Head and neck cancer: clinical impact of natural immunity. Dr. S. P. Schantz and Dr. B. W. Brown.
  7. Patterns of incidence of glioblastoma and glioma by race, sex, and age. Dr. V. A. Levin, Dr. B. W. Brown, and Mr. L. Levy
  8. Mathematical models for cell senescence. Dr. W. Brock, Dr. B. W. Brown, and Dr. E. N. Atkinson.
  9. Radiosensitivity as a prognostic factor in head and neck cancer. Dr. W. Brock and Dr. B. W. Brown.
  10. Phase I/II study of combined modality treatment for resectable non-small cell superior sulcus tumors. Dr. R. Komaki, Dr. J. B. Putnam, Dr. J. S. Lee, and Dr. B. W. Brown.
  11. Kinetics and radiosensitivity as predictors of treatment response in human cervical cancer. Dr. M. Morris, Dr. W. A. Brock, Dr. N. Terry, Dr. B. W. Brown, and Dr. D. M. Gershenson.
  12. Determination of hypoxia in squamous cell carcinomas of the head and neck. Dr. J. D. Morton, Dr. E. E. Kim, Dr. K. Ang, and Dr. B. W. Brown.
  13. Preoperative tumor cell kinetics and rectal carcinoma response to preoperative 5FU+XRT. Dr. T. A. Rich, Dr. D. C. Hohn, Dr. M. Meistrich, and Dr. B. W. Brown.
  14. A phase III randomized study comparing conventional palliative irradiation (30 GY/10 fractions) vs. single fraction photon radiotherapy +/- strontium 89 for painful bone metastases. Dr. D. Podoloff, Dr. A. Porter, Dr. Payne, Dr. B. W. Brown, and Dr. L. Peters.
  15. A study of the relation between mucositis and outcome in patients with head and neck cancer. Dr. F. Geara, Dr. B. W. Brown, and Mr. L. Levy.
  16. Diagnosis of CIN from laser fluorescence spectroscopy. Dr. M. F. Mitchell, Dr. R. Richards Kortum, and Dr. E. N. Atkinson.
  17. Risk factors in breast and ovarian carcinoma. Dr. D. Kieback and Dr. E. N. Atkinson.
  18. Clinical trial of 4-HPR and tamoxifen in breast neoplasia. Dr. K. Dhingra and Dr. E. N. Atkinson.
  19. Evaluation of chemotherapeutic regimens for treatment of gynecologic malignancies. Dr. R. Freedman, Dr. D Gershenson, Dr. M. F. Mitchell, and Dr. E. N. Atkinson.
  20. Studies of administration of viral oncolysate (virus modified tumor extract) to patients with ovarian carcinoma. Dr. R. Freedman and Dr. E. N. Atkinson.
  21. Studies of two argyrophilic nucleolar organizer region counting methods. Dr. W. A. Mourad and Dr. E. N. Atkinson.
  22. Intrinsic resistance to anticancer agents in murine pancreatic adenocarcinoma. Dr. J. A. Nelson and Dr. E. N. Atkinson.
  23. Predicting acute graft rejection in renal transplantation. Dr. J. Grevel and Dr. E. N. Atkinson.

CCSG Shared Resource:
Cancer Information Systems

The Cancer Information Systems Resource is composed of members of the Section of Computer Science of the Department of Biomathematics. It provides scientific computational abilities to meet the diverse needs of the institution's cancer researchers in addition to direct collaboration. This Resource has been a continuously funded shared resource from the inception of M. D. Anderson's Cancer Center Support Grant in 1979.

In addition to their activities on this Resource, the faculty have their own research projects -- usually arising from issues noted during the course of consultation -- and actively participate in educational activities.

This resource provides scientific computational abilities needed by the institution's cancer researchers. Powerful and easy-to-use software to meet the diverse research needs of the institution are acquired where possible, and otherwise written. Code created by this Resource is made freely available to any researcher anywhere who desires it. Interested readers can obtain these packages by anonymous ftp to odin.mda.uth.tmc.edu (129.106.3.17). A description of available packages is on /pub/index. Code for the S statistical system is submitted to statlib and is not placed on our ftp account. Statlib can be accessed by ftp (user name is statlib, send mail address as a password) at lib.stat.cmu.edu (128.2.241.142). Listed here are some of the development efforts of the previous two years.

  1. STPLAN - Performs Sample Size, Power, and Related Calculations. Dr. Barry W. Brown, Mr. James Lovato, and Mr. Chris Brauner.
    STPLAN calculates the sample size required to produce a specific power. The package is symmetric in that it can calculate any one of the following when the others are specified: sample size, minimal detectable difference, type one error (significance level), and power. Most of the commonly encountered clinical test situations are incorporated into the calculations of STPLAN.
    The most important recent improvements in STPLAN consist of providing a full written report of the test conditions for which calculations are made and to permit tables of values to be calculated with a single command.
    New capabilities are added to STPLAN as the need is made evident in the course of collaborations of our group. Recent additions include calculations for changes of a binomial parameter for subjects above and below the median value of a continuous variable, calculations for the log-normal distribution, and for two normal distributions with differing variances. Documentation of all methods used and the derivation of most is being completed.
  2. Asymptotic Sample Size Calculations. Dr. Barry W. Brown and Mr. James Lovato.
    A set of programs for performing asymptotic sample size calculations (using likelihood methods) has been written in the S statistical language. These methods provide required sample size for cases that are too complex for STPLAN; for example, detecting a quadratic term in logistic regression. These likelihood methods are extremely general.
  3. Steckel's Generalization of Logistic Regression. Dr. Barry W. Brown and Mr. Dan Serachitopol.
    The Steckel generalization of logistic regression has been implemented in the S system. This generalization parameterizes the link function and provides a very flexible set of models for fitting binary response data -- a very common case in cancer research. In a large number of cases, the fit is significantly better than the logistic model.
  4. Cumulative Distribution Functions, Inverses, etc. Dr. Barry W. Brown, Mr. James Lovato, and Ms. Kathy Russell.
    DCDFLIB is a collection of Fortran or C routines which provide the double precision calculation of cumulative distribution functions, their inverses, and their parameters for a number of common statistical distributions listed below. DCDFLIB uses published algorithms cited in its documentation.
    Values associated with a statistical distribution include X, the upper limit of integration of the density, P, the cumulative distribution function evaluated at X, and auxiliary parameters such as degrees of freedom. Given all but one of these values, a routine in cdflib will calculate the value that was not specified.
    Routines are provided for the following distributions: (1) Beta, (2) Binomial, (3) Chi-square, (4) Noncentral Chi-square, (5) F, (6) Noncentral F, (7) Gamma, (8) Negative Binomial, (9) Normal, (10) Poisson, (11) Student's t.
  5. Random Number Generators. Mr. Barry W. Brown, Mr. James Lovato, and Ms. Kathy Russell.
    RANLIB is a collection of routines that provide generators of random numbers from a variety of distributions. RANLIB uses published algorithms cited in its documentation. Both Fortran and C versions are available.
    The uniform generator uses an algorithm of L'Ecuyer and Cote to provide 32 virtual random number generators. Each generator contains 1,048,576 blocks of numbers, and each block is of length 1,073,741,824. Any generator can be set to the beginning or end of the current block. Packaging is provided so that if these capabilities are not needed, a single generator with period 2.3 X 1018 is seen.
    Using this uniform generator, routines are provided that return: (1) Beta random deviates, (2) Chi-square random deviates, (3) Exponential random deviates, (4) F random deviates, (5) Gamma random deviates, (6) Multivariate normal random deviates (mean and covariance matrix specified), (7) Noncentral chi-square random deviates, (8) Noncentral F random deviates, (9) Univariate normal random deviates, (10) Random permutations of an integer array, (11) Real uniform random deviates between specified limits, (12) Binomial random deviates, (13) Poisson random deviates, (14) Integer uniform deviates between specified limits, (15) Multinomial random deviates, (16) Negative binomial random deviates, and (17) Seeds for the random number generator calculated from a user provided character string.
  6. Accelerated Failure Model. Dr. Barry W. Brown, Dr. F. Martin Spears, Mr. James Lovato, Mr. L. Levy, and Dr. E. Neely Atkinson.
    This model posits that time to failure accelerates or decelerates for each subject as a function of the values of a set of covariates. The model is parametric, and so provides a nice complement to the nonparametric proportional hazards model that is commonly used to analyze time to event data. The model was published in 1981 and its advantages are well known to statisticians. However, it is little used in practice due to the lack of code for fitting it to data. Numerous serious numeric problems are encountered in the model. Writing robust code has been a long term effort but the current version of the program appears to work reasonably well.
    This program fits the accelerated failure model to data with or without covariates. Its capabilities include fitting the general model or any submodel with p and q fixed. It also can automatically fit to data a set of models specified by a rectangular grid in (p,q) coordinates or a set of named models. Several options are provided to add covariates to an existing model. Finally, ACCFLF will calculate time to event probabilities for accelerated failure models using either times from the current data or a user supplied list.

Recent Publications

  1. Abu-Farsakh, H.A., Katz, R.L., Atkinson, N., Champlin, R.E. Prognostic factors in bronchoalveolar lavage in 77 patients with bone marrow transplants. Acta Cytologica, in press.
  2. Atkinson, E.N. Computing AT A - BT B = LT D L using generalized hyperbolic Householder transformations. Linear Algebra and Its Applications, 194:135-147, 1993. (return to Research #5)
  3. Atkinson, E.N. Interactive dynamic graphics for exploratory survival analysis. The American Statistician, in press. (return to Research #3)
  4. Bondy, M.L., Strom, S.S., Colopy, M.W., Brown, B.W., and Strong, L. Accuracy of family history of cancer obtained through interviews with relative of patients with childhood sarcoma. Journal of Clinical Epidemiology, 47:89-96, 1994.
  5. Brown, B.W. STPLAN. Computational Statistics and Data Analysis, 17:597-598, 1994.
  6. Brown, B.W., Brauner, C., and Minnotte, M.C. Noncancer deaths in white adult cancer patients. Journal of the National Cancer Institute, 85:979-987, 1993. (return to Research #1)
  7. Brown, B.W. and Levy, L. Certification of Algorithm 708: Significant Digit Computation of the incomplete Beta. ACM Transactions On Mathematical Software, 20: 393-397,1994.
  8. Brown, B.W., Lovato, J., and Russell, K. RANLIB - random number generation library (C and F77): Version 1.1. Computational Statistics and Data Analysis, 17:598, 1994.
  9. Freedman, R.S., Bowen, J.M., Delcos, L., Edwards. C., Wallace. S., Atkinson, E.N., Ioannides, C.G., Kasi, L.P., Scott, W., and Patenia, R. Active intralymphatic immunotherapy of uterine cervical carcinoma with viral oncolysate: A pilot study. International Journal of Gynecological Cancer, 4:101-110, 1994.
  10. Freedman, R.S., Edwards, C.L., Kavanagh, J.J., Kudelka, A.P., Katz, R.L., Carrasco, C.H., Atkinson, E.N., Scott, W., Tomasovic, B., Templin, S. And Platsoucas, C.D. Intraperitoneal adoptive immunotherapy of ovarian carcinoma with tumor infiltrating lymphocytes and low-dose recombinant Interleukin-2: A pilot study. Journal of Immunotherapy, 16:198-210, 1994.
  11. Freedman, R.S., Tomasovic, B., Templin, S., Atkinson, E.N., Kudelka, A., Edwards, C.L. and Platsoucas, C.D. Large-scale expansion in Interleukin-2 of tumor-infiltrating lymphocytes from patients with ovarian carcinoma for adoptive immunotherapy. Journal of Immunological Methods,167:145-160, 1994.
  12. Gershenson, D.M., Mitchell, M.F., Atkinson, N., Silva, E.G., Burke, T.W., Morris, M., Kavanagh,J.J., Warner, D., and Wharton, J.T. Age contrasts in patients with advanced epithelial ovarian cancer. The M.D. Anderson Cancer Center Experience. Cancer, 71:638-43, 1993.
  13. Gershenson, D.M., Silva, E.G., Mitchell, M.F., Atkinson, E.N., and Wharton, J.T.: Transitional cell carcinoma of the ovary: A matched control study of advanced-stage patients treated with cisplatin-based chemotherapy. American Journal of Obstetrics and Gynecology, 168:1178-1186, 1993.
  14. Joshi, J. H., Newman, K. A., Brown, B.W., Finley, R.S., Ruxer, R.L., Moody, M.A. and Schimpff, S.C. Double ß-lactam regimen compared to an aminoglycoside/ß lactam regimen as empiric antibiotic therapy for febrile granulocytopenic cancer patients. Supportive Care in Cancer, 1:186-294, 1993.
  15. Kieback, D.G., McCamant, S.K., Press, M.F., Atkinson, E.N., Gallager, H.S., Edwards, C.L., Hajek, R. A. and Jones, L.A. Improved prediction of survival in advanced adenocarcinoma of the ovary by immunocytochemical analysis and the composition adjusted receptor level of the estrogen receptor. Cancer Research, 53:5188-5192, 1993.
  16. Kieback, D.G., Press, M.F., Atkinson, E.N., Edwards, G.L., Mobus, V.J., Runnebaum, I.B., Kreienberg, R. and Jones, L.A. Prognostic significance of estrogen receptor expression in ovarian cancer. Immunoreactive score (IRS) vs. composition adjusted receptor level (CARL). Anticancer Research, 13:2489-2496, 1993.
  17. Koulos, J.P., Wright, T.C., Mitchell, M.F., Silva, E., Atkinson, E.N., and Richart, R.M. Relationships between c-Ki-ras mutations, HPV types, and prognostic indicators in invasive endocervical adenocarcinomas. Gynecologic Oncology, 48:364-369, 1993.
  18. Matthews, C.M., Burke, T.W., Tornos, C., Eifel, P.J., Atkinson, E.N., Stringer, C.A., Morris, M., and Silva, E.G. Stage I cervical adenocarcinoma: Prognostic evaluation of surgically treated patients. Gynecologic Oncology, 49:19-23, 1993.
  19. Miller, B., Morris, M., Rutledge, F., Mitchell, M.F., Atkinson, E.N., Burke, T.W., and Wharton, J.T. Aborted exenterative procedures in recurrent cervical cancer. Gynecologic Oncology, 50:94-99, 1993.
  20. Morris, M., Gershenson D.M., Burke, T.W., Follen Mitchell, M., Levenback, C., Atkinson, N., Wharton, J.T. A phase II study of carboplatin and cisplatin in advanced or recurrent squamous carcinoma of the uterine cervix. Gynecologic Oncology, 53:234-237, 1994.
  21. Mourad, W.A., Connelly, J.H., Sembera, D.L., Atkinson, E.N., and Bruner, J.M. The correlation of two argyrophilic nucleolar organizer region counting methods with bromodeoxyuridine-labeling index: a study of metastatic tumors of the brain. Human Pathology, 24: 206-210, 1993.
  22. Patton, T.J., Mitchell, M.F., Atkinson, E.N., Eifel, P., Gastorf, L., Yancey, C., Miller, D. and Wharton, J.T. Parameters of small bowel dysfunction in cervical cancer patients undergoing radiotherapy. International Journal of Gynecologic Cancer, 3:175-182, 1993.
  23. Peters, L.J., Geopfert, H., Ang, K.K., Byers, R.M., Maor, M.H., Guillamondequi, O., Morrison, W.H., Weber, R.S., Garden, A.S., Frankenthaler, R.A., Oswald, M.J., and Brown, B.W. Evaluation of the dose for postoperative radiation therapy of head and neck cancer: first report of a prospective randomized trial. International Journal of Radiation, Oncology, Biology, Physics, 26:3-11, 1993.
  24. Peters, L.J., Withers, R.H., and Brown, B.W. Complicating issues in complication reporting (Editorial). International Journal of Radiation, Oncology, Biology, Physics, in press.
  25. Pryzant, R.M., Meistrich, M.L., Wilson, G., Brown, B.W., and McLaughlin, P. Long-term reduction in sperm count after chemotherapy with and without radiation therapy for non-Hodgkin's lymphomas. Journal of Clinical Oncology, 11:239-247, 1993.
  26. Robertson, L.E., Estey, E., Kantarjian, H., Koller, C., O'Brien, S., Brown, B., and Keating, M.J. Therapy-related leukemia and myelodysplastic syndrome in chronic lymphocytic leukemia. Leukemia, in press.
  27. Robins, D.B., Katz, R.L., Swan, Jr., F., Atkinson, E.N., Ordonez, N.G., and Huh, Y.O. Immunotyping of lymphoma by fine-needle aspiration: a comparative study of cytospin preparations and flow cytometry. American Journal of Clinical Pathology, 101:569-576, 1994.
  28. Rodriguez, M.A., Fuller, L.M., Zimmerman, S.O., Allen, P.K., Brown, B.W., Munsell, M.F., Hagemeister, F.B., McLaughlin, P., Velasquez, W.S., Swan, Jr., F., and Cabanillas, F.F. Hodgkin's disease: Study of treatment intensities and incidences of second malignancies. Annals of Oncology, 4:125-131, 1993.
  29. Roman, L.D., Morris, M., Mitchell, M.F., Eifel, P.J., Burke, T.W., and Atkinson, E.N.: Prognostic factors for patients undergoing simple hysterectomy in the presence of invasive cancer of the cervix. Gynecologic Oncology, 50:179-184, 1993.
  30. Roth, J.A., Fossella, R., Komaki, R., Ryan, B.M., Putnam, Jr., J.D., Lee, J.S., Dhingra, H., DeCaro, L., Chasen, M., McGavran, M., Atkinson, E.N., Hong, W.K. A randomized trial comparing perioperative chemotherapy and surgery with surgery alone in resectable stage IIIA non-small-cell lung cancer. Journal of the National Cancer Institute, 86:673-680, 1994.
  31. Sneige, N., McNeese, M.D., Atkinson, E.N., Kemp, B., Sahin, A., Ayala, A.G. and Ames, F.C. Ductal carcinoma in situ treated with lumpectomy and irradiation: Histopathologic analysis of 49 cases with emphasis on risk factors and long-term results. Human Pathology, in press.

Submitted Publications

  1. McBride, C.M., Boddie, A.W., and Brown, B. Long term follow up of node negative breast cancer patients treated only by regional therapy.
  2. Lee, J.J., Serachitopol, D.M. , and Brown, B.W. Likelihood weighted confidence intervals for the difference of two binomial probabilities.
  3. Brown, B.W., Spears, F.M., Levy, L.B., Lovato, J. and Russell, K. Algorithm XXX:LLDRLF, log-likelihood and some derivative for log-F models.
  4. Spears, F.M., Brown, B.W. and Atkinson, E.N. The effect of incomplete knowledge of parameter values on single-stage designs for logistic regression.
  5. Spears, F.M. and Brown, B.W. Two-stage designs for logistic regression with incomplete knowledge of parameter values. (return to Research #4)

Table of Contents Section of Mathematical Biology