Table of Contents
Section of Biostatistics
Department of Biomathematics
Section of Computer Science
Staff and Activities
Staff
Barry W. Brown, Ph.D. (1963) University of California at Berkeley
Chief, Section of Computer Science
Larry and Pat McNeil Professor of Biomathematics
Areas: Mathematical modeling of cancer processes; statistical
computing; statistical methodology; software engineering
E. Neely Atkinson, Ph.D. (1981) Rice University
Associate Professor of Biomathematics
Areas: Computational statistics; regression analysis; computer
science; modeling of cancer processes
David Tuttle, M.A.M.S. (1985) Rice University
Software Systems Specialist III
Areas: UNIX, network, and TCP/IP administration
Chris Brauner, M.A. (1992) Rice University
Programmer Analyst II
Areas: Statistics and statistical computing
David Gutierrez, B.A. (1984) Rice University
Programmer Analyst II
Areas: Computer architecture; computer networks; distributed processing
systems; human-computer interfaces
Lawrence Levy, B.A. (1976) University of Houston
Programmer Analyst II
Areas: Statistical graphics; computational statistics; survival
analysis
Dan M. Serachitopol, M.S. (1974) Polytechnic Institute of Bucharest
Programmer Analyst II
Areas: Statistical modelling using the S language; symbolic computing;
graphical user interfaces using the TCL/TK language; numerical
algorithms; neural networks and genetical algorithms
F. Martin Spears,(*)
Ph.D. (1992) Rice University
Visiting Assistant Professor
Areas: Statistical computing and statistics
Activities
Staff of this section are skilled in numerical analyses, systems
analysis, computer simulation, programming, language theory, and statistical computing. This
expertise is used in a wide variety of applications including:
modeling and simulation of the spread of metastases, development
of software tools for cancer research, statistical improvement
in computational methods for likelihood estimation, and collaboration
with clinical research and basic scientists.
This group has developed computer software for study planning
and survival analyses that has been distributed to numerous researchers,
including those at many of the comprehensive cancer centers.
Footnotes
- (*)
- Dr. Spears' current address: University of Houston at Clear
Lake, Department of Mathematics, 2700 Bay Area Blvd., Houston,
TX 77058.(Return)
- Progress Against Cancer Mortality: Another
Look (Recent Publications, #6).
Dr. B. W. Brown, Mr. C. Brauner, and Mr. L. Levy.
In 1986, Bailar and Smith published a widely noted paper, "Progress
against cancer?" in the New England Journal of Medicine,
314:1226-1232, 1986. They examined the mortality rate ascribed
to cancer and noted that it had changed very little with time,
although there were some modest improvements. Their conclusion
was that there had been little progress and they suggested that
more resources be placed in prevention rather than improved treatment.
We examine the impact of cancer on mortality from a slightly different
viewpoint.
The SEER (Surveillance, Epidemiology, and End Results) data is
used in this project. SEER is a government financed cancer registry
that covers about 10% of the U.S. population. An examination of
overall trends in cancer including incidence and survival requires
the use of population-based data in registries that cover defined
geographic areas.
We disagree with Bailar and Smith in that we think that the concept
of the primary cause of death is philosophically undefined: although
death is certainly observed, its cause is not. We believe that
death, like most human events, is frequently the culmination of
several factors, and an attribution of primary cause is an unwarranted
oversimplification.
Because the cause of death is not observable, examining the correctness
of cancer specific mortality rates appears impossible. It is,
however, feasible to compare the rates of noncancer death in those
diagnosed with cancer to those in the overall population matched
by age and sex. If all excess mortality associated with cancer
is recorded as due to cancer, then the patient and population
noncancer death rates should be the same if the patient and general
population do not differ. The patient population, however, may
suffer an increased susceptibility to disease in general compared
to the overall population. This susceptibility may be caused by
factors such as unhealthy life styles or genetic defects. Consequently,
a higher than population death rate from causes other than cancer
might be expected in the cancer population. However, a generalized
lack of resistance to disease would not explain systematic changes
in relative noncancer mortality rates with time after the diagnosis
of cancer.
The ratio of patient-to-population noncancer death rates is greater
than 1 for all cancers combined and for the common solid tumors.
Leading noncancer causes of death in cancer patients were circulatory
and respiratory failures. The noncancer relative risk decreased
rapidly after diagnosis and decreased with the patient's age at
diagnosis. It increased slightly with the calendar year of diagnosis.
Because the largest excess of noncancer death occurred shortly
after diagnosis, it seems almost certain that a large portion
of the excess was caused by treatment of the cancer. The excess
has increased with time from 1973 (the beginning of SEER) to 1986
(the last year examined), which we attribute to increased treatment
intensity.
We conclude that cancer-attributed mortality is not a complete
measure of the cancer-associated mortality because it omits a
number of deaths associated with cancer. From this observation,
we suggest that trials of detection methodology (e.g., mammography)
should use as the outcome the number of individuals diagnosed
with the cancer and dead within some period of time. The number
of deaths attributed to cancer underestimates the value of detection,
by ignoring the lesser severity of treatment used in early stage
disease. Decreased severity of treatment would be expected to
result in fewer treatment related deaths.
Increasing intensity of treatment is warranted if it decreases
cancer mortality more than it increases treatment associated death
rates. To determine whether this is so, three year rates of survival
of cancer patients were examined. The three year period contains
most of the excess noncancer deaths.
For whites of both sexes and age groups, there is an improvement
in three year survival of about 1/2% by year of diagnosis. Statistical
analysis shows that this improvement is not likely to be due to
chance fluctuation. Increased intensity of treatment appears to
save lives, at least in the short run. Blacks show somewhat less
improvement in three-year survival than whites; overall the improvement
is about 1/4% per year.
Returning to the primary theme of this project, we suggest that
progress against cancer mortality be evaluated by examining the
proportion of the (age adjusted) general population who are diagnosed
with cancer and dead by a particular age. This criterion has the
advantage of being largely immune to early detection bias; if
early detection does not prolong life the criterion will be unchanged.
The suggested criterion has the advantage of not relying on the
unobservable cause of death, and it does take into consideration
both cancer incidence and the results of treatment.
The suggested measure of progress has another advantage over the
criterion of the rate of death from cancer. Deaths from cancer
occur in patients diagnosed over a wide range of time and the
mixture of these diagnosis times is generally unknown. Consequently,
it is difficult to determine the outcome on those diagnosed in
1975, say, and to compare the outcome to those diagnosed in 1985,
and so to evaluate progress in comparatively recent times. Our
measure of progress makes this comparison possible, and we apply
it to patients diagnosed in three time periods: 1975-76, 1980-81,
and 1985-86.
The difficulty in applying our criterion is in extrapolating survival
beyond the period of observation. We focus on adult onset cancers
and desire to determine the proportion of the population diagnosed
and dead by ages up to 85 years. This agenda requires the determination
of the probability of survival to ages up to 85 years of a patient
diagnosed at age 20. The data from the first period studied provides
a follow-up of only 15 years; the latest provides only 5 years
of follow-up. Originally, it was hoped that fitting the very flexible
Log-F model to the data would provide a believable extrapolation,
but this was found to be inadequate. A large proportion of cancer
patients die soon after diagnosis and this early survival experience
is not predictive of later experience.
A strategy for extrapolating survival at specific times after
diagnosis was evolved. Observed survival rates from earlier time
periods are used when available. For survival beyond 15 years,
results were used from fitting the Log-F model to the 10 to 15
year survival experience of patients diagnosed in 1975-76. This
model has been found good for fitting late survival experience.
Note that these methods do not consider possible improvements
with time in long term cancer survival; there is no data on which
to estimate such improvements.
Cancer incidence is increasing with time and survival after diagnosis
is improving with time of diagnosis. The net effect on the proportion
diagnosed with cancer and dead by particular ages is shown in
the figures. For each race and sex combination, the leftmost chart
shows the proportion graphed against age; the rightmost chart
shows the change with respect to 1975-76 experience. The solid
lines in the figures shows the 1975-76 experience, the dotted
line shows that for 1980-81 shows that for 1980-81, and the dashed
line is for 1985-86. Note that in the difference figures, the
experiences in 1975-76 is taken as the freeline. Cancers related
to AIDS are excluded from the graph for white men for the 1985-56
time period. AIDS causes a large number of cancer in young men
who then die at an early age, but this was felt to be largely
extraneous to the issue of cancer-associated mortality.
For whites of both sexes, the proportion of the population diagnosed
with cancer and dead by ages up to 65 or 70 has remained approximately
constant over time. The effects of increasing incidence and survival
approximately cancel. At quite advanced ages, everyone dies and
this measure reduces to incidence of cancer, which is increasing.
For black women, the cancer associated mortality has decreased
with time. The incidence of cancer in young black women has decreased
and the increase at older ages is more than compensated for by
increased survival. The experience of black men has definitely
gotten worse. There has been a great increase in cancer incidence
in black men over 40, and this increase overwhelms any improvement
in survival.
Left panels. The proportion of the SEER population diagnosed
with cancer and dead by particular ages. The solid line presents
experience for 1975-76, the dotated line that for 1980-1981, and
the dashed line, 1985-1986.
Right panels. The change in proportions from the experience
of 1975-1976.
Note. AIDS related cancers have been ignored in white men.
These cancers add a large number of diagnoses and deaths at young
ages, but are deemed extraneous to changes in the impact of cancer
on survival.
- Statistical Methods for the Diagnosis of
CIN from Laser Spectroscopy. Dr. E. Neely Atkinson, Dr. R.
Richards-Kortum, and Dr. M. F. Mitchell.
The goal of this study is to develop methods for diagnosing the
presence of cervical intraepithelial neoplasia (CIN) based on
an analysis of the fluorescence excitation-emission spectra of
cervical tissue in vivo. If the project is successful,
it will greatly aid in avoiding unnecessary biopsies with a consequent
decrease in cost, discomfort, and risk to patients.
During colposcopy, suspicious areas of the cervix are illuminated
by a laser probe at several excitation wavelengths and the intensity
of the resulting fluorescence measured at a number of emission
wavelengths, resulting in an excitation emission matrix (EEM)
of values. Typically, there are approximately 3 excitation wavelengths
and 50 emission wavelengths for each excitation wavelength. Based
on various characteristics of the EEM, we seek to classify the
tissue as normal, inflamed, squamous intraepithelial lesions (SIL),
or metaplastic.
Although we are exploring a number of potential algorithms, they
all comprise three basic phases: data preprocessing, data reduction,
and classification. For each individual patient, normal tissue
consistently produces a higher peak emission intensity occurring
at a lower emission wavelength than abnormal tissue. However,
between patients there is great variability in the location and
intensity of the peak emission for normal tissue.
Similar relations hold between the other tissue types. It will
probably be necessary to calibrate the EEM for each patient prior
to attempting any classification to remove some of the between
patient variability. Next, for each area of tissue under examination,
there are approximately 150 values in the EEM; in order to derive
a classification algorithm which will be small enough and fast
enough to be clinically useful, these values need to be reduced
to a small number of parameters which retain the relevant information.
Two methods are under exploration: fitting parametric curves characterized
by a small number of parameters to each excitation-emission spectrum
and fitting each spectrum as a linear combination of several orthonormal
basis vectors. Finally, the parameters derived from the data reduction
must be used to classify the tissue. We are studying several classification
algorithms: polychotomous logistic regression, classification
trees, and neural networks.
The number of combinations of techniques available for use at
each phase of the data analysis means that this is a potentially
lengthy and complicated project. However, if it is successful,
the techniques developed in this study will be of great benefit
to patients.
- Dynamic Interactive Graphics for the Exploratory
Analysis of Survival Data (Recent Publications, #3).
Dr. E. Neely Atkinson.
The goal of this project is to develop and implement interactive
dynamic techniques which permit clinical researchers to graphically
examine survival data in an intuitive fashion in order to discover
interesting aspects of their data which may then be analyzed more
formally using traditional statistical methods.
Although powerful techniques are available for testing specific
hypotheses in the analysis of survival data, these methods are
only useful for answering particular questions about a data set;
they are not convenient for examining data in a less structured
way, looking for surprising aspects which may then be studied
more extensively. The numerous techniques for exploratory data
analysis developed for many other data types do not extend directly
to survival data, which is characterized by censored data values.
Recent advances in computer hardware and software now make it
feasible to develop graphical methods for the exploration of survival
data which permit clinical researchers to view survival data in
various intuitive fashions and to implement these techniques on
the computer systems likely to be available to most researchers.
Suppose, for example, a researcher wishes to examine the effects
of several covariates on survival, perhaps searching for subsets
of patients who do particularly well or poorly or perhaps attempting
to construct an appropriate model for survival analysis. The covariates
can be displayed in a scatterplot matrix (a graphical matrix giving
all pairwise scatterplots for the variables) and the scatterplot
matrix linked to a survival curve. Using a mouse (or other input
device), the researcher can select several data points from the
scatterplot matrix and see the survival curve computed using only
the selected points. As the selection of points changes, the survival
curve is continuously updated, permitting the researcher to search
for interesting regions of the data. Instead of a survival curve,
the scatterplot matrix could be linked to another appropriate
display, such as a censored boxplot or an event chart.
A preliminary version of a computer package embodying these ideas
has been implemented in the LISP-STAT system and posted to Netlib.
As I obtain feedback and suggestions, I intend to modify and extend
the package to make it as useful as possible to clinicians.
- Design of Dose-Response Studies (Submitted
Publications, #5).
Dr. B. W. Brown and Dr. F. M. Spears.
This study is part of a general effort to use inexpensive computation
to design efficient trials so as to obtain the maximum amount
of information at minimal cost. The current application applies
to studies of the response of cancer to differing doses of some
therapeutic agent.
Most previous work on such designs considers only the case in
which the parameters of the dose-response curve is known when
the experiment is being planned. This case is unrealistic: if
the experimenter knew these parameters, he would not perform the
experiment. We focus on designs that are efficient for specified
degrees of initial investigator uncertainty.
In the most widely used dose-response model, typical gains in
efficiency are as follows. If the parameters of the dose-response
curve are known in advance, a planned experiment typically allows
one-half to two-thirds the number of subjects to be used for the
same degree of accuracy as an experiment that equally spaces groups
of subjects across doses. If the parameters are not well known
in advance, a study planned for uncertainty leads to a similar
increase in efficiency over a study that uses the most likely
values.
For large degrees of initial uncertainty, a two-stage study allows
a similar savings over a one-stage study. In a two-stage study,
only a portion of the subjects are used in the first stage of
the study. The allocation of the remainder of the subjects depends
on the outcome of the subjects in the first stage of the study.
Chaloner and Larntz applied a similar overall approach with different
forms of representation of uncertainty and criteria of efficiency
in "Optimal bayesian design applied to logistic regression
experiments" in the Journal of Statistical Planning and
Inference, 21:191-208, 1989. They found that the number of
dose points increases with uncertainty; we have not. We are currently
investigating the effect of the representation of uncertainty
and the functional form of criteria on good designs.
- Hyperbolic Matrix Decompositions in Statistical
Computing (Recent Publications, #2).
Dr. E. Neely Atkinson.
An algebraic system known as the hyperbolic complex numbers can
be used to compute certain matrix decompositions which fail to
exist when ordinary real numbers are used. This fact can be exploited
in the computations associated with several models frequently
used in the analysis of survival data.
Parameter estimation in the fitting of statistical models generally
requires the numerical minimization of an objective function.
At each iteration of the minimization, a matrix factorization
known as the Cholesky decomposition is applied to the Hessian,
or matrix of second derivatives, of the objective function. The
Cholesky factorization requires that the matrix being decomposed
be positive definite; when this condition is not met, the Hessian
must be modified to ensure positive definiteness. It is desirable
to make the modifications as small as possible. By using hyperbolic
computations, the Cholesky decomposition can be performed even
when the condition of positivity is not met; as the decomposition
is being performed, it is apparent what modification would be
required to enforce positive definiteness. This approach may lead
to smaller changes to the Hessian than currently used methods.
The Hessian matrix frequently appears as the product of component
matrices. By using hyperbolic computations, the required decomposition
can be performed directly on the component matrices rather than
actually forming the Hessian explicitly. This approach minimizes
the effects of round-off error on the computations.
Both of the above situations arise in the use of the class of
accelerated failure time models. The Hessian contains a portion
related to the linear elements of the model which can be written
in factored form and a portion related to the nonlinear elements
which may require modification to guarantee positivity. By applying
hyperbolic techniques to this class of models, I hope to derive
models which are both more efficient than current techniques and
more resistant to numerical difficulties.
- Exact Designs and Analyses for Single- and Multiple-Stage
Studies with a Binomial Outcome. Dr. Barry W. Brown and Dr.
F. M. Spears.
Many cancer trials involve an outcome, such as complete remission,
that is observed shortly after treatment. Ethical considerations
usually dictate an early look at the data of such a trial when
it is only partially complete. The trial should be terminated
early if there is overwhelming evidence that the new treatment
is better than the old. In this case, the new treatment graduates
from being experimental to being standard practice. The trial
should also terminate early if it appears that a demonstration
of the superiority of the new treatment is quite unlikely. The
new treatment may be more intense and toxic than the old or it
may be more expensive. In either case, patients should not be
subjected to the new treatment if it appears to lack additional
benefits over the old one.
The purpose of this project is to implement exact methods for
planning and analysis of studies with and without early examinations
of the data. The overall methodology is quite clear; the difficulty
in implementation is the presentation of sufficient information
on consequences of choices of stopping points in early looks to
the designer of the study.
The plural in "designs" refers to the fact that all
of these methods involve sorting all possible outcomes of the
trial in order so that the outcomes most favoring the superiority
of the new treatment are on top of the sorted list. Statistical
theory shows that there is no unique best sort order for this
task; a choice between reasonable orders must be made. Fortunately,
comparisons of several orders shows them to be generally consistent.
This project is closely related to a collaboration between these
investigators and Dr. J. J. Lee and Mr. D. Serachitopol, investigating
various confidence limits on the difference of two binomial outcomes.
It is an interesting irony, that the "exact" methods
provide confidence limits that are wider than necessary; that
is, some "inexact" methods are better. This finding
is an analogy to the Fisher "exact" test for the difference
in binomial proportions. The Fisher test is an exact answer to
a question rarely asked in practice and provides worse answers
to the usual question than approximate methods.
- Methods for Dealing with Multiple Hypotheses. Dr. Barry
W. Brown and Ms. Kathy Russell.
The simultaneous investigation of a number of different questions
is common in cancer research. For example, several investigators
are looking at the pattern of joint occurrence of several genetic
alterations suspected to be involved in the cancer process. For
each pair of genetic loci, there is a separate question as to
whether the alterations are related. In quality of life studies,
scores in several psychological dimensions are related to the
outcome of clinical treatment. Each pair of psychological dimensions
and clinical measures must be evaluated for association. A third
example is the examination of a number of patient characteristics
at diagnosis for association with the outcome of treatment.
Calculating statistical significance while ignoring the fact that
a number of questions are being asked leads to erroneous findings
of evidence. If each of 100 independent tests has a probability
of 0.05 of falsely being found significant then there is a probability
greater than 0.99 that at least one of the 100 will be found significant.
In a few cases, specific methods have been devised to deal with
the multiplicity of questions being asked. In other cases, an
overall test exists, determining whether there is evidence that
any of the many results are due to chance. Unfortunately, these
special cases do not cover the many instances of multiple hypotheses
encountered in cancer research.
One strategy for dealing with multiplicity is to drastically increase
the amount of evidence required to declare a result as more than
chance. Such stringent requirements may well lead to missing important
real effects.
Another strategy, and the one under examination, plots the ordered
probabilities that each result is due to chance against its rank
in the ordering. If the results are independent and all due to
chance, the resulting plot should be a straight line. Deviations
from linearity indicate results that are not likely to be due
to chance. There are both graphic and algorithmic methods for
assessing such graphs and we are adding some of our own. A problem
with all such methods is that their behavior when the results
are not independent but are correlated is unknown. This behavior
is being examined via simulation to determine the generality with
which such methods can be used.
- Goodness of Fit via Smoothing Techniques. Dr. Barry
W. Brown, Dr. Joan Staniswalis, and Mr. Dan Serachitopol.
The determination of the functional relation of a set of covariates
on some outcome variable is one of the most frequently encountered
in statistics generally as well as cancer research specifically.
The covariates could be patient characteristics at entry to a
study or the dose of some therapeutic agent and the outcome could
be whether or not there is a complete remission of cancer or survival
time. Or the covariates could be the amount of a particular chemical
generated in cell culture and the outcome the number of cells
with a mutation at a specific locus. Models for such data usually
are chosen for their simplicity and known properties; it is important
to determine whether these models are adequate and if not to improve
them until they are.
This project deals with the problem of ascertaining whether an
assumed form of functional relation is a reasonable representation
of data and, if not, providing suggestions for its improvement.
Standard statistical practice allows a simple model to be compared
to a more complex one to determine whether the complexity is necessary,
but it does not provide a general comprehensive model for comparison
to the one used.
Drs. Staniswalis and Severini published a paper, "Diagnostics
for assessing regression models," in the Journal of the
American Statistical Association, 86:684-691, 1991, that proposed
a nonparametric model obtained by allowing the overall model to
vary continuously with the covariates. This general model is constrained
only by the need to be smooth; no particular functional form is
necessary. The model is estimated using smoothing techniques,
i.e., variations on the idea of a running mean. This estimation
proved difficult in practice; smoothing greatly cut down the range
of covariate values in the fit, making parameter estimation difficult.
Mr. Serachitopol noted that the outcome statistic (e.g., probability
of complete remission) is easy to estimate locally even though
the its relation with the covariates is not. Staniswalis' statistic
measuring the difference between the models requires only this
outcome statistic. The outcome statistic can be easily used to
calculate changes in the model required to make the chosen form
match the smoothed result and hence to suggest improvements to
the original model.
We are writing code to allow this technique to be applied to outcomes
that are continuous, probabilities, and survival time. This is
a somewhat arduous task. The asymptotic theory for the distribution
of Dr. Staniswalis' statistic was derived in her paper; however,
experimentation shows that this distribution rarely applies to
the sample sizes seen in practice. Hence simulation (bootstrapping)
under the hypothesis that the original model is good is necessary
to determine the statistical significance of the difference between
the original model and the smoothed one.
- Biostatistics Core for: Extensions of radiotherapy research.
Dr. L. Peters, Dr. B. W. Brown, Dr. E. N. Atkinson, and Ms. M.
J. Oswald.
- Biostatistics Core for: A mutational model for childhood cancer.
Dr. L. Strong, Dr. B. W. Brown, and Mr. L. Levy.
- Human tumor cell radiosensitivity vs. radiocurability. Dr.
W. A. Brock and Dr. B. W. Brown.
- Patterns of cancer occurrence in relatives of childhood sarcoma
and osteosarcoma Patients. Dr. L. Strong and Dr. B. W. Brown.
- Multiple primary cancer and mutagen-hypersensitivity. Dr.
S. P. Schantz. Dr. B. W. Brown, and Mr. L. Levy.
- Head and neck cancer: clinical impact of natural immunity.
Dr. S. P. Schantz and Dr. B. W. Brown.
- Patterns of incidence of glioblastoma and glioma by race,
sex, and age. Dr. V. A. Levin, Dr. B. W. Brown, and Mr. L. Levy
- Mathematical models for cell senescence. Dr. W. Brock, Dr.
B. W. Brown, and Dr. E. N. Atkinson.
- Radiosensitivity as a prognostic factor in head and neck cancer.
Dr. W. Brock and Dr. B. W. Brown.
- Phase I/II study of combined modality treatment for resectable
non-small cell superior sulcus tumors. Dr. R. Komaki, Dr. J. B.
Putnam, Dr. J. S. Lee, and Dr. B. W. Brown.
- Kinetics and radiosensitivity as predictors of treatment response
in human cervical cancer. Dr. M. Morris, Dr. W. A. Brock, Dr.
N. Terry, Dr. B. W. Brown, and Dr. D. M. Gershenson.
- Determination of hypoxia in squamous cell carcinomas of the
head and neck. Dr. J. D. Morton, Dr. E. E. Kim, Dr. K. Ang, and
Dr. B. W. Brown.
- Preoperative tumor cell kinetics and rectal carcinoma response
to preoperative 5FU+XRT. Dr. T. A. Rich, Dr. D. C. Hohn, Dr. M.
Meistrich, and Dr. B. W. Brown.
- A phase III randomized study comparing conventional palliative
irradiation (30 GY/10 fractions) vs. single fraction photon radiotherapy
+/- strontium 89 for painful bone metastases. Dr. D. Podoloff,
Dr. A. Porter, Dr. Payne, Dr. B. W. Brown, and Dr. L. Peters.
- A study of the relation between mucositis and outcome in patients
with head and neck cancer. Dr. F. Geara, Dr. B. W. Brown, and
Mr. L. Levy.
- Diagnosis of CIN from laser fluorescence spectroscopy. Dr.
M. F. Mitchell, Dr. R. Richards Kortum, and Dr. E. N. Atkinson.
- Risk factors in breast and ovarian carcinoma. Dr. D. Kieback
and Dr. E. N. Atkinson.
- Clinical trial of 4-HPR and tamoxifen in breast neoplasia.
Dr. K. Dhingra and Dr. E. N. Atkinson.
- Evaluation of chemotherapeutic regimens for treatment of gynecologic
malignancies. Dr. R. Freedman, Dr. D Gershenson, Dr. M. F. Mitchell,
and Dr. E. N. Atkinson.
- Studies of administration of viral oncolysate (virus modified
tumor extract) to patients with ovarian carcinoma. Dr. R. Freedman
and Dr. E. N. Atkinson.
- Studies of two argyrophilic nucleolar organizer region counting
methods. Dr. W. A. Mourad and Dr. E. N. Atkinson.
- Intrinsic resistance to anticancer agents in murine pancreatic
adenocarcinoma. Dr. J. A. Nelson and Dr. E. N. Atkinson.
- Predicting acute graft rejection in renal transplantation.
Dr. J. Grevel and Dr. E. N. Atkinson.
The Cancer Information Systems Resource is composed of members
of the Section of Computer Science of the Department of Biomathematics.
It provides scientific computational abilities to meet the diverse
needs of the institution's cancer researchers in addition to direct
collaboration. This Resource has been a continuously funded shared
resource from the inception of M. D. Anderson's Cancer Center
Support Grant in 1979.
In addition to their activities on this Resource, the faculty
have their own research projects -- usually arising from issues
noted during the course of consultation -- and actively participate
in educational activities.
This resource provides scientific computational abilities needed
by the institution's cancer researchers. Powerful and easy-to-use
software to meet the diverse research needs of the institution
are acquired where possible, and otherwise written. Code created
by this Resource is made freely available to any researcher anywhere
who desires it. Interested readers can obtain these packages by
anonymous ftp to odin.mda.uth.tmc.edu (129.106.3.17). A description
of available packages is on /pub/index. Code for the S statistical
system is submitted to statlib and is not placed on our ftp account.
Statlib can be accessed by ftp (user name is statlib, send mail
address as a password) at lib.stat.cmu.edu (128.2.241.142). Listed
here are some of the development efforts of the previous two years.
- STPLAN - Performs Sample Size, Power, and Related Calculations.
Dr. Barry W. Brown, Mr. James Lovato, and Mr. Chris Brauner.
STPLAN calculates the sample size required to produce a specific
power. The package is symmetric in that it can calculate any one
of the following when the others are specified: sample size, minimal
detectable difference, type one error (significance level), and
power. Most of the commonly encountered clinical test situations
are incorporated into the calculations of STPLAN.
The most important recent improvements in STPLAN consist of providing
a full written report of the test conditions for which calculations
are made and to permit tables of values to be calculated with
a single command.
New capabilities are added to STPLAN as the need is made evident
in the course of collaborations of our group. Recent additions
include calculations for changes of a binomial parameter for subjects
above and below the median value of a continuous variable, calculations
for the log-normal distribution, and for two normal distributions
with differing variances. Documentation of all methods used and
the derivation of most is being completed.
- Asymptotic Sample Size Calculations. Dr. Barry W. Brown and
Mr. James Lovato.
A set of programs for performing asymptotic sample size calculations
(using likelihood methods) has been written in the S statistical
language. These methods provide required sample size for cases
that are too complex for STPLAN; for example, detecting a quadratic
term in logistic regression. These likelihood methods are extremely
general.
- Steckel's Generalization of Logistic Regression. Dr. Barry
W. Brown and Mr. Dan Serachitopol.
The Steckel generalization of logistic regression has been implemented
in the S system. This generalization parameterizes the link function
and provides a very flexible set of models for fitting binary
response data -- a very common case in cancer research. In a large
number of cases, the fit is significantly better than the logistic
model.
- Cumulative Distribution Functions, Inverses, etc. Dr. Barry
W. Brown, Mr. James Lovato, and Ms. Kathy Russell.
DCDFLIB is a collection of Fortran or C routines which provide
the double precision calculation of cumulative distribution functions,
their inverses, and their parameters for a number of common statistical
distributions listed below. DCDFLIB uses published algorithms
cited in its documentation.
Values associated with a statistical distribution include X, the
upper limit of integration of the density, P, the cumulative distribution
function evaluated at X, and auxiliary parameters such as degrees
of freedom. Given all but one of these values, a routine in cdflib
will calculate the value that was not specified.
Routines are provided for the following distributions: (1) Beta,
(2) Binomial, (3) Chi-square, (4) Noncentral Chi-square, (5) F,
(6) Noncentral F, (7) Gamma, (8) Negative Binomial, (9) Normal,
(10) Poisson, (11) Student's t.
- Random Number Generators. Mr. Barry W. Brown, Mr. James Lovato,
and Ms. Kathy Russell.
RANLIB is a collection of routines that provide generators of
random numbers from a variety of distributions. RANLIB uses published
algorithms cited in its documentation. Both Fortran and C versions
are available.
The uniform generator uses an algorithm of L'Ecuyer and Cote to
provide 32 virtual random number generators. Each generator contains
1,048,576 blocks of numbers, and each block is of length 1,073,741,824.
Any generator can be set to the beginning or end of the current
block. Packaging is provided so that if these capabilities are
not needed, a single generator with period 2.3 X 1018 is seen.
Using this uniform generator, routines are provided that return:
(1) Beta random deviates, (2) Chi-square random deviates, (3)
Exponential random deviates, (4) F random deviates, (5) Gamma
random deviates, (6) Multivariate normal random deviates (mean
and covariance matrix specified), (7) Noncentral chi-square random
deviates, (8) Noncentral F random deviates, (9) Univariate normal
random deviates, (10) Random permutations of an integer array,
(11) Real uniform random deviates between specified limits, (12)
Binomial random deviates, (13) Poisson random deviates, (14) Integer
uniform deviates between specified limits, (15) Multinomial random
deviates, (16) Negative binomial random deviates, and (17) Seeds
for the random number generator calculated from a user provided
character string.
- Accelerated Failure Model. Dr. Barry W. Brown, Dr. F. Martin
Spears, Mr. James Lovato, Mr. L. Levy, and Dr. E. Neely Atkinson.
This model posits that time to failure accelerates or decelerates
for each subject as a function of the values of a set of covariates.
The model is parametric, and so provides a nice complement to
the nonparametric proportional hazards model that is commonly
used to analyze time to event data. The model was published in
1981 and its advantages are well known to statisticians. However,
it is little used in practice due to the lack of code for fitting
it to data. Numerous serious numeric problems are encountered
in the model. Writing robust code has been a long term effort
but the current version of the program appears to work reasonably
well.
This program fits the accelerated failure model to data with or
without covariates. Its capabilities include fitting the general
model or any submodel with p and q fixed. It also can automatically
fit to data a set of models specified by a rectangular grid in
(p,q) coordinates or a set of named models. Several options are
provided to add covariates to an existing model. Finally, ACCFLF
will calculate time to event probabilities for accelerated failure
models using either times from the current data or a user supplied
list.
Recent Publications
- Abu-Farsakh, H.A., Katz, R.L., Atkinson, N., Champlin,
R.E. Prognostic factors in bronchoalveolar lavage in 77 patients
with bone marrow transplants. Acta Cytologica, in press.
- Atkinson, E.N. Computing AT
A - BT B = LT D L using generalized hyperbolic
Householder transformations. Linear Algebra and Its Applications,
194:135-147, 1993. (return to Research #5)
- Atkinson, E.N. Interactive dynamic graphics
for exploratory survival analysis. The American Statistician,
in press. (return to Research #3)
- Bondy, M.L., Strom, S.S., Colopy, M.W., Brown, B.W.,
and Strong, L. Accuracy of family history of cancer obtained through
interviews with relative of patients with childhood sarcoma. Journal
of Clinical Epidemiology, 47:89-96, 1994.
- Brown, B.W. STPLAN. Computational Statistics and
Data Analysis, 17:597-598, 1994.
- Brown, B.W., Brauner, C., and Minnotte,
M.C. Noncancer deaths in white adult cancer patients. Journal
of the National Cancer Institute, 85:979-987, 1993. (return to Research #1)
- Brown, B.W. and Levy, L. Certification of Algorithm
708: Significant Digit Computation of the incomplete Beta. ACM
Transactions On Mathematical Software, 20: 393-397,1994.
- Brown, B.W., Lovato, J., and Russell, K. RANLIB - random
number generation library (C and F77): Version 1.1. Computational
Statistics and Data Analysis, 17:598, 1994.
- Freedman, R.S., Bowen, J.M., Delcos, L., Edwards. C., Wallace.
S., Atkinson, E.N., Ioannides, C.G., Kasi, L.P., Scott,
W., and Patenia, R. Active intralymphatic immunotherapy of uterine
cervical carcinoma with viral oncolysate: A pilot study. International
Journal of Gynecological Cancer, 4:101-110, 1994.
- Freedman, R.S., Edwards, C.L., Kavanagh, J.J., Kudelka, A.P.,
Katz, R.L., Carrasco, C.H., Atkinson, E.N., Scott, W.,
Tomasovic, B., Templin, S. And Platsoucas, C.D. Intraperitoneal
adoptive immunotherapy of ovarian carcinoma with tumor infiltrating
lymphocytes and low-dose recombinant Interleukin-2: A pilot study.
Journal of Immunotherapy, 16:198-210, 1994.
- Freedman, R.S., Tomasovic, B., Templin, S., Atkinson, E.N.,
Kudelka, A., Edwards, C.L. and Platsoucas, C.D. Large-scale expansion
in Interleukin-2 of tumor-infiltrating lymphocytes from patients
with ovarian carcinoma for adoptive immunotherapy. Journal
of Immunological Methods,167:145-160, 1994.
- Gershenson, D.M., Mitchell, M.F., Atkinson, N., Silva,
E.G., Burke, T.W., Morris, M., Kavanagh,J.J., Warner, D., and
Wharton, J.T. Age contrasts in patients with advanced epithelial
ovarian cancer. The M.D. Anderson Cancer Center Experience. Cancer,
71:638-43, 1993.
- Gershenson, D.M., Silva, E.G., Mitchell, M.F., Atkinson,
E.N., and Wharton, J.T.: Transitional cell carcinoma of the
ovary: A matched control study of advanced-stage patients treated
with cisplatin-based chemotherapy. American Journal of Obstetrics
and Gynecology, 168:1178-1186, 1993.
- Joshi, J. H., Newman, K. A., Brown, B.W., Finley, R.S.,
Ruxer, R.L., Moody, M.A. and Schimpff, S.C. Double ß-lactam
regimen compared to an aminoglycoside/ß lactam regimen as
empiric antibiotic therapy for febrile granulocytopenic cancer
patients. Supportive Care in Cancer, 1:186-294, 1993.
- Kieback, D.G., McCamant, S.K., Press, M.F., Atkinson, E.N.,
Gallager, H.S., Edwards, C.L., Hajek, R. A. and Jones, L.A. Improved
prediction of survival in advanced adenocarcinoma of the ovary
by immunocytochemical analysis and the composition adjusted receptor
level of the estrogen receptor. Cancer Research, 53:5188-5192,
1993.
- Kieback, D.G., Press, M.F., Atkinson, E.N., Edwards,
G.L., Mobus, V.J., Runnebaum, I.B., Kreienberg, R. and Jones,
L.A. Prognostic significance of estrogen receptor expression in
ovarian cancer. Immunoreactive score (IRS) vs. composition adjusted
receptor level (CARL). Anticancer Research, 13:2489-2496,
1993.
- Koulos, J.P., Wright, T.C., Mitchell, M.F., Silva, E., Atkinson,
E.N., and Richart, R.M. Relationships between c-Ki-ras mutations,
HPV types, and prognostic indicators in invasive endocervical
adenocarcinomas. Gynecologic Oncology, 48:364-369, 1993.
- Matthews, C.M., Burke, T.W., Tornos, C., Eifel, P.J., Atkinson,
E.N., Stringer, C.A., Morris, M., and Silva, E.G. Stage I
cervical adenocarcinoma: Prognostic evaluation of surgically treated
patients. Gynecologic Oncology, 49:19-23, 1993.
- Miller, B., Morris, M., Rutledge, F., Mitchell, M.F., Atkinson,
E.N., Burke, T.W., and Wharton, J.T. Aborted exenterative
procedures in recurrent cervical cancer. Gynecologic Oncology,
50:94-99, 1993.
- Morris, M., Gershenson D.M., Burke, T.W., Follen Mitchell,
M., Levenback, C., Atkinson, N., Wharton, J.T. A phase
II study of carboplatin and cisplatin in advanced or recurrent
squamous carcinoma of the uterine cervix. Gynecologic Oncology,
53:234-237, 1994.
- Mourad, W.A., Connelly, J.H., Sembera, D.L., Atkinson,
E.N., and Bruner, J.M. The correlation of two argyrophilic
nucleolar organizer region counting methods with bromodeoxyuridine-labeling
index: a study of metastatic tumors of the brain. Human Pathology,
24: 206-210, 1993.
- Patton, T.J., Mitchell, M.F., Atkinson, E.N., Eifel,
P., Gastorf, L., Yancey, C., Miller, D. and Wharton, J.T. Parameters
of small bowel dysfunction in cervical cancer patients undergoing
radiotherapy. International Journal of Gynecologic Cancer,
3:175-182, 1993.
- Peters, L.J., Geopfert, H., Ang, K.K., Byers, R.M., Maor,
M.H., Guillamondequi, O., Morrison, W.H., Weber, R.S., Garden,
A.S., Frankenthaler, R.A., Oswald, M.J., and Brown, B.W.
Evaluation of the dose for postoperative radiation therapy of
head and neck cancer: first report of a prospective randomized
trial. International Journal of Radiation, Oncology, Biology,
Physics, 26:3-11, 1993.
- Peters, L.J., Withers, R.H., and Brown, B.W. Complicating
issues in complication reporting (Editorial). International
Journal of Radiation, Oncology, Biology, Physics, in press.
- Pryzant, R.M., Meistrich, M.L., Wilson, G., Brown, B.W.,
and McLaughlin, P. Long-term reduction in sperm count after chemotherapy
with and without radiation therapy for non-Hodgkin's lymphomas.
Journal of Clinical Oncology, 11:239-247, 1993.
- Robertson, L.E., Estey, E., Kantarjian, H., Koller, C., O'Brien,
S., Brown, B., and Keating, M.J. Therapy-related leukemia
and myelodysplastic syndrome in chronic lymphocytic leukemia.
Leukemia, in press.
- Robins, D.B., Katz, R.L., Swan, Jr., F., Atkinson, E.N.,
Ordonez, N.G., and Huh, Y.O. Immunotyping of lymphoma by fine-needle
aspiration: a comparative study of cytospin preparations and flow
cytometry. American Journal of Clinical Pathology, 101:569-576,
1994.
- Rodriguez, M.A., Fuller, L.M., Zimmerman, S.O., Allen, P.K.,
Brown, B.W., Munsell, M.F., Hagemeister, F.B., McLaughlin,
P., Velasquez, W.S., Swan, Jr., F., and Cabanillas, F.F. Hodgkin's
disease: Study of treatment intensities and incidences of second
malignancies. Annals of Oncology, 4:125-131, 1993.
- Roman, L.D., Morris, M., Mitchell, M.F., Eifel, P.J., Burke,
T.W., and Atkinson, E.N.: Prognostic factors for patients
undergoing simple hysterectomy in the presence of invasive cancer
of the cervix. Gynecologic Oncology, 50:179-184, 1993.
- Roth, J.A., Fossella, R., Komaki, R., Ryan, B.M., Putnam,
Jr., J.D., Lee, J.S., Dhingra, H., DeCaro, L., Chasen, M., McGavran,
M., Atkinson, E.N., Hong, W.K. A randomized trial comparing
perioperative chemotherapy and surgery with surgery alone in resectable
stage IIIA non-small-cell lung cancer. Journal of the National
Cancer Institute, 86:673-680, 1994.
- Sneige, N., McNeese, M.D., Atkinson, E.N., Kemp, B.,
Sahin, A., Ayala, A.G. and Ames, F.C. Ductal carcinoma in situ
treated with lumpectomy and irradiation: Histopathologic analysis
of 49 cases with emphasis on risk factors and long-term results.
Human Pathology, in press.
Submitted Publications
- McBride, C.M., Boddie, A.W., and Brown, B. Long term
follow up of node negative breast cancer patients treated only
by regional therapy.
- Lee, J.J., Serachitopol, D.M. , and Brown, B.W. Likelihood
weighted confidence intervals for the difference of two binomial
probabilities.
- Brown, B.W., Spears, F.M., Levy, L.B., Lovato, J.
and Russell, K. Algorithm XXX:LLDRLF, log-likelihood and some derivative for
log-F models.
- Spears, F.M., Brown, B.W. and Atkinson, E.N. The effect
of incomplete knowledge of parameter values on single-stage designs
for logistic regression.
- Spears, F.M. and Brown, B.W. Two-stage
designs for logistic regression with incomplete knowledge of parameter
values. (return to Research #4)
Table of Contents
Section of Mathematical Biology