Research

Frontiers in Statistics

Monday, February 28, 2005

Title: Semiparametric Analysis of Longitudinal Data with Informative Observation Times
Speaker: Do-Hwan Park, University of Missouri
Time: 3:00pm‐4:00pm
Place: ENB 108

Abstract

Statistical analysis of longitudinal data is an important topic faced in a number of applied fields including epidemiology, public health and medicine. In general, the information contained in longitudinal data can be divided into two parts. One is the set of observation times that can be regarded as realizations of an observation process and the other is the set of actually observed values of the response variable of interest that can be seen as realizations of a longitudinal or response process. For their analysis, a number of methods have been proposed and most of them assume that the two processes are independent. This greatly simplifies the analysis since one can rely on conditional inference procedures given the observation times.

However, the assumption may not be true in some applications. We will consider situations where the assumption does not hold and propose a semiparametric regression model that allows the dependence between the observation and response processes. Inference procedures are proposed based on the estimating equation approach and the asymptotic properties of the method are established. The results of simulation studies will be reported and the method is applied to a bladder cancer study.

Friday, February 25, 2005

Title: Robust Estimation of Mixture Complexity
Speaker: Mi-Ja Woo, University of Georgia, Athens
Time: 2:00pm‐3:00pm
Place: PHY 109

Abstract

Developing statistical procedures to determine the number of components, known as mixture complexity, remains an area of intense research. In many applications, it is important to find the mixture with fewest components that provides a satisfactory fit to the data. This talk focuses on consistent estimation of unknown number of components in finite mixture models, when the exact form of the component densities are unknown but are postulated to be close to members of some parametric family. Minimum Hellinger distances are used to develop a robust estimator of mixture complexity, when all the parameters associated with the model are unknown. The estimator is shown to be consistent. When there is no model misspecification, Monte Carlo simulations for a wide variety of target mixtures illustrate the implementation and performance of the estimator. Robustness of the estimator examined via model misspecification shows that, in contrast to an estimator based on Kullback-Leibler distance, the performance is unaffected by model misspecification. An example concerning hypertension is revisited to further illustrate the performance of the estimator.

Wednesday, February 23, 2005

Title: General Convex Stochastic Orderings and Related Martingale-type Structures
Speaker: Francisco Vera, University of South Carolina
Time: 3:00pm‐4:00pm
Place: LIF 267

Abstract

Over-dispersion of a population relative to a fitted baseline model can be accounted for in various ways. For example, one way is by using a mixture over the family of baseline models. Another is via a martingale structure if the Total Time on Test (TTT) Transform of the population “dominates” that of the baseline model. Here these latter ideas are extended to stochastic orderings in terms of Tchebycheff systems and related to a martingale-type of structure, called a $$k$$-mart, between the population and the baseline model. These ideas are illustrated for a binomial baseline model using the Saxony 1876-85 sibship census for families with twelve siblings. In addition the construction of a “most identical” distribution in the case of $$1$$-mart is presented.

Friday, February 4, 2005

Title: Analysis of Gene Expression Data and Chemosensitivity Prediction
Speaker: Florence George
Time: 3:00pm‐4:00pm
Place: PHY 109

Abstract

Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels for thousands of genes simultaneously. Microarray technology can provide important insights about the underlying genetic causes of many important biological questions. We discuss the computational methods of four important tasks: (1) The identification of differentially expressed genes, (2) the discovery of clusters of differentially expressed genes, (3) identification of features from the clusters and (4) the classification of biological samples.

The study is on gene expression levels of 55 advanced stage ovarian cancer patients. 33 of these patients showed complete response to chemotherapy, while the rest had a progressive disease at the completion of therapy. We sought to determine whether the gene expression levels were sufficient for the prediction of chemosensitivity.

Friday, January 21, 2005

Title: Review of Extreme Value Distributions with Examples
Speaker: Lakshminarayan Rajaram
Time: 3:00pm‐4:00pm
Place: PHY 109

Abstract

Extreme value theory has turned out to be one of the most important statistical disciplines in the last few decades. One of the most outstanding features of extreme value analysis is the objective to quantify the stochastic behavior of a process at unusually large (or small) levels. The central platform of extreme value theory is the three types of theorem of Fisher and Tippet, which asserts that there are only three types of distributions that can arise as limiting distributions of extreme values in the random samples.

The topic of this seminar mainly focuses on the review of extreme value distributions, especially Generalized Extreme Value (GEV) and Generalized Pareto Distributions (GPD) with examples as applied to the existing real data on rainfall and sea-levels.

Possible applications of extreme value theory to the area of pharmacokinetics to model the maximum drug concentrations in blood after the infusion of a drug along with appropriate covariates will be discussed.

Friday, January 14, 2005

Title: How to Perform an Analysis of Variance Procedure in S-Plus
Speaker: George Kimber
Time: 3:00pm‐4:00pm
Place: PHY 109

Abstract

The commands in S-Plus that generate the one-way ANOVA, the factorial ANOVA, and the nonparametric ANOVA will be demonstrated using datasets from several disciplines. Illustrations of how to test the underlying assumptions will be presented. Several of the post-hoc procedures will also be reviewed. A general discussion of the rationale for and the interpretation of the Analysis of Variance procedures and their related tests will also be conducted.