A Generic Sure Independence Screening Procedure
Introduction
Extracting important features from ultra-high dimensional data is one of the primary tasks in statistical learning, information theory, precision medicine and biological discovery. Many of the sure independent screening methods developed to meet these needs are suitable for special models that follow certain assumptions. With the availability of more data types and possible models, a model-free generic procedure with fewer and less restrictive assumptions of the data is required. In this paper, we propose a generic nonparametric sure independence screening procedure, called SBI-SIS, on the basis of a recently developed universal dependence measure: standardized ball information. We show that the proposed procedure has strong screening consistency even when the dimensionality is an exponential order of the sample size without subexponential moment assumptions of the data. We investigate the exibility of this procedure by considering three commonly encountered challenging settings in biological discovery or precision medicine: iterative SBI-SIS, interaction pursuit, and survival outcomes. We use simulation studies and real data analyses to illustrate the versatility and practicability of our SBI-SIS method.
Software Download
References
1. Pan, W., Wang, X.Q., Xiao, W.N., Zhu, H.T., A Generic Sure Independence Screening Procedure, Submitted.