Analysis of Sparse Sufficient Dimension Reduction Models PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Analysis of Sparse Sufficient Dimension Reduction Models PDF full book. Access full book title Analysis of Sparse Sufficient Dimension Reduction Models by Yeshan Withanage. Download full books in PDF and EPUB format.

Analysis of Sparse Sufficient Dimension Reduction Models

Analysis of Sparse Sufficient Dimension Reduction Models PDF Author: Yeshan Withanage
Publisher:
ISBN:
Category : Dimension reduction (Statistics)
Languages : en
Pages : 0

Book Description
Sufficient dimension reduction (SDR) in regression analysis with response variable y and predictor vector x is focused on reducing the dimension of x to a small number of linear combinations of the components in x. Since the introduction of the inverse regression method, SDR became a very active topic in the literature. When the dimension p of x is increasing with the number of observations n, the traditional SDR methods may not perform well. The purpose of this study is two fold, theoretical and empirical. In the theoretical analysis, I provide a proof for the consistency of a variable selection procedure in sparse single-index models (a special SDR model) through an inverse regression method called CUME. And for the case of multiple linear regression, I obtain the influence functions for estimators of the parameter vector with SCAD and MCP penalties by extending the idea of LASSO influence function. In the empirical aspect, I combine the LASSO-SIR algorithm with the influence function of LASSO to construct a new metric for choosing the penalty parameter for variable selection as an alternative approach to the usual cross-validation method. From the empirical analysis, it was found that the newly proposed influence function-based measure outperforms the traditional cross-validation method in a wide range of settings. Finally, I also propose an algorithm to estimate the structural dimension d of SDR models with large dimension p

Analysis of Sparse Sufficient Dimension Reduction Models

Analysis of Sparse Sufficient Dimension Reduction Models PDF Author: Yeshan Withanage
Publisher:
ISBN:
Category : Dimension reduction (Statistics)
Languages : en
Pages : 0

Book Description
Sufficient dimension reduction (SDR) in regression analysis with response variable y and predictor vector x is focused on reducing the dimension of x to a small number of linear combinations of the components in x. Since the introduction of the inverse regression method, SDR became a very active topic in the literature. When the dimension p of x is increasing with the number of observations n, the traditional SDR methods may not perform well. The purpose of this study is two fold, theoretical and empirical. In the theoretical analysis, I provide a proof for the consistency of a variable selection procedure in sparse single-index models (a special SDR model) through an inverse regression method called CUME. And for the case of multiple linear regression, I obtain the influence functions for estimators of the parameter vector with SCAD and MCP penalties by extending the idea of LASSO influence function. In the empirical aspect, I combine the LASSO-SIR algorithm with the influence function of LASSO to construct a new metric for choosing the penalty parameter for variable selection as an alternative approach to the usual cross-validation method. From the empirical analysis, it was found that the newly proposed influence function-based measure outperforms the traditional cross-validation method in a wide range of settings. Finally, I also propose an algorithm to estimate the structural dimension d of SDR models with large dimension p

Bayesian Model Averaging Sufficient Dimension Reduction

Bayesian Model Averaging Sufficient Dimension Reduction PDF Author: Michael Declan Power
Publisher:
ISBN:
Category :
Languages : en
Pages : 56

Book Description
In sufficient dimension reduction (Li, 1991; Cook, 1998b), original predictors are replaced by their low-dimensional linear combinations while preserving all of the conditional information of the response given the predictors. Sliced inverse regression [SIR; Li, 1991] and principal Hessian directions [PHD; Li, 1992] are two popular sufficient dimension reduction methods, and both SIR and PHD estimators involve all of the original predictor variables. To deal with the cases when the linear combinations involve only a subset of the original predictors, we propose a Bayesian model averaging (Raftery et al., 1997) approach to achieve sparse sufficient dimension reduction. We extend both SIR and PHD under the Bayesian framework. The superior performance of the proposed methods is demonstrated through extensive numerical studies as well as a real data analysis.

Sparse Group Sufficient Dimension Reduction and Covariance Cumulative Slicing Estimation

Sparse Group Sufficient Dimension Reduction and Covariance Cumulative Slicing Estimation PDF Author: Bilin Zeng
Publisher:
ISBN:
Category : Analysis of covariance
Languages : en
Pages : 115

Book Description
"This dissertation contains two main parts: In Part One, for regression problems with grouped covariates, we adopt the idea of sparse group lasso (Friedman et al., 2010) to the framework of the sufficient dimension reduction. We propose a method called the sparse group sufficient dimension reduction (sgSDR) to conduct group and within group variable selections simultaneously without assuming a specific model structure on the regression function. Simulation studies show that our method is comparable to the sparse group lasso under the regular linear model setting, and outperforms sparse group lasso with higher true positive rates and substantially lower false positive rates when the regression function is nonlinear or (and) the error distributions are non-Gaussian. One immediate application of our method is to the gene pathway data analysis where genes naturally fall into groups (pathways). An analysis of a glioblastoma microarray data is included for illustration of our method. In Part Two, for many-valued or continuous Y, the standard practice of replacing the response Y by a discrete version of Y usually results in the loss of power due to the ignorance of intra-slice information. Most of the existing slicing methods highly reply on the selection of the number of slices h. Zhu et al. (2010) proposed a method called the cumulative slicing estimation (CUME) which avoids the otherwise subjective selection of h. In this dissertation, we revisit CUME from a different perspective to gain more insights, and then refine its performance by incorporating the intra-slice covariances. The resulting new method, which we call the covariance cumulative slicing estimation (COCUM), is comparable to CUME when the predictors are normally distributed, and outperforms CUME when the predictors are non-Gaussian, especially in the existence of outliers. The asymptotic results of COCUM are also well proved."--Abstract, page iv.

Dimension Reduction and Sufficient Graphical Models

Dimension Reduction and Sufficient Graphical Models PDF Author: Kyongwon Kim
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
The methods I develop in my thesis are based on linear or nonlinear sufficient dimension reduction. The basic principle of linear sufficient dimension reduction is to extract a small number of linear combinations of predictor variables, which can represent original predictor variables without loss of information on the conditional distribution of response variable given predictor variables. Nonlinear sufficient dimension reduction is a more generalized version of linear sufficient dimension reduction to the nonlinear context. I am focusing on applying sufficient dimension reduction methods into two areas, regression modeling and graphical models. The first project is about statistical inference in regression context after sufficient dimension reduction. Second, I apply nonlinear sufficient dimension reduction method to the well known statistical graphical models in machine learning. These projects have consistency in a context that discovering areas that sufficient dimension reduction can be applied and establishing statistical theory behind their applications. My first project is about post sufficient dimension reduction statistical inference. The methodologies of sufficient dimension reduction have undergone extensive developments in the past three decades. However, there has been a lack of systematic and rigorous development of post dimension reduction inference, which has seriously hindered its applications. The current common practice is to treat the estimated sufficient predictors as the true predictors and use them as the starting point of the downstream statistical inference. However, this naive inference approach would grossly overestimate the confidence level of an interval, or the power of a test, leading to the distorted results. In this project, we develop a general and comprehensive framework of post dimension reduction inference, which can accommodate any dimension reduction method and model building method, as long as their corresponding influence functions are available. Within this general framework, we derive the influence functions and present the explicit post reduction formulas for the combinations of numerous dimension reduction and model building methods. We then develop post reduction inference methods for both confidence interval and hypothesis testing. We investigate the finite-sample performance of our procedures by simulations and a real data analysis. My second project is about applying nonlinear dimension reduction technique to graphical models. We introduce the Sufficient Graphical Model by applying the recently developed nonlinear sufficient dimension reduction techniques to the evaluation of conditional independence. Graphical model is nonparametric in nature, as it does not make distributional assumptions such as the Gaussian or copula Gaussian assumptions. However, unlike fully nonparametric graphical model, which relies on the high-dimensional kernel to characterize a conditional independence, our graphical model is based on a conditional independence given a set of sufficient predictors with a substantially reduced dimension. In this way, we avoid the curse of dimensionality that comes with a high-dimensional kernel. We develop the population-level properties, convergence rate, and consistency of our estimate. By simulation comparisons and an analysis of the DREAM 4 Challenge data set, we demonstrate that our method outperforms the existing methods when the Gaussian or copula Gaussian assumptions are violated, and its performance remains excellent in the high-dimensional setting.

Sufficient Dimension Reduction

Sufficient Dimension Reduction PDF Author: Bing Li
Publisher: CRC Press
ISBN: 1351645730
Category : Mathematics
Languages : en
Pages : 362

Book Description
Sufficient dimension reduction is a rapidly developing research field that has wide applications in regression diagnostics, data visualization, machine learning, genomics, image processing, pattern recognition, and medicine, because they are fields that produce large datasets with a large number of variables. Sufficient Dimension Reduction: Methods and Applications with R introduces the basic theories and the main methodologies, provides practical and easy-to-use algorithms and computer codes to implement these methodologies, and surveys the recent advances at the frontiers of this field. Features Provides comprehensive coverage of this emerging research field. Synthesizes a wide variety of dimension reduction methods under a few unifying principles such as projection in Hilbert spaces, kernel mapping, and von Mises expansion. Reflects most recent advances such as nonlinear sufficient dimension reduction, dimension folding for tensorial data, as well as sufficient dimension reduction for functional data. Includes a set of computer codes written in R that are easily implemented by the readers. Uses real data sets available online to illustrate the usage and power of the described methods. Sufficient dimension reduction has undergone momentous development in recent years, partly due to the increased demands for techniques to process high-dimensional data, a hallmark of our age of Big Data. This book will serve as the perfect entry into the field for the beginning researchers or a handy reference for the advanced ones. The author Bing Li obtained his Ph.D. from the University of Chicago. He is currently a Professor of Statistics at the Pennsylvania State University. His research interests cover sufficient dimension reduction, statistical graphical models, functional data analysis, machine learning, estimating equations and quasilikelihood, and robust statistics. He is a fellow of the Institute of Mathematical Statistics and the American Statistical Association. He is an Associate Editor for The Annals of Statistics and the Journal of the American Statistical Association.

A Study of Sufficient Dimension Reduction Methods

A Study of Sufficient Dimension Reduction Methods PDF Author: Chong Wang
Publisher:
ISBN:
Category :
Languages : en
Pages : 96

Book Description


Dimension Reduction

Dimension Reduction PDF Author: Christopher J. C. Burges
Publisher: Now Publishers Inc
ISBN: 1601983786
Category : Computers
Languages : en
Pages : 104

Book Description
We give a tutorial overview of several foundational methods for dimension reduction. We divide the methods into projective methods and methods that model the manifold on which the data lies. For projective methods, we review projection pursuit, principal component analysis (PCA), kernel PCA, probabilistic PCA, canonical correlation analysis (CCA), kernel CCA, Fisher discriminant analysis, oriented PCA, and several techniques for sufficient dimension reduction. For the manifold methods, we review multidimensional scaling (MDS), landmark MDS, Isomap, locally linear embedding, Laplacian eigenmaps, and spectral clustering. Although the review focuses on foundations, we also provide pointers to some more modern techniques. We also describe the correlation dimension as one method for estimating the intrinsic dimension, and we point out that the notion of dimension can be a scale-dependent quantity. The Nystr m method, which links several of the manifold algorithms, is also reviewed. We use a publicly available dataset to illustrate some of the methods. The goal is to provide a self-contained overview of key concepts underlying many of these algorithms, and to give pointers for further reading.

Sufficient Dimension Reduction Based on Normal and Wishart Inverse Models

Sufficient Dimension Reduction Based on Normal and Wishart Inverse Models PDF Author: Liliana Forzani
Publisher:
ISBN:
Category :
Languages : en
Pages : 358

Book Description


Covariate Information

Covariate Information PDF Author: Debmalya Nandy
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
In two major parts as described below, this dissertation presents two novelmethods for reducing the dimension of the covariate space in large supervised problems.One performs dimension reduction by means of identifying linear combinations of theoriginal covariates (we use the terms covariates, features, predictors, and explanatoryvariables interchangeably) in high-dimensional regressions. The other screens featuresas a preliminary step of analysis in ultrahigh-dimensional regressions.In Part A, building upon recent research on the applications of the DensityInformation Matrix (DIM), we develop a tool for Sufficient Dimension Reduction (SDR)in regression problems called Covariate Information Matrix (CIM). CIM exhaustivelyidentifies the Central Subspace (CS) and provides a rank ordering of the reducedcovariates in terms of their regression information. Compared to other popular SDRmethods, CIM does not require distributional assumptions on the covariates, or estimationof the mean regression function. CIM is implemented via eigen-decompositionof a matrix estimated with a previously developed efficient nonparametric densityestimation technique. We also propose a bootstrap-based diagnostic plot for estimatingthe dimension of the CS. Results of simulations and real data applications demonstratesuperior or competitive performance of CIM compared to that of some other SDRmethods. In its current formulation, CIM is applicable to scenarios where the numberof covariates (p) is high but still smaller than the number of sample units (n).In Part B, we consider a different type of scenarios, such as those generatedby contemporary high-throughput experimental and surveying techniques. Thesegive rise to ultrahigh-dimensional supervised problems with sparse signals, i.e. alimited number of observations (n), each with a very large number of covariates(p ” n), only a small share of which is truly associated with the response. In thesesettings, major concerns on computational burden, algorithmic stability and statisticalaccuracy call for substantially reducing the feature space by eliminating redundant covariates before the use of any sophisticated statistical analysis. Following thedevelopment of Sure Independence Screening (Fan and Lv, 2008) and other model andcorrelation-based feature screening methods, we propose a model-free procedurecalled Covariate Information Number - Sure Independence Screening (CIN-SIS orCIS, in short). Notably, the CIN is the univariate version of the CIM in Part A. CISuses a marginal utility built upon Fisher Information, possesses the sure screeningproperty and is applicable to any type of response. Simulations and an applicationto transcriptomic data on rats reveal CIS' comparative strengths over some popularfeature screening methods.Finally, in Part C, we discuss potential future research avenues stemming fromParts A and B. We consider some existing strategies for sparse sufficient dimensionreduction and sparse estimation of large matrices that can be adopted to estimateCIM in scenarios with p > n. Regarding CIS, we consider iterative versions of featurescreening algorithms, leading to iterative CIS. We also discuss some thresholdingstrategies to determine the cardinality of the set of features selected during screening. Inthis context, we propose a novel bootstrap-based graphical diagnostic applicable toany feature screening algorithm. Our preliminary simulation results demonstrate theeffectiveness of this diagnostic.

Dimension Reduction and Variable Selection

Dimension Reduction and Variable Selection PDF Author: Hossein Moradi Rekabdarkolaee
Publisher:
ISBN:
Category : Multivariate analysis
Languages : en
Pages :

Book Description
High-dimensional data are becoming increasingly available as data collection technology advances. Over the last decade, significant developments have been taking place in high-dimensional data analysis, driven primarily by a wide range of applications in many fields such as genomics, signal processing, and environmental studies. Statistical techniques such as dimension reduction and variable selection play important roles in high dimensional data analysis. Sufficient dimension reduction provides a way to find the reduced space of the original space without a parametric model. This method has been widely applied in many scientific fields such as genetics, brain imaging analysis, econometrics, environmental sciences, etc. in recent years. In this dissertation, we worked on three projects. The first one combines local modal regression and Minimum Average Variance Estimation (MAVE) to introduce a robust dimension reduction approach. In addition to being robust to outliers or heavy-tailed distribution, our proposed method has the same convergence rate as the original MAVE. Furthermore, we combine local modal base MAVE with a $L_1$ penalty to select informative covariates in a regression setting. This new approach can exhaustively estimate directions in the regression mean function and select informative covariates simultaneously, while being robust to the existence of possible outliers in the dependent variable. The second project develops sparse adaptive MAVE (saMAVE). SaMAVE has advantages over adaptive LASSO because it extends adaptive LASSO to multi-dimensional and nonlinear settings, without any model assumption, and has advantages over sparse inverse dimension reduction methods in that it does not require any particular probability distribution on \textbf{X}. In addition, saMAVE can exhaustively estimate the dimensions in the conditional mean function. The third project extends the envelope method to multivariate spatial data. The envelope technique is a new version of the classical multivariate linear model. The estimator from envelope asymptotically has less variation compare to the Maximum Likelihood Estimator (MLE). The current envelope methodology is for independent observations. While the assumption of independence is convenient, this does not address the additional complication associated with a spatial correlation. This work extends the idea of the envelope method to cases where independence is an unreasonable assumption, specifically multivariate data from spatially correlated process. This novel approach provides estimates for the parameters of interest with smaller variance compared to maximum likelihood estimator while still being able to capture the spatial structure in the data.