A Study of the Asymptotic Properties of Lasso Estimates for Correlated Data PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download A Study of the Asymptotic Properties of Lasso Estimates for Correlated Data PDF full book. Access full book title A Study of the Asymptotic Properties of Lasso Estimates for Correlated Data by Shuva Gupta. Download full books in PDF and EPUB format.

A Study of the Asymptotic Properties of Lasso Estimates for Correlated Data

A Study of the Asymptotic Properties of Lasso Estimates for Correlated Data PDF Author: Shuva Gupta
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
ABSTRACT: In this thesis we investigate post-model selection properties of L1 penalized weighted least squares estimators in regression models with a large number of variables M and correlated errors. We focus on correct subset selection and on the asymptotic distribution of the penalized estimators. In the simple case of AR(1) errors we give conditions under which correct subset selection can be achieved via our procedure. We then provide a detailed generalization of this result to models with errors that have a weak-dependency structure (Doukhan 1996). In all cases, the number M of regression variables is allowed to exceed the sample size n. We further investigate the asymptotic distribution of our estimates, when M

A Study of the Asymptotic Properties of Lasso Estimates for Correlated Data

A Study of the Asymptotic Properties of Lasso Estimates for Correlated Data PDF Author: Shuva Gupta
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
ABSTRACT: In this thesis we investigate post-model selection properties of L1 penalized weighted least squares estimators in regression models with a large number of variables M and correlated errors. We focus on correct subset selection and on the asymptotic distribution of the penalized estimators. In the simple case of AR(1) errors we give conditions under which correct subset selection can be achieved via our procedure. We then provide a detailed generalization of this result to models with errors that have a weak-dependency structure (Doukhan 1996). In all cases, the number M of regression variables is allowed to exceed the sample size n. We further investigate the asymptotic distribution of our estimates, when M

Lasso for Autoregressive and Moving Average Coeffi[ci]ents Via Residuals of Unobservable Time Series

Lasso for Autoregressive and Moving Average Coeffi[ci]ents Via Residuals of Unobservable Time Series PDF Author: Hanh Nguyen
Publisher:
ISBN:
Category : Autoregression (Statistics).
Languages : en
Pages : 115

Book Description
This dissertation contains four topics in time series data analysis. First, we propose the oracle model selection for autoregressive time series when the observations are contaminated with trend. An adaptive least absolute shrinkage and selection operator (LASSO) type model selection method is used after the trend is estimated by non-parametric B-splines method. The first step is to estimate the trend by B-splines method and then calculate the detrended residuals. The second step is to use the residuals as if they were observations to optimize an adaptive LASSO type objective function. The oracle properties of such an Adaptive Lasso model selection procedure are established; that is, the proposed method can identify the true model with probability approaching one as the sample size increases, and the asymptotic properties of estimators are not affected by the replacement of observations with detrended residuals. The extensive simulation studies of several constrained and unconstrained autoregressive models also confirm the theoretical results. The method is illustrated by two time series data sets, the annual U.S. tobacco production and annual tree ring width measurements. Second, we generalize our first topic to a more general class of time series using the autoregressive and moving-average (ARMA) model. The ARMA model class is the building block for stationary time series analysis. We adopt the two-step method non-parametric trend estimation with B-spline and model selection and model estimation with the adaptive LASSO. We prove that such model selection and model estimation procedure possesses the oracle properties. Another important objective of this topic is forecasting time series with trend. We approach the forecasting problem by two methods: the empirical method by using the one-step ahead prediction in time series and the bagging method. Our simulation studies show that both methods are efficient with the decreased mean square error when the sample size increases. Simulation studies are adopted to illustrate the asymptotic result of our proposed method for model selection and model estimation with twelve ARMA(p, q) models, in which p an q are in the range from 1 to 15. The method is also illustrated by two time series data sets from the New York State Energy Research and Development Authority (NYSERDA), a public benefit corporation which offers data and analysis to help New Yorkers increase energy efficiency. Third, we propose a new model class, which is motivated by lag effects of covariates on the dependent variable. Our paper aims at providing more accurate statistical analysis for the relationship, for example, between the outcome of an event that occurs once every several years and the covariates that have observations every year. Lag effects have received a great deal of attention since Almon (1965) proposed linear distributed lag models to model the dependence of time series on several regressors from a correlated sequence. Motivated by the linear distributed lag model, we propose distributed generalized linear models as well as the estimation procedure for the model coefficients. The estimators from our proposed procedure are shown to be oracle or asymptotically efficient. Simulation studies confirm the asymptotic properties of the estimators and present consequences of model misspecification as well as better model prediction accuracy. The application is illustrated by analysis of the presidential election data in 2016. Fourth, we aim to provide an easy-to-use data analysis procedure for linear regression with non-independent errors. In practice, errors in regression model may be non-independent. In such situation, it is usually suitable to assume that the error terms for the model follow a time series structure. In fact, this type of model structure (reffered as RegARMA) has received great interests from researchers. Pierce (1971) discussed a nonlinear least squares estimation of RegARMA; Greenhouse et al. (1987) studied biological rhythm data by using the RegARMA model. Recently, Wu and Wang (2012) used the shrinkage estimation procedure to analyze data using RegARMA. However, in the literature the trend factor of the time series has not been considered. We will use the same idea of the two step-procedure as in the first project and the second project for our approach. We first estimate the trend of the time series by using a non-parametric method such as B-spline or linear Kernel. We then use the adaptive LASSO method for model selection and model estimation of the linear part and the time series error part. Simulation results show that our approach works quite well. However, it would be very interesting and challenging to improve the estimations and extend the estimation method to more complicated models, which will be the focus of the future research.

Asymptotic Properties of a Rank Estimate in Heteroscedastic Linear Regression

Asymptotic Properties of a Rank Estimate in Heteroscedastic Linear Regression PDF Author: Kristi Kuljus
Publisher:
ISBN:
Category :
Languages : en
Pages : 38

Book Description


Statistical Learning with Sparsity

Statistical Learning with Sparsity PDF Author: Trevor Hastie
Publisher: CRC Press
ISBN: 1498712177
Category : Business & Economics
Languages : en
Pages : 354

Book Description
Discover New Methods for Dealing with High-Dimensional DataA sparse statistical model has only a small number of nonzero parameters or weights; therefore, it is much easier to estimate and interpret than a dense model. Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underl

TWO-STAGE SCAD LASSO FOR LINEAR MIXED MODEL SELECTION

TWO-STAGE SCAD LASSO FOR LINEAR MIXED MODEL SELECTION PDF Author: Mohammed A. Yousef
Publisher:
ISBN:
Category : Linear models (Statistics)
Languages : en
Pages : 116

Book Description
Linear regression model is the classical approach to explain the relationship between the response variable (dependent) and predictors (independent). However, when the number of predictors in the data increases, the likelihood of the correlation between predictors also increases, which is problematic. To avoid that, the linear mixed effects model was proposed which consists of a fixed effects term and a random effects term. The fixed effects term represents the traditional linear regression coefficients, and the random effects term represents the values that are drawn randomly from the population. Thus, the linear mixed model allows us to represent the mean as well as the covariance structure of the data in a single model. When the fixed and random effects terms increase in their dimensions, selection as appropriate model, which is the optimum fit, becomes increasingly difficult. Due to this natural complexity inherent in the linear mixed model, in this dissertation we propose a two-stage method for selecting fixed and random effects terms. In the first stage, we select the most significant fixed effects in the model based on the conditional distribution of the response variable given the random effects. This is achieved by minimizing the penalized least square estimator with a SCAD Lasso penalty term. We used the Newton-Raphson optimization algorithm to implement the parameter estimations. In this process, the coefficients of the unimportant predictors shrink towards exactly zero, thus eliminating the noise from the model. Subsequently, in the second stage we choose the most important random effects by maximizing the penalized profile log-likelihood function. This maximization is achieved using the Newton-Raphson optimization algorithm. As in the first stage, the penalty term appended is SCAD Lasso. Unlike the fixed effects, the random effects are drawn randomly from the population; hence, they need to be predicted. This prediction is done by estimating the diagonal elements (variances) of the covariance structure of the random effects. Note that during this step, for all random effects that are unimportant, the corresponding variance components will shrink to exactly zero (similar to the shrinking of fixed effects parameters in the first stage). This is how noise is eliminated from the model while retaining only significant effects. Hence, the selection of the random effects is completed. In both stages of the proposed approach, it is shown that the selection of the effects through elimination is done with the probability tending to one. It is indicative that the proposed method surely identifies all true effects, fixed as well as random. Also, it is shown that the proposed method satisfies the oracle properties, namely asymptotic normality and sparsity. At the end of these two stages, we have the optimal linear mixed model which can be readily applied to correlated data. To test the overall effectiveness of the proposed approach, four simulation studies are conducted. Each scenario has a different number of subjects, different observations per subject, and different covariance structures on which the data are generated. The simulation results illustrate that the proposed method can effectively select the fixed effects and random effects in the linear mixed model. In the simulations, the proposed method is also compared with other model selection methods, and the simulation results make it manifest that the proposed method performs better in choosing the true model. Subsequently, two applications, Amsterdam growth and health study data (Kemper, 1995) and Messier 69 data-Astronomy application (Husband, 2017), are utilized to investigate how the proposed approach behaves with the real-life data. In both applications, the proposed method is compared with other methods. The proposed method proves to be more effective than its counterparts in identifying the appropriate mixed model.

Asymptotic Properties of Restricted L 1-estimates of Regression

Asymptotic Properties of Restricted L 1-estimates of Regression PDF Author: Jitka Dupac̆ová
Publisher:
ISBN:
Category :
Languages : en
Pages : 17

Book Description


Asymptotic Properties of Nonlinear Least Squares Estimates in Stochastic Regression Models

Asymptotic Properties of Nonlinear Least Squares Estimates in Stochastic Regression Models PDF Author: Stanford University. Department of Statistics
Publisher:
ISBN:
Category :
Languages : en
Pages : 12

Book Description


Nonlinear Programming

Nonlinear Programming PDF Author: Anthony V. Fiacco
Publisher: SIAM
ISBN: 0898712548
Category : Mathematics
Languages : en
Pages : 224

Book Description
Analyzes the 'central' or 'dual' trajectory used by modern path following and primal/dual methods for convex / general linear programming.

Asymptotic Properties in Space and Time of an Estimator in Error-in Variables Models in the Presence of Validation Data

Asymptotic Properties in Space and Time of an Estimator in Error-in Variables Models in the Presence of Validation Data PDF Author:
Publisher:
ISBN:
Category :
Languages : en
Pages : 24

Book Description


Theory of Ridge Regression Estimation with Applications

Theory of Ridge Regression Estimation with Applications PDF Author: A. K. Md. Ehsanes Saleh
Publisher: John Wiley & Sons
ISBN: 1118644611
Category : Mathematics
Languages : en
Pages : 384

Book Description
A guide to the systematic analytical results for ridge, LASSO, preliminary test, and Stein-type estimators with applications Theory of Ridge Regression Estimation with Applications offers a comprehensive guide to the theory and methods of estimation. Ridge regression and LASSO are at the center of all penalty estimators in a range of standard models that are used in many applied statistical analyses. Written by noted experts in the field, the book contains a thorough introduction to penalty and shrinkage estimation and explores the role that ridge, LASSO, and logistic regression play in the computer intensive area of neural network and big data analysis. Designed to be accessible, the book presents detailed coverage of the basic terminology related to various models such as the location and simple linear models, normal and rank theory-based ridge, LASSO, preliminary test and Stein-type estimators. The authors also include problem sets to enhance learning. This book is a volume in the Wiley Series in Probability and Statistics series that provides essential and invaluable reading for all statisticians. This important resource: Offers theoretical coverage and computer-intensive applications of the procedures presented Contains solutions and alternate methods for prediction accuracy and selecting model procedures Presents the first book to focus on ridge regression and unifies past research with current methodology Uses R throughout the text and includes a companion website containing convenient data sets Written for graduate students, practitioners, and researchers in various fields of science, Theory of Ridge Regression Estimation with Applications is an authoritative guide to the theory and methodology of statistical estimation.