Global Testing and Large-Scale Multiple Testing for High-Dimensional Covariance Structures PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Global Testing and Large-Scale Multiple Testing for High-Dimensional Covariance Structures PDF full book. Access full book title Global Testing and Large-Scale Multiple Testing for High-Dimensional Covariance Structures by Tony Cai. Download full books in PDF and EPUB format.

Global Testing and Large-Scale Multiple Testing for High-Dimensional Covariance Structures

Author: Tony Cai
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
Driven by a wide range of contemporary applications, statistical inference for covariance structures has been an active area of current research in high-dimensional statistics. This review provides a selective survey of some recent developments in hypothesis testing for high-dimensional covariance structures, including global testing for the overall pattern of the covariance structures and simultaneous testing of a large collection of hypotheses on the local covariance structures with false discovery proportion and false discovery rate control. Both one-sample and two-sample settings are considered. The specific testing problems discussed include global testing for the covariance, correlation, and precision matrices, and multiple testing for the correlations, Gaussian graphical models, and differential networks.

Global Testing and Large-Scale Multiple Testing for High-Dimensional Covariance Structures

Author: Tony Cai
Publisher:
ISBN:
Category :
Languages : en
Pages :

Large Scale Multiple Testing for High-Dimensional Nonparanormal Data

Author: Yanhui Xu
Publisher:
ISBN:
Category :
Languages : en
Pages : 107

Book Description
False discovery control in high dimensional multiple testing has been frequently encountered in many scientific research. Under the multivariate normal distribution assumption, \cite{fan2012} proposed an approximate expression for false discovery proportion (FDP) in large-scale multiple testing when a common threshold is used and provided a consistent estimate of realized FDP when the covariance matrix is known. They further extended their study when the covariance matrix is unknown \citep{fan2017}. However, in reality, the multivariate normal assumption is often violated. In this paper, we relaxed the normal assumption by developing a testing procedure on nonparanormal distribution which extends the Gaussian family to a much larger population. The nonparanormal distribution is indeed a high dimensional Gaussian copula with nonparametric marginals. Estimating the underlying monotone functions is key to good FDP approximation. Our procedure achieved minimal mean error in approximating the FDP compared with other methods in simulation studies. We gave theoretical investigations regarding the performance of estimated covariance matrix and false rejections. In real dataset setting, our method was able to detect more differentiated genes while still maintaining the FDP under a small level. This thesis provides an important tool for approximating FDP in a given experiment where the normal assumption may not hold. We also developed a dependence-adjusted procedure which provides more power than fixed-threshold method. Our procedure also show robustness for heavy-tailed data under a variety of distributions in numeric studies.

Multivariate Statistical Modeling in Engineering and Management

Author: Jhareswar Maiti
Publisher: CRC Press
ISBN: 1000618390
Category : Mathematics
Languages : en
Pages : 637

Book Description
The book focuses on problem solving for practitioners and model building for academicians under multivariate situations. This book helps readers in understanding the issues, such as knowing variability, extracting patterns, building relationships, and making objective decisions. A large number of multivariate statistical models are covered in the book. The readers will learn how a practical problem can be converted to a statistical problem and how the statistical solution can be interpreted as a practical solution. Key features: Links data generation process with statistical distributions in multivariate domain Provides step by step procedure for estimating parameters of developed models Provides blueprint for data driven decision making Includes practical examples and case studies relevant for intended audiences The book will help everyone involved in data driven problem solving, modeling and decision making.

Computational Statistics in Data Science

Author: Richard A. Levine
Publisher: John Wiley & Sons
ISBN: 1119561086
Category : Mathematics
Languages : en
Pages : 672

Book Description
Ein unverzichtbarer Leitfaden bei der Anwendung computergestützter Statistik in der modernen Datenwissenschaft In Computational Statistics in Data Science präsentiert ein Team aus bekannten Mathematikern und Statistikern eine fundierte Zusammenstellung von Konzepten, Theorien, Techniken und Praktiken der computergestützten Statistik für ein Publikum, das auf der Suche nach einem einzigen, umfassenden Referenzwerk für Statistik in der modernen Datenwissenschaft ist. Das Buch enthält etliche Kapitel zu den wesentlichen konkreten Bereichen der computergestützten Statistik, in denen modernste Techniken zeitgemäß und verständlich dargestellt werden. Darüber hinaus bietet Computational Statistics in Data Science einen kostenlosen Zugang zu den fertigen Einträgen im Online-Nachschlagewerk Wiley StatsRef: Statistics Reference Online. Außerdem erhalten die Leserinnen und Leser: * Eine gründliche Einführung in die computergestützte Statistik mit relevanten und verständlichen Informationen für Anwender und Forscher in verschiedenen datenintensiven Bereichen * Umfassende Erläuterungen zu aktuellen Themen in der Statistik, darunter Big Data, Datenstromverarbeitung, quantitative Visualisierung und Deep Learning Das Werk eignet sich perfekt für Forscher und Wissenschaftler sämtlicher Fachbereiche, die Techniken der computergestützten Statistik auf einem gehobenen oder fortgeschrittenen Niveau anwenden müssen. Zudem gehört Computational Statistics in Data Science in das Bücherregal von Wissenschaftlern, die sich mit der Erforschung und Entwicklung von Techniken der computergestützten Statistik und statistischen Grafiken beschäftigen.

High-Dimensional Covariance Estimation

Author: Mohsen Pourahmadi
Publisher: John Wiley & Sons
ISBN: 1118034295
Category : Mathematics
Languages : en
Pages : 204

Book Description
Methods for estimating sparse and large covariance matrices Covariance and correlation matrices play fundamental roles in every aspect of the analysis of multivariate data collected from a variety of fields including business and economics, health care, engineering, and environmental and physical sciences. High-Dimensional Covariance Estimation provides accessible and comprehensive coverage of the classical and modern approaches for estimating covariance matrices as well as their applications to the rapidly developing areas lying at the intersection of statistics and machine learning. Recently, the classical sample covariance methodologies have been modified and improved upon to meet the needs of statisticians and researchers dealing with large correlated datasets. High-Dimensional Covariance Estimation focuses on the methodologies based on shrinkage, thresholding, and penalized likelihood with applications to Gaussian graphical models, prediction, and mean-variance portfolio management. The book relies heavily on regression-based ideas and interpretations to connect and unify many existing methods and algorithms for the task. High-Dimensional Covariance Estimation features chapters on: Data, Sparsity, and Regularization Regularizing the Eigenstructure Banding, Tapering, and Thresholding Covariance Matrices Sparse Gaussian Graphical Models Multivariate Regression The book is an ideal resource for researchers in statistics, mathematics, business and economics, computer sciences, and engineering, as well as a useful text or supplement for graduate-level courses in multivariate analysis, covariance estimation, statistical learning, and high-dimensional data analysis.

Large Sample Covariance Matrices and High-Dimensional Data Analysis

Author: Jianfeng Yao
Publisher: Cambridge University Press
ISBN: 9781107065178
Category : Mathematics
Languages : en
Pages : 0

Book Description
High-dimensional data appear in many fields, and their analysis has become increasingly important in modern statistics. However, it has long been observed that several well-known methods in multivariate analysis become inefficient, or even misleading, when the data dimension p is larger than, say, several tens. A seminal example is the well-known inefficiency of Hotelling's T2-test in such cases. This example shows that classical large sample limits may no longer hold for high-dimensional data; statisticians must seek new limiting theorems in these instances. Thus, the theory of random matrices (RMT) serves as a much-needed and welcome alternative framework. Based on the authors' own research, this book provides a first-hand introduction to new high-dimensional statistical methods derived from RMT. The book begins with a detailed introduction to useful tools from RMT, and then presents a series of high-dimensional problems with solutions provided by RMT methods.

Sequential Multiple Testing for Variable Selection in High Dimensional Linear Model

Author: Hailu Chen
Publisher:
ISBN: 9781369300451
Category : Analysis of covariance
Languages : en
Pages : 137

Book Description
Covariance test is proposed for testing the significance of the predictor variable that enters the current lasso model along the lasso solution path. In this paper, we propose the sequential multiple testing structure using covariance test p-values, which has good power properties with error rate controlled at a desired level. Specifically, we consider the full underlying hypotheses and the error rate control within each step as well as across all steps along the lasso solution path.

Handbook of International Large-Scale Assessment

Author: Leslie Rutkowski
Publisher: CRC Press
ISBN: 1439895120
Category : Mathematics
Languages : en
Pages : 650

Book Description
Technological and statistical advances, along with a strong interest in gathering more information about the state of our educational systems, have made it possible to assess more students, in more countries, more often, and in more subject domains. The Handbook of International Large-Scale Assessment: Background, Technical Issues, and Methods of Data Analysis brings together recognized scholars in the field of ILSA, behavioral statistics, and policy to develop a detailed guide that goes beyond database user manuals. After highlighting the importance of ILSA data to policy and research, the book reviews methodological aspects and features of the studies based on operational considerations, analytics, and reporting. The book then describes methods of interest to advanced graduate students, researchers, and policy analysts who have a good grounding in quantitative methods, but who are not necessarily quantitative methodologists. In addition, it provides a detailed exposition of the technical details behind these assessments, including the test design, the sampling framework, and estimation methods, with a focus on how these issues impact analysis choices.

Multivariate Reduced-Rank Regression

Author: Raja Velu
Publisher: Springer Science & Business Media
ISBN: 1475728530
Category : Mathematics
Languages : en
Pages : 269

Book Description
In the area of multivariate analysis, there are two broad themes that have emerged over time. The analysis typically involves exploring the variations in a set of interrelated variables or investigating the simultaneous relation ships between two or more sets of variables. In either case, the themes involve explicit modeling of the relationships or dimension-reduction of the sets of variables. The multivariate regression methodology and its variants are the preferred tools for the parametric modeling and descriptive tools such as principal components or canonical correlations are the tools used for addressing the dimension-reduction issues. Both act as complementary to each other and data analysts typically want to make use of these tools for a thorough analysis of multivariate data. A technique that combines the two broad themes in a natural fashion is the method of reduced-rank regres sion. This method starts with the classical multivariate regression model framework but recognizes the possibility for the reduction in the number of parameters through a restrietion on the rank of the regression coefficient matrix. This feature is attractive because regression methods, whether they are in the context of a single response variable or in the context of several response variables, are popular statistical tools. The technique of reduced rank regression and its encompassing features are the primary focus of this book. The book develops the method of reduced-rank regression starting from the classical multivariate linear regression model.

Some New Developments on Multiple Testing Procedures

Author: Lilun Du
Publisher:
ISBN:
Category :
Languages : en
Pages : 134

Book Description
In the context of large-scale multiple testing, hypotheses are often accompanied with certain prior information. In chapter 2, we present a single-index modulated multiple testing procedure, which maintains control of the false discovery rate while incorporating prior information, by assuming the availability of a bivariate p-value for each hypothesis. To find the optimal rejection region for the bivariate p-value, we propose a criteria based on the ratio of probability density functions of the bivariate p-value under the true null and non-null. This criteria in the bivariate normal setting further motivates us to project the bivariate p-value to a single index p-value, for a wide range of directions. The true null distribution of the single index p-value is estimated via parametric and nonparametric approaches, leading to two procedures for estimating and controlling the false discovery rate. To derive the optimal projection direction, we propose a new approach based on power comparison, which is further shown to be consistent under some mild conditions. Multiple testing based on chi-squared test statistics is commonly used in many scientific fields such as genomics research and brain imaging studies. However, the challenges associated with designing a formal testing procedure when there exists a general dependence structure across the chi-squared test statistics have not been well addressed. In chapter 3, we propose a Factor Connected procedure to fill in this gap. We first adopt a latent factor structure to construct a testing framework for approximating the false discovery proportion (FDP) for a large number of highly correlated chi-squared test statistics with finite degrees of freedom k. The testing framework is then connected to simultaneously testing k linear constraints in a large dimensional linear factor model involved with some observable and unobservable common factors, resulting in a consistent estimator of FDP based on the associated unadjusted p-values.