Permutation-based Inference for High-dimensional Data PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Permutation-based Inference for High-dimensional Data PDF full book. Access full book title Permutation-based Inference for High-dimensional Data by . Download full books in PDF and EPUB format.

Permutation-based Inference for High-dimensional Data

Permutation-based Inference for High-dimensional Data PDF Author:
Publisher:
ISBN: 9789461828859
Category :
Languages : en
Pages : 125

Book Description


Permutation-based Inference for High-dimensional Data

Permutation-based Inference for High-dimensional Data PDF Author:
Publisher:
ISBN: 9789461828859
Category :
Languages : en
Pages : 125

Book Description


Statistical Inference from High Dimensional Data

Statistical Inference from High Dimensional Data PDF Author: Carlos Fernandez-Lozano
Publisher: MDPI
ISBN: 3036509445
Category : Science
Languages : en
Pages : 314

Book Description
• Real-world problems can be high-dimensional, complex, and noisy • More data does not imply more information • Different approaches deal with the so-called curse of dimensionality to reduce irrelevant information • A process with multidimensional information is not necessarily easy to interpret nor process • In some real-world applications, the number of elements of a class is clearly lower than the other. The models tend to assume that the importance of the analysis belongs to the majority class and this is not usually the truth • The analysis of complex diseases such as cancer are focused on more-than-one dimensional omic data • The increasing amount of data thanks to the reduction of cost of the high-throughput experiments opens up a new era for integrative data-driven approaches • Entropy-based approaches are of interest to reduce the dimensionality of high-dimensional data

Permutation Tests for Complex Data

Permutation Tests for Complex Data PDF Author: Fortunato Pesarin
Publisher: John Wiley & Sons
ISBN: 9780470689523
Category : Mathematics
Languages : en
Pages : 448

Book Description
Complex multivariate testing problems are frequently encountered in many scientific disciplines, such as engineering, medicine and the social sciences. As a result, modern statistics needs permutation testing for complex data with low sample size and many variables, especially in observational studies. The Authors give a general overview on permutation tests with a focus on recent theoretical advances within univariate and multivariate complex permutation testing problems, this book brings the reader completely up to date with today’s current thinking. Key Features: Examines the most up-to-date methodologies of univariate and multivariate permutation testing. Includes extensive software codes in MATLAB, R and SAS, featuring worked examples, and uses real case studies from both experimental and observational studies. Includes a standalone free software NPC Test Release 10 with a graphical interface which allows practitioners from every scientific field to easily implement almost all complex testing procedures included in the book. Presents and discusses solutions to the most important and frequently encountered real problems in multivariate analyses. A supplementary website containing all of the data sets examined in the book along with ready to use software codes. Together with a wide set of application cases, the Authors present a thorough theory of permutation testing both with formal description and proofs, and analysing real case studies. Practitioners and researchers, working in different scientific fields such as engineering, biostatistics, psychology or medicine will benefit from this book.

High-Dimensional Probability

High-Dimensional Probability PDF Author: Roman Vershynin
Publisher: Cambridge University Press
ISBN: 1108415199
Category : Business & Economics
Languages : en
Pages : 299

Book Description
An integrated package of powerful probabilistic tools and key applications in modern mathematical data science.

Foundations of Linear and Generalized Linear Models

Foundations of Linear and Generalized Linear Models PDF Author: Alan Agresti
Publisher: John Wiley & Sons
ISBN: 1118730038
Category : Mathematics
Languages : en
Pages : 471

Book Description
A valuable overview of the most important ideas and results in statistical modeling Written by a highly-experienced author, Foundations of Linear and Generalized Linear Models is a clear and comprehensive guide to the key concepts and results of linearstatistical models. The book presents a broad, in-depth overview of the most commonly usedstatistical models by discussing the theory underlying the models, R software applications,and examples with crafted models to elucidate key ideas and promote practical modelbuilding. The book begins by illustrating the fundamentals of linear models, such as how the model-fitting projects the data onto a model vector subspace and how orthogonal decompositions of the data yield information about the effects of explanatory variables. Subsequently, the book covers the most popular generalized linear models, which include binomial and multinomial logistic regression for categorical data, and Poisson and negative binomial loglinear models for count data. Focusing on the theoretical underpinnings of these models, Foundations ofLinear and Generalized Linear Models also features: An introduction to quasi-likelihood methods that require weaker distributional assumptions, such as generalized estimating equation methods An overview of linear mixed models and generalized linear mixed models with random effects for clustered correlated data, Bayesian modeling, and extensions to handle problematic cases such as high dimensional problems Numerous examples that use R software for all text data analyses More than 400 exercises for readers to practice and extend the theory, methods, and data analysis A supplementary website with datasets for the examples and exercises An invaluable textbook for upper-undergraduate and graduate-level students in statistics and biostatistics courses, Foundations of Linear and Generalized Linear Models is also an excellent reference for practicing statisticians and biostatisticians, as well as anyone who is interested in learning about the most important statistical models for analyzing data.

Medical Image Computing and Computer Assisted Intervention – MICCAI 2018

Medical Image Computing and Computer Assisted Intervention – MICCAI 2018 PDF Author: Alejandro F. Frangi
Publisher: Springer
ISBN: 3030009289
Category : Computers
Languages : en
Pages : 918

Book Description
The four-volume set LNCS 11070, 11071, 11072, and 11073 constitutes the refereed proceedings of the 21st International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2018, held in Granada, Spain, in September 2018. The 373 revised full papers presented were carefully reviewed and selected from 1068 submissions in a double-blind review process. The papers have been organized in the following topical sections: Part I: Image Quality and Artefacts; Image Reconstruction Methods; Machine Learning in Medical Imaging; Statistical Analysis for Medical Imaging; Image Registration Methods. Part II: Optical and Histology Applications: Optical Imaging Applications; Histology Applications; Microscopy Applications; Optical Coherence Tomography and Other Optical Imaging Applications. Cardiac, Chest and Abdominal Applications: Cardiac Imaging Applications: Colorectal, Kidney and Liver Imaging Applications; Lung Imaging Applications; Breast Imaging Applications; Other Abdominal Applications. Part III: Diffusion Tensor Imaging and Functional MRI: Diffusion Tensor Imaging; Diffusion Weighted Imaging; Functional MRI; Human Connectome. Neuroimaging and Brain Segmentation Methods: Neuroimaging; Brain Segmentation Methods. Part IV: Computer Assisted Intervention: Image Guided Interventions and Surgery; Surgical Planning, Simulation and Work Flow Analysis; Visualization and Augmented Reality. Image Segmentation Methods: General Image Segmentation Methods, Measures and Applications; Multi-Organ Segmentation; Abdominal Segmentation Methods; Cardiac Segmentation Methods; Chest, Lung and Spine Segmentation; Other Segmentation Applications.

Dimension Reduction and High-dimensional Data

Dimension Reduction and High-dimensional Data PDF Author: Maxime Turgeon
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
"Recent technological advances in many domains including both genomics and brain imaging have led to an abundance of high-dimensional and correlated data being routinely collected. A widespread analytical goal in these fields is to investigate the relationships between, on the one hand, a group of genomic markers or anatomical brain measurements and, on theother hand, a set of clinical variables or phenotypes. To leverage the correlation within each set of measurements, and to improve the interpretability of a measure of the association, one can use dimension reduction techniques: one, or both, group of variables can be summarised by a small set of latent features that summarise the structure of interest andcapture association through an appropriately chosen statistic. But the high-dimensionality of contemporary datasets brings many computational and theoretical challenges, and most classical multivariate methods cannot be used directly.This thesis is comprised primarily of three manuscripts that investigate the issues related to measuring association in high dimensional datasets. In the first manuscript, I explore the optimality properties of a dimension reduction method known as Principal Component of Explained Variance (PCEV). This method seeks a linear combination of the outcome variablesthat maximises the proportion of variance explained by a set of covariates of interest. I then explain how PCEV can be extended to a computationally simple and efficient estimation strategy for high-dimensional outcomes (p > n) that relies on a "block-independence" assumption. In the second manuscript, I study the problem of inference with high-dimensional datasets: given two datasets Y and X, with one or both being high-dimensional, how can we perform a test of association in a computationally efficient way? Specifically, I look at the set of multivariate methods that can be described as a double Wishart problem; PCEV, Canonical Correlation Analysis (CCA), and Multivariate Analysis of Variance (MANOVA) are all examples of double Wishart problems. I show that valid high-dimensional p-values can be derived using an empirical estimator of the null distribution. This is achieved by performing a small number of permutations, and then fitting a location-scale family of the Tracy-Widom distribution of order 1 to the test statistics computed from the permuted data. Finally, in the third manuscript, I apply the concepts developed in the two other manuscripts to a data analysis of targeted custom capture bisulfite methylation data. I show how PCEV can be used in conjunction with the ideas in the second manuscript to test for a region-level association between the methylation levels of CpG dinucleotides and levels of anti-citrullinated protein antibody (ACPA), an antigen thought to be a predictor of rheumatoid arthritis onset. In this study, the CpG dinucleotides are naturally grouped by design, and several of these groups contain a number of methylation measurements that is larger than the samplesize." --

Sparse Graphical Modeling for High Dimensional Data

Sparse Graphical Modeling for High Dimensional Data PDF Author: Faming Liang
Publisher: CRC Press
ISBN: 0429582900
Category : Mathematics
Languages : en
Pages : 150

Book Description
This book provides a general framework for learning sparse graphical models with conditional independence tests. It includes complete treatments for Gaussian, Poisson, multinomial, and mixed data; unified treatments for covariate adjustments, data integration, and network comparison; unified treatments for missing data and heterogeneous data; efficient methods for joint estimation of multiple graphical models; effective methods of high-dimensional variable selection; and effective methods of high-dimensional inference. The methods possess an embarrassingly parallel structure in performing conditional independence tests, and the computation can be significantly accelerated by running in parallel on a multi-core computer or a parallel architecture. This book is intended to serve researchers and scientists interested in high-dimensional statistics, and graduate students in broad data science disciplines. Key Features: A general framework for learning sparse graphical models with conditional independence tests Complete treatments for different types of data, Gaussian, Poisson, multinomial, and mixed data Unified treatments for data integration, network comparison, and covariate adjustment Unified treatments for missing data and heterogeneous data Efficient methods for joint estimation of multiple graphical models Effective methods of high-dimensional variable selection Effective methods of high-dimensional inference

Two Graph-based Tests for High-dimensional Inference

Two Graph-based Tests for High-dimensional Inference PDF Author: Hao Chen
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
With modern science there is a growing emphasis on multivariate, complex data types. Some of these data are high dimensional. Others, such as survey preference, network, and tree data, cannot be characterized easily with standard models on Euclidean spaces. This dissertation details the investigation in this new setting of two classic statistical problems: change-point detection and two-sample comparison of categorical data. Change-point models are widely used in various fields for detecting lack of homogeneity in a sequence of observations. In many applications, the dimension of the observations in the sequence can be very high, even much larger than the length of the sequence. Testing the homogeneity of such sequences is a challenging but important problem. Existing approaches are limited in many ways. We proposed a new non-parametric approach that can be applied to data in high dimension, and even to non-Euclidean object data, as long as an informative similarity measure on the sample space can be defined. The approach is graph-based two-sample tests adapted to the scan-statistic setting. Graph-based two-sample tests are tests base on graphs connecting observations by similarity [Friedman and Rafsky, 1979, Rosenbaum, 2005]. We show that this new approach is powerful in high dimensions compared to parametric approaches. We also derive accurate analytic $p$-value approximations for very general situations, which lead to easy off-the-shelf homogeneity testing for large multivariate data sets. This approach has been applied on two data sets: The determination of authorship of a classic novel, and the detection of change in a social network over time. Two-sample comparison of categorical data is a classic problem in statistics. In many modern applications, the number of categories can be quite large, even comparable to the sample size, causing existing methods to have low power. When the number of categories is large, there is often underlying structure on the sample space that can be exploited. We propose a general non-parametric approach that makes use of similarity information on the space of categories in two-sample tests. Our approach addresses a shortcoming of existing graph-based two-sample tests by no longer requiring uniqueness of the underlying graph, thus allowing ties in the distance matrix defining the graph. We found two types of statistics that are both powerful and fast to compute. We show that their permutation null distributions are asymptotically normal and that their $p$-value approximations under typical settings are quite accurate, facilitating the application of this approach.

Statistical Modeling for Biological Systems

Statistical Modeling for Biological Systems PDF Author: Anthony Almudevar
Publisher: Springer Nature
ISBN: 3030346757
Category : Medical
Languages : en
Pages : 361

Book Description
This book commemorates the scientific contributions of distinguished statistician, Andrei Yakovlev. It reflects upon Dr. Yakovlev’s many research interests including stochastic modeling and the analysis of micro-array data, and throughout the book it emphasizes applications of the theory in biology, medicine and public health. The contributions to this volume are divided into two parts. Part A consists of original research articles, which can be roughly grouped into four thematic areas: (i) branching processes, especially as models for cell kinetics, (ii) multiple testing issues as they arise in the analysis of biologic data, (iii) applications of mathematical models and of new inferential techniques in epidemiology, and (iv) contributions to statistical methodology, with an emphasis on the modeling and analysis of survival time data. Part B consists of methodological research reported as a short communication, ending with some personal reflections on research fields associated with Andrei and on his approach to science. The Appendix contains an abbreviated vitae and a list of Andrei’s publications, complete as far as we know. The contributions in this book are written by Dr. Yakovlev’s collaborators and notable statisticians including former presidents of the Institute of Mathematical Statistics and of the Statistics Section of the AAAS. Dr. Yakovlev’s research appeared in four books and almost 200 scientific papers, in mathematics, statistics, biomathematics and biology journals. Ultimately this book offers a tribute to Dr. Yakovlev’s work and recognizes the legacy of his contributions in the biostatistics community.