Machine Learning Approaches to Refining Post-translational Modification Predictions and Protein Identifications from Tandem Mass Spectrometry PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Machine Learning Approaches to Refining Post-translational Modification Predictions and Protein Identifications from Tandem Mass Spectrometry PDF full book. Access full book title Machine Learning Approaches to Refining Post-translational Modification Predictions and Protein Identifications from Tandem Mass Spectrometry by Clement Chung. Download full books in PDF and EPUB format.

Machine Learning Approaches to Refining Post-translational Modification Predictions and Protein Identifications from Tandem Mass Spectrometry

Machine Learning Approaches to Refining Post-translational Modification Predictions and Protein Identifications from Tandem Mass Spectrometry PDF Author: Clement Chung
Publisher:
ISBN: 9780494794173
Category :
Languages : en
Pages :

Book Description


Machine Learning Approaches to Refining Post-translational Modification Predictions and Protein Identifications from Tandem Mass Spectrometry

Machine Learning Approaches to Refining Post-translational Modification Predictions and Protein Identifications from Tandem Mass Spectrometry PDF Author: Clement Chung
Publisher:
ISBN: 9780494794173
Category :
Languages : en
Pages :

Book Description


Analysis of Protein Post-Translational Modifications by Mass Spectrometry

Analysis of Protein Post-Translational Modifications by Mass Spectrometry PDF Author: John R. Griffiths
Publisher: John Wiley & Sons
ISBN: 1119045851
Category : Science
Languages : en
Pages : 414

Book Description
Covers all major modifications, including phosphorylation, glycosylation, acetylation, ubiquitination, sulfonation and and glycation Discussion of the chemistry behind each modification, along with key methods and references Contributions from some of the leading researchers in the field A valuable reference source for all laboratories undertaking proteomics, mass spectrometry and post-translational modification research

Novel Data Analysis Approaches for Cross-linking Mass Spectrometry Proteomics and Glycoproteomics

Novel Data Analysis Approaches for Cross-linking Mass Spectrometry Proteomics and Glycoproteomics PDF Author: Lei Lu
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
Bottom-up proteomics has emerged as a powerful technology for biological studies. The technique is used for a myriad of purposes, including among others protein identification, post-translational modification identification, protein-protein interaction analysis, protein quantification analysis, and protein structure analysis. The data analysis approaches of bottom-up proteomics have evolved over the past two decades, and many different algorithms and software programs have been developed for these varied purposes. In this thesis, I have focused on improving the database search strategies for the important special applications of bottom-up proteomics, including cross-linking mass spectrometry proteomics and O-glycoproteomics. In cross-linking mass spectrometry proteomics, a sample of proteins is treated with a chemical cross-linking reagent. This causes peptides within the proteins to be cross-linked to one another, forming peptide doublets that are released by treatment of the sample with a protease such as trypsin. The data analysis tools are designed to identify the cross-linked peptides. In O-glycoproteomics, the peptides that are released by protease digestion of the protein sample can be modified with any of or even multiple distinct O-glycans, and the data analysis tools should be able to identify all of the glycans and the modification sites at which they are located. In both cases, traditional database searching strategies which try to match the experimental spectra to all potential theoretical spectra is not practical due to the large increases in search space. Researchers suffered from a lack of efficient data analysis tools for these two applications. Here we successfully devised new search algorithms to address these problems, and impemented them in two new software modules in our laboratories' bottom-up software engine MetaMorpheus (Crosslinking data analysis via MetaMorpheusXL and O-glycoproteomics data analysis via O-Pair Search). The new search strategies used in the software program are both based on ion-indexed open search, which was first developed for large scale proteomic studies in the programs MSFragger and Open-pFind. The ion-indexed open search was optimized for cross-linking mass spectrometry proteomics and O-glycoproteomics in this study, and combined with other algorithms. In O-glycoproteomics, a graph-based algorithm is used to speed up the identification and localization of O-glycans. Other useful features have been added in the software program, such as enabling analysis of both cleavable cross-links and non-cleavable cross-links in the cross-link search module, and calculating localization probabilities in the O-glyco search module. Further optimizations including machine learning methods for false discovery rate (FDR) analysis, retention time prediction and spectral prediction could further improve the current best search approaches for cross-link proteomics and O-glycoproteomics data analysis. Chapter 1 provides an overview of bottom-up proteomics data analysis methods and outlines how ion-indexed open search could be useful for special bottom-up proteomics studies. Chapter 2 describes the development of a cross-linking mass spectrometry proteomics search module, resulting in efficiency improvements for both cleavable and non-cleavable cross-link proteomics data analysis. Chapter 3 describes the development of an O-glycoproteomics search module; by combining the ion-indexed open search algorithm with the graph-based localization algorithm, the O-pair Search is more than 2000 times faster than the currently widely used software program Byonic. In Chapter 4, a novel top-down data acquisition method is described. Chapter 5 provides conclusions and future directions.

Novel Data Analysis Approaches for Cross-linking Mass Spectrometry Proteomics and Glycoproteomics

Novel Data Analysis Approaches for Cross-linking Mass Spectrometry Proteomics and Glycoproteomics PDF Author: Lei Lu
Publisher:
ISBN:
Category :
Languages : en
Pages : 0

Book Description
Bottom-up proteomics has emerged as a powerful technology for biological studies. The technique is used for a myriad of purposes, including among others protein identification, post-translational modification identification, protein-protein interaction analysis, protein quantification analysis, and protein structure analysis. The data analysis approaches of bottom-up proteomics have evolved over the past two decades, and many different algorithms and software programs have been developed for these varied purposes. In this thesis, I have focused on improving the database search strategies for the important special applications of bottom-up proteomics, including cross-linking mass spectrometry proteomics and O-glycoproteomics. In cross-linking mass spectrometry proteomics, a sample of proteins is treated with a chemical cross-linking reagent. This causes peptides within the proteins to be cross-linked to one another, forming peptide doublets that are released by treatment of the sample with a protease such as trypsin. The data analysis tools are designed to identify the cross-linked peptides. In O-glycoproteomics, the peptides that are released by protease digestion of the protein sample can be modified with any of or even multiple distinct O-glycans, and the data analysis tools should be able to identify all of the glycans and the modification sites at which they are located. In both cases, traditional database searching strategies which try to match the experimental spectra to all potential theoretical spectra is not practical due to the large increases in search space. Researchers suffered from a lack of efficient data analysis tools for these two applications. Here we successfully devised new search algorithms to address these problems, and impemented them in two new software modules in our laboratories' bottom-up software engine MetaMorpheus (Crosslinking data analysis via MetaMorpheusXL and O-glycoproteomics data analysis via O-Pair Search). The new search strategies used in the software program are both based on ion-indexed open search, which was first developed for large scale proteomic studies in the programs MSFragger and Open-pFind. The ion-indexed open search was optimized for cross-linking mass spectrometry proteomics and O-glycoproteomics in this study, and combined with other algorithms. In O-glycoproteomics, a graph-based algorithm is used to speed up the identification and localization of O-glycans. Other useful features have been added in the software program, such as enabling analysis of both cleavable cross-links and non-cleavable cross-links in the cross-link search module, and calculating localization probabilities in the O-glyco search module. Further optimizations including machine learning methods for false discovery rate (FDR) analysis, retention time prediction and spectral prediction could further improve the current best search approaches for cross-link proteomics and O-glycoproteomics data analysis. Chapter 1 provides an overview of bottom-up proteomics data analysis methods and outlines how ion-indexed open search could be useful for special bottom-up proteomics studies. Chapter 2 describes the development of a cross-linking mass spectrometry proteomics search module, resulting in efficiency improvements for both cleavable and non-cleavable cross-link proteomics data analysis. Chapter 3 describes the development of an O-glycoproteomics search module; by combining the ion-indexed open search algorithm with the graph-based localization algorithm, the O-pair Search is more than 2000 times faster than the currently widely used software program Byonic. In Chapter 4, a novel top-down data acquisition method is described. Chapter 5 provides conclusions and future directions.

Characterization and Identification of Protein Posttranslational Modifications Using Protein Enrichment and Mass Spectrometry

Characterization and Identification of Protein Posttranslational Modifications Using Protein Enrichment and Mass Spectrometry PDF Author: Liwen Wang
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
Abstract: This dissertation describes a proteomic workflow for the analysis of protein post-translational modifications (PTMs). The workflow combines the techniques for protein enrichment, multi-dimensional separations, mass spectrometry (MS) and automatic data analysis. The workflow was developed to improve the application of proteomic analysis in the realms of biomarker discovery and experimental therapeutic research. Chapter 2 presents an immunoaffinity chromatography method that was developed to enrich acetylated histones. A self-packed immunoaffinity capillary column was developed using commercial antibodies that could be recycled and used for on-line and off-line enrichment. The acetylated fractions were collected and identified by Matrix Assisted Laser Desorption (MALDI) MS and electrospray ionization (ESI) liquid chromatography tandem mass spectrometry (LC-MS/MS). In chapter 3 an optimized phosphoproteomic analysis workflow based on phosphopeptide enrichment, data-dependant neutral loss mass spectrometry and a novel hierarchical database searching is described. The combination of these approaches improved the confidence of phosphopeptide identifications. Chapter 4 describes the use of phosphoprotein enrichment and a tandem phosphoprotein and phosphopeptide enrichment to improve the identification of phosphoproteins and localization of the phosphorylation sites. Purification of global phosphoproteins from primary CLL B-cells was conducted by use of PhosTag Zn2 enrichment strategy at neutral pH. SDS-PAGE gel was used to separate the purified phosphoprotein fraction and Pro-Q diamond staining was employed to visualize those phosphoprotein bands. Shot-gun proteomic analysis was then performed to identify all the enriched phosphoproteins in the gel. Phosphopeptide enrichment was used in tandem to map phosphorylation sites of the enriched phosphoproteins. Chapter 5 describes the identification of tyrosine phosphoproteins associated with immunotherapy of malignant cells with the small modular immunopharmaceutical targeted against CD37 (CD37-SMIPTM). This drug induces apoptosis and antibody-dependent cellular cytotoxicity (ADCC) in primary Chronic Lymphocyte Leukemia (CLL) cells. Tyrosine phosphorylation of proteins was investigated as an early activation event for the cytotoxicity. Immunoprecipitation was used to purify the phosphotyrosine proteins from treated cell lysate and untreated cell lysate. Detection of modulation of tyrosine phosphorylation and identification of those tyrosine phosphoproteins after treatment by proteomic approaches revealed proteins associated with the signaling pathway activated by immunotherapy. Chapter 6 describes a direct application of the proteomic platform developed in Chapter 3 combined with LC-MS protein profiling. The modulation of histone phosphorylation isoforms induced by various chemotherapy drugs was detected by LC-MS screening. We detected the dephosphorylation of histones H1 and hyperphosphorylation of H2A.X associated with the different drug treatments.

Computational Methods for Mass Spectrometry Proteomics

Computational Methods for Mass Spectrometry Proteomics PDF Author: Ingvar Eidhammer
Publisher: John Wiley & Sons
ISBN: 9780470724293
Category : Medical
Languages : en
Pages : 296

Book Description
Proteomics is the study of the subsets of proteins present in different parts of an organism and how they change with time and varying conditions. Mass spectrometry is the leading technology used in proteomics, and the field relies heavily on bioinformatics to process and analyze the acquired data. Since recent years have seen tremendous developments in instrumentation and proteomics-related bioinformatics, there is clearly a need for a solid introduction to the crossroads where proteomics and bioinformatics meet. Computational Methods for Mass Spectrometry Proteomics describes the different instruments and methodologies used in proteomics in a unified manner. The authors put an emphasis on the computational methods for the different phases of a proteomics analysis, but the underlying principles in protein chemistry and instrument technology are also described. The book is illustrated by a number of figures and examples, and contains exercises for the reader. Written in an accessible yet rigorous style, it is a valuable reference for both informaticians and biologists. Computational Methods for Mass Spectrometry Proteomics is suited for advanced undergraduate and graduate students of bioinformatics and molecular biology with an interest in proteomics. It also provides a good introduction and reference source for researchers new to proteomics, and for people who come into more peripheral contact with the field.

Expanding the Toolbox of Tandem Mass Spectrometry with Algorithms to Identify Mass Spectra from More Than One Peptide

Expanding the Toolbox of Tandem Mass Spectrometry with Algorithms to Identify Mass Spectra from More Than One Peptide PDF Author: Jian Wang
Publisher:
ISBN: 9781303217050
Category :
Languages : en
Pages : 124

Book Description
In high-throughput proteomics the development of computational methods and novel experimental strategies often rely on each other. In several areas, mass spectrometry methods for data acquisition are ahead of computational methods to interpret the resulting tandem mass (MS/MS) spectra. While there are numerous situations where two or more peptides are co-fragmented in the same MS/MS spectrum, nearly all mainstream computational approaches still make the ubiquitous assumption that each MS/MS spectrum comes from only one peptide. In this thesis we addressed problems in three emerging areas where computational tools that relax the above assumption are crucial for the success application of these approaches on a large-scale. In the first chapter we describe algorithms for the identification of mixture spectra that are from more than one co-eluting peptide precursors. The ability to interpret mixture spectra not only improves peptide identification in traditional data-dependent-acquisition (DDA) workflows but is also crucial for the success application of emerging data-independent-acquisition (DIA) techniques that have the potential to greatly improve the throughput of peptide identification. In chapter two, we address the problem of identification of peptides with complex post-translational modification (PTM). Detection of PTMs is important to understand the functional dynamics of proteins. Complex PTMs resulted from the conjugation of another macromolecule onto the substrate protein. The resultant modified peptides not only generate spectrum that contains a mixture of fragment ions from both the PTM and the substrate peptide but they also display substantially different fragmentation patterns as compared to conventional, unmodified peptides. We describe a hybrid experimental and computational approach to build search tools that capture the specific fragmentation patterns of modified peptides. Finally in chapter three we address the problem of identification of linked peptides. Linked peptides are two peptides that are covalently linked together. The generation and identification of linked peptides has recently been demonstrated to be a versatile tool to study protein-protein interactions and protein structures, however the identification of linked peptides face many challenges. We integrate lessons learned in the previous chapters to build an efficient and sensitive tool to identify linked peptides from MS/MS spectra.

New Methods for the Analysis of Noncovalent Biological Complexes and Protein Posttranslational Modifications

New Methods for the Analysis of Noncovalent Biological Complexes and Protein Posttranslational Modifications PDF Author: Yonghao Yu
Publisher:
ISBN:
Category :
Languages : en
Pages : 480

Book Description


Penalty-Based Dynamic Programming for the Identification of Post-Translational Modifications in Peptide Mass Spectra

Penalty-Based Dynamic Programming for the Identification of Post-Translational Modifications in Peptide Mass Spectra PDF Author: Laurence Elliot Bernstein
Publisher:
ISBN:
Category :
Languages : en
Pages : 125

Book Description
Tandem mass spectrometry (MS/MS) has long been the leading method of identifying peptides and proteins in complex biological samples and many algorithms have been created for this purpose. Many of the methods for searching MS/MS spectra against a database of known proteins must restrict the number of post-translational modifications (PTMs) that they can identify because the larger the number of PTMs being considered, the larger the search space, which in turn increases both computational complexity and the potential for false matches. In addition these algorithms cannot discover new peptides or homologues or be used with species for which a protein database does not exist. Newer algorithms have been developed that perform "open" or "blind" searches capable of finding any possible modifications, however these methods increase the search space even further, often resulting in lower performance and the generation of many putative modification masses that must be sifted through manually to determine which are real. To address the shortcomings of the existing methods, we created a new blind database search algorithm based on spectral networks. Our method uses a modification of the standard spectral tagging filtration techniques tailored for contig-consensus spectra generated from spectral networks, along with, the first of its kind, penalty-based, dynamic programming spectrum-database alignment algorithm that is able to accurately to identify both a priori specified modifications as well as novel PTMs. We then developed a workflow based on these new techniques that combines previous work in clustering, spectral alignment, spectral networks, and multi-spectral assembly. Because our new algorithm only identifies spectra that lie within the spectral networks, we created a workflow, called RaVen, that merged our method with MS-GF+ and combines the results from both methods resulting in a method with massive improvement in overall identification rates above existing methods while at the same time identifying many more rare modifications in samples. We also propose an improved way of measuring the accuracy of blind search algorithms: "peptide variants" which better meet captures the goals of blind search methods and does not rely on precise localization of modifications (which is very difficult to achieve for most algorithms).

Machine Learning for Protein Subcellular Localization Prediction

Machine Learning for Protein Subcellular Localization Prediction PDF Author: Shibiao Wan
Publisher: Walter de Gruyter GmbH & Co KG
ISBN: 1501501526
Category : Technology & Engineering
Languages : en
Pages : 213

Book Description
Comprehensively covers protein subcellular localization from single-label prediction to multi-label prediction, and includes prediction strategies for virus, plant, and eukaryote species. Three machine learning tools are introduced to improve classification refinement, feature extraction, and dimensionality reduction.