Author: W. Kim
Publisher: Springer Science & Business Media
ISBN: 3642823750
Category : Computers
Languages : en
Pages : 367
Book Description
This book is an anthology of the results of research and development in database query processing during the past decade. The relational model of data provided tremendous impetus for research into query processing. Since a relational query does not specify access paths to the stored data, the database management system (DBMS) must provide an intelligent query-processing subsystem which will evaluate a number of potentially efficient strategies for processing the query and select the one that optimizes a given performance measure. The degree of sophistication of this subsystem, often called the optimizer, critically affects the performance of the DBMS. Research into query processing thus started has taken off in several directions during the past decade. The emergence of research into distributed databases has enormously complicated the tasks of the optimizer. In a distributed environment, the database may be partitioned into horizontal or vertical fragments of relations. Replicas of the fragments may be stored in different sites of a network and even migrate to other sites. The measure of performance of a query in a distributed system must include the communication cost between sites. To minimize communication costs for-queries involving multiple relations across multiple sites, optimizers may also have to consider semi-join techniques.
Query Processing in Database Systems
Author: W. Kim
Publisher: Springer Science & Business Media
ISBN: 3642823750
Category : Computers
Languages : en
Pages : 367
Book Description
This book is an anthology of the results of research and development in database query processing during the past decade. The relational model of data provided tremendous impetus for research into query processing. Since a relational query does not specify access paths to the stored data, the database management system (DBMS) must provide an intelligent query-processing subsystem which will evaluate a number of potentially efficient strategies for processing the query and select the one that optimizes a given performance measure. The degree of sophistication of this subsystem, often called the optimizer, critically affects the performance of the DBMS. Research into query processing thus started has taken off in several directions during the past decade. The emergence of research into distributed databases has enormously complicated the tasks of the optimizer. In a distributed environment, the database may be partitioned into horizontal or vertical fragments of relations. Replicas of the fragments may be stored in different sites of a network and even migrate to other sites. The measure of performance of a query in a distributed system must include the communication cost between sites. To minimize communication costs for-queries involving multiple relations across multiple sites, optimizers may also have to consider semi-join techniques.
Publisher: Springer Science & Business Media
ISBN: 3642823750
Category : Computers
Languages : en
Pages : 367
Book Description
This book is an anthology of the results of research and development in database query processing during the past decade. The relational model of data provided tremendous impetus for research into query processing. Since a relational query does not specify access paths to the stored data, the database management system (DBMS) must provide an intelligent query-processing subsystem which will evaluate a number of potentially efficient strategies for processing the query and select the one that optimizes a given performance measure. The degree of sophistication of this subsystem, often called the optimizer, critically affects the performance of the DBMS. Research into query processing thus started has taken off in several directions during the past decade. The emergence of research into distributed databases has enormously complicated the tasks of the optimizer. In a distributed environment, the database may be partitioned into horizontal or vertical fragments of relations. Replicas of the fragments may be stored in different sites of a network and even migrate to other sites. The measure of performance of a query in a distributed system must include the communication cost between sites. To minimize communication costs for-queries involving multiple relations across multiple sites, optimizers may also have to consider semi-join techniques.
Database Systems for Advanced Applications
Author: Weiyi Meng
Publisher: Springer
ISBN: 3642374875
Category : Computers
Languages : en
Pages : 509
Book Description
This two volume set LNCS 7825 and LNCS 7826 constitutes the refereed proceedings of the 18th International Conference on Database Systems for Advanced Applications, DASFAA 2013, held in Wuhan, China, in April 2013. The 51 revised full papers and 10 short papers presented together with 2 invited keynote talks, 1 invited paper, 3 industrial papers, 9 demo presentations, 4 tutorials and 1 panel paper were carefully reviewed and selected from a total of 227 submissions. The topics covered in part 1 are social networks; query processing; nearest neighbor search; index; query analysis; XML data management; privacy protection; and uncertain data management; and in part 2: graph data management; physical design; knowledge management; temporal data management; social networks; query processing; data mining; applications; and database applications.
Publisher: Springer
ISBN: 3642374875
Category : Computers
Languages : en
Pages : 509
Book Description
This two volume set LNCS 7825 and LNCS 7826 constitutes the refereed proceedings of the 18th International Conference on Database Systems for Advanced Applications, DASFAA 2013, held in Wuhan, China, in April 2013. The 51 revised full papers and 10 short papers presented together with 2 invited keynote talks, 1 invited paper, 3 industrial papers, 9 demo presentations, 4 tutorials and 1 panel paper were carefully reviewed and selected from a total of 227 submissions. The topics covered in part 1 are social networks; query processing; nearest neighbor search; index; query analysis; XML data management; privacy protection; and uncertain data management; and in part 2: graph data management; physical design; knowledge management; temporal data management; social networks; query processing; data mining; applications; and database applications.
Query Processing over Uncertain Databases
Author: Lei Chen
Publisher: Springer Nature
ISBN: 3031018966
Category : Computers
Languages : en
Pages : 91
Book Description
Due to measurement errors, transmission lost, or injected noise for privacy protection, uncertainty exists in the data of many real applications. However, query processing techniques for deterministic data cannot be directly applied to uncertain data because they do not have mechanisms to handle data uncertainty. Therefore, efficient and effective manipulation of uncertain data is a practical yet challenging research topic. In this book, we start from the data models for imprecise and uncertain data, move on to defining different semantics for queries on uncertain data, and finally discuss the advanced query processing techniques for various probabilistic queries in uncertain databases. The book serves as a comprehensive guideline for query processing over uncertain databases. Table of Contents: Introduction / Uncertain Data Models / Spatial Query Semantics over Uncertain Data Models / Spatial Query Processing over Uncertain Databases / Conclusion
Publisher: Springer Nature
ISBN: 3031018966
Category : Computers
Languages : en
Pages : 91
Book Description
Due to measurement errors, transmission lost, or injected noise for privacy protection, uncertainty exists in the data of many real applications. However, query processing techniques for deterministic data cannot be directly applied to uncertain data because they do not have mechanisms to handle data uncertainty. Therefore, efficient and effective manipulation of uncertain data is a practical yet challenging research topic. In this book, we start from the data models for imprecise and uncertain data, move on to defining different semantics for queries on uncertain data, and finally discuss the advanced query processing techniques for various probabilistic queries in uncertain databases. The book serves as a comprehensive guideline for query processing over uncertain databases. Table of Contents: Introduction / Uncertain Data Models / Spatial Query Semantics over Uncertain Data Models / Spatial Query Processing over Uncertain Databases / Conclusion
Query Processing over Incomplete Databases
Author: Yunjun Gao
Publisher: Springer Nature
ISBN: 303101863X
Category : Computers
Languages : en
Pages : 106
Book Description
Incomplete data is part of life and almost all areas of scientific studies. Users tend to skip certain fields when they fill out online forms; participants choose to ignore sensitive questions on surveys; sensors fail, resulting in the loss of certain readings; publicly viewable satellite map services have missing data in many mobile applications; and in privacy-preserving applications, the data is incomplete deliberately in order to preserve the sensitivity of some attribute values. Query processing is a fundamental problem in computer science, and is useful in a variety of applications. In this book, we mostly focus on the query processing over incomplete databases, which involves finding a set of qualified objects from a specified incomplete dataset in order to support a wide spectrum of real-life applications. We first elaborate the three general kinds of methods of handling incomplete data, including (i) discarding the data with missing values, (ii) imputation for the missing values, and (iii) just depending on the observed data values. For the third method type, we introduce the semantics of k-nearest neighbor (kNN) search, skyline query, and top-k dominating query on incomplete data, respectively. In terms of the three representative queries over incomplete data, we investigate some advanced techniques to process incomplete data queries, including indexing, pruning as well as crowdsourcing techniques.
Publisher: Springer Nature
ISBN: 303101863X
Category : Computers
Languages : en
Pages : 106
Book Description
Incomplete data is part of life and almost all areas of scientific studies. Users tend to skip certain fields when they fill out online forms; participants choose to ignore sensitive questions on surveys; sensors fail, resulting in the loss of certain readings; publicly viewable satellite map services have missing data in many mobile applications; and in privacy-preserving applications, the data is incomplete deliberately in order to preserve the sensitivity of some attribute values. Query processing is a fundamental problem in computer science, and is useful in a variety of applications. In this book, we mostly focus on the query processing over incomplete databases, which involves finding a set of qualified objects from a specified incomplete dataset in order to support a wide spectrum of real-life applications. We first elaborate the three general kinds of methods of handling incomplete data, including (i) discarding the data with missing values, (ii) imputation for the missing values, and (iii) just depending on the observed data values. For the third method type, we introduce the semantics of k-nearest neighbor (kNN) search, skyline query, and top-k dominating query on incomplete data, respectively. In terms of the three representative queries over incomplete data, we investigate some advanced techniques to process incomplete data queries, including indexing, pruning as well as crowdsourcing techniques.
Scalable Uncertainty Management
Author: Sergio Greco
Publisher: Springer Science & Business Media
ISBN: 3540879927
Category : Computers
Languages : en
Pages : 411
Book Description
This book constitutes the refereed proceedings of the Second International Conference on Scalable Uncertainty Management, SUM 2008, held in Naples, Italy, in Oktober 2008. The 27 revised full papers presented together with the extended abstracts of 3 invited talks/tutorials were carefully reviewed and selected from 42 submissions. The papers address artificial intelligence researchers, database researchers, and practitioners to demonstrate theoretical techniques required to manage the uncertainty that arises in large scale real world applications and to cope with large volumes of uncertainty and inconsistency in databases, the Web, the semantic Web, and artificial intelligence in general.
Publisher: Springer Science & Business Media
ISBN: 3540879927
Category : Computers
Languages : en
Pages : 411
Book Description
This book constitutes the refereed proceedings of the Second International Conference on Scalable Uncertainty Management, SUM 2008, held in Naples, Italy, in Oktober 2008. The 27 revised full papers presented together with the extended abstracts of 3 invited talks/tutorials were carefully reviewed and selected from 42 submissions. The papers address artificial intelligence researchers, database researchers, and practitioners to demonstrate theoretical techniques required to manage the uncertainty that arises in large scale real world applications and to cope with large volumes of uncertainty and inconsistency in databases, the Web, the semantic Web, and artificial intelligence in general.
Advanced Research on Computer Science and Information Engineering
Author: Gang Shen
Publisher: Springer Science & Business Media
ISBN: 3642214010
Category : Computers
Languages : en
Pages : 524
Book Description
This two-volume set (CCIS 152 and CCIS 153) constitutes the refereed proceedings of the International Conference on Computer Science and Information Engineering, CSIE 2011, held in Zhengzhou, China, in May 2011. The 159 revised full papers presented in both volumes were carefully reviewed and selected from a large number of submissions. The papers present original research results that are broadly relevant to the theory and applications of Computer Science and Information Engineering and address a wide variety of topics such as algorithms, automation, artificial intelligence, bioinformatics, computer networks, computer security, computer vision, modeling and simulation, databases, data mining, e-learning, e-commerce, e-business, image processing, knowledge management, multimedia, mobile computing, natural computing, open and innovative education, pattern recognition, parallel computing, robotics, wireless networks, and Web applications.
Publisher: Springer Science & Business Media
ISBN: 3642214010
Category : Computers
Languages : en
Pages : 524
Book Description
This two-volume set (CCIS 152 and CCIS 153) constitutes the refereed proceedings of the International Conference on Computer Science and Information Engineering, CSIE 2011, held in Zhengzhou, China, in May 2011. The 159 revised full papers presented in both volumes were carefully reviewed and selected from a large number of submissions. The papers present original research results that are broadly relevant to the theory and applications of Computer Science and Information Engineering and address a wide variety of topics such as algorithms, automation, artificial intelligence, bioinformatics, computer networks, computer security, computer vision, modeling and simulation, databases, data mining, e-learning, e-commerce, e-business, image processing, knowledge management, multimedia, mobile computing, natural computing, open and innovative education, pattern recognition, parallel computing, robotics, wireless networks, and Web applications.
Grid and Cloud Database Management
Author: Sandro Fiore
Publisher: Springer Science & Business Media
ISBN: 3642200451
Category : Computers
Languages : en
Pages : 353
Book Description
Since the 1990s Grid Computing has emerged as a paradigm for accessing and managing distributed, heterogeneous and geographically spread resources, promising that we will be able to access computer power as easily as we can access the electric power grid. Later on, Cloud Computing brought the promise of providing easy and inexpensive access to remote hardware and storage resources. Exploiting pay-per-use models and virtualization for resource provisioning, cloud computing has been rapidly accepted and used by researchers, scientists and industries. In this volume, contributions from internationally recognized experts describe the latest findings on challenging topics related to grid and cloud database management. By exploring current and future developments, they provide a thorough understanding of the principles and techniques involved in these fields. The presented topics are well balanced and complementary, and they range from well-known research projects and real case studies to standards and specifications, and non-functional aspects such as security, performance and scalability. Following an initial introduction by the editors, the contributions are organized into four sections: Open Standards and Specifications, Research Efforts in Grid Database Management, Cloud Data Management, and Scientific Case Studies. With this presentation, the book serves mostly researchers and graduate students, both as an introduction to and as a technical reference for grid and cloud database management. The detailed descriptions of research prototypes dealing with spatiotemporal or genomic data will also be useful for application engineers in these fields.
Publisher: Springer Science & Business Media
ISBN: 3642200451
Category : Computers
Languages : en
Pages : 353
Book Description
Since the 1990s Grid Computing has emerged as a paradigm for accessing and managing distributed, heterogeneous and geographically spread resources, promising that we will be able to access computer power as easily as we can access the electric power grid. Later on, Cloud Computing brought the promise of providing easy and inexpensive access to remote hardware and storage resources. Exploiting pay-per-use models and virtualization for resource provisioning, cloud computing has been rapidly accepted and used by researchers, scientists and industries. In this volume, contributions from internationally recognized experts describe the latest findings on challenging topics related to grid and cloud database management. By exploring current and future developments, they provide a thorough understanding of the principles and techniques involved in these fields. The presented topics are well balanced and complementary, and they range from well-known research projects and real case studies to standards and specifications, and non-functional aspects such as security, performance and scalability. Following an initial introduction by the editors, the contributions are organized into four sections: Open Standards and Specifications, Research Efforts in Grid Database Management, Cloud Data Management, and Scientific Case Studies. With this presentation, the book serves mostly researchers and graduate students, both as an introduction to and as a technical reference for grid and cloud database management. The detailed descriptions of research prototypes dealing with spatiotemporal or genomic data will also be useful for application engineers in these fields.
Inconsistency Tolerance
Author: Leopoldo Bertossi
Publisher: Springer Science & Business Media
ISBN: 3540242600
Category : Computers
Languages : en
Pages : 300
Book Description
Inconsistency arises in many areas in advanced computing. Often inconsistency is unwanted, for example in the specification for a plan or in sensor fusion in robotics; however, sometimes inconsistency is useful. Whether inconsistency is unwanted or useful, there is a need to develop tolerance to inconsistency in application technologies such as databases, knowledge bases, and software systems. To address this situation, inconsistency tolerance is being built on foundational technologies for identifying and analyzing inconsistency in information, for representing and reasoning with inconsistent information, for resolving inconsistent information, and for merging inconsistent information. The idea for this book arose out of a Dagstuhl Seminar on the topic held in summer 2003. The nine chapters in this first book devoted to the subject of inconsistency tolerance were carefully invited and anonymously reviewed. The book provides an exciting introduction to this new field.
Publisher: Springer Science & Business Media
ISBN: 3540242600
Category : Computers
Languages : en
Pages : 300
Book Description
Inconsistency arises in many areas in advanced computing. Often inconsistency is unwanted, for example in the specification for a plan or in sensor fusion in robotics; however, sometimes inconsistency is useful. Whether inconsistency is unwanted or useful, there is a need to develop tolerance to inconsistency in application technologies such as databases, knowledge bases, and software systems. To address this situation, inconsistency tolerance is being built on foundational technologies for identifying and analyzing inconsistency in information, for representing and reasoning with inconsistent information, for resolving inconsistent information, and for merging inconsistent information. The idea for this book arose out of a Dagstuhl Seminar on the topic held in summer 2003. The nine chapters in this first book devoted to the subject of inconsistency tolerance were carefully invited and anonymously reviewed. The book provides an exciting introduction to this new field.
Database Systems for Advanced Applications. DASFAA 2021 International Workshops
Author: Christian S. Jensen
Publisher: Springer Nature
ISBN: 3030732169
Category : Computers
Languages : en
Pages : 446
Book Description
This volume constitutes the papers of several workshops which were held in conjunction with the 26th International Conference on Database Systems for Advanced Applications, DASFAA 2021, held in Taipei, Taiwan, in April 2021. The 29 revised full papers presented in this book were carefully reviewed and selected from 84 submissions. DASFAA 2021 presents the following five workshops: 6th International Workshop on Big Data Quality Management (BDQM 2021) 5th International Workshop on Graph Data Management and Analysis (GDMA 2021) First International Workshop on Machine Learning and Deep Learning for Data Security Applications (MLDLDSA 2021) 6th International Workshop on Mobile Data Management, Mining, and Computing on Social Network (MobiSocial 2021) 2021 International Workshop on Mobile Ubiquitous Systems and Technologies (MUST 2021) Due to the Corona pandemic this event was held virtually.
Publisher: Springer Nature
ISBN: 3030732169
Category : Computers
Languages : en
Pages : 446
Book Description
This volume constitutes the papers of several workshops which were held in conjunction with the 26th International Conference on Database Systems for Advanced Applications, DASFAA 2021, held in Taipei, Taiwan, in April 2021. The 29 revised full papers presented in this book were carefully reviewed and selected from 84 submissions. DASFAA 2021 presents the following five workshops: 6th International Workshop on Big Data Quality Management (BDQM 2021) 5th International Workshop on Graph Data Management and Analysis (GDMA 2021) First International Workshop on Machine Learning and Deep Learning for Data Security Applications (MLDLDSA 2021) 6th International Workshop on Mobile Data Management, Mining, and Computing on Social Network (MobiSocial 2021) 2021 International Workshop on Mobile Ubiquitous Systems and Technologies (MUST 2021) Due to the Corona pandemic this event was held virtually.
Data Cleaning
Author: Ihab F. Ilyas
Publisher: Morgan & Claypool
ISBN: 1450371558
Category : Computers
Languages : en
Pages : 284
Book Description
This is an overview of the end-to-end data cleaning process. Data quality is one of the most important problems in data management, since dirty data often leads to inaccurate data analytics results and incorrect business decisions. Poor data across businesses and the U.S. government are reported to cost trillions of dollars a year. Multiple surveys show that dirty data is the most common barrier faced by data scientists. Not surprisingly, developing effective and efficient data cleaning solutions is challenging and is rife with deep theoretical and engineering problems. This book is about data cleaning, which is used to refer to all kinds of tasks and activities to detect and repair errors in the data. Rather than focus on a particular data cleaning task, this book describes various error detection and repair methods, and attempts to anchor these proposals with multiple taxonomies and views. Specifically, it covers four of the most common and important data cleaning tasks, namely, outlier detection, data transformation, error repair (including imputing missing values), and data deduplication. Furthermore, due to the increasing popularity and applicability of machine learning techniques, it includes a chapter that specifically explores how machine learning techniques are used for data cleaning, and how data cleaning is used to improve machine learning models. This book is intended to serve as a useful reference for researchers and practitioners who are interested in the area of data quality and data cleaning. It can also be used as a textbook for a graduate course. Although we aim at covering state-of-the-art algorithms and techniques, we recognize that data cleaning is still an active field of research and therefore provide future directions of research whenever appropriate.
Publisher: Morgan & Claypool
ISBN: 1450371558
Category : Computers
Languages : en
Pages : 284
Book Description
This is an overview of the end-to-end data cleaning process. Data quality is one of the most important problems in data management, since dirty data often leads to inaccurate data analytics results and incorrect business decisions. Poor data across businesses and the U.S. government are reported to cost trillions of dollars a year. Multiple surveys show that dirty data is the most common barrier faced by data scientists. Not surprisingly, developing effective and efficient data cleaning solutions is challenging and is rife with deep theoretical and engineering problems. This book is about data cleaning, which is used to refer to all kinds of tasks and activities to detect and repair errors in the data. Rather than focus on a particular data cleaning task, this book describes various error detection and repair methods, and attempts to anchor these proposals with multiple taxonomies and views. Specifically, it covers four of the most common and important data cleaning tasks, namely, outlier detection, data transformation, error repair (including imputing missing values), and data deduplication. Furthermore, due to the increasing popularity and applicability of machine learning techniques, it includes a chapter that specifically explores how machine learning techniques are used for data cleaning, and how data cleaning is used to improve machine learning models. This book is intended to serve as a useful reference for researchers and practitioners who are interested in the area of data quality and data cleaning. It can also be used as a textbook for a graduate course. Although we aim at covering state-of-the-art algorithms and techniques, we recognize that data cleaning is still an active field of research and therefore provide future directions of research whenever appropriate.