Author: Claude Muller
Publisher: Presses Univ. Franche-Comté
ISBN: 9782848670621
Category : Applied linguistics
Languages : fr
Pages : 370
Book Description
INTEX pour la linguistique et le traitement automatique des langues
Author: Claude Muller
Publisher: Presses Univ. Franche-Comté
ISBN: 9782848670621
Category : Applied linguistics
Languages : fr
Pages : 370
Book Description
Publisher: Presses Univ. Franche-Comté
ISBN: 9782848670621
Category : Applied linguistics
Languages : fr
Pages : 370
Book Description
Automatic Processing of Natural-Language Electronic Texts with NooJ
Author: Linda Barone
Publisher: Springer
ISBN: 3319550020
Category : Computers
Languages : en
Pages : 266
Book Description
This book constitutes the refereed proceedings of the 10th International Conference, NooJ 2016, held České Budějovice, Czech Republic, in June 2016. The 21 revised full papers presented in this volume were carefully reviewed and selected from 45 submissions. NooJ is a linguistic development environment that provides tools for linguists to construct linguistic resources that formalise a large gamut of linguistic phenomena: typography, orthography, lexicons for simple words, multiword units and discontinuous expressions, inflectional and derivational morphology, local, structural and transformational syntax, and semantics.
Publisher: Springer
ISBN: 3319550020
Category : Computers
Languages : en
Pages : 266
Book Description
This book constitutes the refereed proceedings of the 10th International Conference, NooJ 2016, held České Budějovice, Czech Republic, in June 2016. The 21 revised full papers presented in this volume were carefully reviewed and selected from 45 submissions. NooJ is a linguistic development environment that provides tools for linguists to construct linguistic resources that formalise a large gamut of linguistic phenomena: typography, orthography, lexicons for simple words, multiword units and discontinuous expressions, inflectional and derivational morphology, local, structural and transformational syntax, and semantics.
Applications of Finite-State Language Processing
Author: Svetla Koeva
Publisher: Cambridge Scholars Publishing
ISBN: 1443826030
Category : Language Arts & Disciplines
Languages : en
Pages : 225
Book Description
NooJ is both a corpus processing tool and a linguistic development environment: it allows linguists to formalize several levels of linguistic phenomena: orthography and spelling, lexicons for simple words, multiword units and frozen expressions, inflectional, derivational and productive morphology, local, structural syntax and transformational syntax. For each of these levels, NooJ provides linguists with one or more formal tools specifically designed to facilitate the description of each phenomenon, as well as parsing tools designed to be as computationally efficient as possible. This approach distinguishes NooJ from most computational linguistic tools, which provide a single formalism that should describe everything. As a corpus processing tool, NooJ allows users to apply sophisticated linguistic queries to large corpora in order to build indices and concordances, annotate texts automatically, perform statistical analyses, etc. NooJ is freely available and linguistic modules can already be downloaded for Acadian, Arabic, Armenian, Bulgarian, Catalan, Chinese, Croatian, French, English, German, Hebrew, Greek, Hungarian, Italian, Polish, Portuguese, Spanish and Turkish. The present volume contains papers from the 2008 International NooJ conference which was held 8–10 June 2008 in Budapest. While the focus of the Budapest conference was on making NooJ compatible with other applications, the papers vary with respect to whether they regard Natural Language Processing (NLP) as a research goal or as a tool. However, they all present a slightly different problem either in the field of NLP, or in one that can be solved using NLP, or present a new development in the tool itself. The range of problems dealt with in the volume is quite varied, which will hopefully enable the readers to find contributions that are relevant to their field of interest.
Publisher: Cambridge Scholars Publishing
ISBN: 1443826030
Category : Language Arts & Disciplines
Languages : en
Pages : 225
Book Description
NooJ is both a corpus processing tool and a linguistic development environment: it allows linguists to formalize several levels of linguistic phenomena: orthography and spelling, lexicons for simple words, multiword units and frozen expressions, inflectional, derivational and productive morphology, local, structural syntax and transformational syntax. For each of these levels, NooJ provides linguists with one or more formal tools specifically designed to facilitate the description of each phenomenon, as well as parsing tools designed to be as computationally efficient as possible. This approach distinguishes NooJ from most computational linguistic tools, which provide a single formalism that should describe everything. As a corpus processing tool, NooJ allows users to apply sophisticated linguistic queries to large corpora in order to build indices and concordances, annotate texts automatically, perform statistical analyses, etc. NooJ is freely available and linguistic modules can already be downloaded for Acadian, Arabic, Armenian, Bulgarian, Catalan, Chinese, Croatian, French, English, German, Hebrew, Greek, Hungarian, Italian, Polish, Portuguese, Spanish and Turkish. The present volume contains papers from the 2008 International NooJ conference which was held 8–10 June 2008 in Budapest. While the focus of the Budapest conference was on making NooJ compatible with other applications, the papers vary with respect to whether they regard Natural Language Processing (NLP) as a research goal or as a tool. However, they all present a slightly different problem either in the field of NLP, or in one that can be solved using NLP, or present a new development in the tool itself. The range of problems dealt with in the volume is quite varied, which will hopefully enable the readers to find contributions that are relevant to their field of interest.
Automatic Processing of Various Levels of Linguistic Phenomena
Author: Božo Bekavac
Publisher: Cambridge Scholars Publishing
ISBN: 1443837121
Category : Language Arts & Disciplines
Languages : en
Pages : 280
Book Description
Every year since 2002, the linguistic development environment NooJ has been enhanced with new online features that allow social scientists to develop new applications and explore new domains. The 2011 conference was no exception and the arrival of v3.0 has brought many more features and a new range of applications, from the analysis of ancient Arabic and old English texts to the analysis of conversations held by the Mars500 mission’s astronauts. At the 2011 conference, members of the European Meta-Net CESAR project announced that NooJ will soon be available Open Source and will become the de-facto standard tool for Corpus processing in European research in Social Science. Today, NooJ is used as a research tool in over 30 academic and research centers in the world and there are NooJ modules available for over 20 languages. The international NooJ conference is organized every year; 50 participants present their work in the domains of Linguistic formalization, Corpus processing and Natural Language Processing applications. The present volume contains a selection of papers from the NooJ 2011 International Conference which was held from 13–15 June 2011 in Dubrovnik, Croatia. This volume presents problems dealing with machine translation, information extraction, processing of multi-word units, automatic disambiguation, semantic analysis, and psychological and literature analysis of various corpora.
Publisher: Cambridge Scholars Publishing
ISBN: 1443837121
Category : Language Arts & Disciplines
Languages : en
Pages : 280
Book Description
Every year since 2002, the linguistic development environment NooJ has been enhanced with new online features that allow social scientists to develop new applications and explore new domains. The 2011 conference was no exception and the arrival of v3.0 has brought many more features and a new range of applications, from the analysis of ancient Arabic and old English texts to the analysis of conversations held by the Mars500 mission’s astronauts. At the 2011 conference, members of the European Meta-Net CESAR project announced that NooJ will soon be available Open Source and will become the de-facto standard tool for Corpus processing in European research in Social Science. Today, NooJ is used as a research tool in over 30 academic and research centers in the world and there are NooJ modules available for over 20 languages. The international NooJ conference is organized every year; 50 participants present their work in the domains of Linguistic formalization, Corpus processing and Natural Language Processing applications. The present volume contains a selection of papers from the NooJ 2011 International Conference which was held from 13–15 June 2011 in Dubrovnik, Croatia. This volume presents problems dealing with machine translation, information extraction, processing of multi-word units, automatic disambiguation, semantic analysis, and psychological and literature analysis of various corpora.
Constructing Interpersonality
Author: Enrique Lafuente-Millán
Publisher: Cambridge Scholars Publishing
ISBN: 144382027X
Category : Foreign Language Study
Languages : en
Pages : 380
Book Description
The view that academic discourse is, by definition, impersonal has long been superseded. It seems unquestionable now that the interpersonal component of texts, that is, the ways in which the writers project themselves and their audience in the discourse, is an essential factor determining the success of scholarly communication and has become a fundamental issue in the field of English for Academic Purposes (EAP). Interpersonality is the key issue around which the articles in this edited book focus on. The eighteen contributions included in this volume provide a wide exploratory view of the many academic genres in which interpersonality is manifested and the various analytical approaches from which the textual manifestation of that interpersonality can be studied. The varied origin of the contributors is also representative of the global interest that the issue of interpersonality arouses in the field of academic discourse analysis at an international level. The present volume constitutes a highly valuable tool for applied linguists and discourse analysts with an interest in EAP as well as for students, instructors and language teachers interested in academic discourse. The book may also be of interest to other agents intervening in the research publication process, such as translators, proofreaders, reviewers and editors.
Publisher: Cambridge Scholars Publishing
ISBN: 144382027X
Category : Foreign Language Study
Languages : en
Pages : 380
Book Description
The view that academic discourse is, by definition, impersonal has long been superseded. It seems unquestionable now that the interpersonal component of texts, that is, the ways in which the writers project themselves and their audience in the discourse, is an essential factor determining the success of scholarly communication and has become a fundamental issue in the field of English for Academic Purposes (EAP). Interpersonality is the key issue around which the articles in this edited book focus on. The eighteen contributions included in this volume provide a wide exploratory view of the many academic genres in which interpersonality is manifested and the various analytical approaches from which the textual manifestation of that interpersonality can be studied. The varied origin of the contributors is also representative of the global interest that the issue of interpersonality arouses in the field of academic discourse analysis at an international level. The present volume constitutes a highly valuable tool for applied linguists and discourse analysts with an interest in EAP as well as for students, instructors and language teachers interested in academic discourse. The book may also be of interest to other agents intervening in the research publication process, such as translators, proofreaders, reviewers and editors.
Building and Exploring Web Corpora (WAC3 - 2007)
Author: Cédrick Fairon
Publisher: Presses univ. de Louvain
ISBN: 9782874630828
Category : Language Arts & Disciplines
Languages : en
Pages : 186
Book Description
WAC More and more people are using Web data for linguistic and NLP research. The Web as Corpusworkshop (WAC) provides a venue for exploring how we can use it effectively and the advancementsto which this could lead.This book is a collection of the talks presented at the 3 rd WAC in Louvain-la-Neuve (Belgium).The focus is on the description of Web corpus collection projects, the exploration of Web datacharacteristics from a linguistics/NLP perspective, and on the use of crawled Web data for NLPpurposes. CLEANEVAL Any use of Web data requires that it be cleaned in order to get rid of unwanted material including,for example, HTML markup, navigation bars, advertisements. To date there has been no sharingof resources or expertise in this particular domain and the cleaning has often been done minimally.Cleaneval was an exercise aimed at promoting collaboration and improving our understandingof the issues. Results and perspectives are presented in this book.
Publisher: Presses univ. de Louvain
ISBN: 9782874630828
Category : Language Arts & Disciplines
Languages : en
Pages : 186
Book Description
WAC More and more people are using Web data for linguistic and NLP research. The Web as Corpusworkshop (WAC) provides a venue for exploring how we can use it effectively and the advancementsto which this could lead.This book is a collection of the talks presented at the 3 rd WAC in Louvain-la-Neuve (Belgium).The focus is on the description of Web corpus collection projects, the exploration of Web datacharacteristics from a linguistics/NLP perspective, and on the use of crawled Web data for NLPpurposes. CLEANEVAL Any use of Web data requires that it be cleaned in order to get rid of unwanted material including,for example, HTML markup, navigation bars, advertisements. To date there has been no sharingof resources or expertise in this particular domain and the cleaning has often been done minimally.Cleaneval was an exercise aimed at promoting collaboration and improving our understandingof the issues. Results and perspectives are presented in this book.
Formalising Natural Languages with NooJ 2013
Author: Svetla Koeva
Publisher: Cambridge Scholars Publishing
ISBN: 1443860670
Category : Computers
Languages : en
Pages : 250
Book Description
This volume contains 17 articles, developed from papers that were chosen from among the 44 presentations of work on NooJ presented at the 2013 International NooJ Conference in Saarbrücken in June, 2013. NooJ is a linguistic development environment that allows linguists to formalize a wide gamut of linguistic phenomena, and then test, adapt, share and accumulate each elementary description to build linguistic “modules”, that is, structured libraries of linguistic resources. NooJ is also used as a corpus processor that can launch sophisticated queries over large corpora of texts, in order to produce various results, including concordances, statistical analyses, information extraction, and automatic translation. NooJ is used in many research centers; it has recently been endorsed by the European Metashare CESAR Project, and is now available as an open source software at the METASHARE repository. NooJ is also used by a growing number of software companies to construct various Natural Language Processing applications.
Publisher: Cambridge Scholars Publishing
ISBN: 1443860670
Category : Computers
Languages : en
Pages : 250
Book Description
This volume contains 17 articles, developed from papers that were chosen from among the 44 presentations of work on NooJ presented at the 2013 International NooJ Conference in Saarbrücken in June, 2013. NooJ is a linguistic development environment that allows linguists to formalize a wide gamut of linguistic phenomena, and then test, adapt, share and accumulate each elementary description to build linguistic “modules”, that is, structured libraries of linguistic resources. NooJ is also used as a corpus processor that can launch sophisticated queries over large corpora of texts, in order to produce various results, including concordances, statistical analyses, information extraction, and automatic translation. NooJ is used in many research centers; it has recently been endorsed by the European Metashare CESAR Project, and is now available as an open source software at the METASHARE repository. NooJ is also used by a growing number of software companies to construct various Natural Language Processing applications.
Natural Language Processing of Semitic Languages
Author: Imed Zitouni
Publisher: Springer Science & Business
ISBN: 3642453589
Category : Computers
Languages : en
Pages : 477
Book Description
Research in Natural Language Processing (NLP) has rapidly advanced in recent years, resulting in exciting algorithms for sophisticated processing of text and speech in various languages. Much of this work focuses on English; in this book we address another group of interesting and challenging languages for NLP research: the Semitic languages. The Semitic group of languages includes Arabic (206 million native speakers), Amharic (27 million), Hebrew (7 million), Tigrinya (6.7 million), Syriac (1 million) and Maltese (419 thousand). Semitic languages exhibit unique morphological processes, challenging syntactic constructions and various other phenomena that are less prevalent in other natural languages. These challenges call for unique solutions, many of which are described in this book. The 13 chapters presented in this book bring together leading scientists from several universities and research institutes worldwide. While this book devotes some attention to cutting-edge algorithms and techniques, its primary purpose is a thorough explication of best practices in the field. Furthermore, every chapter describes how the techniques discussed apply to Semitic languages. The book covers both statistical approaches to NLP, which are dominant across various applications nowadays and the more traditional, rule-based approaches, that were proven useful for several other application domains. We hope that this book will provide a "one-stop-shop'' for all the requisite background and practical advice when building NLP applications for Semitic languages.
Publisher: Springer Science & Business
ISBN: 3642453589
Category : Computers
Languages : en
Pages : 477
Book Description
Research in Natural Language Processing (NLP) has rapidly advanced in recent years, resulting in exciting algorithms for sophisticated processing of text and speech in various languages. Much of this work focuses on English; in this book we address another group of interesting and challenging languages for NLP research: the Semitic languages. The Semitic group of languages includes Arabic (206 million native speakers), Amharic (27 million), Hebrew (7 million), Tigrinya (6.7 million), Syriac (1 million) and Maltese (419 thousand). Semitic languages exhibit unique morphological processes, challenging syntactic constructions and various other phenomena that are less prevalent in other natural languages. These challenges call for unique solutions, many of which are described in this book. The 13 chapters presented in this book bring together leading scientists from several universities and research institutes worldwide. While this book devotes some attention to cutting-edge algorithms and techniques, its primary purpose is a thorough explication of best practices in the field. Furthermore, every chapter describes how the techniques discussed apply to Semitic languages. The book covers both statistical approaches to NLP, which are dominant across various applications nowadays and the more traditional, rule-based approaches, that were proven useful for several other application domains. We hope that this book will provide a "one-stop-shop'' for all the requisite background and practical advice when building NLP applications for Semitic languages.
Corpus Linguistics, Computer Tools, and Applications - State of the Art
Author: Barbara Lewandowska-Tomaszczyk
Publisher: Peter Lang
ISBN: 9783631583111
Category : Computers
Languages : en
Pages : 772
Book Description
Contents: Barbara Lewandowska-Tomaszczyk: PALC 2007: Where are we now? - Paul Rayson/Dawn Archer/Alistair Baron/Nicholas Smith: Travelling through time with corpus annotation software - Eugene H. Casad: Parsing texts and compiling a dictionary with shoebox - Belinda Maia/Rui Silva/Anabela Barreiro/Cecília Fróis: 'N-grams in search of theories' - Piotr Pęzik/Jung-jae Kim/Dietrich Rebholz-Schuhmann: MedEvi - A permuted concordancer for the biomedical domain - Patrick Hanks: Why the «word sense disambiguation problem» can't be solved, and what should be done instead - Rafał
Publisher: Peter Lang
ISBN: 9783631583111
Category : Computers
Languages : en
Pages : 772
Book Description
Contents: Barbara Lewandowska-Tomaszczyk: PALC 2007: Where are we now? - Paul Rayson/Dawn Archer/Alistair Baron/Nicholas Smith: Travelling through time with corpus annotation software - Eugene H. Casad: Parsing texts and compiling a dictionary with shoebox - Belinda Maia/Rui Silva/Anabela Barreiro/Cecília Fróis: 'N-grams in search of theories' - Piotr Pęzik/Jung-jae Kim/Dietrich Rebholz-Schuhmann: MedEvi - A permuted concordancer for the biomedical domain - Patrick Hanks: Why the «word sense disambiguation problem» can't be solved, and what should be done instead - Rafał
Computational Linguistics and Intelligent Text Processing
Author: Alexander Gelbukh
Publisher: Springer Science & Business Media
ISBN: 3642194362
Category : Computers
Languages : en
Pages : 541
Book Description
This two-volume set, consisting of LNCS 6608 and LNCS 6609, constitutes the thoroughly refereed proceedings of the 12th International Conference on Computer Linguistics and Intelligent Processing, held in Tokyo, Japan, in February 2011. The 74 full papers, presented together with 4 invited papers, were carefully reviewed and selected from 298 submissions. The contents have been ordered according to the following topical sections: lexical resources; syntax and parsing; part-of-speech tagging and morphology; word sense disambiguation; semantics and discourse; opinion mining and sentiment detection; text generation; machine translation and multilingualism; information extraction and information retrieval; text categorization and classification; summarization and recognizing textual entailment; authoring aid, error correction, and style analysis; and speech recognition and generation.
Publisher: Springer Science & Business Media
ISBN: 3642194362
Category : Computers
Languages : en
Pages : 541
Book Description
This two-volume set, consisting of LNCS 6608 and LNCS 6609, constitutes the thoroughly refereed proceedings of the 12th International Conference on Computer Linguistics and Intelligent Processing, held in Tokyo, Japan, in February 2011. The 74 full papers, presented together with 4 invited papers, were carefully reviewed and selected from 298 submissions. The contents have been ordered according to the following topical sections: lexical resources; syntax and parsing; part-of-speech tagging and morphology; word sense disambiguation; semantics and discourse; opinion mining and sentiment detection; text generation; machine translation and multilingualism; information extraction and information retrieval; text categorization and classification; summarization and recognizing textual entailment; authoring aid, error correction, and style analysis; and speech recognition and generation.