Comparable Corpora and Computer-assisted Translation PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Comparable Corpora and Computer-assisted Translation PDF full book. Access full book title Comparable Corpora and Computer-assisted Translation by Estelle Maryline Delpech. Download full books in PDF and EPUB format.

Comparable Corpora and Computer-assisted Translation

Comparable Corpora and Computer-assisted Translation PDF Author: Estelle Maryline Delpech
Publisher: John Wiley & Sons
ISBN: 1119002702
Category : Computers
Languages : en
Pages : 221

Book Description
Computer-assisted translation (CAT) has always used translation memories, which require the translator to have a corpus of previous translations that the CAT software can use to generate bilingual lexicons. This can be problematic when the translator does not have such a corpus, for instance, when the text belongs to an emerging field. To solve this issue, CAT research has looked into the leveraging of comparable corpora, i.e. a set of texts, in two or more languages, which deal with the same topic but are not translations of one another. This work had two primary objectives. The first is to assess the input of lexicons extracted from comparable corpora in the context of a specialized human translation task. The second objective is to identify bilingual-lexicon-extraction methods which best match the translators' needs, determining the current limits of these techniques and suggesting improvements. The author focuses, in particular, on the identification of fertile translations, the management of multiple morphological structures, and the ranking of candidate translations. The experiments are carried out on two language pairs (English–French and English–German) and on specialized texts dealing with breast cancer. This research puts significant emphasis on applicability – methodological choices are guided by the needs of the final users. This book is organized in two parts: the first part presents the applicative and scientific context of the research, and the second part is given over to efforts to improve compositional translation. The research work presented in this book received the PhD Thesis award 2014 from the French association for natural language processing (ATALA).

Comparable Corpora and Computer-assisted Translation

Comparable Corpora and Computer-assisted Translation PDF Author: Estelle Maryline Delpech
Publisher: John Wiley & Sons
ISBN: 1119002702
Category : Computers
Languages : en
Pages : 221

Book Description
Computer-assisted translation (CAT) has always used translation memories, which require the translator to have a corpus of previous translations that the CAT software can use to generate bilingual lexicons. This can be problematic when the translator does not have such a corpus, for instance, when the text belongs to an emerging field. To solve this issue, CAT research has looked into the leveraging of comparable corpora, i.e. a set of texts, in two or more languages, which deal with the same topic but are not translations of one another. This work had two primary objectives. The first is to assess the input of lexicons extracted from comparable corpora in the context of a specialized human translation task. The second objective is to identify bilingual-lexicon-extraction methods which best match the translators' needs, determining the current limits of these techniques and suggesting improvements. The author focuses, in particular, on the identification of fertile translations, the management of multiple morphological structures, and the ranking of candidate translations. The experiments are carried out on two language pairs (English–French and English–German) and on specialized texts dealing with breast cancer. This research puts significant emphasis on applicability – methodological choices are guided by the needs of the final users. This book is organized in two parts: the first part presents the applicative and scientific context of the research, and the second part is given over to efforts to improve compositional translation. The research work presented in this book received the PhD Thesis award 2014 from the French association for natural language processing (ATALA).

Using Comparable Corpora for Under-resourced Areas of Machine Translation

Using Comparable Corpora for Under-resourced Areas of Machine Translation PDF Author: Inguna Skadina
Publisher:
ISBN: 9783319990057
Category : Corpora (Linguistics)
Languages : en
Pages : 323

Book Description
This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.

Building and Using Comparable Corpora

Building and Using Comparable Corpora PDF Author: Serge Sharoff
Publisher: Springer Science & Business Media
ISBN: 3642201288
Category : Computers
Languages : en
Pages : 333

Book Description
The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.

Using Comparable Corpora for Under-Resourced Areas of Machine Translation

Using Comparable Corpora for Under-Resourced Areas of Machine Translation PDF Author: Inguna Skadiņa
Publisher: Springer
ISBN: 3319990047
Category : Computers
Languages : en
Pages : 326

Book Description
This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.

Machine Learning in Translation Corpora Processing

Machine Learning in Translation Corpora Processing PDF Author: Krzysztof Wolk
Publisher: CRC Press
ISBN: 0429588836
Category : Computers
Languages : en
Pages : 205

Book Description
This book reviews ways to improve statistical machine speech translation between Polish and English. Research has been conducted mostly on dictionary-based, rule-based, and syntax-based, machine translation techniques. Most popular methodologies and tools are not well-suited for the Polish language and therefore require adaptation, and language resources are lacking in parallel and monolingual data. The main objective of this volume to develop an automatic and robust Polish-to-English translation system to meet specific translation requirements and to develop bilingual textual resources by mining comparable corpora.

Parallel Corpora for Contrastive and Translation Studies

Parallel Corpora for Contrastive and Translation Studies PDF Author: Irene Doval
Publisher: John Benjamins Publishing Company
ISBN: 9027262845
Category : Language Arts & Disciplines
Languages : en
Pages : 313

Book Description
This volume assesses the state of the art of parallel corpus research as a whole, reporting on advances in both recent developments of parallel corpora – with some particular references to comparable corpora as well– and in ways of exploiting them for a variety of purposes. The first part of the book is devoted to new roles that parallel corpora can and should assume in translation studies and in contrastive linguistics, to the usefulness and usability of parallel corpora, and to advances in parallel corpus alignment, annotation and retrieval. There follows an up-to-date presentation of a number of parallel corpus projects currently being carried out in Europe, some of them multimodal, with certain chapters illustrating case studies developed on the basis of the corpora at hand. In most of these chapters, attention is paid to specific technical issues of corpus building. The third part of the book reflects on specific applications and on the creation of bilingual resources from parallel corpora. This volume will be welcomed by scholars, postgraduate and PhD students in the fields of contrastive linguistics, translation studies, lexicography, language teaching and learning, machine translation, and natural language processing.

New directions in corpus-based translation studies

New directions in corpus-based translation studies PDF Author: Claudio Fantinuoli
Publisher: Language Science Press
ISBN: 3944675835
Category : Language Arts & Disciplines
Languages : en
Pages : 175

Book Description
Corpus-based translation studies has become a major paradigm and research methodology and has investigated a wide variety of topics in the last two decades. The contributions to this volume add to the range of corpus-based studies by providing examples of some less explored applications of corpus analysis methods to translation research. They show that the area keeps evolving as it constantly opens up to different frameworks and approaches, from appraisal theory to process-oriented analysis, and encompasses multiple translation settings, including (indirect) literary translation, machine (assisted)-translation and the practical work of professional legal translators. The studies included in the volume also expand the range of application of corpus applications in terms of the tools used to accomplish the research tasks outlined.

Neural Machine Translation

Neural Machine Translation PDF Author: Philipp Koehn
Publisher: Cambridge University Press
ISBN: 1108497322
Category : Computers
Languages : en
Pages : 409

Book Description
Learn how to build machine translation systems with deep learning from the ground up, from basic concepts to cutting-edge research.

Corpus-based Perspectives in Linguistics

Corpus-based Perspectives in Linguistics PDF Author: Yuji Kawaguchi
Publisher: John Benjamins Publishing
ISBN: 9789027233189
Category : Language Arts & Disciplines
Languages : en
Pages : 464

Book Description
UBLI has conducted field surveys since 2002 and built spoken language corpora for French, Spanish, Italian (Salentino dialect), Russian, Malaysian, Turkish, Japanese, and Canadian multilinguals. This volume features new research presented at the UBLI second workshop on Corpus Linguistics – Research Domain, which was held on September 14, 2006. The first part consisting of eleven presentations to this workshop shows a wide range of subjects within the area of corpus-based research, such as dictionary, linguistic atlas, dialect, translation, ancient texts, non-standard texts, sociolinguistics, second language acquisition, and natural language processing. The second part of this volume comprises ten additional contributions to both written and spoken corpora by the members and research assistants of UBLI.

Corpus Use and Translating

Corpus Use and Translating PDF Author: Allison Beeby
Publisher: John Benjamins Publishing
ISBN: 9027291063
Category : Language Arts & Disciplines
Languages : en
Pages : 166

Book Description
Professional translators are increasingly dependent on electronic resources, and trainee translators need to develop skills that allow them to make the best use of these resources. The aim of this book is to show how CULT (Corpus Use for Learning to Translate) methodologies can be used to prepare learning materials, and how novice translators can become autonomous users of corpora. Readers interested in translation studies, translator training and corpus linguistics will find the book particularly useful. Not only does it include practical, technical advice for using and learning to use corpora, but it also addresses important issues such as the balance between training and education and how CULT methodologies reinforce student autonomy and responsibility. Not only is this a good introduction to CULT, but it also incorporates the latest developments in this field, showing the advantages of using these methodologies in competence-based learning.