These results suggest that it is possible to find a minimal set of MT error types without reducing the estimation performance, which, in addition, indicates that informative QE systems that are based on individual error detection systems can be built with less effort, namely by detecting only the error types with highest predictive power.Īs supervised machine learning algorithms rely on the availability of large labelled data sets, the manually annotated corpus of SMT errors serves as training data to build in a first step error detection and in a second step quality estimation systems. Furthermore, we applied feature selection methods and investigated the predictive power of different MT error types on post-editing time. Our results show that post-editing time can be estimated with high accuracy when all the translation errors in the MT output are known. Using the corpus of SMT errors, in a series of experiments, we sought to determine whether MT errors can explain the post-editing effort indicators and analyzed the informativeness of the individual error types on estimating post-editing time. ![]() ![]() Apart from temporal post-editing effort, we also use technical post-editing effort indicators based on edit distance throughout this thesis as these are commonly used in the research community. However, as this is extremely complex and difficult to achieve for large data sets, we used, as suggested in literature, temporal effort (or post-editing time) as an indirect measurement of cognitive post-editing effort. Under the assumption that correcting MT errors requires cognitive demand, one would ideally like to measure cognitive effort directly in order to understand the relationship between MT errors and post-editing effort. To demonstrate the validity and reliability of this taxonomy, we proposed a novel method for alignment-based Inter-Annotator Agreement (IAA) analysis and show that this method can be used effectively on large annotation sets.ĭifferent effort indicators have been introduced in literature to measure post-editing effort. Moreover, the hierarchical nature of this taxonomy enables the analysis of MT errors at different levels. The error taxonomy is grounded in translation quality assessment literature and allows for an MT-specific, fine-grained error annotation based on the main distinction between accuracy and fluency errors. In order to study the relationship between MT errors and post-editing effort on a large scale, we developed an error taxonomy and a corpus of MT errors originating from statistical (SMT), rule-based and neural machine translation systems for English-Dutch and obtained post-edited versions of the MT output of this corpus. This thesis presents a comprehensive approach to automatic error detection as a basis for understanding the relationship between different types of MT errors and the corresponding post-editing effort and take a first step towards informative quality estimation systems of machine translation, which are able to justify the basis for estimated quality. Despite the link between MT errors and the cognitive effort involved in correcting them, current QE studies often focus on finding informative features that capture the monolingual and bilingual properties of given source/MT output pairs and estimate overall post-editing effort at word, sentence or document level without making a distinction between MT error types. With post-editing of MT output becoming a common practice in fast-paced Computer-Assisted Translation (CAT) workflows, research on Quality Estimation (QE) has thrived in recent years. ![]() Moreover, on the task of predicting post-editing effort, while solely relying on monolingual information, it achieves on-par results with the state-of-the-art quality estimation systems which use both bilingual and monolingual information.ĭespite the recent advances in the field of machine translation (MT), today, MT systems cannot guarantee that the sentences they produce will be fluent and coherent in both syntax and semantics. Our results show that this method is effective for capturing all types of fluency errors at once. In this study, we investigate the effectiveness of using monolingual information contained in the machine-translated text to estimate word-level quality of SMT output.We propose a recurrent neural network architecture which uses morpho-syntactic features and word embeddings as word representations within surface and syntactic n-grams.We test the proposed method on two language pairs and for two tasks, namely detecting fluency errors and predicting overall post-editing effort. Various studies show that statistical machine translation (SMT) systems suffer from fluency errors, especially in the form of grammatical errors and errors related to idiomatic word choices.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |