Machine Translation Infrastructure for Turkic Languages (MT-Turk)

Machine Translation Infrastructure for Turkic Languages (MT-Turk)

Emel Alkım and Yalçın Çebi

Department of Computer Engineering, Dokuz Eylul University, Turkey

Abstract: In this study, a multilingual, extensible machine translation infrastructure for grammatically similar Turkic languages “MT-Turk” is presented. MT-Turk infrastructure has multi-word support and is designed using a combined rule-based translation approach thatunites the strengths of interlingual and transfer approaches. This resulted in achieving ease of extensibility by adding new Turkic languages. The new language can be used both as destination and as source language achieving two-way extensibility. In addition, the infrastructure is strengthened with the ability of learning from previous translations and using the suggestions of previous users for disambiguation. Finally, the success of MT-Turk for three Turkic languages -Turkish, Kirghiz and Kazan- is evaluated using BiLingual Evaluation Understudy (BLEU) metric and it is seen that the suggestion system improved the success by 43.66% in average. Although the lack of linguistic resources affected the success of the system negatively, this study led to the introduction of an extensible infrastructure that can learn from previous translations.

Keywords: Rule-based machine translation, Turkic languages, semi-language specific interlingua and disambiguation by suggestions.

Received April 21, 2015; accepted November 8, 2016
 
Read 1411 times
Share
Top
We use cookies to improve our website. By continuing to use this website, you are giving consent to cookies being used. More details…