The article compares and evaluates the quality of machine translation tools for proper understanding and successful recognition of idiomatic expressions by school-leavers; based on the implementation of machine translation services in the academic environment of Skyeng students. Readers are provided with the results of the survey devoted to participants’ experience of using seven services. A comparative analysis of the effectiveness of idiomatic translation showed that school-leavers will continue using only two translators out of seven (Google Translate, Yandex Translate, Deepl Translate, Bing Translator, Reverso, MyMemory Translation, and PROMT Online Translator), since the rest performed literal translation. Additionally, students evaluated each service in accordance with the individual experience, pointing out the weaknesses and strengths of the websites. The students' scores were further interpreted into a five-mark grading system using a formula in accordance with criteria.
Keywords: machine translation, idiomatic expressions, online-school Skyeng, school-leavers, learning process, academic environment, ESL learners, tools, evaluation, Google Translate, Yandex Translate, Deepl Translate, Bing Translator, Reverso, MyMemory Translation, PROMT Online Translator.
Данная статья сравнивает и оценивает качество работы средств машинного перевода для успешного понимания и распознавания идиоматических выражений выпускниками школ на основе внедрения сервисов по машинному переводу в учебную среду учеников онлайн-школы Skyeng. Читателям представлены результаты анкетирования участников, которым было предложено использовать семь сервисов по переводу во время обучения. Сравнительный анализ эффективности идиоматического перевода показал, что выпускники школ продолжат использовать только два переводчика из семи (Google Translate, Yandex Translate, Deepl Translate, Bing Translator, Reverso, MyMemory Translation, and PROMT Online Translator), так как остальные выполняли буквальный перевод. Также ученики оценили каждый сервис в соответствии с индивидуальным опытом, указывая на слабые и сильные стороны сайтов. Оценки учеников были дополнительно интерпретированы в пятибалльную шкалу по формуле в соответствии с критериями.
Ключевые слова: машинный перевод, идиоматические выражения, онлайн-школа Skyeng, выпускники, образовательный процесс, учебная среда, оценка, Google Translate, Yandex Translate, Deepl Translate, Bing Translator, Reverso, MyMemory Translation, PROMT Online Translator.
During the study process and language immersion, students face groups of words, the meaning of which does arise from the definition of its constituent components. These indivisible phrases are called phraseological units or idiomatic expressions. Machine Translation (MT) has always done an unsatisfactory job managing idioms due to translating phrases in the literal sense instead of building lexical units on logical grounds [2]. According to Algiers [1], phraseologisms are puzzling by nature and unpredictability. Due to this, the integral part of the language remains unassimilated by school-leavers.
The present article provides a comparative analysis of seven online translators (Google Translate, Yandex Translate, Deepl Translate, Bing Translator, Reverso, MyMemory Translation, and PROMT Online Translator) that may assist English learners within the meaning of idiomatic expressions that can be met in the Unified State Exam in Russia. The aim is to identify which online tools perform most effectively by gathering feedback from Skyeng’s students, who were suggested using the above-mentioned engines.
Literature Review
As it was mentioned in the introductory part, during the language learning process, recognition of idioms tends to be the most challenging part underway towards skills improvement. It often takes many years to get familiar with these expressions before an individual stands assured of the meaning.
Modern research studies prove that NMT systems (Neural Machine Translation) are more preferable for professional work rather than Statistical Machine Translation (SMT) owing to a larger amount of corpus and resources used. Moreover, according to Shofner, the NMT method is the most advanced and has higher accuracy. The information is sent to different ‘layers’ and is processed similarly to the human brain, using neural networks [5].
However, idiomatic phrases pose problems even for NMT models. The performance investigation of the NMT architecture in 2019 demonstrated that the BLEU scores for regular sentences were higher than for those that contained phraseological units [6]. Error Analysis of English-Latvian SMT and NMT translations for 196 sentences from a balanced evaluation set showed that NMT translation is more fluent. Nevertheless, word order errors, morphological inaccuracy, wrong lexical choice, and missing phrases in translations could be encountered [8].
Nowadays, few articles have been published on inaccurate idiomatic translation, especially on the topic of Russian-English idiom interpretation and vice versa. The inherent complexity of the expressions can be identified by the engines only if neural architecture is developed. It means that the training corpus has to focus on idioms explicitly. The process of parallel sentence generation is time-consuming and expensive. Despite that, ESL learners give precedence over online translators due to affordability and easy handling of the tools. School-leavers have no resources for in-depth decoding of sophisticated expressions. The most effective solution is to betake websites that are plain to see.
On the contrary, some online translators work better than others because of more effective lexical gap treatment and more profound self-training techniques. For instance, The National Institute of Standards and Technology compared 22 machine translation systems in 2005, and it was found out that Google Translate was a top performer in the field [9]. It is important to emphasise that sentences with no metaphorical sense are referred to. Сoncurrently, Veena Yadav, a Google Translate user, posted a public complaint about disaffection with the service’s literal idiomatic translation, proving that it is essential to use third-party dictionaries in order to find the meaning. The complaint was upvoted more than 20 times by the users [10, URL: https://clck.ru/VnFj2 ].
The criterion of assessment for the comparative analysis of the engines is the BLEU score (bilingual evaluation understudy). It is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. However, the quality of the translation of idiomatic expressions has to be estimated differently. The shortened list of the criteria presented by Jabbari is the following: modulation (M), the absence of literal interpretation, which can be allowed only in rare cases; the presence of transposition (T), the change in grammatical structure; the overall consistency of translation (C) [3].
Research on the subject of Machine Translation tools comparison covers not the sector-specific interpretation. In the next part of the article, it is significant to test the practical appliance of online services that treat idiomatic expressions.
The Methods of the Research
The quasi-experimental research consists of several parts, which are based on the analysis of quantitative and qualitative data. Firstly, students from an online English school called Skyeng were asked to get acquainted with seven Machine Translation tools (Google Translate, Yandex Translate, Deepl Translate, Bing Translator, Reverso, MyMemory Translation, and PROMT Online Translator). It was necessary to use the websites for several weeks to make the tools part of the students’ academic environment. Secondly, participants were asked to translate all the unknown idiomatic expressions that could be met during the learning and exam preparation process, using the engines. Thirdly, the feedback on the services’ performance was gathered using a questionnaire-based method of research. The section «Key Results» provides information about the efficiency of each of the tools, according to the school leavers’ experience. Additionally, the survey identifies the main issues connected with the inadequate translation and provides data for the assessment of the engines’ performance.
Key Results: Quantitative Data
The survey was created to gather students' feedback and compare MT tools on the subject of idiomatic translation. Seventeen anonymous respondents (aged from 15 to 18) from online school Skyeng were asked to use seven MT engines: Google Translate, Yandex Translate, Deepl Translate, Bing Translator, Reverso, MyMemory Translation, and PROMT Online Translator. After three weeks, the participants were asked to share the opinions on the translation quality of encountered idioms. The aim was to analyse and evaluate the overall performance of each tool using a Google Form [7, URL: https://clck.ru/WjofH]. It includes seven sections: one per engine. The questions for evaluation are the same in every part of the questionnaire:
- Which idioms did you translate using the tools?
- Evaluate the overall performance (ten-point Likert scale; where 1 — means inaccurate translation, 5 — adequate translation, 10 — close to perfect interpretation)
- Which idioms were translated accurately? (list at least 3 of them, providing the original text and its interpretation)
- Which idioms were translated inaccurately? (list at least 3 of them, providing the original text and its interpretation)
- According to your experience, what were the main errors connected with?
- What advantages of the tool can you mark?
- Would you continue using the tool for idiomatic translation?
Table A.1 demonstrates the percentage points of those respondents who would keep on using the engine even though not all the phrases were translated preferably. In addition, the table depicts information about the total score of each tool (questions 2, 7). It is clearly seen that the most effective MT service, according to the survey results, is Reverso, with 82,4 % (14 students) of respondents who would stick to it. Reverso’s performance was rated as 7.3 out of 10 compared to Bing Translator that was left with a 4.9-point score. Generally speaking, tools like PROMT Online Translator, MyMemory Translation, Deepl Translate, and Google Translate showed mediocre results with 5 points as an average score. However, Yandex Translate was ranked as the second-best translation engine, with 70 % (12 students) of participants who are willing to use the tool for idiomatic translation. It is 12 % lower than Reverso’s figures.
Having compared these seven elements, one may conclude that some features and translations were favoured more. It is essential to proceed with qualitative data analysis to identify possible reasons for poor translation and the strong sides of the tools.
Key Results: Qualitative Data
The qualitative data aims to collect text-based feedback (questions 3–6) that was submitted via the same Google Form questionnaire [7, URL: https://forms.gle/92gALmMJwZ6NTGpw7 ]. The analysis is divided into two parts. The aim of the first part (questions 3–4) is to analyse which idioms the MT engines translated accurately and inaccurately. The second one (questions 5–6) deals with the possible reasons for inadequate translation and beneficial aspects that were marked by Skyeng students.
Part 1. Participants were asked to list all the idiomatic expressions that were used in the experiment.
Table A.2 demonstrates the number of correctly translated idioms and provides examples of tools’ performance. It can be seen that Table A.2 verifies students’ evaluation results presented in Table A.1 . Reverso showed the highest accuracy with 54 preferable translations out of 72. However, it did not cope with idioms that were accurately interpreted by MyMemory Translation, for example, «throw good money after bad». Similarly, a Russian saying «подложить свинью» was translated inappropriately by Yandex Translation, which is ranked second in the experiment. It proves that even those MT engines that demonstrated poor results can provide precise translations. Different MT tools work relying on contrasting parallel corpora.
Part 2. This section is devoted to school-leavers’ answers to Question 5 . Moreover, it contains the interpretation of students’ replies in accordance with three criteria (modulation, transposition, consistency) presented in «Literature Review», which were evaluated on a scale from 1 to 5. The counting scheme: (100 % — the % of those who chose the answer option): 20. The first criterion (M) corresponds to «the translation was literal; however, the main idea was still vivid» ; the second one (T) — «the connotation was changed» ; the third one (C) — «the translation was incoherent, the phrase became a primitive set of words . » This part of the experiment additionally includes the advantageous sides of each of the tools marked by respondents.
Table A.3 shows that Reverso occupies the leading position with 9.9 points out of 15 (in accordance with three criteria). It must be added that PROMT Online Translator, MyMemory Translation, and Bing Translator performed slightly better in terms of modulation. It means that fewer idioms were translated literally by these tools. However, in terms of transposition or, in other words, changed grammatical structure Google Translate is the leader, with 4 points out of 5. Yandex Translate also demonstrated great results with 3.8 points. The overall consistency of translation (C) is one of the weakest sides of the tools. Consistency embodies cultural adaptation and provides the closest equivalent to the source-language message, according to Kelly [4]. Taking this into consideration, Reverso is the ultimate leader in the experiment with 4 points out of 5. In reference to students’ feedback, it can be indeed stated that the main issues with translation are connected with «word-for-word» interpretations. What is more, some of the engines have narrow databases. Due to this, the websites simply translate the phrase literally.
As for the tools’ benefits, Skyeng students marked an opportunity to rate translations (Reverso, Deepl Translate). The most helpful feature which improves the performance of the sites and user experience is «idioms in context» (PROMT Online Translator, MyMemory Translation, Reverso, Yandex Translate, Google Translate). It makes the phrase easy to remember and helps to build associations. It is an extremely promising development towards higher accuracy and the ability to extend to new language pairs.
Summary of the main experiment’s results and limitations of the present research are described in the next section of the article.
Discussion
Machine translation of texts and idioms has improved over the years. However, idiomatic translation is deemed to be a separate aspect of MT systems, which process unique phrases from natural language by a narrow margin. Thus, MT engines can cope with idiomatic expressions successfully only if NLP tools are implemented into engines. Summing up the practical part of the work, Skyeng school-leavers acknowledged the tools’ imperfection by preferring only two translators out of seven for further studies of idioms. The total number of accurately interpreted phrases confirms the mentioned above student’s choice. Despite the fact that engines like PROMT Online Translator or MyMemory Translation have definitive features and even cope with language pairs of rare occurrences, the tools cannot be fully trusted due to low scores in terms of the absence of literal interpretation and the overall consistency. Limitations of the experiment include the dissimilar experiences students had using the same tools since different language pairs within the usage limits were translated.
Conclusion
In conclusion, the comparison of seven different MT tools has shown that five engines are not preferable for learning idioms. Skyeng undergraduates managed to test the websites and identify that idiomatic expressions are frequently translated literally due to the absence of context and lack of parallel texts. This study can be valuable and practical for those ELLs who face these phraseological units often in a studying environment. Therefore, the study results can be used for a future research project, which may include other languages, less popular expressions, and significantly more participants.
Appendix A
The results of the skyeng school-leavers survey
Table A.1
Students’ evaluation results
Name of the tool |
Respondents who would continue using the tool (%) |
Tool’s overall score (ten-point scale) |
Google Translate |
47 % (8 students) |
5.5/10 |
Yandex Translate |
70 % (12 students) |
7.2/10 |
Deepl Translate |
23,5 % (4 students) |
5/10 |
Bing Translator |
29,4 % (5 students) |
4.9/10 |
Reverso |
82,4 % (14 students) |
7.3/10 |
MyMemory Translation |
41,2 % (7 students) |
5.6/10 |
PROMT Online Translator |
35,3 % (6 students) |
5.6/10 |
Table A.2
MT tools’ performance in a specific context
Name of the tool |
Accurately translated idioms |
Inaccurately translated idioms |
In total |
Google Translate |
– Come hell or high water: что бы ни случилось – Вертеться как белка в колесе: spin around like a squirrel in a wheel |
– Spill the beans: пролить бобы – Подложить свинью: add a pig |
35/72 |
Yandex Translate |
– Spill the beans: проболтаться – Пара пустяков: piece of cake |
– Have somebody on the string: есть кого-то на строке – Подложить свинью: plant a pig |
47/72 |
Deepl Translate |
– For a rainy day: на чёрный день – Лезть на рожон: going to the trouble |
– Piece of cake: кусок пирога – Вот где собака зарыта: that's where the dog is buried |
31/72 |
Bing Translator |
– Piece of cake: проще простого – Вот где собака зарыта: that's where the hen scratches |
– Make up for lost time: наймите потерянное время – Подложить свинью: put a pig |
30/72 |
Reverso |
– Make up for lost time: наверстать время – Подложить свинью: dirty trick |
– Throw good money after bad: бросать хорошие деньги после плохих – Котелок не варит: boiling pot |
54/72 |
MyMemory Translation |
– Throw good money after bad: бросать деньги на ветер – В гостях хорошо, а дома лучше: east or west home is best |
– Be under the weather: быть под непогоду – Как в воду кануть: how to sink into the water |
27/72 |
PROMT Online Translator |
– Not on speaking terms: в ссоре – Как в воду кануть: to disappear into thin air |
– Apple of eye: яблоко глаза – Душа ушла в пятки: soul went to heels |
28/72 |
Table A.3
Features of the tools and errors evaluation
Name of the tool |
Advantages |
What caused mistakes (students’ comments) |
Criteria evaluation |
Google Translate |
– verified translations – idioms in context – speed – widespreadness |
«The translation did not fully match the Russian equivalent» |
M — 2/5 T — 4/5 C — 2/5 |
Yandex Translate |
– idioms in context – appealing design – the decryption of the figurative meaning |
«The translation was literal.» |
M — 1,8/5 T — 3,8/5 C — 3,5/5 |
Deepl Translate |
– one can rate translations – documents conversation – inspires confidence (AI, neural networks) |
«The tool lacks accuracy, database is narrow» |
M — 2,4/5 T — 3,5/5 C — 2,6/5 |
Bing Translator |
– corrects sentence structure – hotkeys – cache memory |
«I was not pleased with this translator; it changed the meaning of phrases» |
M — 2,6/5 T — 3/5 C — 2/5 |
Reverso |
– interface – idioms in context – differentiates AmE and BrE – 5-star system feedback |
«Direct translation or word-for-word translation can be met» |
M — 2,4/5 T — 3,5/5 C — 4/5 |
MyMemory Translation |
– idioms in context – «human contributions» and «computer translation» sections – translations marked by experts |
«It takes time to find an appropriate option among all the translations» |
M — 2,6/5 T — 3,5/5 C — 3,2/5 |
PROMT Online Translator |
– great performance with Ru-Eng pairs – idioms in context – works better with unpopular idioms |
«Some words were not even translated by the tool» |
M — 2,6/5 T — 3/5 C — 2/5 |
References:
- Abbey, A. (2021). The Challenge of Idioms for Language Learners. Confianza. [Electronic resource]. — URL: https://ellstudents.com/blogs/the-confianza-way/challenge-of-idioms-for-language-learners (last viewed 10.05.2021).
- Anastasiou, D. (2010). Idiom treatment experiments in machine translation. Cambridge Scholars Publishing. pp. 2–3.
- Jabbari, M. (2016). Idiomatic Expressions in Translation. Journal Of Advances In Humanities. pp. 507–514.
- Kelly, L. (1970). Cultural Consistency in Translation. The Bible Translator. pp. 170–175.
- Kenzie, S. (2021). The Challenge of Idioms for Language Learners. United Language Group. [Electronic resource]. — URL: https://www.unitedlanguagegroup.com/blog/statistical-vs-neural-machine-translation (last viewed 10.05.2021).
- Parmar, J., & Estrada-Arias, D. (2019). Idiomatic Language Translation and Transfer Learning. Stanford. pp. 3–4.
- Pakulova, D. (2021), Google Forms: «Comparison of MT tools», URL: https://docs.google.com/forms/d/1Eou1Tri1t9z3j1YqMI5gMe9-JvFBx-PHawlKBvuj-Ck/edit?usp=sharing (last viewed 17.05.21).
- Skadiņš, R., Goba, K., & Šics, V. (2010). Improving SMT for Baltic Languages with Factored Models. pp. 125–132.
- Vanjani, M., & Aiken, M. (2020). A Comparison of Free Online Machine Language Translators. Journal of Management Science and Business Intelligence. pp. 1–2.
- Veena, Y. Sometimes Google translates «Idioms» Literally Not Figuratively? [Electronic resource]. — URL: https://support.google.com/translate/thread/46838266/sometime-google-translate-idioms-literally-not-figuratively?hl=en (last viewed 12.05.2021).